It turns out that with Twitter data alone, we can go quite some way into figuring out someone's personality.
— Anthony Goldbloom
I got really excited about the idea of data-driven startup just as I was starting Kaggle.
You want to evaluate future borrowers, but in order to train an algorithm that will help you identify future defaults, you have to train it and evaluate it on past data.
Business analytics or predictive modelling is a $100 billion industry, and $41 billion is spent on outsourced business analytics every year. I think that's about twice the size of the movie industry - it's really big.
I love kite surfing and mountain bike riding. It's kind of interesting; my kite surfing ability has probably deteriorated with the rate of Kaggle's success.
Startup stories are always smoother in the telling than they are in reality. A startup is not one, but a series of 'Aha!' moments, and some which seem like 'Aha!' moments but turn out not to be.
Big data is mostly about taking numbers and using those numbers to make predictions about the future. The bigger the data set you have, the more accurate the predictions about the future will be.
We think Facebook and Google know a lot about us - who knows more about us than AmEx, MasterCard and Visa? They know exactly what we spend and where we spent it... so they're looking at ways to unlock it.
If competition for Kaggle's top talent becomes fierce enough among banks, insurance companies, hedge funds - we hope the world's best data scientists will earn more than $50 million per year, just like the world's best hedge fund managers.
Without the discipline of having a wife to come home to, you end up just working all the time.
As a young entrepreneur starting an enterprise company, be prepared for the fact that you'll need to get involved in enterprise sales. Everyone wants to speak to the founder, and this is also how you'll get feedback on your product. It's worth bringing in early somebody with enterprise sales experience.
Companies are getting bitten by hiring a data scientist who isn't really a data scientist.
Our view is that the very best data miners or statisticians can earn as much as the very best golfers or tennis players.
I think I'm okay at everything and not spectacular at any one thing that Kaggle does.