Shapecaret-double-leftcaret-double-rightcaret-downcaret-leftcaret-right-circlecaret-rightShapeclosedropdownexpand morefacebookLogolinkedinlogo-footerlogo-marklogo-mobilemailsearchtwitteryoutube
Menu Sluiten

BLOG | Follow TMC Data Science in their Kaggle competition

In this blog we keep you updated on the progress, challenges and wins of the TMC Data Science team in their first Kaggle competition.


August 8, 2018 |This week, we as TMC Data Science Employeneurs continued to fight for our place in the rankings of a Kaggle competition. After the pseudo-leak last week that Romain talked about in his article, we started off again using one of the kernels from the forums. This brought us again on the same ground as the other contestants.  


Since all of the ‘TMC Datathoners’ (the name we gave our group) are at different positions on the learning curve, the reasons everyone joined differ as well. While the more experienced among us do it to stay up to date with the best algorithms available, others joined to learn from their colleagues. This is a perfect example of one of the five pillars of TMC’s Employeneurship model: by working in business cells with their own technical expertise and niche market knowledge, like-minded people work together and valuable knowledge is easily shared. 


The team-building is a very important aspect as well. Because of the weekly meetings we now see people coming together and spending time to work towards a common goal. Even over the weekends, two of the Employeneurs met up for ‘co+(ffee/ding)’ at a local coffee place. Personally, this is one of the things I like about TMC. Since many of the Employeneurs come from outside of the Netherlands when they start at TMC, they’re often challenged with building a new social circle. Initiatives like the Kaggle competition help them to easily meet new people and make the transition to a new environment a lot easier.

For me, this already has been a great opportunity to learn. When we first started, competing with data scientists from all over the world seemed very daunting. However, even after the first meeting we had as a group, I realised the potential our team has. The drive of the more experienced data scientists to help out the other members of the team is the main reason I think this is a great initiative and I truly believe this is not the last competition we will partake in. 

SHOOT FOR THE TOP 10%| Romain Huet

July 26, 2018 | As my colleague Valentin explains in his article we as the Employeneurs of TMC Data Science are participating in a Kaggle competition. I am helping him organize our participation. Since it is our first competition all together our objective is to be able to submit a collaborative work and aim to be within the top 10%.

Teaching and learning

To do so every week we gather cheerful and eager Employeneurs around pizzas to contribute to this project and learn how to use machine learning on a real world problem. Some of us are more experienced with a strong background in machine learning. Therefore, helping the other to keep up with what’s happening is another challenge in itself.

In these weekly meetings everyone shares what they have done during the previous week. That leads to open discussions and questions to learn more about the field of machine learning, especially for the curious/beginners. It is time consuming for those with more experience who solely contribute to the competition and have to explain/teach to others. However, as any teacher you are happy when people understand and improve in their work.

Learn from failure

The competition we are working on has been launched by Santander Group, a Spanish bank, to help them identify the value of transactions for each potential customer. One nice property of the data is that no domain knowledge is required, hence we can all focus on pre-processing data and the machine learning part. By working with Kaggle “Kernels” corresponding to codes shared by other kagglers we were able to be in the top 14%, until a “leak” appeared. In this kind of competition everything can happen and in matter of hours you can find yourself at the bottom of the leaderboard very quickly. This pseudo-leak is actually a hack which helps to have a better point of view/understanding about the data. Now everyone, including us, are taking advantage of this data hack by working with "Kernels" shared on the forums.

The participation in such a competition brings you the ability to learn faster about machine learning and see how quickly it is evolving with the help of competent people. As for me, in addition to learn, I can teach my knowledge to others which on the other hand help me realize that I have much more to learn in this amazing field.

Wat is je volgende stap? We kunnen je daarbij helpen