Empty energy drink cans, timers ticking away on giant blue screens, that bug you think you squashed, being exhausted but still trying to fix your model. Excerpts from nightmares, depending on which end you’re standing on.
On the 20th and the 21st of September 2019, over a span of 25 hours, a young cohort of individuals came together to solve a task that had been a problem for the ECB for some time: how to automatically classify products into categories so that inflation rates can be monitored and forecasted?
The purpose of this Hackathon can be distilled in a question: Can you teach a machine a task that would require thousands of hours of human labour and months of training? I know at least 40 people who were keen to find it out for themselves. The top 3 teams were also given the opportunity to present their work at the ECB Inflation Conference a few days later.
Working on such intensive projects right from the beginning was one of the most important reasons that I chose the Master in Applied Data Science Program at Frankfurt School. The opportunity to learn while solving real-world tasks, be in a class with people from the widest range of experiences and backgrounds, and have fun while doing so was too good an opportunity to pass up on. The great promise of Machine Learning is to improve existing domains. People who understand a domain (say, Finance) and learn how to apply Machine Learning to it will have an enviable skill-set. Frankfurt School provides the perfect platform to achieve this.
The scope of the Hackathon comes under a burgeoning field in Machine and Deep Learning called Natural Language Processing or NLP. It involves trying to find a way to semantically teach a computer a language. One of the ways to achieve this is to create numerical vectors out of all the words. These words must be mapped to a higher dimensional space such that semantically similar words are close to each other.
Now imagine mapping the description, name and product categories over a higher dimensional space. We can draw hyperplanes inside this to demarcate spaces that belong in one category. Any future text will be categorised into one of the above demarcated categories depending on how it is worded. There are many ways to achieve this. One can use classical (but surprisingly effective) Machine Learning algorithms like SVM and Random Forests to classify, or use more sophisticated techniques, like a Long-Short Term Memory (LSTM) neural network and use transfer learning with models per-trained on language classification. Fine tuning the parameters on such models can be arduous task and one needs to rely on experience and heuristics to do so.
A Hackathon is exactly that: to find a way to hack a problem by experimentation and reach a solution over iterations, like a long marathon. To the future participants of the Hackathon mulling over their participation, go for it! It will be the most amazing learning experience you can have. It doesn’t matter if you are from a finance, natural science or management background. I saw people with no background in Machine Learning bring rough ideas that existed in their head to code, build models that had among the highest accuracies and amaze a jury that had decades of experience in the domain. The pressure to solve a task under time pressure pushes people further than they thought possible and is one of the most desirable skills in a modern knowledge worker.