THE FUTURE IS HERE

Rethinking Data Science, Machine Learning, and AI

Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.

​In this podcast, Hugo and Vincent discuss:

– Questioning established methods and exploring alternative approaches, such as how we think about recommender systems
– The pitfalls of following trends without critical thinking, like blindly applying supervised learning techniques
– The role of doubt in fostering innovation, using examples from Vincent’s experience in building real-world ML systems
– The power of visualization and human intuition, showcasing how visual insights can outperform complex models
– ​The stories we tell with data and their implications, drawing from Vincent’s upcoming book, “Data Science Fiction”
– Strategies for understanding algorithmic systems to prevent failure, such as rethinking evaluation metrics and label quality
– Fostering a culture of open-mindedness and continuous learning, with examples from Vincent’s work in the data science community

​Vincent will share insights from his experience as an engineer, researcher, team lead, and educator in data science. He’ll discuss his work at :probabl and how it relates to the broader themes of the discussion, such as the importance of focusing on the problem rather than just the tools.

​This episode will provide valuable ideas for rethinking your approach to data science and machine learning, with concrete examples and case studies to inspire a more creative and critical mindset in your work.

​About Vincent Warmerdam

​Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl. He is known for defending common sense over hype in data science and has a preference for simpler, scalable solutions. Vincent is the creator of calmcode.io, co-founder and co-chair of PyData Amsterdam, and has been involved in numerous data projects and open-source initiatives. His upcoming book, “Data Science Fiction,” explores the stories we tell with data and their implications.

0:00 – Introduction: Vincent’s background in data science and ML
5:50 – Real-world data science failure mode: The theater expansion case study
11:24 – Rethinking common approaches in ML and data science
18:21 – Case study: World Food Organization and the importance of problem framing
28:22 – Systems thinking in data science and its practical applications
35:50 – Open-source in action: Introduction to Scikit-Lego project
42:00 – The overlooked importance of UI/UX in data science projects
52:49 – Live demo: Neural search tool for arXiv papers using embeddings
1:04:00 – Discussion on the role of LLMs in data science workflows
1:24:31 – Innovation in ML pipelines: Demo of Scikit-Playtime project
1:33:43 – Future trends in AI/ML and the importance of knowledge sharing