Rethinking Data Science, Machine Learning, and AI
Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.
In this podcast, Hugo and Vincent discuss:
- Questioning established methods and exploring alternative approaches, such as how we think about recommender systems
- The pitfalls of following trends without critical thinking, like blindly applying supervised learning techniques
- The role of doubt in fostering innovation, using examples from Vincent's experience in building real-world ML systems
- The power of visualization and human intuition, showcasing how visual insights can outperform complex models
- The stories we tell with data and their implications, drawing from Vincent's upcoming book, "Data Science Fiction"
- Strategies for understanding algorithmic systems to prevent failure, such as rethinking evaluation metrics and label quality
- Fostering a culture of open-mindedness and continuous learning, with examples from Vincent's work in the data science community
Vincent will share insights from his experience as an engineer, researcher, team lead, and educator in data science. He'll discuss his work at :probabl and how it relates to the broader themes of the discussion, such as the importance of focusing on the problem rather than just the tools.
This episode will provide valuable ideas for rethinking your approach to data science and machine learning, with concrete examples and case studies to inspire a more creative and critical mindset in your work.
About Vincent Warmerdam
Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl. He is known for defending common sense over hype in data science and has a preference for simpler, scalable solutions. Vincent is the creator of calmcode.io, co-founder and co-chair of PyData Amsterdam, and has been involved in numerous data projects and open-source initiatives. His upcoming book, "Data Science Fiction," explores the stories we tell with data and their implications.
0:00 - Introduction: Vincent's background in data science and ML
5:50 - Real-world data science failure mode: The theater expansion case study
11:24 - Rethinking common approaches in ML and data science
18:21 - Case study: World Food Organization and the importance of problem framing
28:22 - Systems thinking in data science and its practical applications
35:50 - Open-source in action: Introduction to Scikit-Lego project
42:00 - The overlooked importance of UI/UX in data science projects
52:49 - Live demo: Neural search tool for arXiv papers using embeddings
1:04:00 - Discussion on the role of LLMs in data science workflows
1:24:31 - Innovation in ML pipelines: Demo of Scikit-Playtime project
1:33:43 - Future trends in AI/ML and the importance of knowledge sharing
Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.
In this podcast, Hugo and Vincent discuss:
– Questioning established methods and exploring alternative approaches, such as how we think about recommender systems
– The pitfalls of following trends without critical thinking, like blindly applying supervised learning techniques
– The role of doubt in fostering innovation, using examples from Vincent’s experience in building real-world ML systems
– The power of visualization and human intuition, showcasing how visual insights can outperform complex models
– The stories we tell with data and their implications, drawing from Vincent’s upcoming book, “Data Science Fiction”
– Strategies for understanding algorithmic systems to prevent failure, such as rethinking evaluation metrics and label quality
– Fostering a culture of open-mindedness and continuous learning, with examples from Vincent’s work in the data science community
Vincent will share insights from his experience as an engineer, researcher, team lead, and educator in data science. He’ll discuss his work at :probabl and how it relates to the broader themes of the discussion, such as the importance of focusing on the problem rather than just the tools.
This episode will provide valuable ideas for rethinking your approach to data science and machine learning, with concrete examples and case studies to inspire a more creative and critical mindset in your work.
About Vincent Warmerdam
Vincent Warmerdam is a senior data professional and machine learning engineer at :probabl. He is known for defending common sense over hype in data science and has a preference for simpler, scalable solutions. Vincent is the creator of calmcode.io, co-founder and co-chair of PyData Amsterdam, and has been involved in numerous data projects and open-source initiatives. His upcoming book, “Data Science Fiction,” explores the stories we tell with data and their implications.
0:00 – Introduction: Vincent’s background in data science and ML
5:50 – Real-world data science failure mode: The theater expansion case study
11:24 – Rethinking common approaches in ML and data science
18:21 – Case study: World Food Organization and the importance of problem framing
28:22 – Systems thinking in data science and its practical applications
35:50 – Open-source in action: Introduction to Scikit-Lego project
42:00 – The overlooked importance of UI/UX in data science projects
52:49 – Live demo: Neural search tool for arXiv papers using embeddings
1:04:00 – Discussion on the role of LLMs in data science workflows
1:24:31 – Innovation in ML pipelines: Demo of Scikit-Playtime project
1:33:43 – Future trends in AI/ML and the importance of knowledge sharing