Dive into Deep Learning: Coding Session #4 Attention Mechanism I (Americas/EMEA)

📌 Session #4 – Attention mechanism (Transformer) implementation (Part I)
📌 Introduction, Coding

About:
The goal of this series is to provide code-focused sessions by reimplementing selected models from the interactive open source book "Dive into Deep Learning" http://d2l.ai/ by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola. In this session, we will discuss the attention mechanism and we’ll be covering parts of Chapter 10. Attention Mechanism https://d2l.ai/chapter_attention-mechanisms/
We recommend interested participants read these chapters of the book to take full advantage of the session.

These sessions are meant for people who are interested in implementing models from scratch and/or never have implemented a model before. We hope to help participants either get started in their Machine Learning journey or deepen their knowledge if they already have previous experience.

We will try to achieve this by:

* Helping participants to create their models end-to-end by reimplementing models from scratch, and discussing what modules/elements need to be included (e.g. data preprocessing, dataset generation, data transformation, etc…) to train an ML model.

* Discussing and resolving coding questions participants might have during the sessions.

📌 Session Leads: Devansh Agarwal and Kshitij Aggarwal

Devansh Agarwal is a Data Scientist at BMS. He graduated with a Ph.D. in Astronomy where he has developed pipelines using high-performance computing and machine learning to aid the discovery of astronomical objects. https://www.linkedin.com/in/devanshkv/

Kshitij Aggarwal is a 4th-year graduate student at the Department of Physics and Astronomy at West Virginia University. He uses data analysis, machine learning, and high-performance computing to discover and study a new class of astronomical objects called Fast Radio Bursts. https://kshitijaggarwal.github.io/
https://www.linkedin.com/in/kshitijaggarwal13/

● FULL CURRICULUM
📌 Session 1:
Coding env setup example and book presentation
A quick review of ML domains (supervised/unsupervised/RL)
General Architecture/Components of ML code
Implementation of simple MLP-model
http://d2l.ai/chapter_introduction/index.html

📌 Session 2:
CNN model (LeNet/ResNet) implementation
http://d2l.ai/chapter_convolutional-neural-networks/index.html

📌 Session 3:
RNN model (LSTM) implementation
http://d2l.ai/chapter_recurrent-neural-networks/index.html

📌 Session 4:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

📌 Session 5:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

📌 Session 6:
Generative adversarial networks (DCGAN) implementation
http://d2l.ai/chapter_generative-adversarial-networks/index.html

=========================
MLT (Machine Learning Tokyo)

site: https://machinelearningtokyo.com/
github: https://github.com/Machine-Learning-Tokyo
slack: https://machinelearningtokyo.slack.com/messages
discuss: https://discuss.mltokyo.ai/
twitter: https://twitter.com/__MLT__
meetup: https://www.meetup.com/Machine-Learning-Tokyo/
facebook: https://www.facebook.com/machinelearningtokyo

📌 Session #4 – Attention mechanism (Transformer) implementation (Part I)
📌 Introduction, Coding

About:
The goal of this series is to provide code-focused sessions by reimplementing selected models from the interactive open source book “Dive into Deep Learning” http://d2l.ai/ by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola. In this session, we will discuss the attention mechanism and we’ll be covering parts of Chapter 10. Attention Mechanism https://d2l.ai/chapter_attention-mechanisms/
We recommend interested participants read these chapters of the book to take full advantage of the session.

We will try to achieve this by:

* Discussing and resolving coding questions participants might have during the sessions.

📌 Session Leads: Devansh Agarwal and Kshitij Aggarwal

📌 Session 2:
CNN model (LeNet/ResNet) implementation
http://d2l.ai/chapter_convolutional-neural-networks/index.html

📌 Session 3:
RNN model (LSTM) implementation
http://d2l.ai/chapter_recurrent-neural-networks/index.html

📌 Session 4:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

📌 Session 5:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

📌 Session 6:
Generative adversarial networks (DCGAN) implementation
http://d2l.ai/chapter_generative-adversarial-networks/index.html

=========================
MLT (Machine Learning Tokyo)

THE FUTURE IS HERE

AI Now

Human augmentation and digital technology's impact on humans

“What If We Make SUPER Humans?” – AI Gene Editing Future SPARKS Human Enhancement CONTROVERSY

What is AI risk management? – AIGP Certification (2026)

AI in Risk Management and Governance

How AI is Revolutionizing Talent Management | Latest Technology Updates

The Impact of AI on Talent Management and Performance Improvement

AI for Talent Management and AI Upskilling for small-to-medium Businesses

How Does AI Impact Talent Management? – Emerging Tech Insider

4 Reasons why GANs are NOT Widely Used | Generative Adversarial Networks Tips and Tricks

Generative Adversarial Networks (GANs) Specialization

Dive into Deep Learning: Coding Session #4 Attention Mechanism I (Americas/EMEA)

Dive into Deep Learning: Coding Session #4 Attention Mechanism I (Americas/EMEA)

Rich X Search