THE FUTURE IS HERE

Dive into Deep Learning: Coding Session #4 Attention Mechanism I (Americas/EMEA)

๐Ÿ“Œ Session #4 โ€“ Attention mechanism (Transformer) implementation (Part I)
๐Ÿ“Œ Introduction, Coding

About:
The goal of this series is to provide code-focused sessions by reimplementing selected models from the interactive open source book “Dive into Deep Learning” http://d2l.ai/ by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola. In this session, we will discuss the attention mechanism and weโ€™ll be covering parts of Chapter 10. Attention Mechanism https://d2l.ai/chapter_attention-mechanisms/
We recommend interested participants read these chapters of the book to take full advantage of the session.

These sessions are meant for people who are interested in implementing models from scratch and/or never have implemented a model before. We hope to help participants either get started in their Machine Learning journey or deepen their knowledge if they already have previous experience.

We will try to achieve this by:

* Helping participants to create their models end-to-end by reimplementing models from scratch, and discussing what modules/elements need to be included (e.g. data preprocessing, dataset generation, data transformation, etcโ€ฆ) to train an ML model.

* Discussing and resolving coding questions participants might have during the sessions.

๐Ÿ“Œ Session Leads: Devansh Agarwal and Kshitij Aggarwal

Devansh Agarwal is a Data Scientist at BMS. He graduated with a Ph.D. in Astronomy where he has developed pipelines using high-performance computing and machine learning to aid the discovery of astronomical objects. https://www.linkedin.com/in/devanshkv/

Kshitij Aggarwal is a 4th-year graduate student at the Department of Physics and Astronomy at West Virginia University. He uses data analysis, machine learning, and high-performance computing to discover and study a new class of astronomical objects called Fast Radio Bursts. https://kshitijaggarwal.github.io/
https://www.linkedin.com/in/kshitijaggarwal13/

โ— FULL CURRICULUM
๐Ÿ“Œ Session 1:
Coding env setup example and book presentation
A quick review of ML domains (supervised/unsupervised/RL)
General Architecture/Components of ML code
Implementation of simple MLP-model
http://d2l.ai/chapter_introduction/index.html

๐Ÿ“Œ Session 2:
CNN model (LeNet/ResNet) implementation
http://d2l.ai/chapter_convolutional-neural-networks/index.html

๐Ÿ“Œ Session 3:
RNN model (LSTM) implementation
http://d2l.ai/chapter_recurrent-neural-networks/index.html

๐Ÿ“Œ Session 4:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

๐Ÿ“Œ Session 5:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html

๐Ÿ“Œ Session 6:
Generative adversarial networks (DCGAN) implementation
http://d2l.ai/chapter_generative-adversarial-networks/index.html

=========================
MLT (Machine Learning Tokyo)

site: https://machinelearningtokyo.com/
github: https://github.com/Machine-Learning-Tokyo
slack: https://machinelearningtokyo.slack.com/messages
discuss: https://discuss.mltokyo.ai/
twitter: https://twitter.com/__MLT__
meetup: https://www.meetup.com/Machine-Learning-Tokyo/
facebook: https://www.facebook.com/machinelearningtokyo