Prafulla Dhariwal (OpenAI) - Jukebox: A Generative Model for Music -

Prafulla Dhariwal (OpenAI) – Jukebox: A Generative Model for Music

Prafulla Dhariwal (OpenAI)
Jukebox: A Generative Model for Music
Presentation recorded June 19, 2020

Abstract: Music is an extremely challenging domain for generative modeling: it’s highly diverse, humans are perceptive to small errors, and it has extremely long range dependencies to learn if generated as raw audio. We show it’s possible to generate music with singing directly in the raw audio domain. We tackle the long sequence lengths of raw audio using a multi-scale VQ-VAE to compress it to discrete codes, and model those using autoregressive Transformers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable.

Bio: Prafulla Dhariwal is a research scientist at OpenAI leading work on generative models under the guidance of Ilya Sutskever. His work focuses on modeling high dimensional data while preserving fidelity and diversity, with prominent works being Glow, a normalizing flow generating high resolution images with fast sampling; and Variational Lossy Auto-encoder, a way to understand and prevent latent collapse with autoregressive decoders in VAE’s. In the past, he’s also worked on reinforcement learning, including PPO, a popular on-policy RL algorithm; and GamePad, an environment to make it easier to apply RL to formal theorem proving. He obtained his undergraduate degree from MIT in 2017 with a double major in Computer Science and Mathematics.

THE FUTURE IS HERE

AI Now

An implant in his brain lets him do incredible tasks with his thoughts

This brain implant allows paralyzed people to type with their mind

Some movies where emotion recognition based on AI is used.

AI & Emotion: Can Machines REALLY Understand US? 🤔

Emotion Recognition via JD Robot and Microsoft Azure

AI and EEG to Decode Human Emotions

CAR parking sensor / simple sensor alarm / science project

Hot box detection with Frauscher Wheel Sensor RSR123

EAG | Microelectronics Test & Engineering

Microelectronics & Globalization | new course 2021/22