Build Hour: GPT-Realtime-2

Build with the next wave of realtime voice AI. In this Build Hour, you’ll learn how to use GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper to build low-latency voice agents that can translate live speech, reason across tools, operate apps, and support more natural voice-to-voice and voice-to-action experiences.

In this session, Teri Yu (Product) and Erika Kettleson (Solutions Engineering) will cover:
• Building with new realtime audio models for translation, streaming speech-to-text, and intelligent voice agents
• Using GPT-Realtime-2 capabilities like preambles, 128K context, parallel tool calling, domain understanding, context over turns, and controllable expressiveness
• Creating voice-powered workflows for shopping and product analytics dashboards
• Customer Spotlight on how Sierra (https://sierra.ai/) is designing production customer experience agents with guardrails, VAD tuning, tracing, redaction, evals, and customer-specific harnesses.

👉 Realtime Voice Blog: https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/
👉 Voice Agents Docs: https://developers.openai.com/api/docs/guides/voice-agents
👉 Playground: https://platform.openai.com/audio/realtime
👉 Follow along with the code repo: http://github.com/openai/build-hours
👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours

00:00 Welcome and intro
02:06 Realtime voice models overview
02:26 GPT-Realtime-Translate and GPT-Realtime-Whisper demo
04:36 GPT-Realtime-2: three ways to build with voice AI
05:14 What’s new in GPT-Realtime-2
06:58 Demo: Voice-powered search agent
12:32 Demo: Product analytics dashboard
17:24 What can you build with voice AI?
18:36 Customer spotlight: Sierra
29:56 Q&A
42:05 Resources & Upcoming Build Hours

THE FUTURE IS HERE

AI Now

Nvidia CEO Jensen Huang recounts delivering the 'world's first AI supercomputer' to OpenAI

How to build a Computer Vision RAG app with Python, ChromaDB and OpenAI

OpenAI’s ChatGPT: Answers API Example