THE FUTURE IS HERE

OpenAI's New GPT 3.5 Embedding Model for Semantic Search

In this video, we’ll learn how to use OpenAI’s new embedding model text-embedding-ada-002.

We will learn how to use the OpenAI Embedding API to generate language embeddings and then index those embeddings in the Pinecone vector database for fast and scalable vector search.

This is a powerful and common combination for building semantic search, question-answering, threat detection, and other applications that rely on NLP and search over a large corpus of text data.

Everything will be implemented with OpenAI’s new GPT 3.5 class embedding model called text-embedding-ada-002; their latest embedding model that is 10x cheaper than earlier embedding models, more performant, and capable of indexing ~10 pages into a single vector embedding.

🌲 Pinecone docs:
https://docs.pinecone.io/docs/openai
Colab notebook:
https://github.com/pinecone-io/examples/blob/master/integrations/openai/semantic_search_openai.ipynb

🎙️ Support me on Patreon:
https://patreon.com/JamesBriggs

👾 Discord:
https://discord.gg/c5QtDB9RAP

🤖 AI Dev Studio:
https://aurelio.ai/

🎉 Subscribe for Article and Video Updates!
https://jamescalam.medium.com/subscribe
https://medium.com/@jamescalam/membership

00:30 Semantic search with OpenAI GPT architecture
03:43 Getting started with OpenAI embeddings in Python
04:12 Initializing connection to OpenAI API
05:49 Creating OpenAI embeddings with ada
07:24 Initializing the Pinecone vector index
09:04 Getting dataset from Hugging Face to embed and index
10:03 Populating vector index with embeddings
12:01 Semantic search querying
15:09 Deleting the environment
15:23 Final notes