THE FUTURE IS HERE

OpenAI's CLIP for Zero Shot Image Classification

State-of-the-art (SotA) computer vision (CV) models are characterized by a *restricted* understanding of the visual world specific to their training data [1].

These models can perform *very well* on specific tasks and datasets, but they do not generalize well. They cannot handle new classes or images beyond the domain they have been trained with.

Ideally, a CV model should learn the contents of images without excessive focus on the specific labels it is initially trained to understand.

Fortunately, OpenAI’s CLIP has proved itself as an incredibly flexible CV classification model that often requires *zero* retraining. In this chapter, we will explore CLIP in zero-shot image classification.

🌲 Pinecone article:
https://pinecone.io/learn/zero-shot-image-classification-clip/

🤖 70% Discount on the NLP With Transformers in Python course:
https://bit.ly/3DFvvY5

🎉 Subscribe for Article and Video Updates!
https://jamescalam.medium.com/subscribe
https://medium.com/@jamescalam/membership

👾 Discord:
https://discord.gg/c5QtDB9RAP