Natural Language Processing – Tokenization (NLP Zero to Hero – Part 1)

Share it with your friends Like

Thanks! Share it with your friends!


Welcome to Zero to Hero for Natural Language Processing using TensorFlow! If you’re not an expert on AI or ML, don’t worry — we’re taking the concepts of NLP and teaching them from first principles with our host Laurence Moroney (@lmoroney).

In this first lesson we’ll talk about how to represent words in a way that a computer can process them, with a view to later training a neural network to understand their meaning.

Hands-on Colab →

NLP Zero to Hero playlist →
Subscribe to the TensorFlow channel →


Oumelkheir official says:

What an amazing and simple way of explication, Thank you

Sharawy Abdul says:

Thank u so much , This is very well explained.

Rx - Pert says:

Excellence! How do I leverage kMeans clustering to find similarities or segment sentences from one another?

Shubham Mohan says:

great video

Yunis Huseynzade says:

Thanks you so much. But I have a question. How can I use words in other language than English. For example building a NLP in Azerbaijani.

Eyasu Lencha says:

amazing presentation.thanks dear for the info

Gobal Krishnan V says:

Super Laurence , you done super. Tensorflow is super.

Muskan Jain says:

@lmoroney I have come across the chatbot deployments recently. It is said that there is a problem with the continued conversation in the case of chatbots. But I have a query that why can't we add a lstm on a lstm model? I mean that if suppose we are able to provide a memory on sentences too along with memory on particular sentence then it may able to store the essentials of the previous conversations. Please help me with this query actually I am new to nlp and lot more excited to know.

Asad Anees says:

Thanks Laurence Moroney are blessing for us! Awesome information

Orr Burgel says:

Wow, You unlocked the best teacher achievement, This was super easy to understand, Like the cat in the box theory

ishan ghutake says:

Suppose if you have 30 textfile in one folder how do you tokenize the word?

ipek baris says:

Thank you for the video. Sometimes exclamation mark could be informative for tasks such as sentiment classification. But the tokenizer filters out. Is there way for preventing this?

Rahul Bhardwaj says:

Great, thanks for the info!

Mohith Shivu says:

Sir my name is mohith I am final year BE student can you help me out some doubt on nlp I am working on data generalization and data sanitization our task is identifying given text weather it is sanitized or not generalized or not how it work in python can you help out sir please…. it is helpfull to me

Xuân Tùng Nguyễn says:

Hi. What happens if I set nums_words to 0? I tried and it still prints all the words

Ronnie Rendel says:

Amazingly well said

Aravind Ravindranatha says:

I need your advise on finding the text similarity

ramesh chandra srivastava says:

This code has apache license, so can it be reused?

Niggo Hazerdart says:

Great explanation, thanks a lot!!!

Babu Sivaprakasam says:

Best intro video. Glued to your presentation style

kelvin smith says:

Lol, you explained this so well that it made me want to implement my own library for tokenization

Write a comment


Area 51