Natural Language Processing (Part 2): Data Cleaning & Text Pre-Processing in Python

Share it with your friends Like

Thanks! Share it with your friends!


This six-part video series goes through an end-to-end Natural Language Processing (NLP) project in Python to compare stand up comedy routines.

– Natural Language Processing (Part 1): Introduction to NLP & Data Science
– Natural Language Processing (Part 2): Data Cleaning & Text Pre-Processing in Python
– Natural Language Processing (Part 3): Exploratory Data Analysis & Word Clouds in Python
– Natural Language Processing (Part 4): Sentiment Analysis with TextBlob in Python
– Natural Language Processing (Part 5): Topic Modeling with Latent Dirichlet Allocation in Python
– Natural Language Processing (Part 6): Text Generation with Markov Chains in Python

All of the supporting Python code can be found here:


okwuazu ifeanyi says:

You are the best teacher I’ve seen on this platform. Thank you 🙏

Chrisogonas O. Mc'Odhiambo says:

Well illustrated! Thanks for putting this together, Alice.

Somansh Reddy says:

Well explained!

zoeyx ster says:

this is good! was literally falling asleep with the other youtube videos on this topic, then yours came on! excellent! thanks.

alexander garcia says:

you are amazing!

Kavindu Dananjaya says:

Wow you are THE best tutor for ML. Thank you so much

Afsheen Maroof says:

I encountered errors like. Lower is not a function etc

Neroc ozzardand says:

I think I might know why you like Ali Wong's comedy. It's because you're racist.

hon luu says:

AWESOME and one of the best explanations out there! would love to see you continue with more videos

Reza Abdi says:

Thanks a lot for these clean and informative videos, wondering if you have any suggestions to avoid memory failure in making Document-Term Matrix?!
Here in this line of your code:
data_dtm = pd.DataFrame(data_cv.toarray(), columns=cv.get_feature_names())

Shivaansh Parmar says:

great work really!!!

Rutuja Konde says:

Your just awsome

Bravo Lulu says:

You are awesome

MdSdn says:

hey great videos you got here but i have a question, so basically i need to semantically compare two sentences, say for example i receive a query and then compare it to a sample of phrases that i have to return the phrase that resembles the query the most. Any recommendations that you have for me?

Abhishek Sharma says:

Great explanation. Please grow your channel and make more videos.

noe escalante says:

I just could do the Tutorial until thisd part

ModuleNotFoundError Traceback (most recent call last)
<ipython-input-17-747683bd6db8> in <module>
—-> 1 from wordcloud import WordCloud
2 from wordcloud import WordCloud
3 wc = WordCloud(stopwords=stop_words, background_color="white", colormap="Dark2",
4 max_font_size=150, random_state=42)

ModuleNotFoundError: No module named 'wordcloud'

I dont know what to do 🙁 help! I'm Stuck

Justin Huang says:

this was so clear…i wish i found this 2 months before i started my NLP project lol

kik kaka says:

94% off !!! #udemy #course for

#Data #Science :Data Mining & Natural Language Processing in R

Harness the Power of Machine Learning in R for Data/Text Mining, & Natural Language Processing with Practical Examples

#coupon #deal

Ahmed Qassim says:

This is amazing.. thanks alot ⚘

Write a comment


Area 51