The algorithm of choice, at least at a basic level, for text analysis is often the Naive Bayes classifier. Part of the reason for this is that text data is almost always massive in size. The Naive Bayes algorithm is so simple that it can be used at scale very easily with minimal process requirements. Playlist link: https://www.youtube.com/watch?v=FLZvOKSCkxY&list=PLQVvvaa0QuDf2JswnfiGkliBInZnIC4HL&index=1 sample code: http://pythonprogramming.net http://hkinsley.com https://twitter.com/sentdex http://sentdex.com http://seaofbtc.com