Friday, June 10, 2016

Using NLTK tool kit to classify text using predefined libraries

install and import below libraries

rohitgopidi
rohitgopidi
Reading the training dataset from a CSV, this can also be done from any file format or from any source
rohitgopidi
rohitgopidi
Once the train data is read, you can tokenize and stem if you prefer. This step can be skipped as tokenization can be done in the next steps while calculating TFIDF
rohitgopidi
rohitgopidi
Append the tokenized content , can be skipped if not using tokenizing in the previous step
rohitgopidi
rohitgopidi
Calculating count vectorizer to find the importance of the text in the document
rohitgopidi
rohitgopidi
Using Naive_Bayes library to train and predict
rohitgopidi
rohitgopidi
Test your training model by submitting your new sentence
rohitgopidi
rohitgopidi

No comments:

Post a Comment