Learn Natural Language Processing: From Beginner to Expert
March 27, 2018
This 28-part course consists tutorials, quizzes, hands-on assignments and real-world projects to learn natural language processing.
Natural language processing comprises of a set of computational techniques to understand natural languages such as English, Spanish, Chinese, etc.
The primary objectives of this course are as follows:
- Understand and implement NLP techniques for sentiment classification, information retrieval (search engines) and topic modeling.
- Understand and implement NLP techniques for uncovering text syntax and structure. That is, predicting part-of-speech tags, parse tree structure, named-entities like people and places, etc.
- Understand and implement NLP techniques for some non-traditional topics such as language identification, spelling correction, and creating word clouds.
- Bonus: Understand and implement deep learning methods for NLP (also called Deep NLP), and apply them to text generation and language translation. These methods represent the state-of-the-art for advanced tasks such as language translation, question answering, speech recognition and music composition and power systems like the Google Assistant and Amazon Alexa.
Prerequisites: Python and Linear Algebra, Statistics and Probability (Review).
Related course: Machine Learning.
Enroll to add this course to the top of your Home Page. Get started with the first tutorial below.
Introduction to Natural Language Processing
This section has just one tutorial, introducing you to NLP, discussing its applications, challenges and the various approaches to NLP.
Text Classification and TF-IDF
We first introduce TF-IDF (term frequency, inverse document frequency), a very commonly used measure in NLP to weigh the importance of different words. This helps us in search engine ranking (also called document retrieval), finding similar or related documents, and so on.
Then we move on to text classification (separating different types of documents into pre-defined categories), and use the learned concepts to implement sentiment classification from scratch. Lastly, we’ll cover topic modeling (organizing documents into groups when pre-defined categories are not available).
- TF-IDF: Vector representation of Text
- Quiz: TFIDF (+ search engines, related articles recommender)
- Hands-on Project: Implementing a Search engine from scratch
- Text Classification (Topic Categorization, Spam filtering, etc)
- Hands-on Assignment: Sentiment Classification with Naive Bayes
- Topic modeling with LDA
Understanding Text Syntax and Structures
In this section, we will explore methods for understanding the syntax and structure of text. This sort of processing is important in-order to understand the meaning of sentences (in contrast to just predicting a category), and is used in applications like question answering and text summarization.
In the first tutorial, we’ll see how to identify whether each individual word is a noun, pronoun, adjective, verb, etc, i.e. the part of speech. In the second, we’ll see how to separate a sentence into noun phrases, verb phrases, and so on. Then, we’ll see how to parse a sentence, which reveals not only the roles of individual words and phrases, but also the relation between them. Lastly, we’ll see how to detect named entities in text, like people, places, companies, and so on.
- Part of Speech tagging: Understanding Text Syntax and Structures, Part 1
- Chunking (Shallow Parsing): Understanding Text Syntax and Structures, Part 2
- Parsing: Understanding Text Syntax and Structures, Part 3
- Quiz: Text Syntax and Structures (Parsing) (+Question Answering)
- Introduction to Named Entity Recognition with Examples and Python Code for training Machine Learning model
Other Topics in NLP: Word Clouds, Language Identification and Spell Checking
Word Clouds are a great way of visualizing a document or a set of documents. You can see a word could for NLP at the top of this article.
Language Identification is the problem of identifying whether a given text is written in English, Spanish, French, etc. The last tutorial in this section describes an algorithm to perform automatic spelling correction. We’ll implement both language identification and spelling correction in our hands-on assignments.
- Word Clouds: An Introduction with Code (in Python) and Examples
- Natural Language Identification: What it is, why it is important, and how it works.
- Hands-on Assignment: Implementing Language Identification from Scratch
- Spell Checking and Correction
- Hands-on Assignment: Implementing Spell Check from Scratch
List of Project Ideas
This is a list of about 10 project ideas (with links to datasets and suggested algorithms). It is recommended that you do at-least one end-to-end project.
Bonus: Deep Learning approaches to NLP
Deep Learning is an advanced set of techniques in machine learning which has achieved state-of-art results in language translation, question answering, speech recognition and music composition and powers systems like the Google Assistant and Amazon Alexa.
In this section, we’ll first introduce core concepts in deep learning (2 tutorials + 1 quiz + 1 assignment), and then discuss deep learning methods for NLP (3 tutorials + 1 quiz + 1 project).
We didn’t want to make this course too long, so the tutorials below might feel like they’re going too fast. If you would like to learn deep learning and Deep NLP more throughly, see: Learn Deep Learning: From Beginner to Expert.
- A First Look at Neural Networks
- Quiz: Deep Learning and Neural Networks
- Computational Graphs and Backpropagation
- Hands-on Assignment: Training your first Neural Network
- Quiz: The Importance of “Good” Gradients
- Recurrent Neural Networks and Long Short-Term Memory Networks
- Deep Natural Language Processing
- Quiz: Recurrent Neural Networks
- Hands-on Assignment: Text Generation using Recurrent Neural Networks
- Sequence to Sequence Learning with Neural Networks