CS 6501-007 Natural Language Processing
- Instructor: Yangfeng Ji
- Semester: Spring 2021
- Location: Online
- Time: Monday and Wednesday 12:30 PM - 1:45 PM
- TA:
- Office Hours:
- Yangfeng Ji: Tuesday 11 AM - 12 PM
- Hannah Chen: Thursday 11 AM - 12 PM
- Aidan San: Wednesday 4 PM - 5 PM
- Zoom links for office hours are posted on Campuswire
- From the instructor: I lost track of email requests about class enrollment. If you are still interested in this course, please send me an email (yangfeng at virginia) with only “Course Enrollment Request” in the subject line and a short description of the related courses that you have taken so far.
1. Highlights
We will use Zoom for our online teaching. Students can find the Zoom links on Collab, under the Online Meetings tab. We will also record our lectures and upload to Collab.
About final project
2. Course Description
Natural language processing (NLP) seeks to provide computers with the ability to process and understand human language intelligently. Examples of NLP techniques include (i) automatically translating from one natural language to another, (ii) analyzing documents to answer related questions or make related predictions, and (iii) generating texts to help story writing or build conversational agents. This course, consisting of one fundamental part and one advanced part, will give an overview of modern NLP techniques.
2.1 Topics
This course will mainly focus on applying machine learning (particularly, deep learning) techniques to natural language processing. NLP topics covered by this course
- Text classification
- Language modeling
- Word embeddings
- Sequence labeling
- Machine translation and sequence-to-sequence models
- Some advanced topics: large-scale pre-trained language modelsing (e.g. BERT), generative models, natural language generation, interpretability in NLP
For detail information, please refer to the course schedule.
2.2 Prerequisites
- Proficiency in Python
This course requires some programming in both homeworks and the final project. The preference of programming language for this course is Python (with some additional packages like Scipy, Sklearn, and PyTorch).
- Calculus and Linear Algebra
Multivariable derivatives, matrix/vector notations and operations; singular value decomposition, etc.
- Probability and Statistics
Mean and variance, multinomial distribution, conditional dependence, maximum likelihood estimation, Bayes theorem, etc.
- Foundations of Machine Learning
Logistic regression, cross validation, optimization with gradient descent, bias and variance decomposition, etc.
2.3 Textbooks
Supplemental materials
3. Assignments and Final Project
- Homework (64%):
- There will be four homeworks and each of them is worth 16%.
- Students are allowed to discuss homework with their classmates. But, directly copying answers from others is definitely considered as plagiarism.
- Project (36%):
There is only one course project and the credit breaks down to four parts
- Project proposal: 12%
- Final project presentation: 8%
- Final project report: 16%
- Other than using the machine learning libraries including Sklearn, PyTorch, Tensorflow, students need to implemented the rest of the proposed model by themselves. Copying code from any resources (e.g., Github, Bitbucket, and Gitlab) is prohibited and will be considered as plagiarism.
- Students should team up for this project, each group should have 3 - 4 students.
Last updated on Jan. 27, 2021