Highlights
1. Course Description
Natural language processing (NLP) seeks to provide computers with the ability to process and understand human language intelligently. Examples of NLP techniques include (i) automatically translating from one natural language to another, (ii) analyzing documents to answer related questions or make related predictions, and (iii) generating texts to help story writing or build conversational agents. This course, consisting of one fundamental part and one advanced part, will give an overview of modern NLP techniques.
1.1 Topics
This course will mainly focus on applying machine learning (particularly, deep learning) techniques to natural language processing. NLP topics covered by this course
- Text classification
- Word embeddings
- Language modeling
- Sequence-to-sequence models and machine translation
- Large language models and text generation
- NLP applications
For detail information, please refer to the course schedule.
1.2 Prerequisites
- Proficiency in Python
This course requires some programming in both homeworks and the final project. The preference of programming language for this course is Python (with some additional packages like Scipy, Sklearn, and PyTorch). - Calculus and Linear Algebra
Multivariable derivatives, matrix/vector notations and operations; singular value decomposition, etc. - Probability and Statistics
Mean and variance, multinomial distribution, conditional dependence, maximum likelihood estimation, Bayes theorem, etc. - Foundations of Machine Learning
Logistic regression, cross validation, optimization with gradient descent, bias and variance decomposition, etc.
2. Course Information
2.1 Instructor and TAs
- Instructor: Yangfeng Ji (Office hour: Wednesday 1:30 PM - 2:30 PM; Location: Rice 510)
- Semester: Fall 2024
- Location: Olsson Hall 005
- Time: TuTh 12:30 PM - 1:45 PM
- TA:
- Caroline Gihlstorf (Office hour: Monday 11 AM - 12 PM, Location: Rice 414)
- Elizabeth Palmieri (Office hour: Tuesday 2 - 3 PM, Location: Rice 414)
- Nibir Chandra Mandal (Office hour: Thursday 2 - 3 PM, Location: Rice 414)
2.2 Course Schedule
- Schedule
- Piazza for online discussion. Students enrolled in this class will be added to the course piazza automatically.
- Homework submission template for homework assignments
- All course lectures will be recorded and uploaded to Canvas automatically. Please do not distribute the videos outside the class. Please refer to this page for more information about the recording policy.
3. Assignments and Final Project
- Homework (60%):
- There will be four homeworks, one for each main topics covered in this course
- Each homework assignment is worth 15%.
- Project (40%):
There is only one course project and the credit breaks down to four parts. Students should team up for this project, each group should have 2 - 3 students.- Project proposal: 10%
- Mid-term report: 10%
- Final project report: 10%
- Final project presentation: 10%
In both homework and the final project, other than using the machine learning libraries including Sklearn, PyTorch, Tensorflow, students need to implemented the rest of the proposed model by themselves. Copying code from any resources (e.g., Github, Bitbucket, and Gitlab) is prohibited and will be considered as plagiarism.
3.1 Collaboration policy
For homework assignments
- Students should be fully responsible for the answers in their own submissions.
- Students are allowed to discuss homework with their classmates. If you discuss with your classmates, please disclose their names in your submission. Directly copying answers from others is definitely considered as plagiarism.
- Students are encouraged to enjoy the advancement of NLP techniques and use generative AI tools (e.g., OpenAI ChatGPT and Microsoft Copilot) for their study. If you do use generative AI, an acknowledgment to the AI tools is mandatory. In addition, students should be responsible for their own submission, regarding both the content and the correctness. Instructors reserve the possibility of requesting further clarification about submissions, and the grading will reflect students’ understanding of their own answers.
For the final project, replace the word “student(s)” with “group(s)”.
4. Additional Information
Last updated on Aug. 23, 2024