Machine Learning

UVa CS 4774 (Spring 2023)

1. Introduction

The idea of this final project is to give students an opportunity of working on some “real-world” problems. For the final project, instead of providing the pre-processed data, we will provide the raw datasets. Students can pick one of the datasets, which is equivalent to picking a task as shown in the following.

The theme of this project is “Machine Learning for Social Good”

2. Project Options

Option 1: Detecting Bias in Social Media Posts

Option 2: Brain Tumor Classification (MRI)

Option 3: Detecting Malicious and Benign Websites

On the Kaggle website, each dataset has several demo code, which may not be exactly aligned with our project requirements. So, students can use them as reference, but please use them with caution. On the other hand, directly copying from the example code will be considered as plagiarism.

3. Project Group Signup

We will use Canvas to sign up groups. Currently, Canvas allows students to do self signup, which means students can create their groups.

You can also sign up using the Google form, if you have trouble of using the Canvas.

4. Project Report Guideline

Please create an iPython notebook either from your local machine or on Google collab for this project. Your report will be that notebook file. Please keep all the code and required outputs for grading.

Please use the following section titles to organize your report and implementation.

Section 1: Data Preprocessing

Although all three options are classification tasks, each option has a different type of data. Therefore, each option will have a different requirement of feature engineering.

Please keep the pre-processed data for the tasks in the following sections.

Option 1: Detecting Bias in Social Media Posts

TODO

Option 2: Brain Tumor Classification (MRI)

TODO

Option 3: Detecting Malicious and Benign Websites

TODO

Section 2: Data Splits

TODO

Comment:

Section 3: Build classifiers

TODO

Comments:

Section 4: Hyper-parameter tuning

For each classifier

TODO

Section 5: Analysis

Please answer the following two questions

TODO

5. Submission Guideline

For the final project report

For the final project presentation

6. Project Presentation

The project presentation should be within 6 - 8 minutes. Each team should prepare a slide deck for the presentation, and the number of slides should be around seven, including the title page.

The expected content from the presentation includes

Note that