{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CME193 Assignment 2\n", "\n", "### Due Sunday 17th Feb 5PM\n", "\n", "In this assignment you will implement some machine learning algorithms on the [Congressional Voting Records Dataset](https://archive.ics.uci.edu/ml/datasets/congressional+voting+records). The goal of the assignment is to write a python script that reads in the dataset from the internet, process it and build a few models and output some graphs.\n", "\n", "You can use this notebook to write code and check that it works but once you are sure that everything works you will put all your code in a script, that can be called from the command line. It is always a good habit to convert the code you write in notebooks into clean scripts so that it can be used with relative use later on.\n", "\n", "Note : Most programming courses always have starter code to help students in completing the assignments, this is done so that students do not waste time coding up boilerplate code and also to help graders by standardising the code they have to read, but unfortunately this leaves many students with only the ability to fill in code while they lack confidence in creating a project from scratch. It is in this interest that only minimal starter code is provided in this assignment and you are required to submit a script.\n", "\n", "Make sure you refer to the lecture notebooks in case you forgot how to do any of the operations mentioned below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset\n", "\n", "The dataset we will be working with on this assignment is the [Congressional Voting Records Dataset](https://archive.ics.uci.edu/ml/datasets/congressional+voting+records) for 1984, open the link and read the description of the dataset, make sure you understand what the columns and rows represent.\n", "\n", "The following code will quickly download the dataset into a pandas dataframe" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | party | \n", "Vote_0 | \n", "Vote_1 | \n", "Vote_2 | \n", "Vote_3 | \n", "Vote_4 | \n", "Vote_5 | \n", "Vote_6 | \n", "Vote_7 | \n", "Vote_8 | \n", "Vote_9 | \n", "Vote_10 | \n", "Vote_11 | \n", "Vote_12 | \n", "Vote_13 | \n", "Vote_14 | \n", "Vote_15 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "republican | \n", "n | \n", "y | \n", "n | \n", "y | \n", "y | \n", "y | \n", "n | \n", "n | \n", "n | \n", "y | \n", "? | \n", "y | \n", "y | \n", "y | \n", "n | \n", "y | \n", "
1 | \n", "republican | \n", "n | \n", "y | \n", "n | \n", "y | \n", "y | \n", "y | \n", "n | \n", "n | \n", "n | \n", "n | \n", "n | \n", "y | \n", "y | \n", "y | \n", "n | \n", "? | \n", "
2 | \n", "democrat | \n", "? | \n", "y | \n", "y | \n", "? | \n", "y | \n", "y | \n", "n | \n", "n | \n", "n | \n", "n | \n", "y | \n", "n | \n", "y | \n", "y | \n", "n | \n", "n | \n", "
3 | \n", "democrat | \n", "n | \n", "y | \n", "y | \n", "n | \n", "? | \n", "y | \n", "n | \n", "n | \n", "n | \n", "n | \n", "y | \n", "n | \n", "y | \n", "n | \n", "n | \n", "y | \n", "
4 | \n", "democrat | \n", "y | \n", "y | \n", "y | \n", "n | \n", "y | \n", "y | \n", "n | \n", "n | \n", "n | \n", "n | \n", "y | \n", "? | \n", "y | \n", "y | \n", "y | \n", "y | \n", "