Note: this site is still up to serve as a reference for people who took the course in Spring of '20. If you're currently enrolled, please head to the site for the current semester.

Syllabus

Welcome to Data Programming II! In this course, you'll build on the coding and analysis skills learned in Data Programming I. In addition to writing code to get correct answers, you'll learn to think about how to write code that is efficient. Instead of just analyzing datasets we give you, you'll be collecting your own datasets. In the last third of the semester, we'll have a light introduction to machine learning, prioritizing hands-on projects over deeper theory.

Additions To Syllabus Made During Semester

New Reading (Feb 23)

DS100 Book (see below).

New Reading (Mar 25)

Scipy Lecture Notes (see below)

New Submission Rules (Feb 17)

Autograder Policy

The autograder will be run periodically during the days prior to a project deadline (exact schedule tbd). Because of this, we expect you to try submitting your project early, and make sure nothing crashes. However, this should not be a substitute for running test.py locally. You should only try submitting once you pass the tests locally. The responsibility of making sure your code does not crash is therefore placed on the student.

Resubmissions

Unexpected Crash: If your code crashed/received significantly less than expected and you had tried submitting it (and it ran successfully) before the deadline, you will be able to resubmit up until the following Friday at midnight without being penalized. In order for this to not be counted as late, you will have to coordinate with your TA.
Other resubmit: If after the deadline you are not satisfied with the grade the autograder gives you, you may resubmit afterwards but this will eat into your late days.

Note that TAs will start grading submissions the weekend after the deadline. Therefore any submissions that are not submitted by Friday at midnight might not get feedback. Special Cases: If there were extenuating circumstances (e.g., documented illness) that prevented you from completing your work on time, and you have insufficient late days remaining, please speak with your lab TA. If you're struggling to keep up with the course, please set up an appointment with the instructor (Tyler) to discuss how you can get back on track.

Readings

We'll assign readings from three main sources this semester (all free). Stay on top of them!

Think Python 2nd Edition by Allen B. Downey: Read Online
Automate the Boring Stuff with Python by Al Sweigart: Read Online
Principles and Techniques of Data Science by Sam Lau, Joey Gonzalez, and Deb Nolan: Read Online
Scipy Lecture Notes by many contributors: Read Online

Getting Help

There are a few ways to get help:

Check for walk-in office hours (instructor and TA): go to "Resources" menu and click "Office Hours"
Find your TA's name and email here
Email me at tharter@wisc.edu to schedule an appointment outside of office hour time

Unfortunately, we won't have peer mentors this first semester, because there's nobody who has previously taken the course that we could hire. If you like helping people and do reasonable well in CS 320, please consider applying (at the end of the semester) to be a peer mentor for this course in future semesters. It pays $13/hour.

Online Communication

Besides email (described above), there are four ways we'll communicate outside of class.

1. Piazza: You can ask questions (and see the other questions) here. Do not post code snippets that are >5 lines long, that's considered cheating.

2. Canvas: We'll make announcements on Canvas and periodically upload grades there (detailed feedback will only be on this site, however)

3. Class Forms: We have various forms for us to leave (optionally anonymous) feedback and report exam conflicts.

4. Code Review: You will upload projects using this tool. Via the same tool, TAs will leave comments on your code. Even projects scoring 100% often have a lot of room for improvement, so please take these seriously. When submitting, you can ask for specific kinds of feedback, based on what coding skills you're most interested in developing.

Lab

Each week, we'll post a lab document with exercises you can work on to solidify what we've covered in class. Sometimes we'll also introduce topics we didn't have time for in lecture. The emphasis will usually be on preparing for the project, so make sure you work through the weekly lab before asking us for help on the project.

Lab exercises are ungraded, and you're free to do them from home. However, we encourage you to do them during your scheduled lab time. This gives you an opportunity meet other 32 students and to get help from the TAs and mentors running the lab. You can also use this time to ask questions about the projects or other course material if you like.

We encourage you to work on your own laptops, but the labs have some Windows desktops you can use if you prefer. In order to sign onto the lab machines, you'll need to activate your CS account.

Grading

Grading is based six projects, each worth 10% of your grade, and two exams, each worth 20% of your grade.

At the end of the semester, we'll look at the distribution of scores and set a curve, with thresholds for each letter grade. When setting the curve, we'll guarantee the following:

at least 95% guarantees an A
at least 85% guarantees a B (or better)
at least 75% guarantees a C (or better)
at least 65% guarantees a D (or better)

Exams

Exams are closed-book and closed-laptop. The exams will be multiple-choice scantron (use a #2 pencil). They are cumulative.

You will be allowed one 8.5-by-11 inch notesheet (may be printed or written on both sides), which you must turn in with your exam.

We'll schedule a McBurney exam with 1.5x time accomodations. Feel free to email us if you need other accomodations. More information about McBurney is here.

Exam Times:

Midterm: Wed Mar 11 @ 7:15-9:15 pm in Ingraham B10
Final: Sun May 3 @ 12:25-2:25 pm in TBD

Find your location here. Report any exam conflicts at least 10 days in advance here.

Projects

Late Policy: You will have 7 late days, which you can use across projects at your own discretion. You may use all your late days on the same project, if you like. Using late days on a project does not defer the deadline for subsequent projects, so be careful not to let work pile up. You may not use late days on the last project. Late days are automatically applied if a project is turned in late. After late days are exhausted, anything late will receive zero credit by default. Please talk to me if you're falling this far behind.

Partners: You may work alone, or with one project partner of your choosing. You can partner with different people for different projects if you like. You may also partner with people in a different section of CS 320. You and your partner should program in pairs (two people sitting in front of a screen at the same time). Take turns driving (i.e., writing code) and giving advice. The point of partners is to learn from your peers, not to do half the work. Some exam questions will be specifically written to find cases where a one partner does the work and the other partner does not understand what is going on.

Submission: You will upload either a .py file or a .ipynb (as specified) file for each project with the submission tool. Only one partner should upload the project on behalf of both people. If you have any issues using the tool, make sure you're using the latest version of Chrome, because that is the only browser I support (and if that doesn't work, I'll help you troubleshoot).

Code Review: A TA will give you detailed comments on specific parts of your assignment. This feedback process is called a "code review", and is a common requirement in industry before a programmer is allowed to add her code changes to the main codebase. Read your code reviews carefully; even if you receive 100% on your work, we'll often give you tips to save effort in the future.

Project Grading: Grades will be based on automatic tests that we run. We'll share the tests with you before the due date, so you should rarely be surprised by your grade. Here are the cases where you might get a different final grade than what you see when you run the tests:

Configuration Issues: There are ways to write code that will only work on certain computers (e.g., Windows but not on a Mac). We'll talk about these cases in class and teach you to write code that should be able to run anywhere. If you make a mistake that makes the tests fail for us even though you passed them yourself, we'll let you resubmit a corrected version (in a timely fashion). You should also check with us before using any pip packages that we may not support.
Randomness: There are some bugs in code that don't causes problems every time, resulting in tests sometimes failing and sometimes passing. As in the "Configuration Issues" case, we'll work with you to let you fix the issue.
Cheating: obviously if you cheated (e.g., copied another student's code), the tests are irrelevant, and you have much bigger things to worry about than your grade on a particular assignment.
Faking: sometimes, we will specify that you solve a problem in a specific way (so that you learn a particular skill). If, upon inspection of your code, we see you solved the problem in a way that doesn't meet the project specification, you'll lose points on that project (even if our automatic tests already tentatively gave you a better score). Or, if your code is written specifically to defeat the tests, you'll lose points (see this wikipedia article for an example of a defeat device in the real world). For example, suppose your program is supposed to take the length of a square's side as input and then output the area of a square, and we have a test that verifies your program outputs 100 when we input 10. If you submit a program that always outputs 100 (regardless of the input) because you know we only test with 10, you'll lose points.
Being a bad partner: if your partner complains that you didn't do any work (or that you did all the work, refusing to let your partner write any code), you may lose points (we'll meet with you first to hear your side, of course).

Project grading is results-oriented. That means it doesn't matter how much effort you put it; it only matters how well your code works. This means it is essential that your code runs. If we can't run your code for a project, you'll get a zero on that project, because the tests will fail. We'll never fix the code for you, and we'll never manually give a better grade for code that "looks" almost correct.

Extra Support/Accomodations

As a student, you may experience a range of issues that can cause barriers to learning. These might include strained relationships, anxiety, high levels of stress, alcohol/drug problems, feeling down, loss of loved one, and/or loss of motivation. It is important to be aware that services exist on campus to support students who find themselves in these situations. There are several avenues for support, including: University Health Services and the Dean of Students Office. You can learn more about free, confidential mental health services at UHS by calling 608-265-5600 Opt. 2 or visiting uhs.wisc.edu. Drop-in staff are available daily at the Dean of Students Office to support students and answer questions. To learn more about the Dean of Students Office, please call 608-263-5700 or visit doso.students.wisc.edu.

Cheating

Yeah, of course you shouldn't cheat, but what is cheating? The most common form of academic misconduct in these classes involves copying/sharing code for programming projects. Here's an overview of what you can and cannot do:

Acceptable

any collaboration with your project partner
talking about code (without looking at it or reading it aloud) with anybody
sharing pseudocode/diagrams with anybody
pointing friends to online resources (Piazza, Stackoverflow, etc.)
doing worksheets with friends
copying code examples from online examples that is NOT specific to your project (if project solutions are leaked online, you may not use that). If you copy code, you must cite it in your code with a comment (think of it like citing a quote in a essay -- without the cite, you're plagarizing).

NOT Acceptable

sitting near people you know during an exam
looking at somebody else's project code and typing it line-by-line
copying/pasting code to/from a non-partner (with or without citation)
emailing your code to another student
taking pictures of project solutions
accessing project code from previous semesters
copying code even if you change some things (e.g., variable names)

One action that students wonder about is whether it's acceptable to LOOK at another student's code. While this is not strictly forbidden, it's often dangerous. Many students find it difficult to look without ultimately copying. What's somewhat safer is when a student who has completed a problem looks at the code of a student still working on the problem. In this case, the more-advanced student can often provide useful advice and debugging tips that help the less-advanced student without causing the less-advanced student to copy code. Just make sure the more-advanced student isn't telling the less-advanced student exactly what to type, line by line. The more advanced student shouldn't be looking back at their own code either during this.

Similarity Detection: of course, with >100 students, it's hard for a human TA to notice similar code across two submissions. Thus, we use automated tools to looks for similarities across submissions. Such similarity detection is an active area of computer science research, and the result is tools that detect code copying even when students methodically rename all variables and shuffle the order of their code. We take cheating detection seriously to make the course fair to students who put in the honest effort.

Citing Code: you can copy small snippets of code from stackoverflow (and other online references) if you cite them. For example, suppose I need to write some code that gets the median number from a list of numbers. I might search for "how to get the median of a list in python" and find a solution at https://stackoverflow.com/questions/24101524/finding-median-of-list-in-python.

I could (legitimately) post code from that page in my code, as long as it has a comment as follows:

    # copied from https://stackoverflow.com/questions/24101524/finding-median-of-list-in-python
    def median(lst):
      sortedLst = sorted(lst)
      lstLen = len(lst)
      index = (lstLen - 1) // 2

      if (lstLen % 2):
        return sortedLst[index]
      else:
        return (sortedLst[index] + sortedLst[index + 1])/2.0

In contrast, copying from a nearly complete project (that accomplishes what you're trying to do for your project) is not OK. When in doubt, ask us! The best way to stay out of trouble is to be completely transparent about what you're doing.

Recommendation Letters

Earning a recommendation letter is much harder than earning an A in this course. At a minimum, I'll want to see you doing something complex and interesting beyond the assingments. For a typical letter, I'll have collaborated with a student on some project for multiple months, with many iterations of feedback.

Most grad schools require recommenders to fill long forms rating students on various abilities (see an example below). Make sure that if you're asking me, I would be able to fill such a form without needing to put "I don't know" as my answer to many of the questions.