Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 2.79 KB

hw_1.md

File metadata and controls

55 lines (43 loc) · 2.79 KB

Homework 1

General assignment information

Tutorials

Coding

  1. Find a dataset.
    • It must have:
      • At least one numeric column
      • Between one thousand and one million rows
        • If it's larger than that, you can filter it down.
    • Don't spend too long on this step.
  2. If there's more than one numeric column, pick one.
  3. Create a new notebook.
  4. Using pandas:
    1. Read in the data.
    2. Compute:
      • The mean
      • The median
      • The mode
    3. Do a groupby() with an aggregation.

Now turn in the assignment.

Tutorials, continued

  1. Read The Joys (and Woes) of the Craft of Software Engineering
    • Note not everything in there is applicable to data analysis
  2. Filtering/indexing DataFrames
  3. Learn about functions
  4. Coding Style Guides - Please skim these; I don't expect you to understand and follow everything in them. The most important guidelines to pay attention to are indentation and keeping each statement on its own line.
  5. Guide to commenting your code
  6. Quartz Guide to Bad Data

Optional

Participation

Reminder about the between-class participation requirement.