Main Data History
Show Index Toggle 0 comments
  •  Quick Edit
  • Statistics 216 Notes

    The idea is to have one nice big thing of notes that is sanely ordered. We’ll see how long this lasts.

    Activity 1

    Coin flipping activity

    maybe check out Benford’s Law if you’re interested in learning more.

    why no numbers?

    The lecture covered part of section 1.1 in the book:

    • What is statistics? The study of data.

    • What is data? Data is tables.

    • What are tables? Tables are rows and columns.

    • What are rows? Rows are units, the subjects of inquiry.

    • What are columns? Columns are variables, the characteristics of the units.

    In class we came up with two little examples for data:

    student over 21 hair color male/female
    Bob no brown male
    Sarah no black female
    Jean yes brown female
    student new worth height distance home
    Kristin $2,000 5’5” 200 mi
    Jordan $1,000,000 6’0” 15 mi
    Brad $12 5’7” 2,012 mi

    Note the difference in the types of variables in the table on the left from the types of variables in the table on the right. The variables in the table in the right can be numbers. We can add them or multiply them or divide or whatever. They are called quantitative variables. The variables in the table on the left aren’t really numbers. We can’t add them up or multiply them. We call these types of variables categorical variables. One important point: note that the “over 21” variable is categorical. Some folks mistakenly consider this a quantitative variable because it involves a number. It is, in fact, a categorical variable.

    We talked a little about how we could expect some of the variables to be distributed. I said that I figured that a height variable would look something like this:

    And then I said maybe the “distance home” variable would look a little different. Maybe most people would live close by and less farther away but no one could live a negative distance from campus. Maybe the distribution would look like this: