Tuesday, February 23, 2010

Statistics, Day 5

On Monday, we looked at where the data come from for the statistical calculations. We discussed how surveys occur and how the people are selected from a larger group to participate in the survey. Students need to be able to identify ...
  1. The population: The larger group that the survey is trying to share information about.
  2. The sample: The smaller group that actually participates in the survey.
  3. The sampling method: One of the 5 methods that we discussed in class: random, stratified random, self-selected, systematic, and convenience samples.
  4. Whether or not the sample has bias: Does the sample leave out part a group from the population? Is one group over represented or under represented within the sample?
  5. Whether or not the question has bias: Does the question project the expected or desired response of the interviewer?

Homework: Page 361: #2 - 10 even; Page 362: #2 - 10 even


Statistics, Day 4

On Friday, we went further into the concept of mean absolute deviation. We looked at two sets of grades for students that have the same average or mean (75%) and the same range (from 60% to 90%). However, they were very different students. One had grades more balanced out in the span from 60 to 90; while the other had grades that either were high 80s to 90 or were low 60s. I used these sets to introduce the statistic that measures the deviation away from the mean, or a set of data's variability. To calculate the mean absolute deviation:
  1. Calculate the mean.
  2. Find the distance from every number in the set away from the mean. These are the deviations.
  3. Find the average of all deviations by adding them together and dividing by the number of data in the set. This is the mean absolute deviation(MAD).

The higher the value of the MAD, the less consistent the data are around the mean. Or, the more variability the data has away from the mean.

Homework: Page 365; #9 - 14 and 17 c & d


Thursday, February 18, 2010

Statistics, Day 3

We spent much of classtime today going over the Box & Whisker plot and its characteristics. We discussed where the box can appear with respect to the median. The box, which is defined by the lower and upper quartiles, can appear anywhere around the median, even with one end lined up with the median, for instance when the lower quartile and the median turn out to be the same number. But the box cannot occur completely below or completely above the median because the lower quartile must be less than or equal to the median and the upper quartile must be greater than or equal to the median.

We also talked about the significance of the range and IQR. While the mean, median, and mode show the central tendency of the data, the range and IQR show the spread of the data. We talk about the data being dispersed or spread out when the range or IQR are larger values. We consider the data to be condensed, dense, or consistent when the range or IQR are smaller values. For example, consider two students who both have 75 averages. One has grades spanning from 72 to 78, and the other has grades spanning from 60 to 90. While they have the same average, the differing ranges imply that the second student is capable of better grades but is not consistently working at their potential (high Bs). The first student is much more consistent with their grades and is probably a solid C student.

We started discussing the mean absolute deviation (MAD) today. We will continue to discuss it and work through calculations tomorrow.

Homework: Complete the bottom of the front page of the notes. The assignment listed at the top of the notes is due Monday, February 22nd.

Remember, project topics need to be approved by tomorrow.


Wednesday, February 17, 2010

Statistics, Day 2

Wednesday, Febuary 17th



We discussed the concepts of the 5-Number Summary, Interquartile Range (IQR) and the Box-and-Whisker Plot. Students have made Box & Whisker plots in previous math courses. To make this graph, students must first find the 5-Number Summay. The 5-Number Summary is made up of the following 5 numbers from the data set: minimum, lower quartile, median, upper quartile, and maximum. The interquartile range is used to show how the middle 50% of the data values are spread or are compacted. To find these values, the students need to follow these steps:
  1. Order the data from least to greatest.
  2. Identify the median.
  3. Identify the median of the lower half of the data values. (Do not include the overall median in the lower half of the data values.) This is the lower quartile, also called the first quartile or simply Q1.
  4. Identify the median of the upper half of the data values. (Do not include the overall median in the upper half of the data values.) This is the upper quartile, also called the third quartile or simply Q3.
  5. Identify the smallest value (minimum) and the greatest value (maximum) of the data.
  6. The interquartile range is found by subtracting the third quartile minus the first quartile.

To make a Box and Whisker Plot, follow these steps:

  1. Create a number line or x-axis that spans at least from the minimum to the maximum values.
  2. Use a short, vertical tic-mark to indicate the numbers in the 5-Number Summary in the space above the number line.
  3. Draw a segment connecting the first two tic-marks for a "whisker" connecting the minimum to the first quartile.
  4. Draw a segment connecting the last two tic-marks for a "whisker" connecting the third quartile to the maximum.
  5. Connect the second and fourth tic-marks across the top and across the bottom to create a "box" around the middle 50% of the data values.

The numbers in the 5-Number Summary show how the data are spread or compacted. Each number occurs 25% of the way through the data values, however some of the distances between the actual data values are less (we talk about the these spans being more dense) or the distances between the actual data values are more (we talk about these spans being spread out).

Example:

Homework: Page 371, #1 - 9. Additionally, create a Box & Whisker Plot for #1 - 4 and #6 - 7.

Students received a project today. They need to select a measurable idea to compare between 2 populations (example, how many hours do boys vs. girls spend on Facebook?) and make a hypothesis about the expected results. They will have to collect at least 16 data values for each population (a minimum of 32 data values total). Then, they have to generate all of our statistics with each set of data: mean, mean absolute deviation (to be introduced tomorrow), and the 5-number summary. They need to make a Box & Whisker plot for both sets of data. They may make a histogram for extra credit. They then need to draw a conclusion based on the statistics. The final product can be a report, a poster, a PowerPoint, etc ... This project is due on Wednesday, February 24th.

There will be a test on Statistics (Unit 4 Part 2) on Friday, February 26th.





Statistics, Day 1

Tuesday, February 16th

We reviewed the concepts of central tendency: mean, median, and mode. The mean is the average of the values in a data set. The median is the middle number of the values in the data set when it is arranged in numerical order. The mode is the value that occurs the most often. These concepts were taught to the students in previous math courses. However, we now want them to be able to pick one (or more) of those values that best represents the data. Students should consider whether or not outliers move the mean further up or down away from the center. They should also consider if having a dense group of numbers positions the median further away from part of the data values.

We also reviewed the concepts of a frequency table and a histogram. Both of these can have the data values grouped together or listed individually. They should have between 5 and 10 spans or individual data values. A frequence table lists the data values or spans of the data values in one column and then records the number of times those values occur in the data. A histogram shows the data values or spans of the data values on the x-axis and then has bars going up with a height equal to the frequency for that value. A histogram is a bar graph, however it is slightly more specific: it must have a continuous set of numbers on the x-axis. (A bar graph can have words recorded along the horizontal axis.)

Homework: Histogram Worksheet and Page 365, #1 - 8, 15 b & c, 16, and 17 a & b.


Tuesday, February 2, 2010

Compound Probability Day 2

We started working our way through a packet today which will progress our understanding of compound probabilities. The problems in this packet will utilize contingency tables and Venn diagrams to help us see the numbers where events overlap and where events are disjoint.

Homework: Complete Example 2, Example 3, Practice 1, and Practice 2.