Sunday, June 26, 2016

Coursera Basic Statistics correlation and regression quiz 2 answers

Here is the solution of the Coursera quiz about correlation and regression of basic statistics online course it is second week quiz
1.  You want to visualise the results of a study. When assessing only one ordinal or nominal variable it is sufficient to use a (1) .... When looking at the relationship between two of these ordinal or nominal variables you'd better use a (2) .... When you're assessing the correlation between two continuous variables it's best to use a (3) ... Fill in the right words on the dots.
(1) Contingency table, (2) Scatterplot, (3) Frequency table

2.  Which statement(s) about correlations is/are right?
I. When dealing with a positive Pearson's r, the line goes up.
II. When the observations cluster around a straight line we're dealing with a linear relation between the variables.
III. The steeper the line, the smaller the correlation.

Statement I and II are true, statement III is false.

3. You've collected the following data about the amount of chocolate people eat and how happy these people are.
Amount of chocolate bars a week: 2, 4, 1.5, 2, 3.
Grades for happiness: 7, 3, 8, 8, 6.
(Note, the numbers are in the right order so person one eats 2 chocolate bars and scores her happiness with a 7.)
Compute the Pearson's r.

-0.96

4. You've investigated how eating chocolate bars influences a student's grades. You've done this by asking people to keep track of their chocolate intake (in bars per week) and by assessing their exam results one day later. Which statement(s) about the regression line y-hat = 0.66x + 1.99 is/are true?
If you eat one more chocolate bar a week, your grade becomes 0.66 higher.

5. A professor uses the following formula to grade a statistics exam:
y-hat = 0.5 + 0.53x. After obtaining the results the professor realizes that the grades are very low, so he might have been too strict. He decides to level up all results by one point. What will be the new grading equation?

y-hat = 1.5 + 0.53x

6. What is the explained variance? And how can you measure it?
The explained variance is the percentage of the variance in the dependent variable that can be explained using the formula of the regression line. You can measure this with r-squared.

7. You want to know how much of the variance in your dependent variable Y is explained by your independent variable X. Determine for the following three cases how much variance is explained and arrange the cases in ascending order (from lower to higher explained variance).
(2) (1) (3)

8. A teacher asks his students to fill in a form about how many cigarettes they smoke every week and how much they weigh. After obtaining the results he makes a scatterplot and analyses the datapoints. He computes the Pearson's r to assess the correlation. He finds a correlation of .80. He concludes that smoking more cigarettes causes high body weight. What is wrong with this analysis?
He concludes that smoking causes high body weight. This is not possible after having conducted a regression analysis.

9. What can you conclude about a Pearson's r that is bigger than 1?
This is impossible. Correlations are always between -1 and 1.

10. Why do you use squared residuals when computing the regression line?
Because the residuals can cancel each other out (i.e. their sum equals zero).

2 comments:

  1. answer to first question is wrong .right answer is(1) Frequency table, (2) Contingency table, (3) Scatterplot

    ReplyDelete