Sunday, June 26, 2016

Coursera R lab - Correlation and Regression Answers

Here are the answers of the R lab - Correlation and Regression of second week basic statistics coursera's online course you simply copy the r code from there and paste it and get 100 % results.

Scatterplots

# Plot height and weight of the "women" dataset. Make the title "Heights and Weights"

plot(women$weight, women$height, main = "Heights and Weights")

Making a Contingency Table

# Make a contingency table of tobacco consumption and education

table(smoking$tobacco, smoking$student)

Calculating Percentage From Your Contingency Table

# What percentage of high school students smoke 0-9g of tobacco?

38.6

# Of the students who smoke the most, what percentage are in university?

57.7

Interpreting Your Scatterplot

This is the graph you created of heighs and weights. Based on your graph, what can you say about the relationship between height and weight?

Possible Answers

It is linear and positive

Pearson's R I

Pearson's r is a measure of how strongly the variables are correlated with each other. Look at the graph on the right. Which of the following Pearson's r values are likely to belong to this graph?

1.0

Pearson's R II

35xp

Which of the following Pearson's r values are likely to belong to this graph?

0.418

Pearson's R III

Which of the following Pearson's r values are likely to belong to this graph?

-0.77

Pearson's R IV

Which of the following Pearson's r values are likely to belong to this graph?

-0.26

Calculating Correlation Using R

# Calculate the correlation between var1 and var2

cor(var1,var2)

Finding The Line

# predicted values of y according to line 1

y1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

sum(y-y1)

# predicted values of y according to line 2

y2 <- c(2, 3, 4, 5, 6, 7, 8, 9, 10, 11)

sum(y-y2)

# actual values of y

y <- c(3, 2, 1, 4, 5, 10, 8, 7, 6, 9)

 

# calculate the squared error of line 1

sum((y-y1)^2)

 

# calculate the squared error of line 2

sum((y-y2)^2)

Interpreting The Line

# How prosocial would we predict someone to be when they recieve 6 units of money?

6

# How prosical was the person who recieved 6 units of money in our study?

10

The Regression Equation

26.1

Describing The Line

The equation for the blue line in the graph is Y = 1 + 0.7818(X). What is the equation for the red line?

Y = 4 + 0.7818(x)

Finding The Regression Coefficients in R

# Our data

money <- c(1,2,3,4,5,6,7,8,9,10)

prosocial <- c(3, 2, 1, 4, 5, 10, 8, 7, 6,9)

# Find the regression coefficients

lm(prosocial~money)

Using lm() To Add A Regression Line To Your Plot

# Your plot

plot(money, prosocial, xlab = "Money", ylab = "Prosocial Behavior")

# Store your regression coefficients in a variable called "line"

line <- lm(prosocial ~ money)

# Use "line" to tell abline() to make a line on your graph

abline(line)

Adding A Line

# Your plot

plot(money, prosocial, xlab = "Money", ylab = "Prosocial Behavior")

# Your regression line

line <- lm(prosocial ~ money)

abline(line)

# Add a line that shows the mean of the dependent variable

abline(mean(prosocial), 0)

R Squared I

# Calculate the R squared of prosocial and money

cor(prosocial,money)^2

R Squared II

In addition to being the reduction in residual error from using the regression line over the mean line, and, of course the pearson correlation coefficient squared, how else can we describe the R squared?

The variation in the dependent variable explained by the independent variable

Correlation and Causation

You measured how much money people have and their education level in a town. The graph on the right shows the results. We cannot say that more educaion causes more money, we say it isrelated to more money. Which of the following is not a reason why education is only related to money?

There could be a third unmeasured variable that influences only money

 

Putting It Together: Regression

# your data

money <- c(4, 3, 2, 2, 8, 1, 1, 2, 3, 4, 5, 6, 7, 9, 9, 8, 12)

education <- c(3, 4, 6, 9, 3, 3, 1, 2, 1, 4, 5, 7, 10, 8, 7, 6, 9)

 

# calculate the correlation between X and Y

cor(money, education)

 

# save regression coefficients as object "line"

line <- lm(money ~ education)

 

# print the regression coefficients

line

 

# plot Y and X

plot(education, money, main = "My Scatterplot")

 

# add the regression line

abline(line)

Putting It Together: Contingency Tables

# percentage of people with high money that are university educated

83.3

# percentage of people with low money that are high schol educated

72.7

# what kind of education is linked to more money?

"university"

7 comments: