Fundamentals

LendingClub

First, load the data from https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv and create a dataframe called "loans"

loans <- 
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")
grade_code(correct = "Good answer, well done!")

Then, show some summary statistics from the new loans dataframe:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
summary(loans)
grade_code(correct = "Good answer, well done!")

Rename the response loans\(not.fully.paid to loans\)default, for ease of use:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans$default <- loans$not.fully.paid
grade_code(correct = "Good answer, well done!")

Fit a model to predict default based on fico score alone and display the results table:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")
loans$default <- loans$not.fully.paid

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")
loans$default = loans$not.fully.paid

# Insert the code below
loans.glm1 = glm(default ~ fico, family=binomial, data=loans)
summary(loans.glm1)
grade_code(correct = "Good answer, well done!")

Based on the previous estimations, let us make some assumptions here to see how they would change the results. Let us assume that the FICO score coefficient is -.011 and we have the following assumption for the confidence interval:

NFL

We have data from the National Football League on distance of field goal attempt (x in yeards) and whether or not the field goal was successful (y=1 if successful, 0 otherwise). A logistic regression model was fit and it was found that \(b_0 = 5.7\) and \(b_1 = 0.1\).