Skip to Tutorial Content

Fundamentals

LendingClub

First, load the data from https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv and create a dataframe called "loans"

loans <- 
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

Then, show some summary statistics from the new loans dataframe:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
summary(loans)

Rename the response loans\(not.fully.paid to loans\)default, for ease of use:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")

# Insert the code below
loans$default <- loans$not.fully.paid

Fit a model to predict default based on fico score alone and display the results table:

loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")
loans$default <- loans$not.fully.paid

# Insert the code below
loans <- readr::read_csv("https://www.warin.ca/datalake/courses_data/qmibr/session7/loans.csv")
loans$default = loans$not.fully.paid

# Insert the code below
loans.glm1 = glm(default ~ fico, family=binomial, data=loans)
summary(loans.glm1)

Based on the previous estimations, let us make some assumptions here to see how they would change the results. Let us assume that the FICO score coefficient is -.011 and we have the following assumption for the confidence interval:

NFL

We have data from the National Football League on distance of field goal attempt (x in yeards) and whether or not the field goal was successful (y=1 if successful, 0 otherwise). A logistic regression model was fit and it was found that \(b_0 = 5.7\) and \(b_1 = 0.1\).

Logistic regression 1/2