- #1
ghostyc
- 26
- 0
The data here are concerned with whether people default on a loan taken from a particular bank and for identical interest rates and for a fixed period. The information on each individual is their sex (male of female); their income (in pounds), whether the person is a home owner or not, their age (in years), and the amount of the loan (in pounds).
The information recorded is whether the individal defaulted on the loan or not. Study the data and try and understand a relation between the persons characteristics and defaulting. Specifically, what is your estimated probability that a female aged 42, who is not a home owner, has an income of 23,500, and took a loan of 12,000, defaults on the loan?
The table holding the data have headings as follows:
m/f: male=1, female=0
age: age in years
home: home=1 is a home owner, home=0 is not a home owner
inc: income
loan: amount of loan
def: default=1, non-default=0.
Dataset is given in file "tabl3.txt".
I know it has something to do with Binary response and probably we should use GLM to model it. However, I kind of stuck with it. My problem is that I can do identify which variable is the response in this case.
If I use
[tex]
\log \frac{p}{1-p}=default = sex+age+income+home+loan
[/tex]
with a logit link. It just dese not make sense to me. Because "default" take on 1 or 0. So the predicted value then takes 1 or 0.
Any suggestions?
For those who knows R, here is my code.
Code:
Q3=read.table("tabl3.txt")
colnames(Q3)=c("Sex","Age","Home","Inc","Loan","Def")
Q3$Sex=as.factor(Q3$Sex)
Q3$Home=as.factor(Q3$Home)
Q3$Def=as.factor(Q3$Def)
summary(Q3)
Thanks!