This is a data set from Prosper, which is America’s first marketplace lending platform, with over $9 billion in funded loans. This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others. This data dictionary explains the variables in the data set.
Our main variable of interest (BorrowerAPR) was a multi-modal plot with lots of peaks, the credit rating of 'C' was the most common pre 2009 and post July 2009. From the dataset, professionals, computer programmers and executives top the list of loans taken, it also showed that income range within $25,000 - $49,999 had the highest amount of loans, followed behing closely by the $50,000 - $74,999 range.. It also shows the more amount of loan taken the lower the annual interest rate. A positive correlation was obeserved with Estimated Effective Yield with our variable of interest (BorrowerAPR) another interesting correlation was a positive one between the Original Loan Amount and the monthly loan payment, another obervation was a negative correlation between the Prosper Rating (Alpha) with our variable of interest (BorrowerAPR). In conclusion, the HR ratings had a higher borrower annual interest rate compared to the AA with the least borrower annual interest rate.
For my presentation, i focused on variables that has an effect on the borrower's interest rate such as the Income Range, Employment Status, Occupation and Prosper Rating (Alpha).