top of page

   The purpose of this project was to analyze data from the Home Mortgage Disclosure Act and determine the existence or not of a bias again a certain community.

Initial project.jpg

What did the data look like ?

  The initial data set consisted of about 20 columns and a little over 12 thousands rows.

Raw Data 2.jpg
Raw Data 1.jpg

   The first step in getting a better understanding of the data was to create a new column to group our mortgage applicants by income class. This was done using the "IF" function.

IF function to add something to the data.jpg

Then, using VLOOKUP to quickly summarize the action taken on each application.

Vlookup Function.jpg
Action Taken.jpg

Let's analyze our data

Histogram.jpg

Here, I started with a histogram that gives a better understanding of our applicants income repartition.

We notice that most of our applicants fall between 26 to 76 Thousands in yearly income. This range corresponds to low to middle income households.

Let's dig deeper, shall we?

Stacked bar chart.jpg

This stacked bar chart shows some interesting fact about our data. On one hand, the black or African American community seems to have the lowest approval rate. On the other hand, the White community seems to have the highest approval rate.

​

Could there be a bias ?

Pivot Income Level.jpg

This pivot table, shows us a comparison between the two races for each level of income. Across the board for the same levels of income, White applicants seem to get approved at a higher rate.

​

Income may not be a factor !

Then, what is ?

Hypothesis Testing

The null hypothesis

​

The approval rate for middle-income African-Americans is equal to the approval rate to the approval rate for middle-income white non-hispanic Americans.

The alternative hypothesis

​

There is a statistically significant difference between the approval rates of these two populations in our data. The approval rate for middle-income white non-hispanic Americans is higher than their African-Americans counterpart.

Conclusion

T-testing.jpg

Based on the test, there is evidence at a 95% confidence level that African Americans are disproportionately denied loans compared to white Non-Hispanics. Our p-value being lower than the standard signifiance level of 0.05 proves that the approval rate increased when moving from African-American middle-incomes their white non-hispanic counterparts.

bottom of page