New York City has published data on student SAT scores by high school, along with additional demographic data sets. Above, we combined the following data sets into a single, clean pandas dataframe:
New York City has a significant immigrant population and is very diverse, so comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair.
From the result above, it looks like white percentage(white_per column) has the strongest correlations with sat score floowed by asian percentage(asian_per column).
The strongest negative correlations is free-and-reduced-price lunch program (FRL) percentage(frl_percent). The more FRL eligible students there are, the less average SAT score they get.
It almost undeniable that ethnicity and family wealth come into play. Which conforms with the public opinion.
One that less obvious is saf_t_11 and saf_s_11, which measure how teacher and students perceive safety at school, correlated highly with sat_score. Let's look into more.
As you can from the above, the students who go to the school that they feel safer, generally do better in SAT.
From the above graph, we can see the school that have higher white and asian students composition do better in terms of SAT and the opposite is true for black and hispanic students. We will investigate further on the issue.
Let's look into hispanic case first.
As expected, there is negative correlations. Let's look into those schools that have high hispanic percentage.
A lot of them are schools for mainly immigrants.
Most of them are STEM schools, which probably associated with higher school fee.
From the graph, the schools have more female students than male students do slightly better in SAT.
It is not very ovbious from the graph but the higher scores leaned toward higher percentage of female students. However, all girls schools don't do particularly well than gender mixed school.
In the U.S., high school students take Advanced Placement (AP) exams to earn college credit. There are AP exams for many different subjects. Let's see if that correlate with sat score
It seems there is correlation, however the data seems to divide by two different groups. The upper where have strong correlation between AP Taker percentage and the bottom where they don't have correlation at all. They divide at around SAT overall 1300.