Analyzing SAT Performance of NYC Public Schools

This project explores the SAT performance of New York City (NYC) public schools using a dataset called schools.csv. The analysis addresses three key questions to uncover insights about school performance.

Project Overview

Problem/Goal:

To analyze the SAT performance of NYC public schools and provide insights that can help stakeholders make informed decisions.

Data Source:

schools_data

The dataset schools_data includes information on the average SAT scores in reading, math, and writing for NYC public schools.

Key Insights:

  • Top Schools for Math Performance: Identified schools with exceptional math scores.
  • Top 10 Schools Based on Combined SAT Scores: Ranked schools based on their combined SAT scores.
  • Borough with Largest Standard Deviation in Combined SAT Scores: Analyzed the variation in total SAT scores across boroughs.

Conclusion:

This analysis provided valuable insights into the performance distribution of NYC public schools, helping stakeholders make informed decisions about education policy, resource allocation, and school selection.

Python Code and Outputs:

1. Which NYC schools have the best math results?


import pandas as pd

# Importing the data
schools = pd.read_csv("schools.csv")

# Top Schools for Math Performance
best_math_schools = schools[["school_name", "average_math"]]
best_math_schools = best_math_schools[best_math_schools["average_math"] >= 800 * 0.8].sort_values("average_math", ascending=False)
print(best_math_schools.head())
                    
Output:
School Name Average Math Score
Stuyvesant High School 754
Bronx High School of Science 714
Staten Island Technical High School 711
Queens High School for the Sciences at York College 701
High School for Mathematics, Science, and Engineering at City College 683

2. What are the top 10 performing schools based on the combined SAT scores?


# Creating a column for total SAT scores
schools["total_SAT"] = schools["average_math"] + schools["average_writing"] + schools["average_reading"]
top_10_schools = schools[["school_name", "total_SAT"]].sort_values("total_SAT", ascending=False).head(10)
print(top_10_schools)
                    
Output:
School Name Total SAT Score
Stuyvesant High School 2144
Bronx High School of Science 2041
Staten Island Technical High School 2041
High School of American Studies at Lehman College 2013
Townsend Harris High School 1981
Queens High School for the Sciences at York College 1947
Bard High School Early College 1914
Brooklyn Technical High School 1896
Eleanor Roosevelt High School 1889
High School for Mathematics, Science, and Engineering at City College 1889

3. Which single borough has the largest standard deviation in the combined SAT score?


# Calculating the standard deviation for each borough
largest_std_dev = schools.groupby("borough")["total_SAT"].agg(["count", "mean", "std"]).round(2).rename(columns={"count": "num_schools", "mean": "average_SAT", "std": "std_SAT"}).sort_values("std_SAT", ascending=False).head(1)
largest_std_dev.reset_index(inplace=True)
print(largest_std_dev)
                    
Output:
Borough Number of Schools Average SAT Score Standard Deviation of SAT Scores
Manhattan 89 1340.13 230.29