Introduction
Growing up, one of the first things many people notice about professional basketball players is their size. Standing well over six feet tall, NBA athletes appear physically different from the average person, leading to a common assumption that height is the defining characteristic of success in professional basketball.
As someone who doesn't closely follow the NBA, I became curious about whether that assumption was actually supported by data. Are taller players better scorers? Or do different physical characteristics contribute to different types of success?
To answer these questions, I analyzed nearly three decades of NBA player data using Microsoft Excel. Rather than focusing on championships or individual awards, I explored whether measurable physical characteristics—primarily height, weight, and career longevity—were associated with four fundamental performance metrics: scoring, rebounding, playmaking, and shooting efficiency.
More importantly, this project demonstrates the process of exploratory data analysis: starting with a question, testing assumptions with data, and allowing the evidence to shape the conclusions.
Defining Success
Success in basketball can mean different things depending on a player's role. Instead of attempting to create a single "greatness" score, I evaluated four independent measures of performance that are widely understood by both basketball fans and casual observers.
Scoring — Points Per Game (PPG)
Rebounding — Rebounds Per Game (RPG)
Playmaking — Assists Per Game (APG)
Shooting Efficiency — True Shooting Percentage (TS%)
Each metric represents a different way a player contributes to a team's success.
Dataset
The analysis was conducted using an NBA player statistics dataset obtained from Kaggle containing approximately thirty seasons of player information.
To support different types of analysis, the dataset was separated into two tables.
Season Dataset
Approximately 12,800 player-season records
One record for each player during each NBA season
Career Dataset
Using Pivot Tables in Microsoft Excel, the season-level data was summarized into career averages, producing a dataset of approximately 2,500 unique players.
Each player record included:
Height
Weight
Career Length (Seasons)
Career Points Per Game
Career Rebounds Per Game
Career Assists Per Game
Career True Shooting Percentage
Additional supporting metrics
Using career averages ensured that each player contributed equally to the analysis regardless of how many seasons they played.
Methodology
The project followed a structured exploratory data analysis process.
1. Data Preparation
Cleaned the dataset
Created career-level summaries using Pivot Tables
Calculated career length
Validated the data for missing values and duplicates
2. Descriptive Statistics
Calculated:
Mean
Median
Minimum
Maximum
Standard Deviation to better understand the distribution of the data before investigating relationships.
3. Correlation Analysis
Pearson correlation coefficients were calculated to measure the strength of the relationship between physical characteristics and each performance metric.
Physical Characteristics
Height
Weight
Career Length
Performance Metrics
Points Per Game
Rebounds Per Game
Assists Per Game
True Shooting Percentage
4. Quartile Analysis
Players were divided into quartiles for each performance metric.
Comparing the average height of the top-performing players against the bottom-performing players made it easier to identify practical differences that correlation coefficients alone might not reveal.
Results
Does Height Influence NBA Success?
The analysis produced an interesting conclusion.
Height was not a strong predictor of overall success.
Instead, height influenced different aspects of the game in different ways.
Scoring
Contrary to common expectations, height showed only a weak relationship with career scoring average.
Players of many different heights were capable of becoming high scorers.
Rebounding
Height demonstrated the strongest relationship with rebounding.
Players in the highest rebounding quartile were noticeably taller than players in the lowest quartile, suggesting that height provides a meaningful physical advantage when securing rebounds.
Playmaking
One of the most interesting findings was that shorter players consistently averaged more assists.
Although player positions were not included in the dataset, the results naturally revealed different styles of play, with shorter athletes tending to contribute more as facilitators and ball handlers.
Shooting Efficiency
True Shooting Percentage showed relatively little relationship with height, suggesting that efficient scoring depends more on skill and shot selection than physical stature alone.
Key Findings
Height has little relationship with career scoring average.
Taller players consistently perform better in rebounding.
Shorter players tend to average more assists.
Different physical characteristics are associated with different styles of success rather than overall superiority.
Correlation analysis and quartile analysis together provide a more complete understanding than either technique alone.
What I Learned
This project reinforced an important lesson in data analytics.
The original hypothesis assumed that taller players would generally perform better in professional basketball. However, the data demonstrated that the relationship between physical characteristics and performance is far more nuanced.
Rather than supporting a single conclusion, the analysis revealed that height influences specific aspects of the game while having relatively little effect on others.
From a data analysis perspective, this project also demonstrated the value of combining descriptive statistics, correlation analysis, and quartile segmentation to investigate real-world questions. The quartile analysis proved especially valuable because it translated statistical relationships into patterns that were easier to interpret and communicate.
Ultimately, the project highlights an important principle of analytics: assumptions should always be tested against data. The most interesting discoveries are often the ones that challenge conventional wisdom.
Tools Used
Microsoft Excel
Pivot Tables
Correlation Analysis
Quartile Segmentation
Scatter Plot Visualization
Descriptive Statistics
Generative AI (project planning, methodology review, interpretation, and editorial assistance)