r/dataisbeautiful • u/rsrgrimm • 2d ago
How did draft position affect fantasy football league performance in 2024? (12-man leagues, snake draft)
To assess how draft position affected league performance, I looked into over 400 12-man leagues (all snake drafts) and plotted win ratio, normalized points earned (normalized within a given league to account for various scoring and roster settings), and final league ranking for each draft position.
Surprisingly, 1st pick performed worst on average across all metrics.
League data collected from Sleeper API.
23
15
u/ScientistFromSouth OC: 1 2d ago
It appears nothing happened whatsoever. Did you run any statistical tests such as turkey HSD, ANOVA, or something more specific for rank ordering to test this rigorously.
8
u/SkilledB 2d ago
This reads like ”I can’t read graphs”. Of course they all kinda look the same because the tails of the box and whisker are all at 1 and 12, meaning plenty of finishes at the top and bottom from all draft spots. What’s inside the box are the significant takeaways from this.
6
u/ScientistFromSouth OC: 1 1d ago
To be fair, box plots then don't make sense in this case because they are designed for continuous data and not rank order. You literally can't read them for discrete data. Second his statistical findings were that statistically first draft finished average place 6 and 10th player finished average place 7 which he reported to like 5.7 vs 6.9 (which once again doesn't make sense in a discrete rank order). 1 place difference in a game with twelve players feels like an insignificant net change and basically random.
2
u/SkilledB 1d ago
First, agreed this visualization type definitely isn’t the best for this data.
However, if the sample is big enough to not produce random results, then the 1.2 difference is huge. There isn’t supposed to be a systematic benefit to any single draft spot using the snake draft format. This data (again, assuming a large enough sample) shows that last year, there was a non-neglible difference caused by draft position.
2
u/ScientistFromSouth OC: 1 1d ago
So a few issues:
The guy ran ANOVA to justify statistical significance even though that only says that there is at least one statistically significant non zero slope for a linear model fit to ideally independent continuous data which isn't true for rank order data that isn't independent or continuous.
He seems really hung up on this 1.2 outcome and the "largest p value". He's likely p hacking. I would like to see a pairwise interaction matrix of all of the relative interactions and relative finish positions. I would also like to see some multiple test correction like Turkey HSD. However he would need to run something for rank order data specifically.
I know this sub like aesthetic plots, but come on. For plots this wide, some grid lines that show the 50% win neutral probability or color coding for median win rate above below 50% would make this infinitely better
1
u/rsrgrimm 1d ago
Do you have any recommendations for a better visualization for the final ranking data? I'll admit that data visualization isn't my strongest suit, especially for this type of data. Perhaps a clustered bar chart?
6
u/rsrgrimm 2d ago edited 2d ago
Look more closely. There are clear differences between averages. The mean final ranking for 1st pick teams was 6.9 as opposed to 5.7 for 10th pick teams. That is pretty significant for a sample of 400 teams.
Before running statistical tests, the samples and differences are large enough that it is evident that draft position had an effect. After running ANOVA, for the metrics measured the largest p-value was 5e-6, so draft position was significant for each metric.
4
u/ScientistFromSouth OC: 1 1d ago
Boxplots aren't appropriate for rank order data. They are designed for continuous data. These plots really don't make sense for variables defined on discrete support.
Second, while this is significant at the population level, for an individual player, I am not sure that this provides me with actionable information. Going 10th somehow gives me a 1.2 rank advantage over going first on average. However, that feels pretty trivial in a 12 person game in a one off instance.
Additionally, the validity of an ANOVA test that checks for nonzero slopes of a linear model fit to continuous data being used on a rank order model with discrete support are highly suspect.
1
u/rsrgrimm 1d ago edited 1d ago
That's not true. Box plots are fine for discrete data if the data is ordinal, which this is, and there are a sufficient number of possible values. You could argue there are not enough possible values here (I admit the whiskers are pretty useless), but the means, medians, and IQR are different enough to convey useful information.
I agree that you shouldn't be using this data to guide your draft position (not that many have a choice, though I suppose some leagues allow trading). Though statistically significant, the differences are relatively small and, more importantly, this data only encapsulates a single year.
Additionally, the validity of an ANOVA test that checks for nonzero slopes of a linear model fit to continuous data being used on a rank order model with discrete support are highly suspect.
I don't necessarily agree. There are three metrics being measured here. All can be considered representative of team performance in a league and the ANOVA tests suggest a statistically significant relationship between the predictor and the response. The final league ranking is the only one of the three metrics that is discrete. Plus, I'll remind you that you are the one who asked about an ANOVA test in the first place.
5
u/rsrgrimm 2d ago
I plan on diving into this data more, so let me know if you have any suggestions for further analysis
20
u/drc500free 2d ago
Performance by drafting CMC at 1.1 vs picking anyone else.
2
u/rsrgrimm 2d ago
Yeah, I'd like to look into the effect of drafting specific players for specific picks.
8
u/JamminOnTheOne 2d ago
I’d suggest looking at multiple years of data, so it’s not overly influenced by the specific players drafted in certain spots having especially good or bad seasons (like McCaffrey at #1 in 2024).
1
u/rsrgrimm 2d ago
Yeah, I plan on doing that analysis. This data definitely isn't robust enough to indicate whether draft position has a consistent influence on season performance. But I still thought it was interesting.
2
u/zanderman12 1d ago
Its actually pretty consistant in that the consensus #1 has busted for like 5 years in a row. I did this a few years ago but even 4 years of data shows a similar pattern: 2022 Snake Draft Strategy Review
1
u/zanderman12 1d ago
Hey OP, I love analyzing this type of data. I've done so in previous years but havent had the time this year. Feel free to look through my old posts for inspiration and feel free to DM me if you have questions or want to talk anything through: Snake Draft Strategy Review: What Worked in 2023
1
u/rsrgrimm 1d ago
Awesome, will do. Thank you. I've actually read some of your posts before. Are you also collecting your data from Sleeper?
1
u/zanderman12 1d ago
I started with ESPN, but have done more with sleeper recently. Sleeper is a lot easier to collect a large number of leagues and have an actually documented api, but espn has access to weekly historical projections which can unlock some other analyses
1
u/Reverie_of_an_INTP 2d ago
What do the triangles mean and what are the dots outside the whiskers mean?
2
u/rsrgrimm 2d ago edited 1d ago
The dotted horizontal line toward the middle of the box is the mean, whereas the solid horizontal line is the median. The points of the dotted triangles indicate 1 standard deviation above and below the mean. Dots beyond the whiskers can be considered outliers and are calculated to be points farther than 1.5 times the interquartile range (IQR, i.e. the range between the 25th percentile and the 75th percentile) from the IQR.
0
99
u/Nickyjha 2d ago
not really surprising when the consensus number 1 pick ended up only playing 4 games and finishing as RB73