r/dataisbeautiful 2d ago

How did draft position affect fantasy football league performance in 2024? (12-man leagues, snake draft)

To assess how draft position affected league performance, I looked into over 400 12-man leagues (all snake drafts) and plotted win ratio, normalized points earned (normalized within a given league to account for various scoring and roster settings), and final league ranking for each draft position.

Surprisingly, 1st pick performed worst on average across all metrics.

League data collected from Sleeper API.

40 Upvotes

27 comments sorted by

View all comments

15

u/ScientistFromSouth OC: 1 2d ago

It appears nothing happened whatsoever. Did you run any statistical tests such as turkey HSD, ANOVA, or something more specific for rank ordering to test this rigorously.

10

u/SkilledB 2d ago

This reads like ”I can’t read graphs”. Of course they all kinda look the same because the tails of the box and whisker are all at 1 and 12, meaning plenty of finishes at the top and bottom from all draft spots. What’s inside the box are the significant takeaways from this.

6

u/ScientistFromSouth OC: 1 1d ago

To be fair, box plots then don't make sense in this case because they are designed for continuous data and not rank order. You literally can't read them for discrete data. Second his statistical findings were that statistically first draft finished average place 6 and 10th player finished average place 7 which he reported to like 5.7 vs 6.9 (which once again doesn't make sense in a discrete rank order). 1 place difference in a game with twelve players feels like an insignificant net change and basically random.

2

u/SkilledB 1d ago

First, agreed this visualization type definitely isn’t the best for this data.

However, if the sample is big enough to not produce random results, then the 1.2 difference is huge. There isn’t supposed to be a systematic benefit to any single draft spot using the snake draft format. This data (again, assuming a large enough sample) shows that last year, there was a non-neglible difference caused by draft position.

3

u/ScientistFromSouth OC: 1 1d ago

So a few issues:

  1. The guy ran ANOVA to justify statistical significance even though that only says that there is at least one statistically significant non zero slope for a linear model fit to ideally independent continuous data which isn't true for rank order data that isn't independent or continuous.

  2. He seems really hung up on this 1.2 outcome and the "largest p value". He's likely p hacking. I would like to see a pairwise interaction matrix of all of the relative interactions and relative finish positions. I would also like to see some multiple test correction like Turkey HSD. However he would need to run something for rank order data specifically.

  3. I know this sub like aesthetic plots, but come on. For plots this wide, some grid lines that show the 50% win neutral probability or color coding for median win rate above below 50% would make this infinitely better

1

u/rsrgrimm 1d ago

Do you have any recommendations for a better visualization for the final ranking data? I'll admit that data visualization isn't my strongest suit, especially for this type of data. Perhaps a clustered bar chart?

7

u/rsrgrimm 2d ago edited 2d ago

Look more closely. There are clear differences between averages. The mean final ranking for 1st pick teams was 6.9 as opposed to 5.7 for 10th pick teams. That is pretty significant for a sample of 400 teams.

Before running statistical tests, the samples and differences are large enough that it is evident that draft position had an effect. After running ANOVA, for the metrics measured the largest p-value was 5e-6, so draft position was significant for each metric.

4

u/ScientistFromSouth OC: 1 1d ago

Boxplots aren't appropriate for rank order data. They are designed for continuous data. These plots really don't make sense for variables defined on discrete support.

Second, while this is significant at the population level, for an individual player, I am not sure that this provides me with actionable information. Going 10th somehow gives me a 1.2 rank advantage over going first on average. However, that feels pretty trivial in a 12 person game in a one off instance.

Additionally, the validity of an ANOVA test that checks for nonzero slopes of a linear model fit to continuous data being used on a rank order model with discrete support are highly suspect.

1

u/rsrgrimm 1d ago edited 1d ago

That's not true. Box plots are fine for discrete data if the data is ordinal, which this is, and there are a sufficient number of possible values. You could argue there are not enough possible values here (I admit the whiskers are pretty useless), but the means, medians, and IQR are different enough to convey useful information.

I agree that you shouldn't be using this data to guide your draft position (not that many have a choice, though I suppose some leagues allow trading). Though statistically significant, the differences are relatively small and, more importantly, this data only encapsulates a single year.

Additionally, the validity of an ANOVA test that checks for nonzero slopes of a linear model fit to continuous data being used on a rank order model with discrete support are highly suspect.

I don't necessarily agree. There are three metrics being measured here. All can be considered representative of team performance in a league and the ANOVA tests suggest a statistically significant relationship between the predictor and the response. The final league ranking is the only one of the three metrics that is discrete. Plus, I'll remind you that you are the one who asked about an ANOVA test in the first place.