How to use women’s football data

Data
Zurich's Swiss defender Vanessa Bernauer (L) works around Juventus' Italian forward Cristiana Girelli during the Women's UEFA Champions League Group C football match between Juventus and Zurich on December 15, 2022 at the Juventus stadium in Turin. (Photo by Marco BERTORELLO / AFP) (Photo by MARCO BERTORELLO/AFP via Getty Images)

There is a shift going on in women’s football journalism and writing. More and more, we see the growth of analysis in written work. Which is a very exciting thing to behold in my eyes. Part of the analysis is using data to back a narrative, story, or performance. This wasn’t possible for a long time, as women’s football data wasn’t available. And, if it was available, it was behind paywalls (think of InStat, Wyscout, Opta, and Statsbomb). This has changed recently with the introduction of women’s data on FBRef.

FBRef is a free-to-access website that uses Opta data. Since starting to use Opta, it also means that the top 8 leagues in the world for women’s football are included on the website regarding advanced data. With advanced data, you can think of expected goals metrics, passing metrics, and shot-creating actions. You can find it out yourself over here.

What’s important when having this wealth of data is to know how to use it.

Context with football data

Data without context is useless. It truly is. First of all, it’s important to look at the data you are looking at. What do you want to say? What is it that makes this data good for your purpose? For example, pass accuracy is one of the data metrics frequently used. But what does it actually say? Without knowing the position, the passes made, and where the passes have been played to, we can’t really draw conclusions. So it’s very important to collect your data, clean your data and translate your data to what is relevant for your research or article.

When you write about data and you make visualisations, make sure you explain the data. Not everyone knows what expected goals mean, but by explaining it – it’s very clear and the reader will have a contextual framework of the data.

Representation

Every data analysis you make should be representative. With that, we mean that it needs to be qualified to make an analysis of. This means that the players you will analyse will have:

  • Played an equal amount of minutes or a minimal amount of minutes
  • Play in similar positions/roles
  • Relevant metrics to their position/roles
  • Compared to equal-level leagues or cups
  • Try to analyse over the course of at least 1 season, preferably longer

This will make your analysis meaningful, valid, and with a degree of quality. Comparing a one-game overperformance to a 10-game streak will prove to be not trustworthy.

Advanced data available in women’s football

There are two platforms that work with Opta data and are free to use. First of all there is The Analyst. They have the following leagues with advanced data:

  • English WSL
  • American NWSL
  • German Bundesliga
  • Italian Serie A
  • French D1 Arkema

For a more in-depth look at statistics and data, and being able to download them, you can visit FBRef. They have data on the following leagues:

  • English WSL
  • American NWSL
  • German Bundesliga
  • Italian Serie A
  • French D1 Arkema
  • Australian A-league
  • Spanish Liga F
  • UWCL
Credit: FBRef.

Besides in-depth and advanced data, they also have specific scout reports on players in those specific leagues. Just like the one above.

Another important thing about data is that not every league has complete data, which makes their relevance different. It can give a skewed view, so it’s always important to be mindful of that.

The most important thing is to be active in selecting and using your data. What do you want to tell and how is data going to support that? Being critical of data will lead to a higher-quality article and/or story.

MORE from Her Football Hub: