Sport Science Data Sets

Introduction

The Society for Transparency Openness and Replication in Kinesiology (STORK) recently pushed for sport science researchers to improve the standards of sport science research. STORK has asked that researchers be forthright with their methodologies, reporting results, and to provide accessible data sets. Many researchers heeded the call and, in particular, released the data sets that were utilized in their research.


.


Greater accessibility to data sets brings opportunities for independent researchers to rerun the analyses outline in publications, and run new analyses which may provide new insights. The ability to work with real data sets also invites innovation within the field of sport science. For example, students working with actual human data will be exposed to messy data, incorrect inputs, and inefficient data collection techniques. Students will also learn how to clean the data, transform it when necessary, and improve reporting. Doing so can bring to light the need to learn the nuances in statistical analyses and improve the methods of how we record data. Moreover, if more student graduate with a comprehensive understanding of how to clean and analyse the data, and effectively report outcomes, student will unwittingly push sport scientists to consider new or advanced modelling approaches. That is, if the current standard of practice is to report descriptive statistics, assess normality, and report correlations, researchers may be avoiding the use of, for example, resampling techniques. An influx of young researchers who can perform t-tests and correlation matrices can provide researchers with the opportunity to consider different approaches to their research questions.

This post is an example of how students can utilize data sets to rerun analyses and assess whether their results match the ones that were published. The Discussion section outlines how the results in the post are close, but do not match, the original research and hypothesises why this might be the case.

Results

The table below provides a handful of data sets that were publicly released on sites like Mendeley Data, OSF, and figshare. They cover various areas of sport and exercise science like player profiling, gait mechanics, and aging.

Over time, the table will reference a greater number of data sets. Feel free to keep an eye on this post. Otherwise, you can sign-up to the newsletter and will receive the occasional email with updates.

Conclusion

Accessibility to publicly available data sets can drive the field of sport and exercise science forward by exposing young researchers to data sets that they might encounter in their academic careers. Doing so can improve the robustness of statistical methods and reporting of results, and provide researchers with opportunities to explore new and innovative modelling techniques for future research.



.