Cleaning Catapult Data

A Simple Script to Clean Catapult Data

The Problem

When working with raw catapult GPS or IMU data, you need to handle the first few lines of meta-data which includes info like date, time, and location of the session. This informaiton is less structured than you would normally find, being written in point-form. Below it are properly formatted variables and their subsequent observations.

Getting Started

Step 1 is to load your packages and understand the data format.

Packages Loaded: tidyverse for clean code & janitor to handle variable names

As you can see, it’ll be much more efficient to remove the first 8 rows rather than work around them.

The skip Argument

read_csv() from the readr package (loaded with the tidyverse package) has a skip argument that skips any number of rows before returning the data set.

Skip the first 8 rows to return a data set that is easier to work with.

Easily Clean Variable Names

The janitor package can be used to handle variable names. R isn’t a fan of names with spaces like Heart Rate. The using janitor::clean_names() replaces all spaces with an underscore and converts all upercase text to lowercase. The end result is a simple variable name that R can handle, and that you don’t have to worry about which letters are capitalized.

