Data cleaning and merging involves transforming your data into a format that’s ready to use for analysis.
It’s likely that the data you need for analysis is held in different places. You might have a spreadsheet with participants’ group assignments (whether they are in the intervention or the control group) in one place, and a spreadsheet of outcomes (eg, whether or not they applied for childcare) in another. Merging means ensuring that all this information is in one place. You may do something similar to this in your local authority already.
How to clean and merge your data
Cleaning and merging your data involves the steps below:
Cleaning
- Removing any duplicates
- Making sure all the variables (eg, parent names) are in the same format
Merging
- Decide which variable will be used for merging
- Clean the merging variable in both datasets, if you haven’t already
- Perform the merging itself
- Check that everything worked as expected
Re-coding your variables
- Transform verbal variables like ‘Applied / Did not apply’ to 0s and 1s for Excel to use for analysis.
We have produced two short guides to walk you through this process. You can download them below. Begin with the Word guide to explain the process, and then practice using the Excel guide:
You can also watch the short video below, which walks you through how to do it.