Here's an extra (Mal)teaser: what have chocolate bars got to do with data quality?
Think about a factory line making your favourite chocolate bar. Quality assurance is a key part of every step. If the product is faulty then no one will eat it, and the company wouldn’t make any money.
They know they have to invest to save. Not only in the machines on the production line, but in the raw ingredients too. They could have the most reliable, efficient and best factory machines, but if the raw ingredients had bits of hair or sand, or lumps of unmixed fat in, then they would waste money because they’d have a terrible product at the end. The machines help, but you have to make sure the quality of your raw ingredients is right.
So, what’s the link be-Twix-t chocolate bars and data quality?
Replace the ‘factory machines’ with IT systems and the ‘raw ingredients’ with data. If I can stretch the analogy a little further, reactive data fixing is like paying someone to pick the bits out of the finished chocolate bars!
You wouldn’t eat a poor quality chocolate bar…
...so why accept poor quality data? We all consume data. It fuels our understanding of the environment and the decisions we make. Data is vital; we use it to allocate resource, make regulatory decisions and environmental improvements and to look after our staff to name just a few uses. Chocolate is a luxury but data is not! Approaching data quality in the right way matters.
As in a chocolate factory, we need clarity about the quality of the product we are producing and build in checks along the way to make sure we’re on track. If we don’t check our data we won’t know the quality and we won’t know if it’s wrong until it’s too late.
When errors are discovered we often default to cleansing errors in an ad hoc and resource intensive way. Low and behold, the quality (and fitness for purpose) of the data begins to drop again straight after the ‘fixes’ are applied. We don’t get to the root cause of the problem. Research from the wider data community suggests, on average, organisations waste 15–18% of their overall budget dealing with data issues. That’s a huge amount of wasted money and effort.
The manufacturing world wouldn’t accept such waste, so why accept it with our data?
Our challenge is to change our approach to data quality from reactive fixing to proactive changes.
As we publish more of the information we hold as Open Data we must become comfortable with others using our data. Some of our datasets are better than others, in quality of data and in quality of the processes that support them. We may worry about the quality of the data and others using it without understanding the detail of it. As we share it more widely, we shouldn’t Wispa it: our customers need to understand our data’s strengths and limitations, so that they can make better use of it.
What can you do?
If you look after data: It needn’t be a ‘rocky road’. Our approach to proactively monitoring data quality, called Data Quality Action Plans (DQAP), are based on what you need the data to do. They help lead identify the root causes of your data quality issues and gives you an evidence base to support putting effective long lasting fixes in place.
If you don’t look after data directly: We all make and use data, whether it’s environmental data, time recording or expenses. We are all responsible for its quality. Remember there’s no substitute for getting it right first time and we all have a responsibility to input accurate data. If you have a concern about the data you are inputting or using, raise it. Y'or-kie* to saving time and effort in the long run!
*You are key. It's stretching it a bit, admittedly.