Skip to main content
Data Cleansing Mistakes

blogNews | 5 min read

Data Cleansing Mistakes you Need to Avoid

Data is a hot topic at the moment, which is unsurprising considering the massive change in legislation we have had to adhere to, as well as all the fines and breaches that we have seen take place all within the past 12 months. For many businesses, data is what makes the world go around, but the handling of such has recently come under major scrutiny from regulators, heightening the importance for toeing the line.

When done correctly, data cleansing is a prime way of managing your company databases. Merging all your disparate records, eliminating any unwanted entries, and consolidating it into one single view will help your business remain compliant with regulations such as the GDPR, as well as being more efficient.

However, cleaning up your company’s database isn’t as simple as choosing any old software, as there are a number of mistakes which you MUST avoid in order to maintain your businesses success throughout 2019.


1. Buying Cheap Data

This first mistake is a big no, no. There are many pros and cons to buying data in order to reach out to prospects. The key is to sift and sort through the legitimate and non-legitimate companies trying to sell you data. According to the DMA, you should avoid firms that offer you thousands of records for pennies, and instead work with companies that will help you to select targeted prospects.

This makes sense because you could argue that cheap data may not be of the highest quality, and might actually result in a very low match rate. This will result in wasted expenditure and you’ll be left with a list of bad contacts that you won’t be able to get any use out of.

2. Not Matching Your Database Against Suppression Files

Did you know that there are 3,000 changes made to people’s personal information on a daily basis? Running your database against a series of Suppression files is an integral part of the data cleansing process. Suppression files identify who in your database has either died or moved address.

This is extremely important when it comes to your marketing campaigns, for example, contacting those in your database who have sadly passed away will upset their bereaved relatives and friends. Equally, if your customers have moved house and their new addresses have not been traced then this will likely result in loss of customers and revenue.


Additionally, by not suppressing your data you run the risk of mailing to those registered with marketing preference services. Consumers and businesses have the option to register with the MPS, TPS, and CTPS in order to limit the amount of marketing that they receive, and if you contact customers registered with these services then you could face a hefty fine.

3. Failing to Dedupe Your Database

Business data decays on average by 37% each year, and consumer data by 13%. In addition to this natural decay, you may have also captured duplicate details due to the various touch points your customers have with your business which can lead to such errors. When it comes to cleaning your database, it’s imperative that you choose a solution that dedupes your records as well as suppressing them.

Failure to do so may result in misdeliveries if you have multiple records for certain customers, containing both their current and previous addresses. Additionally, duplicate customer records are likely to cause breach of accurate permission usage, right to be deleted, or subject access request (SAR), according to the Royal Mail Insight Report.

4. Accepting a Fuzzy Match

If your business is conducting a comparison between data cleansing providers, then you need to make sure you analyse the results carefully. Whilst one company may say that they have matched your customers’ addresses against the Royal Mail PAF at 97%, another may only generate 93%. The latter is not necessarily wrong, however, so it is very important that you ask the cleansing companies to compare and analyse the difference.

There is always the question of fuzzy matching versus rule based matching, whereby the provider with the higher match rate may have just assumed the addresses in your database are correct when in fact they are not. For example, according to statistics from Zoopla published by the BBC there are 2,431 roads named High Street, followed by Station Road (1,929) in second place, Church Lane in third place with 1,547 such roads, and lastly Church Street with 1,404.

High Street

Most towns are so large that they have many districts containing these street names, therefore, it is wrong to make any assumption without rules based matching using the district name, postcode, or similar. Generally, a fuzzy match looks good, however, it is making an assumption which can go very wrong. Hopewiser pride themselves in only using rules based matching in order to give the most accurate possible match rate.

We hope that you have found this blog helpful and that you now know what mistakes to avoid when it comes to cleaning your database. Data cleansing is vitally important to your business, and using a reputable provider can help you to get a better return from your data.

, updated 22nd February 2023.