Customer Data Management: Data Cleansing Bureaux
Expertise on tap
As the role that high quality data processing plays has gradually moved further and further up the agenda over the years, isn't it time that you look at the services that your MSP can provide you with to help you crank your data cleansing activities up a notch, asks James Lawson?
Rather than just a provider of processing services, why not view your marketing services provider (MSP) as a centre of excellence? Data processing can be more of an art than a science and the UL's bureaux are full of experienced staff whose knowledge can help build a deeper understanding of your own customer and prospect data. The peculiarities and errors they uncover can point to underlying problems with software and business processes, or even fraud.
Hidden Attributes
"We like to see ourselves as the place to go to for advice, a little like your high street bank," says Rob Salmon, managing director of meta-morphix. "We can consult where a client may have merged two subsidiaries or acquired a new business with multiple databases, and wants to discuss how to treat the data. We can also talk about how to set up a database initially so that it can support business intentions."
A bureau is often best placed to advise on data capture priorities and database structure, such as how many address lines should be employed, whether the postcode should be in a separate field or if the title field should be limited to a drop-down list. Which variables will be needed if you want to calculate a lifetime value? Obviously there are many software vendors who are expert in this area too, but the consultative approach of vendors and MSPs can differ subtly.
"We would consult on the use of software and strategies to create and maintain a single view, and can help with routines to, for example, improve matching," says Steve Tuck, Chief Strategy Officer at data quality software vendor Datanomic. "We wouldn't give specific advice on which reference file to use in applications like tagging and would tend to refer that to one of our MSP partners. They do that all the time."
Highlighting the partnership role that many MSPs have with their clients, Scott Logie, Occam's Strategy Director, notes that his team are continually trying to optimise client procedures.
"We process around 20 campaigns a month for Sainsbury's Finance," he says. "We're very familiar with it and have speeded things up over time. We look for the optimum file format to supply so that's it ready to go as soon as they receive it with no further re-formatting."
As an example, Occam might include derived variables so that the client doesn't have to spend time calculating them, or find the exact format that the call centre being used requires for its dialler.
"We can end up being the expert in their data," says Logie, "which can be handy when the only in-house data processing expert moves on."
In a similar vein, Ai Data Intelligence will send in consultants to advise on getting the best out of in-house processing and analysis while dbg assists with and hosts induction sessions for new staff members that join their client's companies. This sort of consultative advice is especially valuable to novices who want to carry out standard operations like deduping and suppression where the right level of matching is key.
Given that every reference file is out of date the minute it's shipped, it's impossible to ever match every single record absolutely accurately. So what sequence of reference files do you process against and do you risk under or over-matching?
That depends on what you are going to do with the data. For prospect use, losing a few records isn't such a big deal, so you would go for over-matching and perhaps settle for a slightly less accurate but cheaper series of processing operations using fewer reference files. But where high-value customer files are concerned, losing even one policy holder would be a very bad thing for a financial services outfit, so the opposite applies.
"Some clients would rather not change anything at all than risk introducing errors," says David Aitken, Technical Director at Data Discoveries, who consults on both in-house processing and his company's own bureau service. "You have to adjust the confidence level carefully based on the customer's requirements."
Other hard-to-answer processing questions that bureaux can help with include: How do you choose the master record when merging multiple dupes into one? How do you decide which are the most reliable fields from which to build each final merged record? And do the input files need any pre-processing to better enable accurate matching?
Aitken also notes that he can help clients interpret the processed files and also to understand that even the best processing has its limits. For example, name and address quality may vary widely, with perfect contacts at one end and unmatchable rubbish at the bottom, or it can be unexpected filled with foreaign or business records.
"The best and worst are always easy," says Aitken. "It's the ones in the middle that are hard. There can be a big difference between a client's requirements and what is actually possible."
"You have to balance the time and money spent processing against the maximum amount of deliverable records," agrees Logie. "Does a sale have a margin of 50p or £3000? Clients really do value our advice where a decision is not clear-cut."
Bureaux will usually deal more effectively with many classic howlers - such as "The Old Schoolhouse" being mistaken for a business address - simply because, over time, they have built up their own vast custom reference files. This means that they can spot the address wrinkles that PAF gets wrong, match to esoteric obscenities inserted by online form fillers and disaffected contact centre agents, or employ lists covering every forename under the sun.
A broad range of reference data to match against and perhaps tag from can also help them flesh put the picture of a customer, and also enable more accurate profiling. "They may have RFM scores but we can tell them much more about why certain ones are the best customers," says Jon Cano-Lopez, Managing Director at Ai Data Intelligence.
Bureaux that build their experience into their software are also better placed to decide which data items should be changed. It's fairly simple to spot suspect strings (as data folk call a sequence of characters) within client data that exactly match unwanted words like obscenities but it's tougher to work out if they should be flagged up.
For example, a crude matching algorithm might highlight a perfectly valid address component such as Penistone, while a better developed one would consider the context and leave it in peace.
"Names like Tony and Anthony, Liz, Beth and Elizabeth on their own relate to different people but, combined with other information, can help limit customer complaints," notes Darren Wall, Head of Data and Bureau Services at dbg. "If Tony asks not be mailed, I am sure Anthony requires the same."
Bureau can also spot, rectify and then tell their clients about other classic problems, such as inconsistent use of address lines. Good software will not only look at the match between the first address line in one file and the second in another but also the first to the second or third, and so on.
"Many in-house software solutions are capable of this but few will also allow for client-specific anomalies such as the words 'never mail' that may appear as the first address line," adds Wall.
Corruption in supplied files is another typical problem. This can happen for many reasons, a common one being Excel's tendency to strip out the leading zero from records - something that can play havoc with telephone numbers. Again based on hard-one experience, sophisticated bureau software can also search for phonetic mistranslations and typing errors which are also common in call centre-derived files.
Default values entered by lazy or harassed staff are perhaps are perhaps the most popular "spot" of all. If not immediately clear to the client, then a good operator should spot that most of the customer base was born on the first of January. "Be suspicious if 50% of your database contains whatever the first option was for staff", notes Salmon.
So what can or should the client do when errors are pointed out to them? A very common recommendation is the need to install addressing software. This enables agents to carry out swift and right-first-time input of postally correct details either from a supplied street, post town and house number, or more usually, from a postcode and house number.
Field autofilling means that operator keying errors are massively reduced and good software will also pop up possible address matches as the operator enters more data until only one unique address remains - so minimising the time to input each one.
"Overall name and address quality has improved a lot," states Cano-Lopez. "I think that's partly down to the increased use of rapid addressing."
Other errors, like entering default values in order to process a purchase more quickly, imply a need for more staff training and perhaps a look at how user friendly the data input process is. Overly complex sets of code are often ignored by staff, whether at retail sales points or in a mailing house.
"Simplification of codes makes it easier for data handlers to pick up the relevant information," explains Logie.
Being ambitious with initial data collection and asking too much of staff and customers can mean ending up with a worthless record. Better to follow a "drip, drip" approach where, for example, only the basic name, address and email are captured first time around and staff gather extra data items during subsequent transactions, the same applies in spades online.
Duplicates are where much of the valuable intelligence found in data processing lies. It's common to find duplicates when merging transactional data form postal, web and phone channels. Is this a collection of single buyers at the same address who buy through different channels or are they the same person?
"Multibuyers" are highly prized as they tend to be loyal and high spenders. Therefore it makes sense to flag them up and, if possible, treat them differently to maximise their contribution.
With files like NCOA and GAS Reactive now widely available, finding new addresses for goneaway customers and prospects is a lot easier than it used to be. But what about when a new address for a goneaway ends up as a duplicate of an existing file?
This could just be because an existing customer has already started transacting from their new address - in which case the record histories should be merged as this looks like a very loyal customer - or there may be a more sinister explanation.
"By using files intelligently in combination, we can spot where someone moves regularly," says Salmon. "That might potentially be a 'fraud string' where someone moves regularly to avoid detection."
However, he adds that there are also many innocuous reasons for one individual transacting from multiple addresses. Where someone is buying alternatively from two or more addresses, they may have just moved - or might indeed own two or even more houses. Perhaps they have a holiday house or chalet that they hire every year over Christmas and want food and drink delivered.
Another interesting finding is where an order comes from a name and address listed on a deceased reference file. Besides paranormal activity, this can also indicate fraud - or it may be that a widow or widower is simply using his dead partner's account.
The main thing is to spot these anomalies and cross-reference them with any other data, or more likely, to simply make an outbound call or two to find out what's going on.
Learn From Errors
Although taking responsibility for your own data is essential, the subtle indicators to be found with the warp and weft of customer files make a strong argument for choosing to work with an expert bureau: few in-house software packages are able to question why a commonly-used stop file has been omitted from a processing brief or why a particular pattern of duplicates might exist in the input file.
"You can take the learnings processing back to the business," concludes Salmon. "At the very least, you don't want to be correcting the same errors over and over and over again".
Article by James Lawson from Database Marketing, Sep 2010