(This post takes 6 minutes to read)
Tips to Minimize the Impact of Data Quality Issues
The most dreaded conversation a digital analytics professional has to face begins when an executive with a printed report in his hand asks, “Why don’t these numbers look right?” Quick as a cat, she has to gain her bearings and decide if the data’s really the problem, or if the executive just misunderstands the metrics. More often than not, the numbers are indeed wrong, and in a worst-case scenario, this is the first time the analyst is hearing about it.
Sadly, this worst-case scenario happens a lot. It is a common refrain in a three-act tragedy that we analytics professionals keep reenacting throughout our careers. Let me break down the play:
1. As a prelude, our hero has worked hard for months or even years to gather the right data from the company’s digital properties via a structured system of app, mobile and website tags.
2. She has also spent enormous effort persuading the executive team to review website tracking reports at all, since “data-driven decisions” is a rote phrase that every exec has learned to recite, but few have developed it into their actual routine.
3. Cut to the scene above, and the rest of the play is the fallout. The ironic twist is that the discovery of bad or missing data actually serves to dissuade executive sponsors from trusting these reports going forward, and it’s worsened by the fact that this gap or anomaly lives forever in the organization’s reporting system. So even if she manages to fix it quickly, our analytics guru will typically have to keep explaining the blip for weeks and months to come. (She also might start worrying about the security of her job).
The obvious remedy is to always push your tracking requirements through the same QA rigor that the physical features of your site enjoys. Sadly, few companies make this a priority, since data quality usually takes a back-seat to the user experience. A “we can fix that stuff later” attitude often kicks in when deadlines loom and the stakes are high.
Tip #1: Break Glass in Case of Emergency
Some analytics vendors have a bail-out plan to manage reporting disasters when they occur. Adobe Analytics, for instance, introduced Processing Rules a few years ago, letting you fix the broken messages sent by your website tags to Adobe on the fly. I always urge people to use tools like this sparingly and only as a temporary bandage for your broken analytics to get them by while robust fixes are being pushed through their standard development process. I also warn my audience that they need to be really careful as they apply such rules because a single imperfect rule can do vastly more damage than the issue they were trying to fix in the first place.
But you shouldn’t wait until disaster strikes to start researching the toolsets your vendor provides for emergencies. Access to the Processing Rules I just mentioned is blocked even to account administrators until they pass a short exam and certify with the Adobe Analytics account management team, so don’t think you can just Google the instructions on the morning you wake up to find your tags misfiring and fix it all on the spot. Imagine a set of fire-fighters arriving at your burning building and then pulling out a manual to find out what all of the buttons and switches on the side of the fire truck are for. Take the time now to prepare a response plan for the data quality emergencies that will inevitably arise.
Tip #2: Honk If You’re a Robot
Over my 10-year career as a web analyst, I’ve typically focused my pre-release testing on stuff we were adding or changing, with just a token glance across the structure of data that had been deployed and tested in the months and years before. And more than once I’ve been burned for not performing full regression tests on analytics, but honestly, who has the time and inclination to do all of that tedious checking? Enterprise websites are just too big and generally have tracking requirements that are too complex for each and every analytics tag to be checked manually, then re-checked before every release. Not to mention that debugging analytics data requires skills that are in short supply among most companies, and folks like us are usually wearing more than one hat, and can be stretched pretty thin. And absolutely not to be mentioned is the fact that we’re sometimes lazy. Which is precisely why we should be delegating this work to robots who will quietly and continually verify that the tracking is in place, now and in the future. ObservePoint is, of course, one of the vendors of this tag-checking robotic technology.
Last week I was walking a marketing executive through the process of verifying a series of analytics tags. She asked if it wouldn’t be easier just to send it to a QA team to check. Maybe the first time, I told her. But multiply that single, manual effort by the number of development releases you foresee over the coming months and years. Then work into that algorithm a periodic bump for every new tagging requirement that would come along as your company’s sites and measurement requirements evolve. I don’t even know how to do that math, but I know it’s a big quantity, and I also know that I would run out of steam after just two or three iterations. Testing the same things over and over just won’t hold my attention for very long, no matter how bad it is for me professionally to have to manage yet another crisis in data quality.
Tip #3: Apply an Ounce of Prevention
My first two tips have described how to catch and manage data anomalies as soon as they appear in production. But why should they appear there at all? Our primary goal shouldn’t be to minimize the impact of disasters, but to prevent them from occurring in the first place.
A few months into my first consulting job after leaving Omniture, my boss told me how frustrating it was that all of the reports he pulled needed to be presented with asterisks. What he meant by that was, after we successfully identified and resolved a series of data problems, he was being forced to remember that this Visits report couldn’t bridge Date Range X, or that Revenue attribution couldn’t be trusted in Month Y, etc.
Test-driven development has become the new standard for programming architecture. It’s high time that analytics architects get on board. The same robotic minions that are available for production environments can be unleashed upon staging environments. It requires rigor and a full-frontal assault on the mindset I described at the beginning of this article, where analytics issues are not typically considered show-stoppers during the QA-release phase. The reward for such discipline is that your analytics reports will finally mean what they say, period (vs. asterisk).
My career as an analytics consultant didn’t begin the day Josh James hired me at Omniture as an account manager. It came about a year later, when I was on the phone talking to my contact at Tiffany, one of the Omniture clients I was supporting. It was a horrible call, where I had to explain that the Omniture reports were blank for the campaign they just ran because the campaign respondents were never tagged. I decided then and there that I didn’t want to be an account manager any more, but I wanted to build and maintain implementations for clients, so calls like that would be avoided. Over the ten years since that day, I’ve had some successes to be sure, but also many failures where I had to spend time and effort cleaning up data quality issues created by broken website tags.
Knowing how to quickly detect and manage those disasters when they occur is an important skill for analytics and data governance teams. At the same time, faithfully applying tag verification regression tests in pre-production environments will make those disasters increasingly rare.