In Data Quality Management Process: Part 1, I discussed how the two most important questions in data quality management, what to audit, and when to audit play to each other on a tactical level. With an understanding of the differences between the sweeping nature of an audit and the focused, high-frequency nature of simulations, understanding the triggers for data collection (when to audit), informs the segmentation technique (what to audit).
Segmentation techniques create the linkage between business needs for data quality review and the technical configuration of the auditing system.
Read previous posts on audit timing and frequency, segmentation and targeting, proactive alerting and monitoring, qualifications of a data quality analyst, general data quality practices.
Before we dive into segmentation techniques, let’s remember that it’s important to remember that the best approach to digital content auditing is focused, defined, and purposeful. Ideally, using the Data Quality Management Process, you can create a problem statement that clearly defines why you need to conduct and audit, what you need to audit, and clearly identifies the results you expect to achieve. For example, you ought to be able to state:
“Because our web team rolled out a new home page and primary content area for our corporate brand, we need to audit this section of the web site to ensure our Tag Management System, Primary Analytics System of Record and other essential tags have been successfully survived the migration to the production environment free of errors that will cause data corruption down the road.”
Segmentation Techniques
Follow these three steps to build your audit segmentation / simulation targeting plan in ObservePoint.
Consider data collection triggers
Identify what has happened to raise the need for an audit or simulation will point you in the right direction.
Combine information
Consider the reason that an audit is necessary and combine that with information about the structure of your web site / channel / content group / asset groups. Which assets are affected by the internal or external collection trigger?
Create a technical definition of the segment
Define the segment in ObservePoint in preparation for the site scan or simulation. During this process, you’ll need to set several paramaters. This is where there are some differences between audits and simulations.
Common settings
Setting | Definition |
---|---|
Starting Page | The URL of the page that the audit or simulation will begin on |
Location | The physical location of the server that requesting pages from the web server |
User Agent | The user agent string passed by the ObservePoint server to the site’s web server |
Schedule | Frequency of site audit or simulation |
Clear Cookies | Either clears cookies between page visits or allows cookies to persist though sessions during the audit or simulation |
Silent Mode | Cancels the final request to the vendor to prevent server call inflation. Imposes some functionality limitations |
Load Videos | Allows videos that begin to play OnPageLoad to play so that ObsevePoint will see tags that fire at that point in time |
Unique to Audits
Setting | Definition |
---|---|
Page Limit | Limits the number of pages an audit will crawl, in the case that it is not first limited by a filter or the site index |
Include List | URL white list as a regular expression |
Exclude List | URL black list as a regular expression |
User Session | Adds a set of actions to the audit – such as completing form fields to log in to a protected content area |
Inside the audit settings, you’ll also need to limit the scope of the content consumed by the site crawler. This is done using the “include” and “exclude” filter sections. Think of these as white and black lists that match on URL strings using regular expressions. You can use our regex tester to check the validity of your regex string.
Unique to Simulations
Setting | Definition |
---|---|
Actions | Directs the simulation engine to interact with the web site in specific ways. These include SELECT UI element; NAVTO a url; CLICK UI element; INPUT a string into a field; CHECK or UNCHECK a box; EXECUTE code against a page; WATCH a page for a specified period of time |
Monitor | Defines a rule to test against during every page load; can test for the presence and configuration of tags in specific scenarios |
Now that you’ve configured your segment in the ObservePoint technical environment, you’re ready to execute the audit or simulation against your content. In the third post, we’ll cover data quality analysis.
About the Author