Digital in Context: Integrating Business Data in 2018
Welcome everyone and thank you for taking the time to attend my talk today. There’s a lot of great content being presented at the Analytics Summit today and I’m thrilled to be a part of it, so thanks for coming into my session.
Today I want to talk about organizational context for digital analytics data. When I say that, i mean expanding out resources for analysis beyond the clickstream data that traditionally defines the analytics set and bringing the data from around the organization you’re in to add context to the parts you’re directly responsible for. We’ll talk a little bit about what that means first with an overview of why this matters. I’ll cover what kinds of context you want to apply and how you can make that available in your organization. We’ll also talk about how to curate this data you need to provide this context, and then I’ll tell you that now is the time to do this.
I want to mention here that we’re starting off with a bit of an assumption. This is about expanding on your clickstream data, and to do that effectively, you need good clickstream data. This means having a solid, governable data layer in your management, as well as ongoing QA to ensure you’re consistently getting good analytics data in your tool and through any other marketing pixels that you’re firing on your site.
Our host, ObservePoint, of course, is fantastic when it comes to the governance piece for your clickstream components, so have a chat with them if you’re at all concerned with your data quality.
In sort of a typical fashion, I’m going to start with a little bit of history. Let’s take a quick look back at working in the analytics space, starting with web analytics in 2010. Until this decade, our primary source of data, when examining the performance of a web property has been when the data was generated within the web property itself, the clickstream data.
The move from log file analysis to browser tracking at this time is still fresh in memory, and widespread adoption of the benefits of that move, namely the tracking of non-pageview events, is really just getting underway. Our view of attribution at this time is similarly confined to clickstream. The http refer tells us everything we know about how the visitor came to the site and what efforts may have brought them there.
By about 2014, it’s coming clear that a little more context would pay off. And it really started to surface in the most obvious and easy data to obtain, namely marketing cost data. Online advertisers demanded a better understanding of how the money they spent was being translated into revenue. So it’s becoming clear here that the dollars going in need to be accounted for reporting.
What we learned by bringing this marketing cost data in is that this information can have a profound impact on the purchasing decisions recommended by analysts. We’ll see examples of this later, but the move from relying on how much revenue a campaign generated, to how much return it generated, produces far greater insights. I want to point out, this is also an important step for the digital analytics profession itself.
Finance and C-Suite executives consider these things as sort of a second nature, and analysts began to be able to speak the same language as them and actually produce number that Finance deems relevant. As they do that, they become more relevant in the organization that they service.
Unfortunately, a number of analysts seem to have stopped there, largely because it’s often organizationally and politically difficult to assemble other sources of data. Analysts and marketers often work closely together already. Often they’re on the same team and sometimes, by virtue of the fact that they’re the same people, data sharing is much easier. But the lessons we learn from incorporating that contextual data are far too powerful to ignore.
Obviously, data from other areas of the business are likely to have an impact on the stories we tell as analysts and the things that we recommend. Even in the most fractured organization where the team produces its output as sort of a black box and team, there’s value in looking at the business holistically. To any organization that isn’t already doing this, your 2018 planning had better include a strategy for pulling in data from across teams. It’s important for your organization’s survival and the kinds of decisions it makes and it’s important for your own career as analysts fight to stay relevant.
To put it another way, clickstream data is this black lion here. It’s formidable, it’s powerful, it may be the only thing you bring to the game and you likely wouldn’t be faulted for doing so.
But in the context of other data, suddenly the clickstream data—our black lion in the middle there—seems woefully inadequate. It forms the body of our work and it holds the other pieces together, but it’s just part of the picture. The whole story, our Voltron here, is made of many parts, but when we bring them together, we get something far more powerful. So 2018 is the time to assemble your Voltron. How will you do this?
If you’re going to build your Voltron over the coming year, you’re going to need some support. This project can be a significant undertaking and we’ve seen the number of organization fumble on this kind of thing before. You need to assemble your team of pilots who will bring all Voltron parts together. You’ll want a champion. This could be you. This could be your boss. This could be a task you assign to an analytics agency. At any rate, someone needs to own this endeavor and see it through to completion.
Next, you need executive buy-in. That is, the higher-ups need to know what you’re doing, agree that it’s a crucial step in building your organization’s data assets, and most importantly, empower your champion to get the job done. It’s not always easy to get other teams to supply you with what you need to make your analysis more reliable and credible, but having their bosses telling them to listen to you definitely helps.
You need a shopping list of data. What are Voltron’s parts? What data is relevant to your organization and impacts the channel. We’ll talk a little bit about this. And you need to choose where and how you’re going to store all of this data. We’ll go over some options, but basically you need to have a homebase for Voltron.
The first thing you need to plan out is what data you’re going to be bringing into the picture. Maybe you have data about what your office furniture is worth and how long it typically lasts before it needs replacement. That’s useful in some context, but in the context of digital analytics, not so much. We need to think about what data could potentially change our minds about the kinds of conclusions we draw and stories we tell when we’re investigating the data. Here are some examples: marketing costs, merchandising costs, customer acquisition and CRM data, funds, finance adjustments.
Let’s look at a specific example. Let’s consider three marketing campaigns in a somewhat idealized example. Let’s say we ran an email campaign, a display campaign, and we ran a third campaign through remarketing display ads. If we’re looking at purely clickstream data, the only real indicator of the performance is going to be revenue. How much money did each campaign generate for the business? When we look at those numbers for these campaigns, there’s a number, there’s a clear winner. The display buy comes out on top, but as we saw, adding more context to our data could change things. Let’s consider the effect of adding our marketing costs.
Guess what? Email is cheap compared to advertising on other sites. So the story is quite different when we bring in that tid bit of information. You would have been a fool to recommend re-funneling email investment into display on the basis of revenue generated. It has a lower ROI when we consider how much it costs to run those ads. Now the email campaign comes in on top. But as you might have guessed I’m not done.
What about how much it costs to produce or acquire the merchandise in each case? Say we’re at an apparel company. Three campaigns were run earlier in the summer and focus on summer and beach attire. Email campaign pitched swimsuits, display campaign advertised sandals, and the remarketing campaign went after people who were looking at our shorts. Well one of those things is cheaper to produce and obtain, so when we factor in the costs of goods sold per campaign, we’re back to the display campaign winning out. It earned a higher return when we consider the additional costs. But hold on, those cheap sandals may have generated a number of returns via email and phone. What happens if we factor that in?
When we factor in those costs, our crappy sandals were not a huge hit with the people who bought them and had a higher rate of returns, and hence refunds issued. When we factor in those costs, we yet again have a different story to tell. The remarketing campaign wins, but it does so largely because of the products it sold, which were at a higher quality. That is, people love the shorts and it barely lead to any returns. But as we saw earlier, email was still a more effective medium.
Now we have a lot more to say. We may be back to recommending investing more into email, but leverage that medium to sell the shorts we’re retargeting, and now you’re printing money. So that’s a bit of a simplistic example that ignores some facts about how some people react to some of these marketing types, the shorts may have a lower rate of return because people have already invested time in the purchase, but it does illustrate my point. Each new piece of contextual information you can pile on here, the story you’re telling is far more accurate and compelling, if a bit muddier.
Beyond this kind of analysis using cost and refund data, you may want to pull in other kinds of data into your accounts. One example springs to mind; CRM data may have all kinds of relevant demographic dimensions, lifetime value metrics, and other information about your customers might not be in your analytics tool. Bringing that in at an aggregated level gives you powerful fodder for segment building.
Let’s say that you agree with me and that this is a crucial thing you should be doing. How do we accomplish this? One component is going to be deciding where your more complete data set should live. The starter approach is often to get data output from the other sources and mash those together in spreadsheets like Google or Excel. That might get you up and running quickly, but there’s a high chance of error there, a lot of manual work, and only manual work can validate the data.
The other two popular options are either to import your data directly into your analytics tool or to have the data be put in another data store that can be easily joined with your clickstream data. In the former case, you make the data the most readily available for reporting using the builtin or custom reporting features in your data analytics tool. But if you have a hadcore data science team, they may prefer to get the raw data outside of the tool and mash it up with the clickstream data. Let’s look at some examples of each approach.
In Adobe Analytics, there’s a data source feature that will let you push some of this data directly into Adobe Analytics for the purpose of including that data in reporting. It has a SOAP based API and FTP connection, either of which will get your data into the system.
In Google Analytics, you have the option to define various types of data sets and then import data manually in the user interface through the management API. or you can use third-party services that allow you to automate the data transformation and upload process using SFTP and email endpoints.
So either Adobe Analytics or Google Analytics,automate as much as possible and reduce the amount of manual labor you need to do to ensure your data is properly uploaded in a governable fashion. That is, you don’t want human error creeping in here.
As for storing the data in a separate data store, there are a huge number of options here. You might have your own data warehouse, you can use cloud-based databases like those used in Google data platform, Amazon web services, or Microsoft. We’ll look at one specific example here, storing data in BigQuery along with the data exported from Google Analytics 360. In this case, you’ll want to produce and automated import of your data using BigQuery’s API to regularly push the relevant contextual data into BigQuery tables. You can then run queries using BigQuery to combine the data from multiple sources together and use reporting tools like Google Studio or Tableau to produce the end results.
This has the advantage of offering you some pretty raw data to work with, and it has the disadvantage of only giving you some pretty raw data to work with. If you look at this big SQL statement here, this is from Google’s own BigQuery cookbook. It offers an example of calculating the profitability of each product sold using data from Google Analytics, while separate tables of product costs and refunds. This SQL is powerful stuff, and if you’re technically competent as an analyst or you’re a data scientist, this may be your preferred approach.
I like to see a little more democratization of data, making it accessible to wider audiences. For this reason, I like the approach of putting it into an analytics tools better where reporting can easily happen for the less indepth users, and then exports to services like BigQuery can happen for the heavy lifters.
Once you’ve determined your strategy for storing data, it’s time to start talking to people about what data you can get. Here’s where collaboration with others takes off, and if you’re in a large enterprise, politics comes into play. My advice here is to become everyone’s best friend. That’s easier said than done, but my point here is mainly, while it’s certainly possible for you to have all the context available for your analysis, looking at the business holistically, holistically benefits the business.
You want to emphasize the value you’re bringing to the table when you ask for data from other teams. For your Finance team, you’re bringing data that they can finally trust to be accurate and relevant. If they ply you with their adjusted data refund information, you can in turn give them a view of the digital channel that makes sense for their context. Merchandising team if you have one, you can provide them with insights into what products do well and to inform their buying decisions for the next fiscal period, the more your can make them look like rockstars, show that their decisions are having a positive impact on the business and that they’re using the digital channel correctly.
Really make it a team effort. Inform everyone that your goal is to make the whole team look awesome to the executives in your organization. People can be really insecure about their data. What’s to say you’re not going to weaponize this data to make them look bad? It can be reassuring to let them know your goal isn’t to judge how people are doing, your goal is to put everyone’s heads together and to make incremental improvements on top of the awesome they already are. Identify your key stakeholder in each department that need to contribute and keep them in the loop on the project.
Again, it’s a team effort. This is the least technical, least analytical portion of being an analyst, but it’s still part of the job. Analysts need to be able to interface with various teams. The data they get is crucial to their jobs. Analysts need to be able to sell what they’re doing to the people they need data from, and to the executives who sign their checks.
When it comes to actually receiving the data, be clear about your requirements, but also minimize those requirements. By that I mean don’t create busy work. Other teams should be able to set up a process in which they have to do minimal work to get you the data you need. The best approach is to have a system where the work they do by default just feeds your data. Where other teams use specific tools to keep track of their work like CRM and management systems and so on, to try to hook into those tools and reporting to get the insights you need. Do they have an API? Can they send the reports via email? These options provide a way to automate processes.
Another important thing to do is consider governance of the data you’re getting. You need to be able to trust the data you get and you may get data that conflicts with your all. A common case of this is revenue. The order management system, the finance team, and your analytics tool may disagree on what the revenue team numbers are. You need to form a normalization activity to track the variants between these. Decide what’s an acceptable variance between these numbers. And at what threshold should I consider there to be a data quality problem and start a remediation process? Voltron can’t have a loose cannon.
A quick example of how a customer of ours accomplished this in the relatively same fashion as our examples. They have a number of direct API integrations that push data into Google Analytics from advertising platforms like Bing Ads, Facebook, and Yahoo Gemini. Google Adwords and Doubleclick are integrated directly into Google. We empower the transfer of their data through a tool called Analysis Engine, which is provided by a spin-off company of ours.
Once they’ve configured their internal system, to do so, they send us weekly reports consisting of revenue data and merchandise cost data. This allows us to create custom reports in Google Analytics that get much closer to the bottom line return on investment in a variety of reports. This gets pushed into Google Analytics on a regular basis.
Here’s an example of a custom report that we’ve created for this client in Google Analytics. Numbers here have been falsified to protect the innocent and the guilty, but you can see in this report that we’re taking costs and margins into consideration to get a calculated product margin value. We’re doing this in the context of campaigns here, but we can use other dimensions in the same report to produce analysis of other areas of the business. Other reports also make use of the marketing cost data.
The technical aspects of how you get data to flow across teams vary tremendously between organizations, but the strategic approach is basically the same as I’ve outlined here. You need a champion, whether that’s you, an agency, or someone else. You need executive buy-in to find out what data is available and what data is relevant. You need to decide where you’re going to put it, and you need to get everyone on board with putting it there.
Which brings me to my final bit of advice: when do you add context to your data? How about now?
Thanks again for taking time to hear my piece today. I hope you enjoy the rest of the summit and are able to participate in a lot of the other talks that are going on today. There’s great stuff out there.