Lori McNeill, Riptide Analytics - No Dev Resources? No Problem! 6 Technical Hacks for Analytics Practitioners

November 22, 2016

Slide 1:

I’m excited to be part of this summit. I think it’s fantastic that ObservePoint’s put this together for the “measure” community and I and I know everyone from both sides is getting a ton of value from their participation. I’m looking forward to contributing to that in this session, of course, and enjoying the rest of the awesome talks that are lined up for today.

Slide 2:

First off, I want to tell you a bit about Riptide Analytics. We’re a consultancy that partners with companies and organizations across several different verticals to enable their success at every stage of their analytics process. Companies partner with us to ensure that they get the data they need to be an analytics and results-driven decisions support system. We help them execute business decisions supported by data to optimize their marketing, their processes, and their business outcomes.

Slide 3:

As for my part, I’m Riptide’s Chief Technical Adviser, but I’m also a concerned analytics practitioner. Being in the trenches for the last 15 years, first as a scientist, then as a web analyst, then any more and just anything analyst, that’s a thing now that’s coming to an industry survey report medium. I’m concerned that, even though the importance of analytics to a businesses success is pretty mainstream today, we analysts are somehow not the center of the universe. Our companies often don’t prioritize aligning resources to support and augment our roles. Often, there’s nothing even nefarious behind it. It’s not that they don’t care, it’s not sensible due to overall constraints. Nonetheless, I’m still left feeling concerned about how this impacts our ability to fulfill the promise of our analytics roles. We have important work to do and we soldier on, in spite of resource restraints.

Slide 4:

But the struggle is definitely real. I know there might be some of you out there who feel a certain disconnect with what I’m saying about analysts having challenges getting resources—you’re very fortunate. If this is you, I hope you realize your good fortune.

Slide 5:

Some of us work for a dream company. I was discussing the struggle that we have with resources with a buddy once and she looked and me and she looked and me and said, “You know, I can’t really say that that’s been my experience. I get help from IT when needed and the dev team is super responsive and excited about all the insights I’m going to provide.” I listened to her and I kind of smiled and nodded because as you can probably guess, it was in Utah. Honeymooners are so precious, right?

Slide 6:

For most of us, eventually—it might be sooner, it might be later—things get real. Real quick. Quick poll, show of hands: who works at a real company with challenges like budget and constrained resources? I can’t see you, of course, but I know that many of you have your hands up, probably most of you. Except the ObservePoint employees. We work at real resource-constrained companies. We struggle, but I hope that most of you listening feel that you manage to deliver on the promise of your goal most of the time anyway.

I’m here to talk about what some of the things are that we can do in the periods of dev and IT famine. How do you fend for yourself in a more technical so that you can keep moving your work forward and spend more of your time providing awesome value to your org. I’m going to talk to you about some of the ways that I have overcome of worked around challenges fraught by a lack of developer support.

Slide 7:

But I must warn you that this is not a best practices talk. The tips I’m going to cover today are now the best and most polished approaches to instrumenting your site, or wrangling your data in the manner that I’m about to recommend, will not earn you your data sciences badge. Heading out with our expectations in the right place, we’re going to be talking about hacks.

Slide 8:

We’re going to be talking about workarounds. These things are going to help you be more self-sufficient so you can get on with doing what you do best. Some of these hacks may be outside of your comfort zone, but I promise we’re not going to go into the deep end. We’re not trying to replace developers—why would we want to? We’re analysts for a reason, right? But dipping your toe into more technical waters a bit, you should ultimately save time spent on certain tests and definitely save you some frustration.

Slide 9:

Before we get into it, let’s have a brief Twitter intermission—a Twittermission, if you will. Let me ask you this: if your favorite dev resource out there were to grant you one wish today, what would you wish for? Tweet it to us. You never know who might see it, it could be your lucky day. Also, don’t forget the hashtag “Analytics Summit.”

Slide 10:

Let’s get into the typical first blocker: instrumentation. Getting you tools implemented onto your site according to your brilliant specifications can be a bit of a challenge if there’s an issue with developer support.

Slide 11:

Ideally, you would be current with the best practices in your company. You would have your tag management system in place, you would have a robust governed data layer, and you would have full control of all of your tagging. Everybody in your company would love that and wouldn’t have it any other way.

Slide 12:

But, at the real companies that we work for. We know all too well that the code on the sites that we’re trying to measure doesn’t always accommodate our requirements or our timing, even more so. Just like almost everybody else—I say “almost” because we’re excluding the HIPPOs—we have to follow the process and take a number and join the queue.

Slide 13:

So how do we take action when we have deliverables and time-sensitive needs that can’t wait for dev to provide and edit to a data layer or other tracking code? After you spend a bit of time sulking, one thing you can do is take stock, take advantage of what is available to us now, and don’t hate—hack!

Slide 14:

Remember that while data layer is the preferred way of getting a piece of information from your website into the right spot on your data model via your tag management system—this is becoming the standard, but for a lot of us, it’s still a process. That data would be exposed for your TMS or any other tool that chooses to access the data exposed there. This is the preferred way, but we have to remember it’s not the only way. Consider that, in order for a website to function, plenty of pairs of information has to be sent to your browser. Even not neatly packaged and governed with the purpose of analytics, a plethora of data is exposed every time a page is viewed. There’s CSS, there’s JavaScript, there’s HTML, so you use those to your advantage. Instead of going without for weeks while awaiting your turn for dev resources, you can use some simple tools to extract the info you need that might be on the page already.

Slide 15:

If you have your hands in your tag management system, you’re probably used to picking up data directly from page elements, so things like prices or button labels. In a bind, you can use page elements to create data that you would have liked to have exposed in a data layer, but for whatever reason, it’s not possible. You can create data also, around, say the visit state. You can deduce profile and product info. You can tally impressions of transient features like we’re all going to have here as we’re coming up on the holidays. We can use page elements to create data via inference.

Slide 16:

Take for example, a site where visitors can log in. As you know from websites we work on and we all use everyday, there’s a difference between what you see when you’re logged in versus if you’re a visitor who isn’t. I n this example, the visible difference is just a minor change on the header. And as we dig into the next hack, there are also differences under the hood that we’re going to talk about tapping into. But these kinds of on-page differences can be useful, they can be enabling.

As an illustration, while working on an implementation product for a B2B portal site, I ran into an issues where the client needed to know whether or not a logged in user had admin level permissions. While we had the win, we’d gotten as far as getting a tag manager installed on the site, the state of the project was such that getting a data layer implemented was not in the near—or even mid-term—road map. Given this obstacle, of course I had to inform client that they were just SOL and they could contact me when they had it implemented. And in the meantime, I sent them an invoice. Just kidding. That’s not how this works. That’s not what we do. I rolled up my sleeves, I hacked out a little script to look for content that appeared in the header if—and only if—the logged in user was an administrator. When there was a match, it sent the info they needed into their analytics tool. In that case it was an event variable.

And this was in the early days of TMS adoption and before we had the more straight-forward solution for using the DOM—domain optic model, basically you page—like we’re getting with TMS provider’s built in features today. It’s pretty awesome that our vendors are increasingly building in more features to make it easier on non-devs to retrieve data from a page, especially for our analyst friends who are hesitant to inject code that they’ve written. Now is this a robust best-practice approach to instrumenting the site? No, but does it work? Does it provide the needed data? More importantly, was I the hero? Absolutely.

Slide 17:

We can tap into and make inferences from the things we see, but here’s also the things we don’t necessarily see, like variables. When I talk about getting existing variables on a page, I’m referring to global JavaScript variables. This is exposed data, ripe for the taking. Whether you need to assess it by code to be implemented in line or piping the data through a TMS, the variables that already exists on any given page can serve a dual purpose. They make your site function and they can also satisfy your tracking requirements. In all use cases, you’ll to know identifiers they use and you’ll need to understand the logic or business rules that your developers have put in place, but the data is available and accessible a lot of the time.

Slide 18:

To implement our first hack, depending on your situation, you may be able to go ahead and ask a designer or dev to give you a quick hand with a CSS collector or to use an ID to help you ID an element or even a name of a variable. But if your colleagues are really on it, you may have documentation that you can refer to. And in terms of picking up data from page elements, you can also get a lot of mileage with the preview or debugger tool for your particular tag management system if you have your hands in that. Thanks to the click listeners and the other listeners that are built in, we’ve been able to effectively see our pages and our page interactions in the way that our tag management system sees them. Then if you need to fend for yourself in terms of CSS collectors and guides for your trial and error here, Google will also bring you to sites like Stack Overflow W3Schools. These are great resources as well. These days it’s relatively easy to pull data in if you, A, know it’s there in the first place and you have selector or name for it, but you might have to do a little light-lifting to find out what’s available at the time you need it.

Slide 19:

That brings us to our next hack, which is using your browser’s dev tools. Most of us are used to using browser plugins and other tools for QA and debugging our analytics implementations. But if we go a little bit deeper, we can fend for ourselves a bit more in identifying and even retrieving available data by getting comfortable with the browser's dev tools.

Slide 20:

With every browser, you’re going to use a slightly different way of accessing and using the developer console, but there are two things that you can capitalize on for the purpose of analytics implementation. We are using our dev tools to see what page elements there are and explore the variables available on page.

Slide 21:

Going back to our random example from the web here, Dermstore, I’ve accessed the console in this case by right-clicking on the skincare button and inspecting element. That’s probably what a lot of you are doing in your work. The developer tools window opens up and shows me what I have to work with in terms of CSS applied. Depending on your developer’s or designer’s style, this may turn out to be a good thing or it could be a bit of a mess. What I see fewer analytics practitioners doing is using the console area down below.

In here you can enter commands or variable names to get information about the page. In this case, I’ve done the simplest thing, which is to type the word “window” into the console. This gives you everything that’s on the page, but you can quickly scroll through to see variables that are going to be useful to you. I see a few things that I would want for this logged in user. So knowing these variable names, I can set up my tag management system, or make a very clear and specific request to whoever is going to be doing the code, and I’m on my way. This example is using what’s native to Chrome, but there are other browsers and extensions such as Firebug that make the discovery process even easier for you.

Slide 22:

Now that we’ve got passed some of those blockers related to lack of developer resources at the data collection step, let’s move on to the pain points around getting access to the data you need to deliver insights. These days, it’s extremely rare—for me—to produce any kind of analysis using only one data source and without going through some kind of detailed process prior to working with the data.

Slide 23:

At “dream companies”, and even some real companies, there’s an enormous investment in data infrastructure because at some companies, the product or service depends on it. Sometimes the analytics function benefits from this and they are fortunate enough to get a piece of that data nirvana. But let’s be real, how many of us have that luxury?

Slide 24:

How many of you are just praying you don’t run out of memory on your own computers at the end of quarter? Yep, I see you. I feel you. I’ve definitely been there. The reality is, we also work off of downloaded data on our local machines. Our data repository is on our own computers and our local files. That’s why so many of you can benefit from this next hack.

Slide 25:

Supporting your local mini data mart. You already have the data there, you’re already downloading it so this is a tip that embraces the reality that your own computer is already where you need to be. And if we think about it, what are we really trying to accomplish with having that big data warehousing dream? What pain are we trying to solve?

Slide 26:

We’re trying to pull data down from several sources, maybe automate the transformation of that disparate data into some integrated summaries, and we are certainly looking to have the transformed data loaded somewhere for easy access as needed. I’ve seen these problems addressed in a myriad of ways and also without requiring an enterprise data warehouse. One thing that’s consistent is the technical thought process. What we’re looking at here is the two S’s. Data mart. It’s possible to structure your data extracts to really support you analyses and standardize your workflow. It’s save you time and you’re prepping for a model that you would need to communicate anyway once your favorite developer grants you that wish that you tweeted earlier.

So the two S’s: data sources and structure. Just like that big—probably never going to happen—IT project, you will need to take stock of the data sources that you’re extracting from and be clear on the data you need from within those sources. Then for the typical data mart structure, the simple approach you’re going to use on your own computer, you’ll need to think about organizing your data a little bit differently. Your data extracts are going to need to have unique rows of facts in your table or tab or date range, depending on what you’re working with, and another containing the dimensions. The facts are the what of the data that you’re working with and the dimensions describe those facts. So facts tables will tell you: leads generated, transactions, which shoes you sold, and then you tie that into dimensions, which tell us that the shoes are blue or the lead was a certain kind of web form. The facts change more than the dimensions. And you’re already pulling and saving your data week after week, so why not capitalize on your efforts by structuring the data as you would have in a data mart for ease of use. While you’re at it, you can be sure to pull in all of the data from all sources.

Slide 27:

Speaking of data from different sources, data blending is something you hear a lot of talk about today. We often present metrics from different, pertinent data sources side-by-side out of necessity. This next technique ties into the purpose for hacking out your mini data mart in the first place. Bringing data sources together and preparing it specifically to serve your analytics needs, is definitely worth the effort, depending on the data you’re bringing together though, you might be met with some challenges. Don’t worry, there’s always a hack for that.

Slide 28:

Common challenges in data blending include issues around missing data or dimensions.

Slide 29:

You might encounter issues when working with projection because you need future dates. Or maybe you’re needing to work with two data sources that are both incomplete, which is always a joy. A workaround for this is to use data or a master file. Tableau users might recognize this as something called scaffolding the data. You’re basically just going to create something—a table, worksheet—that is a sort of superset of the data that you need to blend. You use this to bridge gaps that are causing issues.

Slide 30:

Fortunately, the area of data extraction and shaping is an area of hackery [sic] where there are several tools available and even more tools emerging to help you be self-sufficient. Tools like Tableau in conjunction with ETL specific tools like Alteryx, and specialty utilities such as unSampler, there’s Excel plugins of course. And they’re all geared at making the data extraction and preparation step much less manual and time-consuming for us analysts. Another thing you’ll want to leverage in hacking this out is some of the desktop database tools, like Microsoft Access, if you have it. There are free alternatives like OpenOffice.

Slide 31:

The last couple of hacks I want to get into touch on another area where we tend to de dependent on available dev resources to make it happen. That is the area of automating recurring tasks and deliverables.

Slide 32:

Wouldn’t it be great if we had all our recurring reports nicely standardized and refreshable and available on a whole, a pull instead of a push basis? Isn’t that the dream all these BI platforms promise us? Build it once and move on.

Slide 33:

Obviously, in actuality, we’re still having to repeat steps over and over and email many things out to many people. Let’s look at two more tips to address aspects of the automation issue.

Slide 34:

For some of us, there’s no getting around emailing reports, be they Excel files or some format of PDFs or finding and recommendations. One of the primary drivers for this dream of adopting a reporting platform is usually a desire to have a way for stakeholders to get a report on-demand rather than having it languishing in their inboxes at some arbitrary interval. Until you get your enterprise BI tool up and running, one hack that will give your stakeholders a small taste of the good life is to create a faux reporting portal.

Slide 35:

Create a portal of your own. The tech requirements are pretty low here. You would just need a place to host the files, a place to share the links, and—to take it up a notch—you’ll also want to add some helpful descriptions and contextual information. The hosting place can be anywhere your stakeholders have access to and they’ll actually use. It’s somewhere they already live. I’ve seen team collaboration tools like Basecamp work really well for this purpose. There’s also Google Drive if your company will allow it. There’s simple intranet if you have one, or will be allowed to set one up. While it’s great if the sharing can happen in these environments to ensure people have access to what they need, a PDF can also work great to put a collection of links to your hosting location. Or you can email your stakeholders if you have to. Providing direction and context is essential for anything you deliver. Once you start getting to the soft service world, even if you do get your dream and dev and IT get that reporting portal set up for you and your stakeholders actually get access to it, it’s still a great practice to create the documentation around those reports to create a better experience for the end user.

Slide 36:

Last, but not least, I encourage you to look for ways to hack your work flows even if you can only get to a partial solution. As I mentioned, there are quite a few utilities now to make analysts jobs easier, and often, we’re not able to automate our full deliverable for many reasons, but even if you’re able to just schedule an opt-in to each download or set up templates for the usual data transformation update that you’re working with. You will save time and a little bit of sanity. Let’s face it, it’s kind of frustrating to do something several times a week on the computer because that’s what computers are there for. If you have to repeat the same thing multiple times, you really do want it automated.

Slide 37:

I hope that you come away from this talk encouraged and knowing that you needn’t have your work come to a grinding halt at those times when there’s little to no dev support for your role. You can often hack a way into collecting the data that you need, have data available in a form that you can use, and realize the benefits of some light automation.

Slide 38:

If we’re willing to dapple—just a bit—in the technical underpinnings of our profession and be resourceful, which we are, we can forge ahead in times when dev resources are lacking.

Slide 39:

Let’s keep the conversation going today. Here’s your Twitter reminder. I look forward to interacting with you all there.

Slide 40:

You’re welcome to connect with me on Twitter and LinkedIn.

Slide 41:

And thank you very much for attending my talk. I’ve assembled a list of some helpful resources for anybody looking to learn more about some of the more technical aspects of our profession. You’re welcome to check that out at bit.ly/analyticstech.

Previous Video
Jason Thompson, 33 Sticks - How to Sell Your Boss On a Digital Data Layer
Jason Thompson, 33 Sticks - How to Sell Your Boss On a Digital Data Layer

This session provides tools to sell your boss on making the investment for a solid data layer foundation.

Next Video
Judah Phillips, SmartCurrent - Data Stewardship in Digital Analytics
Judah Phillips, SmartCurrent - Data Stewardship in Digital Analytics

Learn how to appoint a data steward and create more value with analytics.