Struggling with data quality? You’re not alone.
A breakdown in communication between Analytics, QA, IT, or UX is a common cause. In this presentation, Jose Bergiste, Director of Data & Analytics Engineering, shares tips and tricks to:
- Achieve alignment across teams
- Understand the role of different tools like Jira, Asana, and others in working across teams
Fill out the form to watch this presentation on-demand.
Hello there and welcome to this presentation. Today we're going to talk about how to raise the quality of data in your organization and it takes a village to do that. And we're going to get right into it and really talk through some tips and tricks as to how to do that best for your, your company. Now, if you're struggling with data quality, just know that you're definitely not alone. This is an extremely common problem across the industry. Um, and a lot of times the issue involves a lot of parties and there's a breakdown in communication between the various teams that impact data quality. So your, your QA team, your analytics teams, your IT teams and UX teams as well. Um, so what I hope today is that you come out with some practical things that you can use within your org to make a difference and increasing your data quality.
So first, just a little bit about myself. Um, I really love all things analytics, software development and product management. I'm currently at Cognetik, a consulting company that does soup to nuts data from strategy all the way through analysis and engineering and QA as well in between. And I have about 17 years experience across many different roles. So started off as a graphic designer then as a web developer, then went into digital analytics engineering, done a lot of solution architecture and also have managed people for many of the organizations typically mid to large size organizations. So I hope I can share some tips and tricks with you today just through my years of experience of seeing things that work and things that don't. And a lot of times, you know, it's, it's the little things that make a big difference.
So I'm hoping that you can, we can get some some, some knowledge here as we go through this presentation. So we got quite a bit to cover today. I've broken this, this issue down into six different pieces. First we're going to talk about just the nuances of accurate data. We want to make sure you know, when we are trying to solve inaccurate data problem, we can really understand all the things that impact that and we can define exactly, you know, what is it that we're trying to get to? Then we'll talk, we'll talk about why it takes a village to really solve this problem. We'll get into some tips and tricks in terms of keys to success for, for coordinating many different teams. We'll talk a little bit about documentation and what I'm calling dynamic documentation and then we'll talk about evolution.
How do you, once you kind of have a good process going, how do you keep it going and continue to make progress over time as well as we'll just have some final at the end. So let's get into the nuances of accurate data. There are really three main things to really consider here. First, when we say something's accurate, what are we assuming? What sources are we comparing? Then we'll talk a bit about the challenges of doing comparisons. And then, you know, we'll talk about sort of what you should target in terms of saying something is exactly accurate versus not. So we've got three examples here of something that could potentially be inaccurate. So in the first example, you've got a decrease in orders in your digital analytics reports and is that good or bad? Right? Can we say that, that, that's accurate on that.
But if you were to compare that with another system, let's say for instance your eCommerce backend system and you see that the same, for the same time period, your orders a flat, then obviously that is potentially a data quality issue on the other end of the spectrum. So the third example on the screen here, let's say you're seeing an increase in visitor counts in your reports, but there's nothing else to compare that to. There's no other system that potentially, you know, captures that information. But let's say you have knowledge that there's, you know, a marketing campaign that was just released and you know, you know that that would have increased your visitor accounts. So, as you can see in these examples we can potentially say something's inaccurate versus not because we're comparing it to something else. And the moral of the story here is to always ask the question of what are we comparing this to?
If we say it's not accurate or, or if you say it is accurate based on what? And those that, those comparisons could be explicit where you're comparing two systems to each other or they could be implicit, which is you're comparing a system to maybe some knowledge that you have or you know, some trend that you're aware of. But in either case, always ask the question as to what are we, what are we really measuring against? Comparisons are very challenging. And there are several reasons for that. You know, one of the things is you want to make sure you're comparing the right sources. Okay? So if you've got two systems, you know, are they the right things to compare to each other. So that's the first, the first challenge. Then there's also an assumption that at least one of the systems is accurate.
So in the example I gave earlier, you know, we're comparing a a back end eCommerce system. But is that accurate? And you know, the answer may be yes based on many other factors, but it's important to ask yourself that particularly when it's an implicit comparison because you want to make sure you're comparing it to something that's, you know, that has some good logic to it. The other part that I see a lot of, a lot of people kind of struggle with is time horizon. So you know, generally you, you'll be able to match days and weeks and things of that nature. So you want to ensure you're looking at the right dates when you compare it to things. But times don't get your people off as well because one system may be recording something in a certain time zone in another in a different time zone and that can throw things off.
A segmentation is also a critical in a sense that, you know, one system may be looking at a slice of users, maybe all mobile users and another system's looking at everything else or everything together and your numbers won't match if that's this, if that's the case. Calculation methods is, is essentially how the system is calculating visits or visitors or any sort of metric. And it's commonly known that, you know, the major analytics vendors for instance, they will never match because of how they're calculating certain things. But this is true for a lot of, a lot of other systems. So you want to really make sure when you're comparing two numbers together from two different places, make sure that you understand how this stuff is calculated. And then you know, if you don't necessarily understand it or if there is a problem, granularity is very important as well. And, and that has to do with how did an aggregate number get calculated. So if we use the orders example again, if you seen total orders, can you actually, you know, look at the trend transaction by transaction basis. And it's usually great if there's some key that can tie data sets together. So if you're in your backend system, you have eh transaction ID that can be compared to your digital analytics system. But keep these things in mind you know, as you were basically doing comparisons. And really with all of that being said it's better usually to just target consistency over perfection. So, you know, you want your systems to essentially be a certain percentage off of each other over time. And that itself can, can give an indicator of data quality versus saying these two things match 100%, and if they don't, you know, your data quality's not, yeah, up there. So just the key thing to keep in mind.
All right, so the next session I want to talk about is why does it even, why does it take a village to, you know, solve this, you know, measurement problem here. You know, and the simple answer is really things are very complicated. Applications are very complicated. Um, this is an example here, but in your organization, this could look different. This could most likely be more complicated than what I'm showing here. But if we look at a mobile app and what it takes to really get the data out of a mobile app, you have a UI framework that's you know, concerned about how the app works. Then there's API flying everywhere, there's some hidden data sources. We've got content management systems that's driving the content and you've got SDKs everywhere. And one of those SDKs hopefully is a tag manager and that's trying to instrument measurement data and it's sending that data to your Adobe analytics or Google analytics or your advertisement networks. And all of this is to, you know, use this data to really make your app better. But the key thing here is to get high quality data. You really are playing in an ecosystem with many other concerns. So it's important to keep in mind that there are many other concerns when it comes to building an app and there's many points of failures. And some of those may be systems failures, some are people, people making mistakes, failure. But just always keep in mind where, where you are. And a tip here is to really learn as much as possible about how things work behind the scenes. So that you can, you know, understand the data as much as possible. And there are may people involved in getting these things done. So, you know, over here I've got here four, four roles and I want to emphasize here that I'm calling these roles and not positions.
So a position is tied to a person, right? One person is doing one thing. So in some organizations there are maybe people who do many of these roles together. Or in some organizations they're split up even further, right? So, you know, so this is a general example where you usually have user experience that's concerned about how, you know, something looks. Um, software engineering is concerned about how it functions, quality assurance is doing all the testing, et cetera. And um, you know, digital analytics is concerned about reports, but overall, the moral of the story here is there a lot of people involved, they're going to have different concerns, they're going to have different skill sets and specializations, and the key here is to really try to work together with these different groups and to really try to understand each other and a lot of teaching as well. So you know, each, you can maybe share your knowledge with different groups and learn as much as possible with different groups in order to really work together in a seamless way. And I'm going to share some keys to success. How do you really coordinate all of these different people? And all these different teams in order to get high quality insights and data in order to move your business.
So I look at it really with two key things. And that's leadership and ownership. So I look at these two concepts. So leadership to me is someone who's really in charge of managing the process. They're really looking higher level and making sure people are working well together, right? They're moving out of the way to make their team's lives easier and they're not necessarily looking at the weeds. And then you have your owners, so your owners, they're responsible for doing the work they want to take. They take responsibility and accountability for doing their part. So you know, in many organizations there could be a lot of leaders and owners and in some organizations a leader is an owner for certain parts of the work. So again, these are roles, not necessarily people or a person. Uh, but it's important in order to be successful, to have strong leadership and strong ownership so that you can not only look at, look at the overall process and overall problem, but also have people that are responsible for doing the specific work that's needed in order to achieve the goal. And trust is extremely important here between leaders and owners. And particularly the larger you are, the more, the more trust you're going to need in order to really get all these things done.
The other thing is communication. So you've established leadership and ownership, but you have to communicate. It's essential. And if you, in an organization where you're not communicating at all between these different parties, first step is just start talking. And you can start really informal, by introducing yourself, etc. But you've got to start the conversation. If you're further along or you know, as you start talking, keep in mind that not all communication is created equal. So some communication is too casual, right? So if I'm just walking in the hallway and asking someone to implement something that's going to increase quality, it might be a good conversation, but they may forget it as soon as they go to their desk because they have a lot of other things to do, right? On the other end, some conversations, some conversations are too formal like a 200 page document for, for a request, that's probably too much. And so that can be problematic as well. Frequencies also could be problematic if someone's constantly blasting emails or you know, constantly trying to point out the same thing that could be basically start getting ignored because there's sort of too much chatter, right? And then not giving enough information can lead to confusion. So the moral of the story here is that you do need to communicate that is the baseline for being effective. But try to find a good balance, right? Where you can really mix and match some casual conversation with some formal conversation and get the frequency right and get the right amount of information to the right parties. And this will depend on your organization. So you want to find a balance there. So that this can work for your organization.
Alright. And then in terms of tools, there's no shortage of digital tools for communication. But the point here is I want to caution you guys that first of all, there isn't a single tool that I've seen, at least that meets all the needs, right? We can't just purchase tool X and say that will basically get all our people talking and you know, increase our data quality. Right? You know, I'll give you an example here of, of how maybe you can use a lot of tools to accomplish the same goal. You know, you can use something like a documentation platform, maybe like confluence or Google drive and that, that's a really good platform for maintaining like high level definitions, your metrics, things like that. That's really good for that. But when it comes to actually doing some work to sort of maybe measure certain things, the ticketing system is usually better. So something like JIRA or Wrike and you can always link to the documentation from there. So this is an example where you got to use two systems. Then when it comes to interactive discussions, right? Once the ticket's out there, maybe there needs to be more explanation. If something's not clear sure you could, you know, write comments in the ticket and that's probably okay as well. But sometimes it needs to be even more interactive than that. And so something like Slack or an MS teams is, is good for sort of getting those conversations going. And for, for teams that are virtual, right? Holding conference meetings work really well. You can go as far as even recording the meeting sometimes for reference later if some people were at the meeting and others weren't. But those were kind of some of the tricks you can use to help communication flow and in emails to me, they're good for communicating things that are well understood or statuses, right? But you know, a lot of times using email for, you know, doing tasks or having interactive discussions, it doesn't work very well for that purpose. So it's still, still an important step. But keep in mind that each of these tools and technologies ought to be used, you know, in an inappropriate fashion.
The other tip that I want to, I want to give here is don't try to reinvent the wheel here. So let's say you're having data quality problem at your organization and your IT team, they're using JIRA to track this stuff. You know, try to go with the flow chart, actually use JIRA, figure out how do they use the ticketing system to track these things. How do they prioritize versus let's say, you know, using some other thing and they say asking about it in Slack or sending an email, et cetera. So it's important to find the right balance that works for your organization and to try to use the right tool at the right time.
So let's talk a bit about, so we've got the communication thing going, right? Let's talk about dynamic documentation. So documentation, I think everyone's really familiar with, that you need to document thing. Particularly, when it comes to things like business goals, metrics, dimensions, the different events that are happening in your, in your application and technical architectures, all those things are great things to document. But I'll tell you this, I've seen excellent documentation that's about two years old and most of the times that's not going to help you very much. So I think you have to assume whatever documentation you're creating that it's going to change over time. So you want to upfront think about that and think about how you can use shareable documents. For instance, Google docs or confluence again, or SharePoint, something that makes it so that something can be changed and can be shared out to a wide group of people and they can, you know, actually get the most up to date version of whatever that information is.
You want to also always use versioning to make things clear. And some of these systems will help you do that automatically. So that's fantastic. But you wanna, you wanna make it clear what's changing, right? And then we've talked about dynamic documentation. But that doesn't mean chaos. It doesn't mean you let any, anyone change your documents at any time, right? So you still want to have some governance there where you want to govern your document. If you have a team of people working on the same problem, you want people to think and discuss things before they actually update the documentation. So come to the consensus before that. And I would certainly recommend for like a large organization to actually have a governance board before you, you know, you actually publish and approve certain changes. So that's sort of the tips and tricks there for that.
All right, so if we assume you've gotten a lot of the, you've done a lot of the stuff we've talked about here. You, you know, you understand the nuances of data quality, you've got good communication going within that scene. You've gotten your leaders, et cetera. Are you done? Is that it? Have you solved the data quality problem? I would say absolutely not because everything's going to change. The business is changing. So if your business is changing most of the times, right, there's going to be new needs, you're going to need new data points, right? And with every new project or new initiative, you should look at that as an opportunity to improve data quality because you know, you might as well use that opportunity and really think in a mindset that things are changing. This is where automation really helps. So a tool like ObservePoint can really help you automate the inequality.
But one key thing that I want you to take, take from the automation conversation though is to make sure you have resources there that's going to maintain and adjust the automation. So if you want automate things to make your life easier, but it's not a set it and forget it type of situation because based on the first point there, the business itself is changing and evolving. So you obviously want to evolve with that with your automation as well. The other thing I'd recommend is to, you know, do retrospectives to really look at what worked, what didn't work and have some good conversations with your team as far as, "Hey, how did this work, this time? Can we improve it?" And what comes out of your retrospective, you can actually test and adjust your processes based on that. So don't be afraid to try new things if if something's not working very well you know, try, try to actually change the process of that and, and really discuss it and have frank conversations with your team regarding sort of, you know, how to make like improvement and just, you know, keep in mind that data quality is never a project. It's never, "Hey, we had a data quality project and now we're good." It's an evolving thing and you want to always stay on top of it and want to keep moving forward with that.
And with that, I'll have some final thoughts. Things that I've seen make a big impact. You know, first you want to democratize your insights, right? So that means the teams are working hard, they're helping solve data quality problems, but for what? Right? And at the end of the day, we're doing all this because we want insights, we want to drive businesses forward, and basically every team member should see some of these insights and be able to really see the value of what they're working on. And that makes a big difference in terms of motivation and really set a purpose for everyone. The second tip is to really celebrate your successes. So you're going to have successes, you're going to have failures. Don't ignore your failures, look at your failures, but don't only focus on your failures. Look at your successes as well so that the team can feel some, some sort of reward or balance based on sort of the improvements that are being made. And then last but not least, you want to say please and thank you, right? There are many different people involved as you've seen in this presentation in the process and you know, everyone's busy so definitely, you know, make sure you thank, thank people and be cordial. So that they can really, you can really accomplish the goal of your organization.
And with that, I will say thank you very much for joining me today. I'm Jose Bergiste at Cognetik. So my contact information is here, so please feel free to reach out if you have any questions and I'm hoping for, for any questions you have today. Thank you very much.