Ep. 92 | Managing the Fabric of Customer Data

Oct 10

This week Bob Page, Chief Product Officer at Neebo.ai joins Allison Hartsoe in the Accelerator. Neebo is a new tool (brought to you by the folks at Datameer) designed to help companies manage the massive fabric of assets that contain customer data. With a background in big data, Bob has first-hand experience with the struggles of organizations to manage and monitor their customer analytic assets.

Please help us spread the word about building your business’ customer equity through effective customer analytics. Rate and review the podcast on Apple Podcast, Stitcher, Google Play, Alexa’s TuneIn, iHeartRadio or Spotify. And do tell us what you think by writing Allison at info@ambitiondata.com or ambitiondata.com. Thanks for listening! Tell a friend!

Read Full Transcript

Allison Hartsoe: 00:01 This is the customer equity accelerator. If you are a marketing executive who wants to deliver bottom line impact by identifying and connecting with revenue generating customers, then this is the show for you. I’m your host, Allison Hartsoe, CEO of ambition data. Each week I bring you the leaders behind the customer centric revolution who share their expert advice. Are you ready to accelerate? Then let’s go. Welcome everyone. Today’s show is about how to manage the massive fabric of customer data and other data sources, and to help me discuss this topic is Bob Page. Bob is now the chief product officer at Neebo, which is a creation from Datameer. Bob has been on the show before talking about analyzing massive datasets while he was at Yahoo, eBay, and Hortonworks. Bob, welcome to the show. It’s great to have you back.

Bob Page: 00:59 Thank you, Allison. It’s good to be back.

Allison Hartsoe: 01:01 So we have seen lots of different data companies come out. We’ve seen lots of different tools just exploding across the market. Can you tell us first a little bit about the challenges you experienced as you’re using those tools at previous companies and why something different might be relevant today?

Bob Page: 01:22 Well, you know what, if you look at your analytics maturity model, I’ve been in all those pits of this bear, right? So when I think about explosion of data and how to solve that, how to tame it, a number of years ago, many people thought, well we’ll put it in a data warehouse and then unstructured data came along and maybe we’ll put it in Hadoop, and we’ll create a big data Lake where we put all of our data and then we’ll be able to manage it there. And what you ended up with was a lot of systems and a data Lake too. And so I would say that a lot of people got a lot of value out of the data Lake, but it wasn’t necessarily the thing that was going to solve all the problems. Partially because the data is everywhere and it keeps on coming in from different angles. But partially because if you think in a broader sense about trying to wrangle all the assets in your environment that has to do with analytics, it’s not all just data, but you have your code, right? You have models that have been developed or snippets of sequel or any number of other pieces of code.

Allison Hartsoe: 02:20 But tribal knowledge or things that are residual across the organization.

Bob Page: 02:25 Yeah, I mean I remember we do these analyses, and then someone would come back and say, when you did this analysis, say some quick stream types of things, did you filter out robots? And the analyst would say, Oh, uh, I don’t think so. The table I use already has robots filtered out, or does that code I need to add? And Oh well, here’s a snippet you can add in your analysis. Oh, okay. Yeah, I’ll sort of rerun it. So some of that is about documentation, and some of it is about do I mix on the right data. And some of it is about, am I using the right code?

Allison Hartsoe: 02:52 So the analyst has the habit of going around and around and around as they gradually refined what it is that they’re pulling from and how clean that data source is or how correct it is.

Bob Page: 03:05 Yeah. Or these are the right ones. Is it the trusted one, or is it the right code to get at it. A lot of times there’s, well there’s documentation Wiki somewhere that describes this or I know someone in my organization or has already done this, is an expert in whoever owns this table or who’s done, you know, say a CLV analysis in another department and I want to leverage their experience. So add to that, I read in a Mary Meeker’s support in 2017 her internet report that the average enterprise marketing organization has 91 SAS applications that they use.

Allison Hartsoe: 03:37 Wow.

Bob Page: 03:39 This is average and this was two years ago. So I suspect it’s larger now. Now, to be fair, some of those are not strictly around analytics, there is Skype or box or any number of other things. But the fact remains that there is an explosion of application that every department needs to use, not just marketing, HR and finance, and engineering and IT, and you need to get on the list. Those all contributed to in one form or another, what I would generally call assets that have to do with how analysts are going to get their job done. They need to log in to Google analytics and do some work there, and they need to go over to HubSpot or Salesforce that you name it. Right. There’s a whole lot of different tools that we’re using these days, and it’s not just a clean, Oh, everything’s in a database somewhere.

Allison Hartsoe: 04:20 Yeah, and I think what’s interesting about what you’re saying too is, well we all talk about blending the data together, but the idea of there’s this tweak of the data and that tweak like in the robot’s example, sometimes you do want to include robots if you’re trying to see where there’s a lot of noise coming from or is that driving up your IT costs and sometimes you don’t want to include it for the purposes of analysis. So just landing the data together in a data Lake and expecting it to be meaningful as something I find over and over again that a lot of executives don’t quite get.

Bob Page: 04:53 Well, and even if they do the on the ground reality is that it’s always complete. It’s not always everything that you want because not all the data is in the data Lake and not all the assets that you need are even data. Like I said, there’s code, there’s documentation, there are people, there are applications, there’s a lot of assets. But in an organization, one has that one wants to leverage to do analysis besides data.

Allison Hartsoe: 05:17 So you’re in charge of this new product. Can you tell us a little bit about it, and now that we understand a bit about the pain?

Bob Page: 05:24 Yeah, so because I’ve seen this pain at some very large companies that actually have tried to address it in the past, I’ve had people ask me, you know, the thing that you did back then, is there something new or anybody who’s sort of doing that commercially is something I can get. And when they say a thing, you kind of like a new category of thing. It’s not a new algorithm for analysis. It’s not a new visualization tool. Think of it more like a virtual analytics hub where it has kind of a view of all the stuff, all the assets across your organization. It doesn’t hold them, it just points to them, and when you need them, it’s got a nice say, allows you to connect to any of these analytics sources to be able to find them, discover them if you will, combine them when you need to and then publish them back out and do it all in sort of a collaborative way.

Allison Hartsoe: 06:11 And what is the significance of it being a pointer versus pulling it all in and blending it together?

Bob Page: 06:17 Well, if what you have instead of all the data in one place is the metadata of that stuff, then you’re able to go get the data where it lives, when you need it and not continually push it out to a central location like a data Lake.

Allison Hartsoe: 06:32 Well, and I imagine as you have more and more data that becomes almost like you don’t have enough hours in the day to get it there.

Bob Page: 06:38 Partially true. Yup. I mean, if you have a data Lake and as all the stuff that you need, then great. But even if you had a data Lake and it has absolutely everything you need, and you’re not storing anything anywhere else, including on your laptop or anywhere else, then you still have other applications you need to get to. You still have documentation about best practices of how one does work around analytics. You still have people in your organization who are experts in different areas, and you’re not putting all those in the data Lake. So you still need some way to sort of know that they exist, index them, and be able to search for them, access them, and then pull them in as necessary.

Allison Hartsoe: 07:11 So could this be something like videos and PowerPoints and wikis, and I mean, we’re really talking about not just structured data, clearly.

Bob Page: 07:19 No, we’re not. I mean, you shouldn’t think about this as focused on data assets alone and say it takes a broader view of all the things in your organization that you would consider analytic asset.

Allison Hartsoe: 07:30 So analytics assets, including expertise or really, gosh, it’s almost mind-blowing to think about. I can see why you put it in a new category because I like the word you used previously about fabric. It’s really this kind of embedded residual knowledge that’s in all these pockets and corners of the organization that might be more difficult to pull in.

Bob Page: 07:51 Yeah, and I think that’s one of the things that I’ve seen at a lot of places, including places where I’ve worked and that you don’t always know what has already happened somewhere else or what the tribal knowledge is or even what applications or what Wiki pages or what data systems are up to date and what tables should I be using, which ones are trusted or blessed by finance, it’s hard to know where to find things once you find them. It’s hard to know which ones are current or which ones should I rely on, and when new ones come along, how do I live with to them as well.

Allison Hartsoe: 08:22 Yeah. I heard an interesting story around this, but it wasn’t handled with the same kind of tool. It was the city of New York, so the chief data officer for the city of New York, they are responsible for releasing a lot of information to the public because that they serve the public good and so they release things like real estate data and building data and all these different tables. But the problem is when they do that, people get in and they find different ways to put the data together. That’s not what they would have recommended. And they make assumptions about what a certain column name is, and it becomes a PR nightmare for them because somebody will be like, Oh, I found this amazing conclusion and then it’s just not true. So they really struggled with this idea. What they did is basically manual documentation to try to get on top of it. But do you have other examples where somebody might have a similar problem and could make use of this?

Bob Page: 09:15 Well, I think as you talk about things like building a data culture or being more data-centric in an organization, that usually leads to this idea of data democratization. Let’s lead everybody be data citizens. Let everybody have access to all the data and then do with it what they will. But the problem is exactly what you described. If you don’t understand where that data came from, how it’s generated, what it means, what’s missing from it, et cetera, et cetera, you end up potentially using it in the wrong way � so having a great library in R for example, for manipulating the data and generating insights and doing a regression and all that doesn’t help any of that. Right? The data’s still what it is. So yeah. So having some context around the data is really, really helpful.

Allison Hartsoe: 10:00 Is it helpful for specific situations? Like do I have to be a big enterprise that really has this data democratization problem? Is that where Neebo really fits?

Bob Page: 10:10 I don’t know if it’s really big, but I would say it’s not the sort of, Joe’s pizza probably doesn’t struggle with this necessarily, but once you get to a place where you’ve got more than a handful of people using, trying to do analytics or you’ve got more than a few places where your data lives, or you’ve got more than a few applications that you’re using that are starting to degenerate insights in one form or another could be graphs on the screen or data exports they make available or whatever. Once you start to get more and more of these assets available, and I include the analysts here as assets for the purposes of our discussion, it starts to get complicated and so having some way to find it, catalog it, documented, combine it and do it all sort of in a collaborative way becomes really important.

Allison Hartsoe: 10:54 But I haven’t heard you say you need to write it or hit it with Python or sequel or any kind of like, I have to be a coder to understand or pull in this information. Is that right? You don’t have to be a coder.

Bob Page: 11:08 Well, you don’t. So this is really for the business analyst. It’s for somebody who they know how to get their job done, but they need an assist to find these assets, potentially combine them to create new assets, publish them, and do it in a collaborative way. So this system itself is that we’re building is cloud-native SAS solution. So it’s easy just to log in, and then you can connect to whatever analytic assets that you currently have access to. It respects all the source systems. It doesn’t do anything on the secure side that would undermine aid security that’s currently in place in your organization. Think of all the tools you use today for analysis. I would say keep using them, right? If they meet your needs, keep using them. We’re not looking to replace that. We’re not looking to provide a great machine learning algorithm or a great visualization tool or whatever.

Bob Page: 11:59 There are lots and lots of them and more coming out every day, and Neebo is saying, great, use all those. That’s fine. It’s not the analysis and the visualization, if you will, of those assets that we are interested in. We’re interested in making sure that you are able to get the raw materials you need and do it in a collaborative way, so think of it as kind of something that lives above all the infrastructure and the analytic assets that you have, including all your tools like you’ve got say, some custom created data mover tool that you can invoke. Great point to that, right? We just want to make sure that you’ve got a one-stop-shop to be able to get to all the resources that you need.

Allison Hartsoe: 12:35 And what I think is so powerful about that Meta tool that’s super Uber tool that you’re describing with Neebo is for customer data. We typically see that it’s spread across the organization. It’s a very horizontal operation as opposed to how we think about in organizations, usually very vertically in order to get value and power out of the data. You have these issues of trying to get data from one team or another team. You may have to bring donuts that day or whatever to try to get hold of it. In this model, it seems like you’re really smoothing out the ability to, you don’t have to get it, you just have to be able to see it and point to it.

Bob Page: 13:16 Right? So if somebody from the marketing department has logged into Neebo and said, I’ve got a number of tools and maybe data sources or applications or whatever documents that I want to make available to the rest of the company, then they can just basically enter the information about how to do that. Now you may come along and you’re not in the marketing, and you don’t even know that some of these systems or assets exist and you say, Whoa, look at that. Look at these things I can use. Well, that doesn’t mean you can access them, right? It just means that you know that they exist. You still have to have permission right from IT or whatever the governance of your company is. But at least you know now that there’s something there that you might be able to use, then you can go do the usual thing that one needs to do. We’re not going to circumvent though the security of the systems, but once you can log into them, then you can utilize them without having to know anything about where they are or who owns them or managing them. You just know that, Oh, I have a new set of assets that I can use in my work.

Allison Hartsoe: 14:11 Well, and also not having to rely on somebody else to pull an extract of that data that may or may not be exactly what you want because you didn’t want to tell them something. I think knowing that there’s something there is actually quite powerful. That’s a huge part of the problem.

Bob Page: 14:26 Yes, and if you know who initially connected the system, then you can go ask them information about it. I noticed that there’s an asset here that looks like this. That’s something that you’d recommend I use in this kind of analysis for example.

Allison Hartsoe: 14:40 So it’s kind of making data less blind. It’s not just a dumb table. It’s actually a more, it’s like a layer of meta intelligence on top of it.

Bob Page: 14:47 Yes. I’ve been careful not to say data but that to say asset because it could be an application, it could be document, it could be a number of things that you’re using to do your analysis or to help you do your analysis. Along the way, for example, what’s suppose you are deep in some say CLV analysis, and you realized that something that you want to cross-reference this with. Say, I don’t know. See geographic information and you happen to know. I don’t know how you know maybe because you’ve typed in geographic information or something and you see that I’ve done a lot of work in that area around geographic segmentation and so you shoot me a note that says you have any maybe geoinformation that you could upload and maybe help me with my analysis. It shouldn’t take too long, and you give me essentially write access to this workspace that you’re building, and then I collaborate with you.

Bob Page: 15:32 Maybe I have something perfectly suitable on my laptop. I can upload it into Neebo, and now it becomes part of the catalog of all of the assets across the organization. We do some blending or whatever we do within our collaborative workspace, and you say, perfect, this is just what I need, and then you’re off to the races, and you’ve created a new asset. We’ve done it collaboratively, and along the way, there’s all this lineage, if you will, of all the things that we did so that if anybody else wants to copy it, change it, modify it later, or even see how did this come about. Then that history is all there.

Allison Hartsoe: 16:04 It’s amazing. It’s kind of this hidden problem like until you get into the mix of it and you’re up to your eyeballs of data, and you can understand that asset and the lineage isn’t clear like until you felt that pain and you realize, Oh my gosh, I’ve started with the wrong dataset. I think it becomes very obvious very quickly. So it seems like you’re ahead of the market when it comes to, yes, people feel this pain, but there’s a lot more pain coming. The more and more people become able to use data, able to use these assets, able to or maybe expected to have an analytics measurable point of view in the work that they do. Would that be fair?

Bob Page: 16:44 Well, I certainly hope that we’re not too far ahead of the market. I mean this is something that I’ve talked publicly about for, actually found a presentation that I gave in 2008 talking about some of the problems we had around analysts producing analysis, you know, whether it be PDFs or PowerPoints or whatever, and then not being able to share them anyway. Where do I publish these? How can they be index? You could think of it as simple knowledge and management back then, but it has continued to be an issue beyond just I finished an analysis, but now it’s like the whole pipeline of analysis.

Allison Hartsoe: 17:15 Yeah, that makes sense. Now I think I saw on your site that there’s actually a demo, and you can actually download it and get into the tool. Can you talk a little bit more about where that is and where people go to check it out?

Bob Page: 17:28 Well, you can go to neebo.ai N E E B O dot AI to get a sense of what we’re doing. As you and I talk right now at the end of September 2019, we haven’t announced the product yet. So it’s sort of special for your listeners that this exists.

Allison Hartsoe: 17:43 Secret.

Bob Page: 17:44 Yes, and we’re looking at a release date of probably January of 2020, but we’re so excited about it that it’s hard to keep wraps on it and so that’s why I’ve been happy to chat with you about what we’re doing and what our vision is. If you go there to that website and you think it looks like something that you might want to kick around or test drive, there is a call to action button on there that you can click and have somebody contact you and see if it makes sense for us to kick the tires together.

Allison Hartsoe: 18:11 Are you looking for beta testers, and do you want more feedback?

Bob Page: 18:14 Yeah, we’d love it. I absolutely love it. You didn’t mention one thing, you said something about downloading the application, I should say this is in the cloud. There are no downloads, and you can run it from anywhere. It just lives in the cloud.

Allison Hartsoe: 18:24 Yes, you’re right, and it’s just dating me from the idea that everything had to be downloaded. No, you’re right. It’s accessed in the cloud. It’s a good call out. Thank you. And if somebody wanted to reach out to you directly, what’s the best way for them to get in touch?

Bob Page: 18:38 They could probably just send me an email at bob.page@neebo.ai

Allison Hartsoe: 18:44 neebo.ai. So again, that’s N E E B O dot AI. As always, links to everything we discussed are at ambitiondata.com/podcast, and I will link out to the particular page that Bob referenced. Thank you Bob so much for joining us today. It’s really exciting to kind of be in this like secret release. This is a real compliment.

Bob Page: 19:05 Thanks Allison. I appreciate the opportunity to let you know what I’m up to.

Allison Hartsoe: 19:08 Remember everyone, when you use your data effectively, you can build customer equity. This is not magic. It’s just a very specific journey that you can follow to get results. Thank you for joining today’s show. This is your host, Allison Hartsoe and I have two gifts for you. First, I’ve written a guide for the customer centric CMO, which contains some of the best ideas from this podcast and you can receive it right now. Simply text, ambitiondata, one word, to three one nine nine six (31996) and after you get that white paper, you’ll have the option for the second gift, which is to receive the signal. Once a month. I put together a list of three to five things I’ve seen that represent customer equity signal, not noise, and believe me, there’s a lot of noise out there. Things I include could be smart tools I’ve run across, articles I’ve shared, cool statistics or people and companies I think are making amazing progress as they build customer equity. I hope you enjoy the CMO guide and the signal. See you next week on the customer equity accelerator.

Artificial IntelligenceMachine LearningBig DataAnalytics TechnologyAnalytics Tools

Allison Hartsoe

Ep. 92 | Managing the Fabric of Customer Data

Ep. 93 | Get More from Retail Technology w Nixon’s Gary Penn

Ep. 91 | Human-centered AI with Jen Stirrup of Data Relish