We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Evolution of the Production Analytic Platform: Bridging the Informational Divide

Originally published February 27, 2018

Ron Powell, independent analyst and industry expert for the BeyeNETWORK and executive producer of The World Transformed FastForward Series, interviews Dr. Barry Devlin, founder and principal for 9sight Consulting, to discuss the evolution of the production analytic platform that will bridge the operational/informational divide that exists today. Ron’s focus is on BI, analytics, big data and data warehousing.

Barry, first the data warehouse back in the 1980s, more recently business “unintelligence,” now production analytic platform. You seem to like drawing architectures. Why?

Barry Devlin: Thanks, Ron, for the reminder of my age! Is it really back in the ‘80s? Yes, I’m afraid it is, and I’m still doing architectures after all these years. 

Architectures are really important to me, and back in the ‘80s it was a question of trying to figure out a very key, new paradigm shift could be supported: how to do consistent decision making when data is coming from many different platforms. Back in those days, of course, there were just different applications on a mainframe, but it was still a lot of data coming from different places. And, to me, this architecture idea is really important for anything that involves a large cross-enterprise or cross-functional project. The architecture sets the scene – it sets boundaries and expectations – first for the business and then for delivery.  That’s vital because today with the modern time pressures we have and with vendors always pushing to sell solutions, we often end up rushing to make a product selection. I often think that the data lake, for example, is a very interesting example of how that has happened because for the data lakes that I’ve seen, many of the architectures are actually “productologies.” They are a list of products rather than a list of functions or a list of what it is we need to do in order to get to a particular business goal. With a big project – whether it’s a lake, warehouse or production analytic platform – I think we must start from an architecture that sets the scene and allows the business people to understand what it is they're getting, to have that conversation with them, and then go into the implementation phase. 

The architecture is like a foundation that you build something on – much like a house, right?

Barry Devlin: Absolutely. We know how important it is to have the architecture for our house first because otherwise we end up with something that doesn’t really do the job we need as a family or as a person living in the house.

Can you explain what is important about this new analytic production platform?

Barry Devlin: Yes, I can, but let’s go back to data warehousing and before. Data warehousing was the result of a concept that we came up with back in the 1970s: splitting the operational and informational worlds from one another. That made a lot of sense at the time because business was a lot slower then. We didn’t need to have decisions being made at the speed of light. We didn’t have this idea that the business wanted to see every change and every transaction. Back then I remember in the early days of warehousing in the 1980s, the business folks were saying, “I don’t even want to see the detail. I’m happy to see the end of day or even the end of week (in some cases) because I can’t handle the chopping and changing that happens in between.” 

But that was then and this is now. Indeed the technology has changed to allow us to have much more speedy access to decision making. So this split between informational and operational systems has really started, maybe as far back as ten years ago, to cause problems. We see these problems when you have operational analytics or when you have the Internet of Things, and we end up having to shuttle data, models, and inputs and outputs back and forth between these two systems. And when we do that, we get delays built in, we have data management problems, and we have a lot of issues that we need to deal with. So what the production analytic platform is trying to do is to address that. This was a problem I’ve been thinking about over a few years, and then I had some interesting conversations with my friends at Teradata who were telling me about all of the new analytic and other types of function that they’ve been adding and are continuing to add to their database. 

Now the Teradata database is an interesting starting point because, yes, it is a data warehouse, but it is a very fast, very powerful and very production-oriented data warehouse or database. So it was clear to me that a solution to this operational/informational divide would be to create a  production analytic platform built upon the strength of the relational database and of traditional relational databases like Teradata or DB2. The basic idea is to build upon the features and functions of a traditional relational database in order to begin to do more advanced analytics.

What are the key characteristics of the production analytic platform?

Barry Devlin: There are seven or eight I could list, but I’ll talk about some of the key things in this platform. We’re talking about a production analytic platform based on a relational database as the cornerstone of this architecture. Then, what we really need is embedded support for a wide range of query, reporting, advanced analytic functions and the data that is needed to support them – so different data formats, JSON data formats, CSVs as well as the relational formats.  And the built-in storage, the built-in support, the built-in ability to do all of that function around those types of data becomes really important. Another key is scalability and, of course, that’s central for the data lake as well; but scalability here is about the volumes necessary for production use of this data that’s coming from the Internet of Things. You know, the initial analytical use may be bigger, but I think there’s a balance we need to reach between the different levels of size, speed and ability to handle these. There are a few others as well, but all of them are listed in my “ThoughtPoint," available here.

Some people are saying that the data lake will kill the data warehouse, but you are suggesting that the production analytic platform is built out from the data warehouse. Why not from the data lake?

Barry Devlin: The data lake is different than the data warehouse. At its best, it is more of an informational than an operational environment. And this name "production analytic platform" is formulated in the way that production is the first word because that's what is important: it’s about taking analytics into production. When I look at data lakes and the platforms on which data lakes are built, I think they’re still fairly immature and don’t have the same levels of reliability and maintainability – production values – that I would see in traditional relational databases. That may change over time and the data lake products will offer those characteristics, but that will take time. Since these characteristics are at the heart of traditional relational databases – again like Teradata’s database – I think that’s why it makes it so much easier to think about incorporating the production aspect of analytics in a platform that is capable of doing that. If I start from the data warehouse, which was always an informational construct, and extend it back into the operational world – a little like how we added the operational data store into the data warehouse architecture – then we’ve got a very good platform for allowing us to share data between the data lake and the data warehouse environment and implement production analytics. To me, the data warehouse is the best place to do that. But, of course, the data warehouse is my baby so I’d have to say that.

One of the biggest advantages with your architecture for most enterprises is that they can just utilize the same applications and programs they are using, which makes it a lot easier for them to implement this platform, correct?

Barry Devlin: Yes, by now the data lake has been around in some form for six or seven years. So there is enough done on data lakes that one could say that either the data lake could be a good starting point or the data warehouse could be. I just think that my point of view and my experience with relational databases says to me that the idea of using the longevity of the relational database – they’ve been around since the 1970s – is the way to go because there is a lot of reliability, a lot of maintainability and so on that has been built into those databases. And I think it’s good to reuse that. I think that functionality will have to be built into the data lakes in the future, but I don’t think it’s there yet.

Thank you, Barry, for discussing with us how the production analytic platform will be able to meet the current operational and informational needs of today’s enterprises.
  • Ron PowellRon Powell
    Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010.  Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at rpowell@powellinteractivemedia.com. 

    More articles and Ron's blog can be found in his BeyeNETWORK expert channel.

Recent articles by Ron Powell



Want to post a comment? Login or become a member today!

Be the first to comment!