Blog: Ronald Damhof Subscribe to this blog's RSS feed!

Ronald Damhof

I have been a BI/DW practitioner for more than 15 years. In the last few years, I have become increasingly annoyed - even frustrated - by the lack of (scientific) rigor in the field of data warehousing and business intelligence. It is not uncommon for the knowledge worker to be disillusioned by the promise of business intelligence and data warehousing because vendors and consulting organizations create their "own" frameworks, definitions, super-duper tools etc.

What the field needs is more connectedness (grounding and objectivity) to the scientific community. The scientific community needs to realize the importance of increasing their level of relevance to the practice of technology.

For the next few years, I have decided to attempt to build a solid bridge between science and technology practitioners. As a dissertation student at the University of Groningen in the Netherlands, I hope to discover ways to accomplish this. With this blog I hope to share some of the things I learn in my search and begin discussions on this topic within the international community.

Your feedback is important to me. Please let me know what you think. My email address is Ronald.damhof@prudenza.nl.

About the author >

Ronald Damhof is an information management practitioner with more than 15 years of international experience in the field.

His areas of focus include:

  1. Data management, including data quality, data governance and data warehousing;
  2. Enterprise architectural principles;
  3. Exploiting data to its maximum potential for decision support.
Ronald is an Information Quality Certified Professional (International Association for Information and Data Quality one of the first 20 to pass this prestigious exam), Certified Data Vault Grandmaster (only person in the world to have this level of certification), and a Certified Scrum Master. He is a strong advocate of agile and lean principles and practices (e.g., Scrum). You can reach him at +31 6 269 671 84, through his website at http://www.prudenza.nl/ or via email at ronald.damhof@prudenza.nl.

December 2009 Archives

I have just read a very intriquing paper called 'A Common Approach for OLTP and OLAP using an In-memory Column Database', written by Hasso Plattner. 

It's not a revolutionairy new technical approach for Data Warehousing a Business Intelligence. It's just a series of smaller (mostly technical and some are even quite old) innovations that together could lead to a paradigma shift [1] in the area of Data Warehousing.

This paper is focussing on the transactional world, because that's where the disruption will originate. In short;

  • Ever increasing multi-CPU cores
  • Growth of main memory
  • The I/O revolution - SSD's
  • Column databases for transactions (!)
  • Shared Nothing approach
  • In-memory access to actual data - historic data on slower devices (or not)
  • Zero-update strategies in OLTP (recognizing the imporance of history as well as the importance of parallelism)
  • Not in the paper; but I see datamodels for newly build OLTP systems increasingly resembling the datamodels of the HUB in the data warehouse architecture. 

Modern day Data Warehouses and Business Intelligence architectures incorporates all the above mentioned technologies/methods (well, they should!) and such an architecture therefor compensates for the weaknesses that is intrinsic for OLTP regarding OLAP. 

This paper is acknowledging the above technologies/methods and uses them in a OLTP context. The amazing thing is that the OLTP system is getting a lot faster and entail a lot less system maintenance (no indices, no aggregates, materialized views or what so ever, huge compression factors, etc..). 

BUT, maybe more interesting. What's the use of a data warehouse if the OLTP world is adopting these technologies/methods? Well, the case for a data warehouse becomes thinner. At least; data warehouses as we know it; ..a materialized store of (history and actual) data loaded from various sources ...

Simply put; with the above mix of technologies and methods we are able to stop propagating data. Or in other words; we can just leave the data where it initially is created. The data warehouse will then focus on the metadata (becomes hugely important!), business rules part (although advances in this area are also big), the integration part and the fit-to-task part (make it suitable for analytics, reporting, risc management etc..). Oh....data latency is non-existent.


Data architecture  - truly independent of it's task (whether it's transactional or informational). Could I live to see that? 

I advise people to read the paper from Lyytnen as well [1]:

Architectural innovations stand out as creative acts of adapting and applying latent technologies or potential to previously unarticulated user needs (Abernathy and Clark 1985). They radically deviate from an established trajectory of performance improvement, or redefine what performance means in a given industry (Chistensen and Bower 1996). They are radical (Zaltman et al. 1977) in that they significantly depart from existing alternatives and are shaped by novel, cognitive frames that need to be deployed to make sense of the innovation (Bijker 1987). Consequently, disruptive innovations are truly transformative (Abernathy and Clark 1985). To become widely adopted, disruptive architectural innovations demand provisioning of complementary assets in the form of additional innovations that make the original innovation useful over its diffusion trajectory (Abernathy and Clark 1985;Teece 1986). By doing so, disruptive innovations destroy existing competencies (Schumpeter 1934) and break down existing rules of competition.

I believe for the industry of data warehousing the above might apply. Nowadays, the new technologies and methods mentioned are increasingly used in the Data Warehouse and Business Intelligence scene. When it hits the OLTP scene it will radically change Data Warehousing and Business Intelligence as we know it.

How long will it take? Well, the latter alinea of the above quotation from lyytinen might slow real absorption down considerably:

To become widely adopted, disruptive architectural innovations demand provisioning of complementary assets in the form of additional innovations that make the original innovation useful over its diffusion trajectory (Abernathy and Clark 1985;Teece 1986). By doing so, disruptive innovations destroy existing competencies (Schumpeter 1934) and break down existing rules of competition.

SAP, Oracle and all other vendors of OLTP applications will have some work cut out for them. But I know that these quys are working hard......just listen to the SAP folks on their last summit....


[1] The disruptive nature of information technology Innovations - Lyytinen, Rose, 2003, MISQ 


Posted December 12, 2009 4:13 AM
Permalink | No Comments |