Thank-you everyone for the great feedback so far. Let's keep going on this track until someone says that it simply isn't possible. Why? Because as many of you know, I like to jump out beyond the horizon to see what might be done "outside the box", and if there is a remote chance that it will take hold (because of what we see happening), then great! If not, let's ditch the ideas in favor of something better... I must invent, but my ideas are based on many other individuals work in the industry.
I blogged about temperature of data a while back, and recently I've been exploring metadata. Now I ask the question: what about temperature of metadata as it relates to the impact that the change would have? What about notions like Architecture Mining? Metadata Mining? Correlation analysis on structures? Are these things too far out to bend the brain on?
I think not. The time is coming where the next stage to get to (beyond active/real-time) is Dynamic Data Warehousing. I've borrowed from my good friend Stephen Brobst, and his diagram of the 5 stages of Data Warehousing, and added a 6th (see below).
After all, this is the peak of what we are trying to get to: dynamic adaptation to business as business changes, so do all the systems (especially the integration systems). When we look at the end-results of implementing SEI/CMMI Level 5, or ISO 9001/9002/9003 etc... or PMP best practices, or ITIL documents, or ISACA audits, or CoBIT Controls, or good governance - they all come to a similar conclusion about processing: the nature of the processing routines once built, optimized, and well established is to run seamlessly day in and day out, repeatably in the back-office until such time as new processes or new structures must be introduced.
Automation is often left out when people talk about these levels of projects, however automation is really a fundamental goal of IT to begin with. We should be constantly thinking of new ways to automate repeatable and consistent processes. With this in mind, why can't the structures of the data warehouse (metadata, metrics, data itself, unstructured data, indexes, queries, code, etc..) all be subjected to the same repeatable rules? Even with code there's a finite sequence for execution defined in the compiler architectures.
Now, let's take the great leap off the edge of the horizon for a moment.... What happens to systems when the business changes? What happens to architectures or Data Models to be precise? What happens to the business processing (code) built on the source capture systems? They all change - but do they change consistently, repeatably, and can the affect of the change be measured prior to being implemented?
Most of the time (today) the answer is no, there must be some level of human intelligence involved to figure out where the impact is, how big of an impact it is, where the changes need to be made, and then they architect a patch, a band-aid, a new section of code, or a complete rewrite to serve the needs. Well, this _process_ happens over and over and over again. Cost, Risk, and Mitigation analysis. We can safely assume that for most of the changes happening within standard operations that they can be "graded" in accordance with location, risk, and impact. We can safely say that impact measurement can be automatically determined by examining METADATA, or ontologies of metadata which define the pre-existing relationships and associations of that metadata for our tools.
When I see a new element being added to a table in the source databases, I will typically make the assumption that it must be related to the source key from the table in which it is being added to, otherwise they would not have put the attribute there in the first place. Therefore there is no reason why I shouldn't be able to construct a program that recognizes the new element, where it appears on the source - and can "grade" it according to it's impact, and our downstream models' ability to handle the new element dynamically.
In other words, much like applying temperature of data, I maintain that Architecture Mining (or metadata mining or Mining of Ontology trees) can lead us to mathematical results that can apply temperature ratings to impacts of schema changes. What I am saying is that new attributes that appear within XML, XSD, XSL, object, web-services, or table structures can be run through these algorithms, and assigned a green, yellow or red flag based on the impending confidence that the change is "easy with low impact", "somewhat challenging, but we are confident", or "too difficult to achieve without human intervention."
I further assert that these temperature ratings can be placed on a gradient scale, so that alarms won't be raised unnecessarily. Like any A.I. or Neural Network it would have to be trained, and occasionally re-trained or corrected; and there might be the occasional false positives to deal with, but that sure beats adding 150 attributes to a table by hand, just because the ERP coder was up all night implementing that into the source system.
This is only one piece of Dynamic Data Warehousing, this is the structural change adaptation of Dynamic Data Warehousing. I think that we cannot achieve true DDW without this component working first. After the structures are changing seamlessly, we can begin to work on automating the adaptation of views, reports, mart loads, and processing routines.
Do you have any futuristic thoughts about DDW? I'd like to hear about it.
You can get a Masters of Science in BI at Daniels College of Business, Denver U. http://www.COBICC.org
Posted June 7, 2007 9:46 PM
Permalink | 2 Comments |