Blog: Dan E. Linstedt« Necessary Shifts in the Industry | Main | Why do business changes impact my EDW so much? » Federated Star Schemas Going Super-NovaAre you caught in the explosion of your life? Have you gotten to a point where federated star schemas aren't cutting it for Enterprise Data Warehousing? Is data volume pushing impacts on the star that can't be handled? What does a change to your single enterprise conformed dimension cost? How much time does it take, and how many impacts are there to integrate new systems or more data to your existing federated system? Star schemas are GREAT ARCHITECTURES for solving subject oriented answers, they were and are not designed to be enterprise data warehouses! They get to a certain point where they get in the way of being nimble, creating workable solutions for the business, or constrain the business (because of cost or time to implement - due to impacts of changes). The business is now caught, in a super-nova exploding star. Let's take a look to see what happened, and what we can do to "fix" this situation. Super-nova star schemas, ok: how did we get here? Ok, but I don't have a super-nova on my hands, my federated data warehouse is working just fine... Why should I look at this? Signs of an impending super-nova: * Business is constantly arguing over "what the conformed dimension means", causing either dimension-itis (too many dimensions), or duplicate copies of the data in the dimension, to be slightly tweaked to meet "individual needs." * Business is constantly building their own spread-marts to answer their questions because I.T. is now "frozen" by an inability to change the federated mart (too many users, too many dependencies, too high a cost, too large an impact). * Business is constantly building their own spread-marts because I.T. won't agree to provide data in a SUBJECT ORIENTED BUSINESS FOCUSED PURPOSE, because supposedly "it's already available in the enterprise federated star." * 160 to 300 columns "appear" in a dimension, making it near impossible to manage or to provide full value to the business going forward. * "Copies" of the federated stars have propogated through-out the enterprise and then have been subsequently altered to meet specific business needs, now - the answers from those different federated stars no longer are consistent. * The star has "JUNK" and "HELPER" tables embedded all over the place. What is a JUNK table anyway? And if it's JUNK, why is it in my model? Helper table? What's it helping? is it a band-aid for the architecture? * Volumes have outgrown hardware capabilities, pushing the purchase of an appliance which only staves off the inevitable. * Real-time processing conflicts with the batch processing * Real-Time processing forces UPDATES to the data set, or causes orphaned facts, or worse yet, orphaned dimensions with keys that simply never make it to the rest of the business, and cannot be used. * When was the last time that AUDITABLE and TRACEABLE data could be produced from your federated star-schema? Can you reproduce the source system data by reversing the algorithms (mathematically) to answer an auditors questions? Well.... Hmmmmm... What on earth are you talking about? Would you take a Porsche 911 and weld a trailer hitch to the back-end, to haul a camper up a 30 degree incline in 90 degree (farenheight) weather? COULD YOU DO THIS? YES, you can modify the ARCHITECTURE and adapt the car to actually do this, but what would happen? In this situation, you probably would blow out the motor right? Any time you "adapt" an architecture, or apply band-aids to make it fit where it wasn't designed to fit, it inherits limitations and eventually "blows apart." (the wheels fall off, the band-aids don't stick anymore, it becomes un-managable). Likewise would you take a Big-Rig, chop the top, leave the frame and motor in place, place a new fiberglass body on it and try to win an auto-race? No... Could you? Yes you could, but it doesn't make sense. The Data Vault (Common Foundational Enterprise Architecture) provides this capability as a mathematically designed solution built to suit the needs of enterprise data warehousing. Star Schemas WERE and ARE designed to be a SUBJECT ORIENTED ANSWER DATABASE for use in OLAP solutions... They work wonders in pivots, and OLAP type processing, and yes - the architecture is valuable when applied appropriately, but for our sakes, don't apply the modeling paradigm to be your enterprise data warehouse. Ok, so what is available? What can I do to stop the Super-Nova? I have one that I might end-up with, can I prototype something that will keep me from getting there? With a Data Vault as the EDW, you no longer need Type 2 Dimensions to house history. Also the need for Conformed Dimensions is no longer an absolute necessity, you can do it where it makes sense. It releases the impacts of making changes because "everyone and everything" is using a particular conformed dimension. Remember, like 3rd normal form or star-schema there is a full approach to using, building, implementing a Data Vault. More on that next time. Cheers, |