Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Hello BI Community, it's been quite a number of years since I blogged last on this platform, sorry for such a long silence.  I've been off building Data Vault Communities around the world.  If you aren't familiar with the Data Vault Model, Methodology, and Architecture for Data Warehousing and Business Intelligence, then wander over to http://YouTube.com/LearnDataVault for a better look.

I'm here to chat with the executives, directors, and managers out there.  There is HUGE value in Data Vault 2.0.    In this entry I'll uncover a short discussion that introduces Data Vault to you.

What is Data Vault in the first place?
Data Vault is "Common Foundational Warehouse Modeling Architecture".  In other words: it is a system of business intelligence.

What is Data Vault 1.0?
Data Vault 1.0 is just a Data Vault Model -a Hub and spoke model to be exact, focused on scalability, flexibility, and extensibility.  The model is highly focused at relational database management systems (RDBMS), and utilizes sequence numbers  as it's primary key for join performance. 

Why Data Vault 2.0?
The market is moving, shifting.  We all see it: Hadoop, NoSQL, Big Data, Unstructured Data, Multi-structured data, Hive, Map Reduce, "R", and then vendors like Cloudera, MapR, Horton Works, etc..  But it's not just the data storage that is shifting.  In the implementation space we have maturity cycles, best practices, and better faster cheaper; resulting in Agile, SCRUM, Six Sigma, TQM, and CMMI for Data Warehousing and BI.

Data Vault 2.0 expanded first with the model.  The data model needed to shift.  Nearly every "MPP platform" will hash data sets in order to distribute them for parallel operations, across the MPP environment.  Sequence numbers cause bottlenecks that cost too much in load cycles, and dependencies, and cause hot-spots on query when getting data out. 

Furthermore, sequences are not cross-platform.  Meaning they don't work in Hadoop or NoSQL environments seamlessly.  So, the first thing that has been done, is a change to the Data Vault Model: sequences have been replaced with Hash Keys.

Ok, but what about the other components?
Data Vault 2.0 has expanded, and now includes practitioner best practices, ie: implementation and methodology pieces.  We have taken the best practices from CMMI, Six Sigma, TQM, Managed Self-Service BI, SCRUM, and Agile, and applied them with major impact to the process of building business intelligence systems;  the end results speak for themselves.  We now have teams of 5 or more people answering business requirements in a mature build cycle consisting of 2 week sprints (around the world) because of Data Vault 2.0.

Another question came to bear: the hybrid NoSQL environment is just now emerging, the technology is brand new, and shifting ALL the time (every 3 months or so there are fundamental core changes).  How do you leverage this technology in conjunction with your existing investment in RDBMS (relational technology)?  In other words, how do you BUILD a hybrid solution?

Obviously the last thing you want to do is drop your entire investment in relational technology that you just finished paying for over the last 5 years...

Data  Vault 2.0 architecture works together with Data Vault 2.0 model to help you achieve these goals.  You can leverage the architecture, the model changes, and the implementation best practices to "build-out" a Hadoop (or vendor provided solution) along side your current relational platform.

In the end, Data Vault 2.0 is a very powerful tool, it can really make your enterprise BI projects work successfully, and provides your team with the agility and scale they need to solve problems NOW, with interaction points across Agile 2 week deliveries, so that you can correct the course of your efforts along the way.  Don't take my word for it, read the success story I have available here:  http://www.datavaultcertification.com/

It discusses a Kick Start package that I completed with QSuper (Queensland Super Fund Investments) for Australian Government workers.

If you want to know more please contact me through http://LearnDataVault.com

Thanks,
Dan Linstedt

Posted August 11, 2014 2:55 PM
Permalink | No Comments |

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›