I'm here to chat with the executives, directors, and managers out there. There is HUGE value in Data Vault 2.0. In this entry I'll uncover a short discussion that introduces Data Vault to you.
What is Data Vault in the first place?
Data Vault is "Common Foundational Warehouse Modeling Architecture". In other words: it is a system of business intelligence.
What is Data Vault 1.0?
Data Vault 1.0 is just a Data Vault Model -a Hub and spoke model to be exact, focused on scalability, flexibility, and extensibility. The model is highly focused at relational database management systems (RDBMS), and utilizes sequence numbers as it's primary key for join performance.
Why Data Vault 2.0?
The market is moving, shifting. We all see it: Hadoop, NoSQL, Big Data, Unstructured Data, Multi-structured data, Hive, Map Reduce, "R", and then vendors like Cloudera, MapR, Horton Works, etc.. But it's not just the data storage that is shifting. In the implementation space we have maturity cycles, best practices, and better faster cheaper; resulting in Agile, SCRUM, Six Sigma, TQM, and CMMI for Data Warehousing and BI.
Data Vault 2.0 expanded first with the model. The data model needed to shift. Nearly every "MPP platform" will hash data sets in order to distribute them for parallel operations, across the MPP environment. Sequence numbers cause bottlenecks that cost too much in load cycles, and dependencies, and cause hot-spots on query when getting data out.
Furthermore, sequences are not cross-platform. Meaning they don't work in Hadoop or NoSQL environments seamlessly. So, the first thing that has been done, is a change to the Data Vault Model: sequences have been replaced with Hash Keys.
Ok, but what about the other components?
Data Vault 2.0 has expanded, and now includes practitioner best practices, ie: implementation and methodology pieces. We have taken the best practices from CMMI, Six Sigma, TQM, Managed Self-Service BI, SCRUM, and Agile, and applied them with major impact to the process of building business intelligence systems; the end results speak for themselves. We now have teams of 5 or more people answering business requirements in a mature build cycle consisting of 2 week sprints (around the world) because of Data Vault 2.0.
Another question came to bear: the hybrid NoSQL environment is just now emerging, the technology is brand new, and shifting ALL the time (every 3 months or so there are fundamental core changes). How do you leverage this technology in conjunction with your existing investment in RDBMS (relational technology)? In other words, how do you BUILD a hybrid solution?
Obviously the last thing you want to do is drop your entire investment in relational technology that you just finished paying for over the last 5 years...
Data Vault 2.0 architecture works together with Data Vault 2.0 model to help you achieve these goals. You can leverage the architecture, the model changes, and the implementation best practices to "build-out" a Hadoop (or vendor provided solution) along side your current relational platform.
In the end, Data Vault 2.0 is a very powerful tool, it can really make your enterprise BI projects work successfully, and provides your team with the agility and scale they need to solve problems NOW, with interaction points across Agile 2 week deliveries, so that you can correct the course of your efforts along the way. Don't take my word for it, read the success story I have available here: http://www.datavaultcertification.com/
It discusses a Kick Start package that I completed with QSuper (Queensland Super Fund Investments) for Australian Government workers.
If you want to know more please contact me through http://LearnDataVault.com
Posted August 11, 2014 2:55 PM
Permalink | No Comments |