Blog: Dan E. Linstedt Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt, Copyright 2019 Mon, 11 Aug 2014 14:55:19 -0700 Data Vault 2.0, Big Data & NoSQL for a better look.

I'm here to chat with the executives, directors, and managers out there.  There is HUGE value in Data Vault 2.0.    In this entry I'll uncover a short discussion that introduces Data Vault to you.

What is Data Vault in the first place?
Data Vault is "Common Foundational Warehouse Modeling Architecture".  In other words: it is a system of business intelligence.

What is Data Vault 1.0?
Data Vault 1.0 is just a Data Vault Model -a Hub and spoke model to be exact, focused on scalability, flexibility, and extensibility.  The model is highly focused at relational database management systems (RDBMS), and utilizes sequence numbers  as it's primary key for join performance. 

Why Data Vault 2.0?
The market is moving, shifting.  We all see it: Hadoop, NoSQL, Big Data, Unstructured Data, Multi-structured data, Hive, Map Reduce, "R", and then vendors like Cloudera, MapR, Horton Works, etc..  But it's not just the data storage that is shifting.  In the implementation space we have maturity cycles, best practices, and better faster cheaper; resulting in Agile, SCRUM, Six Sigma, TQM, and CMMI for Data Warehousing and BI.

Data Vault 2.0 expanded first with the model.  The data model needed to shift.  Nearly every "MPP platform" will hash data sets in order to distribute them for parallel operations, across the MPP environment.  Sequence numbers cause bottlenecks that cost too much in load cycles, and dependencies, and cause hot-spots on query when getting data out. 

Furthermore, sequences are not cross-platform.  Meaning they don't work in Hadoop or NoSQL environments seamlessly.  So, the first thing that has been done, is a change to the Data Vault Model: sequences have been replaced with Hash Keys.

Ok, but what about the other components?
Data Vault 2.0 has expanded, and now includes practitioner best practices, ie: implementation and methodology pieces.  We have taken the best practices from CMMI, Six Sigma, TQM, Managed Self-Service BI, SCRUM, and Agile, and applied them with major impact to the process of building business intelligence systems;  the end results speak for themselves.  We now have teams of 5 or more people answering business requirements in a mature build cycle consisting of 2 week sprints (around the world) because of Data Vault 2.0.

Another question came to bear: the hybrid NoSQL environment is just now emerging, the technology is brand new, and shifting ALL the time (every 3 months or so there are fundamental core changes).  How do you leverage this technology in conjunction with your existing investment in RDBMS (relational technology)?  In other words, how do you BUILD a hybrid solution?

Obviously the last thing you want to do is drop your entire investment in relational technology that you just finished paying for over the last 5 years...

Data  Vault 2.0 architecture works together with Data Vault 2.0 model to help you achieve these goals.  You can leverage the architecture, the model changes, and the implementation best practices to "build-out" a Hadoop (or vendor provided solution) along side your current relational platform.

In the end, Data Vault 2.0 is a very powerful tool, it can really make your enterprise BI projects work successfully, and provides your team with the agility and scale they need to solve problems NOW, with interaction points across Agile 2 week deliveries, so that you can correct the course of your efforts along the way.  Don't take my word for it, read the success story I have available here:

It discusses a Kick Start package that I completed with QSuper (Queensland Super Fund Investments) for Australian Government workers.

If you want to know more please contact me through

Dan Linstedt
]]> Mon, 11 Aug 2014 14:55:19 -0700
Data Protection in Cloud and Hosted environments Thought Experiments Tue, 30 Mar 2010 15:05:55 -0700 Vendor of Business Metadata & Technical Metadata I've just reviewed some technology from a vendor who manages business and technical metadata as a service platform.  Let me say: I'm impressed.  There are many current issues these days with different BI and EDW implementations, specifically around managing, entering, and governing the business and technical metadata.  For years I've found myself using Excel and Word to handle these tasks; while great tools - there are some problems for the Data Steward in the form of governance, management, and usefulness of metadata.   In this entry, I'll discuss a new solution from IData called the Data Cookbook.

]]> BI Vendors Tue, 30 Mar 2010 04:10:49 -0700
Data Vault Modeling & MPP Sorry folks, this one is a shameless plug (but I am trying to separate concepts and ideas).

I've decided to give my personal web-site a new face-lift. and in doing so, have begun posting all-things Data Vault modeling and Methodology related there.  So if you're interested in reading new posts about the Data Vault, please see:

I will reserve this blog for posting about Cloud Computing, MPP architectures in general, scalability, performance & tuning, vendors, database engines, data warehousing architecture, compliance, etc.. that I see making a difference in the BI / EDW world.

If you have an idea, or want to see me post about a specific subject, please reply with your comments here.

Thank-you kindly,
Daniel Linstedt

]]> Business Intelligence Thu, 25 Mar 2010 04:51:35 -0700
Part 2: Data Vault What Is it? Welcome to the next installment of Data Vault Modeling and Methodology.  In this entry I will attempt to address the comment I received on the last entry surrounding Data Vault and Master Data.  I will continue posting as much information as I can to help spread the knowledge for those of you still questioning and considering the Data Vault.  I will also try to share more success stories as we go, as much of my industry knowledge has been accrued in the field - actually building systems that have turned in to successes over the years.

Ok, let's discuss the health-care provider space, conceptually managed data and master data sets, and a few other things along the way.

]]> Business Intelligence Mon, 15 Mar 2010 18:39:56 -0700
Data Vault - what is it Exactly? Most of you by now have heard the words: "Data Vault".  When you run it through your favorite search engine you get all kinds of different hits/definitions.  No surprise.  So what is it that I'm referring to when I discuss "Data Vault" with BI and EDW audiences?

This entry will try to answer such basic questions, just to provide a foundation of knowledge with which to build your fact finding on.

]]> Business Intelligence Sun, 14 Mar 2010 20:52:24 -0700
BI and Cloud Computing - security, privacy, where we go There are many things that come to mind when reading the title of this entry, it's a HUGE space with even larger prospects - from the app servers to the databases, from the tips of BI reporting all the way to ethics, security and privacy laws....  And then there's the dreaded: "What if the company supporting my current cloud apps & data fails?"

Hmm, in this entry we will explore the tip of the iceberg as it were, and explore some of the notions to consider when looking at Business Intelligence and Data Warehousing in the cloud...  Why? Because the CLOUD is big, and getting bigger - and because it IS a central and important part of technology evolution in 2010.

]]> Business Intelligence Wed, 03 Mar 2010 17:55:46 -0700
EDW/BI - What's Next in 2010? Business Intelligence Thu, 12 Nov 2009 03:15:27 -0700 Operational DW + Operational BI = System of Record Ok folks, I'm back.  I've spent the last six months reading, writing (and arithmetic - that's a joke... ha-ha...)  seriously, implementing solutions, visiting customer sites, and seeing what's going on in the world; and to that end, the DW/BI world is changing.  You've seen me write about this before, but now it's on-top of your head.

Like it or not...  Your Data Warehouse IS a System Of Record, and if it isn't in 2009, it WILL BE before the end of 2010.  So what does that mean?

]]> Business Intelligence Wed, 28 Oct 2009 00:46:30 -0700
BI/EDW is Shifting - future state of 2010 There are many trends afoot in the industry, but I think there are two or three that are here to stay, and quite frankly, they will make a huge impact on the way BI/EDW is done. Some of the technologies that are ground-shaking and game-changing include Solid State Disk, Database appliances on the Cloud, and data services on the cloud.  I've been watching the Database market now for several years, and here are a few of my predictions for 2010.

]]> BI Vendors Wed, 28 Oct 2009 00:03:46 -0700
Master Data / Metadata and Ontologies Business Intelligence Fri, 24 Jul 2009 04:27:43 -0700 Industry Logical Data Models... What do you think about them? Business Intelligence Fri, 22 May 2009 04:26:51 -0700 Part 10: Secrets of the Masters, Dealing with Real-Time Data The market has been asking for EDW's to deal with more and more real-time based data.  IT on the other hand has become "slower and less agile" as their current system of federated data marts gets larger and larger.  In this entry we will deal with some of the issues, some of the questions, and of course offer an opinion into the insight of dealing with true real-time data sets arriving at the doorstep of the EDW.

]]> Business Intelligence Sat, 09 May 2009 05:25:42 -0700
When does a Vendor have too much power over the Analyst? I've always felt that this blog (with the agreement from Bill Inmon, Shawn Rogers, and Ron Powell) is a place for me to express my guarded, guided, and best possible opinion about vendors, the industry, industry direction.  I've also long believed that vendors DO HAVE GREAT PRODUCTS, but sometimes their marketing and sales campaigns get a little over-zelous and advertise "features" that simply arent' quite true, or that the product doesn't behave in the situations they claim it does.

Recently however, I've been getting "flack" from certain areas of the industry because some of the vendors who read my blog don't like what I'm saying.  They are exerting indirect pressure on my friends and industry affiliations to "disconnect me, take me off the map." Claiming I have no right to share my information on this blog, and questioning if I should even be an industry analyst in the first place.

This entry is a question to all my readers (vendors included), and I hope you will respond with your thoughts and comments.

]]> Thu, 07 May 2009 08:39:45 -0700
Governance, Compliance, and Accountability There are many definitions of governance, compliance, and accountability in the industry, and it seems as though many of us are struggling to define it for the EDW/BI space in some generically acceptable way.  As I've recently researched these subjects (and been involved in them for years) I've noticed a trend that many of the definitions are vertically challenged (if you will).  They focus on a vertical industry rather than on horizontal enterprise data warehousing. 

In this entry I'll add my two cents to this noise as just another voice of opinion on these subjects.

]]> Compliance and Integration Wed, 06 May 2009 06:51:41 -0700