Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Hello BI Community, it's been quite a number of years since I blogged last on this platform, sorry for such a long silence.  I've been off building Data Vault Communities around the world.  If you aren't familiar with the Data Vault Model, Methodology, and Architecture for Data Warehousing and Business Intelligence, then wander over to http://YouTube.com/LearnDataVault for a better look.

I'm here to chat with the executives, directors, and managers out there.  There is HUGE value in Data Vault 2.0.    In this entry I'll uncover a short discussion that introduces Data Vault to you.

What is Data Vault in the first place?
Data Vault is "Common Foundational Warehouse Modeling Architecture".  In other words: it is a system of business intelligence.

What is Data Vault 1.0?
Data Vault 1.0 is just a Data Vault Model -a Hub and spoke model to be exact, focused on scalability, flexibility, and extensibility.  The model is highly focused at relational database management systems (RDBMS), and utilizes sequence numbers  as it's primary key for join performance. 

Why Data Vault 2.0?
The market is moving, shifting.  We all see it: Hadoop, NoSQL, Big Data, Unstructured Data, Multi-structured data, Hive, Map Reduce, "R", and then vendors like Cloudera, MapR, Horton Works, etc..  But it's not just the data storage that is shifting.  In the implementation space we have maturity cycles, best practices, and better faster cheaper; resulting in Agile, SCRUM, Six Sigma, TQM, and CMMI for Data Warehousing and BI.

Data Vault 2.0 expanded first with the model.  The data model needed to shift.  Nearly every "MPP platform" will hash data sets in order to distribute them for parallel operations, across the MPP environment.  Sequence numbers cause bottlenecks that cost too much in load cycles, and dependencies, and cause hot-spots on query when getting data out. 

Furthermore, sequences are not cross-platform.  Meaning they don't work in Hadoop or NoSQL environments seamlessly.  So, the first thing that has been done, is a change to the Data Vault Model: sequences have been replaced with Hash Keys.

Ok, but what about the other components?
Data Vault 2.0 has expanded, and now includes practitioner best practices, ie: implementation and methodology pieces.  We have taken the best practices from CMMI, Six Sigma, TQM, Managed Self-Service BI, SCRUM, and Agile, and applied them with major impact to the process of building business intelligence systems;  the end results speak for themselves.  We now have teams of 5 or more people answering business requirements in a mature build cycle consisting of 2 week sprints (around the world) because of Data Vault 2.0.

Another question came to bear: the hybrid NoSQL environment is just now emerging, the technology is brand new, and shifting ALL the time (every 3 months or so there are fundamental core changes).  How do you leverage this technology in conjunction with your existing investment in RDBMS (relational technology)?  In other words, how do you BUILD a hybrid solution?

Obviously the last thing you want to do is drop your entire investment in relational technology that you just finished paying for over the last 5 years...

Data  Vault 2.0 architecture works together with Data Vault 2.0 model to help you achieve these goals.  You can leverage the architecture, the model changes, and the implementation best practices to "build-out" a Hadoop (or vendor provided solution) along side your current relational platform.

In the end, Data Vault 2.0 is a very powerful tool, it can really make your enterprise BI projects work successfully, and provides your team with the agility and scale they need to solve problems NOW, with interaction points across Agile 2 week deliveries, so that you can correct the course of your efforts along the way.  Don't take my word for it, read the success story I have available here:  http://www.datavaultcertification.com/

It discusses a Kick Start package that I completed with QSuper (Queensland Super Fund Investments) for Australian Government workers.

If you want to know more please contact me through http://LearnDataVault.com

Thanks,
Dan Linstedt

Posted August 11, 2014 2:55 PM
Permalink | No Comments |
Let's face it, cloud computing, grid computing, ubiquitous computing platforms are here to stay.  More and more data mind you will make it's way on to these platforms, and enterprises will continue to find themselves in a world of hurt if they suffer from security breaches.  If we think today's hackers are bad, just wait...  they're after the motherload: all customer data, massive identity theft, etc...  I'm not usually one for doom and gloom, after all - we have good resources, excellent security and VPN and firewalls right?   In this entry we'll explore the notion of what it *might* take to protect your data in a cloud/distributed or hosted environment.  It's a thought provoking future experiment - maybe it would take a black swan?

Posted March 30, 2010 3:05 PM
Permalink | No Comments |

I've just reviewed some technology from a vendor who manages business and technical metadata as a service platform.  Let me say: I'm impressed.  There are many current issues these days with different BI and EDW implementations, specifically around managing, entering, and governing the business and technical metadata.  For years I've found myself using Excel and Word to handle these tasks; while great tools - there are some problems for the Data Steward in the form of governance, management, and usefulness of metadata.   In this entry, I'll discuss a new solution from IData called the Data Cookbook.


Posted March 30, 2010 4:10 AM
Permalink | No Comments |

Sorry folks, this one is a shameless plug (but I am trying to separate concepts and ideas).

I've decided to give my personal web-site a new face-lift. and in doing so, have begun posting all-things Data Vault modeling and Methodology related there.  So if you're interested in reading new posts about the Data Vault, please see: http://www.DanLinstedt.com

I will reserve this blog for posting about Cloud Computing, MPP architectures in general, scalability, performance & tuning, vendors, database engines, data warehousing architecture, compliance, etc.. that I see making a difference in the BI / EDW world.

If you have an idea, or want to see me post about a specific subject, please reply with your comments here.

Thank-you kindly,
Daniel Linstedt


Posted March 25, 2010 4:51 AM
Permalink | No Comments |

Welcome to the next installment of Data Vault Modeling and Methodology.  In this entry I will attempt to address the comment I received on the last entry surrounding Data Vault and Master Data.  I will continue posting as much information as I can to help spread the knowledge for those of you still questioning and considering the Data Vault.  I will also try to share more success stories as we go, as much of my industry knowledge has been accrued in the field - actually building systems that have turned in to successes over the years.

Ok, let's discuss the health-care provider space, conceptually managed data and master data sets, and a few other things along the way.


Posted March 15, 2010 6:39 PM
Permalink | No Comments |
PREV 1 2

Search this blog
Categories ›
Archives ›
Recent Entries ›