Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

December 2008 Archives

This has nothing to do "really" with BI, but on the other hand, we nearly all use Windows at some point in our career, and over the years I have latched on to some vital programs that I really like to use to keep my machine running efficiently. They are mostly fairly cheap, and work really really well.

Disk Defragmentation:
VOPT @ http://www.vopt.com - best program I've had in years, runs in assembly, and is fast, and will even defrag your windows pagefile

Registry fixers:
CCleaner - open source, http://www.ccleaner.com - really good at cleaning up components, missing files, extensions, etc...
WinASO - http://www.winASO.com - really really good at fixing problems in the registry, will even compact the registry for you.

Manage your DRIVERS!
Driver Detective: http://www.drivershq.com/ the BEST driver software for your windows boxes that I've seen in many years, fixed problems on my old XP box too!

My personal favorites for Anti-virus and software based fire-walls:
kaspersky - http://www.kaspersky.com - doesn't invade your machine the way some other virus checkers do.
ZoneAlarm - http://www.zonealarm.com - really good, but requires quite a bit of training before it can be super effective.

So there you have it, some simple tools, really great pieces of software to own and not too expensive.

Enjoy!
Dan L


Posted December 18, 2008 3:44 PM
Permalink | No Comments |

In this day and age everyone is cutting costs, every customer and corporate client is looking for ways and means to become lean and efficient. I've heard a lot about disparate "enterprise data junkyards" recently, especially when it comes to stove-piped solutions involving star schemas as an EDW causing problems with IT agility. As a result of problems with IT agility in the area of EDW/BI processing, business users continue to build spread-marts (access databases, along side of complicated excel spreadsheets). In this entry we will explore this phenomenon, and discuss what executives and business users can do about this growing issue.

It's no secret that the entire world is suffering from a financial "crises". Every company is struggling to make profits, and keep their work-force. But out of "crises" I believe comes opportunity, opportunity to break the molds, bust up the "old way" of doing business, break down the barriers to entry, and the "not invented here" syndrome. Companies and business users must seek new ways of doing business in order to compete, and to stay in business.

The pressure for companies to become more agile means enterprise IT has to become more agile too. Companies must quickly redirect IT resources and efforts to compete effectively in an increasingly competitive global marketplace.
http://esj.com/Enterprise/article.aspx?EditorialsID=2135

This is especially true around current generation 1 EDW / BI projects. The problems are currently evident everywhere we look. Generation 1 EDW's have been built around the notion of stove-piped answer sets (a set of answers for sales, another for HR, another for finance, and yet another for...) you name it, IT has built it somewhere for a _specific_ business unit. The end result is that IT calls this an EDW (loosely affiliated star schemas). Business users continue to request changes, and continue to receive ever-increasing costs, and ever-increasing time to implement from IT.

Business sees this as IT non-agility. At some point the business begins to tell IT: you cost too much for a new subject oriented star, you take to long to integrate my changes, and they (business) run off to create their own "marts"/EDW like structures in Microsoft Access Databases, and Excel Spreadsheets. These are what we call "spread-marts". Eventually the corporation bears the brunt of this directly. Business users eventually "toss" the spread-mart over the wall to IT to handle, and "integrate" with the existing data mart solutions.

* Costs of maintenance steadily rise out of control (for IT to keep up and maintain all the different components)
* Backward compatibility, and integration of new spread-marts requires re-engineering of existing load cycles into a number of different star schemas.
* Business ends up with disparate answer sets
* Staging areas turn into pseudo-warehouses because IT must put history in to the staging areas to satisfy compliance initiatives.

The largest problems that face business are:
1) The cost of "Re-engineering" existing conformed dimensions rises out of control as "more and more conformity" is stuffed in to ever increasing complex load routines
2) The cost of "maintaining" multiple systems for different star-schemas rises out of control, and the time to implement "re-usable components" (conformed dimensions/federated marts) becomes unbearable.

All of this occurs because the WRONG DATA MODEL has been chosen for implementation purposes within an EDW vision. IT is then seen as "slow to respond", or "costing too much" to implement a solution" both of which lead business down the path of creating their own copies of data sets for BI analytics purposes. This is a serious lack of IT agility.

Looking at how long it takes to make required changes or enhancement from start to finish—even when some of the time lapse is outside the direct control of enterprise IT—gives the best picture of enterprise IT agility. [...] Cost : Obviously, time should be tracked as part of a measure of enterprise IT agility, but why track cost?

The answer is simple. Committing extra resources or dollars to reduce elapsed time isn’t a good solution to the agility challenge. Paying a premium to reduce elapsed time might be practical under certain circumstances, but spending extra money to “buy” agility on a regular basis may not be a good investment.

http://esj.com/Enterprise/article.aspx?EditorialsID=2135

Well, take heart. There is a solution out there. Please note: I'm not here to tell you Star Schemas are bad, quite the contrary. Star Schemas are awesome tools for OLAP and drill down, and discovery analysis. Star schemas should be used as Data Mart architectures, and should NOT be used for enterprise data warehousing architectures.

The data model chosen to act as the EDW is at the heart of the success or failure in IT agility within a BI project.

MODELS AFFECT IT AGILITY - CHOOSING THE RIGHT MODEL/ARCHITECTURE FOR THE RIGHT JOB IS CRITICAL.

Based on the process maps, we used data modeling to define the logical data model for the system database. Once we knew the new business processes and the data model, we were able to create a prototype of the user interface and the technical architecture for the system.
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=112307

Do you (business user) find yourself asking IT "please deliver this change to the 'EDW'" (fully expecting a 90 day time box deliverable for $125k) only to find out that IT comes back and says: well, the EDW/BI system requires retrofitting/re-engineering, so it will take 250 days, and cost $500k (because we must integrate this data into existing conformed dimensions) - causing the re-engineering. At this point do you say:

* Never mind, it costs too much... Hey, why don't you just "copy" the star schema models, change them, and build me my own...

* Never mind, it takes too long... Hey, I'm going to build this in Microsoft Access, and get my own data feeds to make this work.

If you are experiencing this, it's because the wrong architecture has been chosen for the EDW (not the data marts, but the enterprise data warehouse) and it has reached it's limits for agility.

When we build systems, what we want is a sure fire way to deliver new data marts in about a 45 minute turn-around time (from the time the 2 page requirements documents hit my desk to the time the user has a sample row set to play with in a data mart) - providing of course that we already have the data in the EDW. This is IT Agility and responsiveness. The business user no longer has any reason to "roll their own".

So what's the secret? How can we return to a good solid system? How can IT get this accomplished especially given the pain of their existing "disparate and federated EDW" architecture?
Part of the secret is in fact the Data Vault Model and the CMM compatible Data Vault Methodology. Companies all over the world are actually seeing huge agility gains by implementing these components. The Data Vault model is freely available (just like 3rd normal form and Star Schema) - you can read about it on www.TDAN.com. We are doing work with intelligence agencies, government agencies, and commercial industries (very large companies) that are proving this today.

An agile IT organization can lower its operating costs, can improve overall customer service, and can find new revenue opportunities. On the other hand, things such a disconnect between IT partners and the business, poor project management, and a large investment in legacy systems can deter IT from becoming agile.
http://enterpriseleadership.org/content.php?cid=1838

This goes back to a differentiation between the definition of an Enterprise Data Warehouse and Data Marts (or data release areas).

"You can catch all the minnows in the ocean and stack them together and they still do not make a whale," Bill Inmon, January 8, 1998.

You can also read my other entry on agility here: www.b-eye-network.com/blogs/linstedt/archives/2008/08/it_agility_and_1.php

Conclusion:
The minnows being data marts cobbled together in an attempt to solve agility problems. Today these systems are breaking down, and business users are losing (or have lost) faith in IT's ability to respond in a timely cost effective manner. IT needs to get back on track, the NEED to be able to create new solutions, change with the business, get costs under control in the EDW. They MUST be flexible, scalable, and AUDITABLE all at the same time. They need to choose the right architecture for the job.

The Data Vault modeling techniques were created in 1990, and released to the public in 2001. They are currently in use at the Belastingdienst (netherlands tax authority), Central Bureau for Statistics in the Netherlands, SNS Bank, Diamler-Chrylser, Hypotheker (netherlands) , Oil & Gas companies in Canada, Banks around the globe, Food & Drug Administration, Federal Aviation Administration, and a number of other large institutions around the world. The benefits are clear, the time is now. Find out how to regain your agility and make business users happy again.

As always, I would love to hear from you.

Dan Linstedt
danL@DanLinstedt.com


Posted December 18, 2008 3:01 AM
Permalink | No Comments |