We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

In my blog: "Stuck in 1985", I discuss the nature of graphing, and how I believe the current BI Reporting vendors aren't doing enough to represent the data for visual recognition. There's a flip side or an underside to this current as well. The question I'm driving here is: Is accurate data visualization driven by data modeling architecture of the warehouse behind the scenes?

I would tend to say YES, it is. In this blog, we explore this notion a bit more in depth. Take a look and let me know what you think...

I begin with directing towards visualization tools, just as I directed towards graphing tools in the last round. In this particular case, there's an Open Source data visualization component called OpenVis from IBM. In the enhanced data model section it discusses how the "data model" plays a critical role in the visualization capabilities. With OpenVis, apparently the data model is an object-oriented component. Here, they discuss the the details of the data model in action.

I believe that data modeling is a key to open many doors. The data model should be consistent (in architecture), repeatable, redundant, and flexible to change (without restating data). In this case, the components or entity types should be standardized beyond just "parent child". In order to gain some sort of 2-dimensional understanding of a data model, patterns within the data model itself must be easily recognized.

“If we assume that the viewer is an expert in the subject area but not data modeling, we must translate the model into a more natural representation for them. For this purpose we suggest the use of orienteering principles as a template for our visualizations.” http://www.thearling.com/text/dmviz/modelviz.htm

In this case, orienteering is the use of “anchor points” like a 3d Landscape where we anchor ourselves to visual queues, street corners, addresses, height of buildings, etc. In the data modeling case, orienteering could easily mean data points treated as geographical or spatial coordinates. In other words, the data model can be capable of driving multi-axis (multi-dimensional) graphing qualities; in fact, I blogged about this earlier.

Here is a very interesting knowledge portal used to visualize information in a moving format (theBrain). Here the data model is virtual – embedded in the software’s reference layer to the content it collects. It reflects a neural net behind the scenes. What if we were to extrapolate the notions behind neural net? What if we were to over-simplify the representation of information in a standardized data modeling format? Would we be better equipped to visualize and mine the information in its native stored format?

I’ve attempted to do just that with the Data Vault data modeling architecture. It’s a standardized set of entity types that represent a poor mans neural net. It provides a two-dimensional data storage space with the capacity for N dimensional bisection/associations based on the physical data stored within the entities. It is based on the business keys and semantic definition of those business keys, along with the grain of those keys. In this manner – grain might be considered one dimension; semantic definition could be another dimension. Within the model we can add gradient and mechanical relevance scores to assist in defining associative properties between elements. In turn, it becomes easier to represent this information in a 3D modeling format, where the data can be visualized and explored on (for instance) landscape maps.

I believe that the key to visualization, and better understanding of our information relies heavily on the architecture or data model housing that information. You can read more about the Data Vault Here...

Posted September 8, 2005 8:37 AM
Permalink | 1 Comment |

1 Comment

What is star schema and snowflake schema

Leave a comment

Search this blog
Categories ›
Archives ›
Recent Entries ›