Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

February 2009 Archives

Very interesting conversations I had with Verizon Wireless tonight.  The fact that they insist on charging me over $200 for two cell phones plus an early termination fee of over $100 makes me sick.  Especially after the fact that I had cancelled one of the phones in October 2008, and the other one - I thought I had requested a plan change to $24 a month....  We'll let me tell you that the lack of BI and customer service is alive and well with the big corporations, never mind the small guy, never mind the fact that I've paid them tons of money for the past 10 years, never mind that I've been a loyal customer!

I have to wonder, with all these enterprise data warehouse initiatives, and business intelligence: WHO is really getting "smarter"?  and with all these governance, compliance and accountability questions: WHY don't they disclose all the information they do have in their systems?

Verizon Wireless is taking me for a ride, to the tune of $400.  They are charging me for January and Februrary usages (which bill through to March I was told) when I'm asking today, that they cut the services, and clearly - looking at the bills for the last two months, I made ZERO calls.  Their business intelligence is lacking.  I feel as though the "big-guy" is using their one-sided notes to claim money from me that is unjust and unfair.

Quite honestly, I asked for the phone to be cut in October - but the supervisor swore to me that she couldn't do anything.  Why?  because "she didn't see it in her notes".  Of course she didn't see it in the notes, those notes of the call were taken by a verizon rep in October - who really didn't want to report the fact that a customer (me) wanted to cut services!

So, instead, she continued to use the phrase: "I know what you're going through, I empathize with you..." and on and on she drolled.  Well, empathy doesn't help.  The big corporation is taking me to the cleaners.

I would highly recommend to anyone out there to "beware" of the contracts you sign, as I have no legal recourse against Verizon at this point, and am stuck paying the bill.  I would also recommend to all my readers: itemize your bills carefully, watch every charge.

I honestly thought that because I was "party" to the conversation, and because I was the "customer" that this could be worked out, but nope - they wouldn't budge.   I wonder, just what do these big corporations do for their largest customers when someone from these accounts calls and says: "those charges last month, they were wrong, can you please remove them?"

One of the other pieces I can't stand here is the fact that I was party to the call, both tonight and in October.  My notes differ (my version of the truth) from what their "representative" typed in to their computer.  It's a one-sided story, and by the way, she flat told me there's nothing she can do for me because "the notes on the account did _not_ reflect my request to cancel service in October."  Who's account is this anyway?  Sounds to me like Verizon wants to believe they own this account no matter what.

Does Customer Service, Business Intelligence, and Compliance and Auditability ring true for these big corporations?

The problem is: their business intelligence told the rep NOTHING about my loyalty, to the rep NOTHING about how much money I've paid their company over the life of my account, told them NOTHING about how "good" I was in always paying my bills on-time... Or maybe it did, and she chose to ignore it and read me the riot-script anyway...  At the end of the day, the little guy loses, and the big guy wins.  What kind of world is this?  I'm in this game to HELP people not hurt them, seems the other way around for Verizon.

Anybody else have a "bad BI experience" they want to share?

Cheers,
Dan Linstedt


Posted February 12, 2009 7:00 PM
Permalink | 3 Comments |

This entry is a candid look (opinionated mind you) at what I see as the future of transformations themselves.  We will cross several subjects in this entry, as it is meant to be a look at where transformations currently happen, where they need to happen, and what's actually happening in the market place.

ETL or Extract, Transform and Load has been around a long long time.  ELT (or sometimes referred to as in-database, or push-down) is new to the ETL vendor world, but a very old concept.  On the other hand, RDBMS vendors have heard the cry and have responded by continually adding new features and functionality to in-database transformation logic.

Now, enter real-time.  Ok, EAI (enterprise application integration) and message queuing - both have been around a long time too, they are also growing and changing.  Then along came BPM (business process management) which changed or morphed into BPEL (business process execution language) and BPW (business process workflows).  All of which to engage real-time flows and manage transactions at the user level.  Oh yea, I almost forgot: the middle tier technology known as EII (enterprise information integration) which never really caught on, but is valuable (none-the-less) when embedded in other technologies like web-services and SOA.

Down to brass tacks...

When we look at what's around the corner we have to ask ourselves the following questions:

1.     What does compliance and auditability mean to our transformation efforts?

2.     What really and truly is so difficult about transforming the data?

3.     What do some of these complex business rules look like in transformation logic?

4.     WHY do we fundamentally rely on machines and programmatic (static rules) to alter data sets?  In other words, why do we "write" rules into SQL or transformation logic to make data "usable" by the business?

5.     Just what is considered "usable data" anyway?

Ok, enough of the esoterical stuff - I just thought we needed to ask these questions, of course - if you have concrete answers, I'd love to hear them in your replies to this blog entry.  Now, on to more serious stuff...  where is transformation going to happen?  Especially given ever-growing data sets, and ever-decreasing latency of arrival...

I would argue that ETL is still partially viable, however their comes a time when transformation in-stream simply falls down, no longer feasible to execute.  ESPECIALLY when loading data from the source systems IN to the EDW.  However, the exception to this rule is when the application is encoded directly on top of the business process rules application - or the business workflow management system.  THEN, as the data is entered and submitted to the application, the data is "edited" or transformed before placing it on the transaction stream.

Likewise this might occur over web-services and streaming services for data sets.

Now this raises the question again: WHAT exactly is auditable data?  WHEN is it compliant or auditable? even for the operational systems?  Is it when the user enters the data on the screen?  is it when it's first captured by the transaction system?

Ok - back to brass tacks.

In order to handle volumes of data in the EDW (flowing in and out), and decrease loading cycle times, it is absolutely imperative that the business rules or transformation logic be moved downstream of the EDW.  That it *NOT* be placed upstream between the source system and the staging area or EDW (as generally architected).  This causes significant re-engineering costs to be incurred, and creates an ETL bottleneck with larger data sets.

Some of this bottleneck is solved through larger hardware or 64 bit systems.  HOWEVER that's not enough anymore.

So what are you saying?

By moving the transformations downstream of the EDW, (between the EDW and the data marts) we now have created an architectural OPTION.  We can now CHOOSE to use ETL or ELT and leverage the RDBMS for transformation.  Especially if both the EDW and the data marts reside on the same database instance.  This allows us to apply the technology in the right place at the right time.  Furthermore it makes the data in the EDW more "compliant and auditable" because it is not subject to change before loading.  (see http://www.DataVaultInstitute.com for more information).

Alright - the future stuff... so what do we need from ETL "vendors" in the future?

* ETL vendors must support both, ETL and ELT (in-database)

* Fully configurable temporary tables, block style processing, in-database control - all from an ETL metadata and visual GUI perspective

* FULL 100% push-down must be supported, and if "EL" needs to be added to the chain, so be it - the ETL tool will automatically set that up, and do it's best to provide 100% push-down where necessary.

* For advanced developers, the ability to control "HOW" the push down will be executed, will full over-rides and step by step debugging IN THE DATABASE.

* Many more, which I don't have time to post now...  these are the major ones.

What does this mean to the Database Vendors?

* Ever increasing support of "faster API calls"

* More parallel API calls

* dedicated "step-by-step" debugging interfaces

* a whole lot more in-core coded transformations and complex SQL statements

* MORE BATCH oriented SQL statements, where a "batch processing size" can be set, then the statements will manage themselves

* MORE interconnection (high speed) with remote database instances.

* MORE metadata

* inclusive of versioning of every single piece of executing code

* Versioning of the TABLE structures and INDEXES

* on-the-fly indexing

* Parallel index builds DURING high speed load or batch operations

* NO MORE "TABLE COPY SWITCHING" for high-volume and high-availability.

 

Please add some of your own thoughts to this party, I'd like to hear what you think.

As always,

Dan Linstedt
DanL@RapidACE.com  - check out a 3D Data Model Visualizer Demo!


Posted February 5, 2009 7:22 AM
Permalink | 1 Comment |

We live in a world where video delivery is becoming the norm.  Business users are getting tired of "bar-charts" and "standard reports".  They want interactivity.  While drill-down was an interesting development in interactivity, there doesn't seem to be any major advancement from the BI vendors in years.

With the advent of Flash-delivery, and Microsoft's new Silverlight platform, one would think that BI vendors would have had tremendous advances in technology recently, but no - we're still dealing with the old column based delivery mechanisms, and we think that Pivot tables are "cool"...  Man, we're stuck in the 80's here people...

I am learning Flash, along with SilverLight.  I'm also learning video, interactivity, dynamic graphics, movement, and so on.  Yea, yea - I can hear it now: that's old technology, web-designers have been doing this for years!  Yep...  I know, why then can't we build BI systems and dashboards that provide this kind of interface for our business users?

Some companies claim to handle queries on the backend against VLDW, but fall down when one of the tables has 1.5 billion rows in it.  Some companies claim the latest in "drill-down" technology.  Ok-fine and dandy.  Some companies claim the latest in 3D bar charts or live graphs!  Still some companies say: we integrate with xyz column and pixel positioning systems... uh-huh....  ok - let's get down to brass tacks:

* I agree we need to deliver valuable information in a format that most business users understand, but I also believe in the power of paradigm shifts.
* Where's the tie of the reporting/BI analytics to the business rules?
* Why can't I walk through my business rule processes in 3D (like a walk down a street) and see specific analytics that make sense to that area of business?
* Where are the truly interactive charts and graphs?

I've said it before, I'll say it again: Hire a game programmer to make BI/Analytics interesting, fun and maybe even addicting!  What?  And disturb the balance?  What balance?  What's the hotest selling game out there (according to informal fad's and polls and what I see selling)...  maybe Guitar Hero?  It's on all the platforms.  What does it do that makes you play it for hours on end?

INTERACT...  It gives you a set speed, a set song, a stage, and a fake guitar with 5 buttons on it - you have (what seems like) infinite combinations of notes and speeds of notes to place your fingers on the buttons.  Your skill level determines how fast the game goes.

Now there are more advanced games, like Warcraft, Doom, and so on that make use of more buttons, intellect, thinking, terrain changes (during the game).  And because you are playing against humans, you've got to be good, or get better.

Ok - so maybe the themes aren't right for BI and analytics, but jeepers creepers, when I open up an application and the data sits there - I feel like I'm sitting in an elevator in the 1970's listening to elevator music, waiting to push a floor button.  Dry, Dry, Dry... 

Why not put the "cubes" on a flash-carrousel?  Why not have the cubes visualized in 3D and inter-connected?  Why not display data in a 3D sound wave format, where the head-quarters is the center of the graph?  Why not be able to fly in to the graph as drill down, fly under, fly through - re-focus the graph on a live grid?  Why not use some scientific style graphing or themed graphing techniques to represent the business and the data in a metaphorical manner?

For instance, what if I'm in the oil & gas industry, and what if I represented my business data and profitability as a land-map graph?  What if oil wells represent business units, and can run-dry if they are not profitable?

I believe that there is power in metadata, and what if the metadata were metaphors for the business - the entire business?  Could we develop 3D visualization techniques across metaphors and make better use of business metadata?  YOU BET!   Ahh-but wait a minute, this might require the business to get better at building, managing, and governing the metadata.  Yep.

But as sure as I sit here, I can tell you - that in order for BI to "break out ot its' shell" and really become truly USED (no doubt it's useful), I believe that 3D visualization along themed game-play like consoles may be what's required.

I wonder what would happen if the worlds largest company commissioned someone like dream-works to develop an interactive scenario game, based on a metaphorical description of their business?

Just an idea folks... think about it.  Can BI be fun in the future?  I would like to think so.  If you've got some themes or ideas for specific lines of business, I'd love to hear about them.

Cheers,
Dan Linstedt
DanL@RapidACE.com
http://www.RapidACE.com - 3D Data Model Visualizer


Posted February 2, 2009 4:04 AM
Permalink | 2 Comments |