Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

May 2006 Archives

Well, an interesting question. I just gave a presentation at Informatica World 2006 on Data Administration, the roles, responsibilities, and possibilities that the DA must undertake and manage. If you're in a DA role today, I'd be very curious to hear some of the types of "dirty jobs" that you've had to do in this role. Coming from a government background, the DA I had on the team had an interesting time in life. In this entry I'll try to describe my impressions of the DA role, and how it's expanded today - and what they should be thinking about.

Data Administration is really a role with responsibilities. It may be seen as a job title for some, but in many of the data integration projects I've worked on, it includes bits and pieces of the following additional roles: Business Process Engineer, Data Manager, Data Steward, Data Quality Metrics Manager, Logical Data Modeler, Data Flow Designer, and so on.

The DA roles I've been involved with had to dig deep, they had to address the Data Flow portions of the business processes (there's another entry coming on defining business processes as they relate to Business Intelligence and data integration), they had to map these data flows in the business to the logical models, help define the metadata for the elements, ensure the business requirements would be met by the models and processes that were to be built underneath. They also had to work as a data steward (in some cases where a specific data steward role had not been defined) which means they interfaced with the business users, managed the master data, definitions, master metadata, and business process metrics (KPA/KPI's).

Eric Rawlins in 1995 defined DA role this way:
Originally Published by: Database Research Group, Inc
http://www.well.com/user/woodman/organic.html

What do we mean by that in the case of data administration? We mean that DA must get out of the design review committee mentality and substitute something more value-added and flexible. It must recognize that systems tend to grow organically, and be a part of that process, rather than an instiller of order upon it.

A big part of the DA role includes (today) Compliance, auditability, and managing metadata initiatives. Below is what I additionally define as a part of the DA role:
* Data administration and management are key roles in today's enterprise projects
* Data administration is a part of data management; the two should be utilized together
* Compliance, accountability, and governance provide a foundationally strong and tenable architecture, and must be a part of the DAs working knowledge

The DA must be capable of adding in compliance, governance, and accountability to the systems. After all, the Data that they are responsible for must be accurate; the processes that they define (as an interface between business and IT) require checking, managing, and proper accountability. Below are the top 10 problems within the DA role:

http://www.educause.edu/ir/library/text/CEM9047.txt

1. Inadequate or missing master metadata
2. Ineffective master data management
3. Incomplete logical models
4. Undefined business process models
5. Missing process control and metrics measurements
6. Non-defined user access matrices
7. Ineffective change management
8. Missing element classification system
9. Lack of user-training material
10. Mismatched data performance SLAs with DBA objectives

I'll blog more on this subject as we move forward, but for now - here are some of the things that are happening in the commercial organizations around the world:
The DA role is being cut and slashed, often to the detriment of the business - these folks know and understand the business, and are more critical TODAY than they ever have been, particularly with governance, compliance, and metadata initiatives. The DA role takes a special breed of individual, someone with business acumen, and IT background - someone who can cross the lines (which their really shouldn't be any lines anyway), and someone who can understand the data set and model the information according to the business.

Somewhere along the way, responsibility of data modeling became an IT only specification when it really belongs in a cross-functional role, dotted line report to business, with a direct report to IT data management staff. Until the business begins re-defining or re-instituting the proper Data Administration groups, data modeling and Data Integration, along with the processing of that data will continue to split further apart due to differing goals and objectives.

Thoughts? Comments?
Dan L
Daniel.Linstedt@MyersHolum.com


Posted May 29, 2006 10:38 AM
Permalink | 1 Comment |

A while back I blogged about appliances, and where I thought the market is headed. Please bear in mind that I frequently like to place myself into the future and attempt to see what will happen overall. Also bear in mind that frequently my definitions are slightly different than the common industry. As it so happens, I had the opportunity to look at and research Appliances going forward. I'd like to draw attention to the appliances in the market space and try in a couple entries or so, to help define the terms more clearly, and level the expectations of what customers may be seeing out there.

Appliances are everywhere, refrigerators, toasters, ovens, dishwashers, printers/scanners/faxers/copiers, DVD/CD players, MP3 players, cell phones (that are more than cell phones), and on and on and on. So that brings us to the IT sector and the definition of appliance. What does it mean? How should we define it? Where are the boundaries? When I look to purchase an "appliance" what should I be concerned with?

These are all questions (I'm sure there are many more) that I will attempt to answer going forward. In fact, on the cover of my latest CRN magazine, the storage standoff article: "May 22, 2006", the quote on the cover goes a little like this:

NetApp CEO Dan Warmenhoven prepares to take on EMC end-to-end with the launch of its first SMB product, a storage appliance priced starting at $5,000 that will combine NAS, iSCSI and, ultimately, fiber channel.

So what IS an appliance?
That's a tough definition - an appliance in the kitchen may be something that has dials and buttons, timers, and thermometers - and helps you cook or bake or toast. In the IT world, the definition of APPLIANCE is very very gray. There really isn't a clear definition of just what an appliance is. Particlarly if you look to IBM, Teradata, Oracle, Db2 UDB or other database vendors. Or how about fire-wall vendors, are those devices "appliances"? If they qualify as appliances, then we have to step back and re-think just what an appliance might be.

In my fog of concentration, I've decided that (and you may not agree - in which case, I would love to hear your comments) APPLIANCE is really a class, an organization or a hierarchy of items. At least for Business Intelligence and data warehousing, the appliance class can be broken down into hardware, and platforms. From there, it can be broken down further - hardware can include storage, networking, security, etc... Software is really a part of platforms, the platforms combine hardware and software for a "white or black box" that can be purchased and plugged in to the enterprise.

AHHH Plugged-in... What the heck does that mean?
Well, plugged in is simply where we start, from there are different sub-classes of PLATFORM APPLIANCES that include: scalability, management, maintenance, setup, enterprise class hardware parts, off the shelf hardware parts, service levels, self-monitoring, MPP abilities, NUMA Clustering abilities, and self configuration levels.

Ok, so when a vendor says: Plug in-and-play it may not necessarily be true?
Right. Sometimes the "platform appliance" requires tuning, configuration, manual manipulation. Other times the appliance really is plug-and-play into the network; it all depends on how much effort a vendor is willing to put into the engineering of their products and services.

There's a difference?
Yes, there's a difference between world-class "platform appliances" and SMB "platform appliances". World class would mean reduced mean-time-between-failure (MTBF), increased scalability (into the hundreds of terabytes) with little to no administration, world-class hardware (higher quality, higher price, more support from the vendor, and longer life-span). Sometimes off-the-shelf parts in an SMB appliance mean unsupported integration to a newer version of the platform. Remember that platform includes the hardware behind the scenes.

Example: an SMB who may stay small (say sells jewelry locally) in terms of data set, may not be able nor want to purchase a world-class "platform appliance", but may want the lower-end cheaper components. They may not need 24x7x365 uptime, nor could afford it. Yet, an international jewelry outfit may scale their data into the hundreds of terabytes, while they may start small, it doesn't mean they'll stay small. If their growth pattern can afford world-class parts, so be it.

For example, would you buy a $400 toaster for your daily toasting in your kitchen? How about a $1200 toaster or a $3000 commercial toaster they use in the hotels for feeding hundreds of guests every day? I wouldn't spend more than $120 for a toaster that might last a year or two, then buy another one when the low-end toasters have improved in quality.

My points are as follows:
* Appliances aren't always what they seem
* Appliances are a class of components; I see them more as platforms which include software, hardware, services, support, and up-time.
* Platforms are more apt to be the proper term, just because I buy a server, throw a database engine on it, and make it "available" through an API - doesn't necessarily make it an appliance.

Finally, if I look out three to five years - in all reality the customer wants more plug-and-play with higher quality and higher class parts, services will be the value-add, and self-monitoring, self-configuration will be expected to be a part of the package - not to mention scalability. Do I call this an "Appliance"? Yes - but it's an Enterprise Class Appliance, Is it really an Appliance? No - it is most likely a pre-configured enterprise class platform solution. Can a customer call it an appliance? Possibly, but I really don't care if they call it ham and cheese, or French toast. They can label it however they wish.

I still feel there is no true or single definition of what an appliance is or should be. I'd love to hear from you, how do you define "Appliance" or "Platform" in your industry?

Thanks,
Dan L
Daniel.Linstedt@MyersHolum.com


Posted May 29, 2006 10:10 AM
Permalink | 1 Comment |

The question? What does the new business initiative really need to focus on?

Today's business initiatives seem to be headed in many different directions, from SOA to MDM to registries, and business processes. The issue is that when different initiatives take on different directions (rather than a consolidated view and set of drivers) they all end up at different destinations. The cost is heart-ache, silo'd solutions, and a maintenance nightmare. The bottom line is that there is convergence afoot. I've written about this over the past 5 years in my convergence articles on TDAN, B-Eye Network, and Teradata Magazine. In this entry we'll explore what business should do, and how they should approach these very different initiatives (all with a common goal).

MDM - Master Data Management
MMDM - Master Metadata Management
SOA - Service Oriented Architectures
Registries - well, registries of web-services, taxonomies and hierarchies of access points, names, and security access restrictions, I guess one could say more metadata...
BPEL - Business Process Execution Language
BPM - Business Process Management

And of course the tools of the trade:
EAI - Enterprise Application Integration
EII - Enterprise Information Integration
ETL/ELT - Extract Transform / Load
RDBMS - Relational Database Management System

Ok now that we got that out of the way... Businesses have been divesting their interests for years (at least when it comes to I.T. projects). It's time to get a little convergence back into the mix. Businesses who start separate initiatives for each of the categories above will quickly find that they end up with one or more of the following:

* Silo'd answer sets
* Silo'd information assets
* Argumentative Fiefdoms within the kingdom (arguing over who's right and who's wrong and who has the best answers).
* IT Constrained Business - disparate projects, tons of sunk cost, high maintenance overhead
* Inconsistent standards
* Missing best practices
* Holes in the I.T. security wall (all over the place)
* Lack of IT business initiative
* Poorly motivated IT employees

And so on... Executive staff should realize that the good things in life don't come cheap, or easy. After all, they've worked extremely hard to get where they are. IT is no different, and should be treated as a single operational business unit. IT's initiatives should be aligned, but in a way that allows IT to work together rather than against each other.

So you've heard this all before have you?
I'm sure you have - it's been printed in the magazines for years, lately it was called IT alignment. Let's get back to the issues shall we?

What does this have to do with lining up: MDM, MMDM, SOA, and Registries?
Everything. Businesses today should establish an overriding IT umbrella, that umbrella is in fact, an SOA initiative. One way to think about it is: IT is a service based organization, SOA is a service based architecture from which automated services make business information, processes and descriptions available (on-demand) to the business. Let's just say SOA does for IT what JIT does for manufacturing and supply chains.

Underneath the SOA are Master Data, Master Metadata, Web Services, Registries, Auditability, EDW, OLTP, data marts, and Information Integration. All of these are the components necessary to make SOA a success. But remember, SOA is a journey not a destination - just like alignment of IT is a continuous process (it never ends).

So what do all of these have in common?
* Shared business insight
* Shared executive level sponsorship
* Shared information and data sets
* Shared asset base
* Shared security model
* Shared business processes
* Shared Metadata
* Common information dissemination model

From a project standpoint:
* Shared milestones
* Shared Risks
* Shared training
* Shared knowledge

There is also a certain dependency (order) in which these items must be executed. If one is left out of the process chain, then the business stands to suffer at the end of the day. Convergence is upon us, and real-time (active), metadata (descriptive), data sets (asset base), registries (organization of all data and metadata underneath), security and services (access layers) are all a part of the enterprise initiative to bring IT in to focus.

More to come on this topic - if you have questions, I'd like to try to answer them. Feel free to ask publicly or privately.

Cheers,
Dan Linstedt
CTO, Myers-Holum, Inc


Posted May 15, 2006 5:26 AM
Permalink | 2 Comments |

In a previous blog entry I discussed the nature of turning IT from a cost center into a profit center. In this entry I'll disclose a few of the requirements, and a few of the steps needed in order to head in that direction. Of course, you can always comment or send me an email - I'll be happy to discuss these items with you directly. Keep in mind that the company I did this with: a. I was an employee of IT, b. I was the "new guy" in the good-old-boys-network, c. there was a huge disbelief that IT could deliver anything let alone on-time or within budget, d. the organization was split by business contracts; my co-worker couldn't work on my project without budgetary approval even though they were a part of IT.

Also keep in mind that this company (at the time) was 7 sectors, 53 companies (all managed by P&L) including IT, and 150,000 employees. It was a large company, and a lot of hurdles had to be overcome.

Here are the steps I took to make it from scratch to success in 9 months:
1. Moderated the user community, defined roles & responsibilities, held the business users accountable for the amount of time they signed-off on to dedicate to our project.
2. Became familiar with the business language and terminology. Spent time with the business users to show I was serious. Interviewed some of them, but mostly watched them work and took lots of notes, asked lots of questions pertaining to HOW they did things. Learned what was efficient and what wasn't from a business perspective, in other words what they did that was a manual effort - that could be automated.
3. Found "friends in business" who began to believe in what I was doing, I educated them on the progress both good and bad - they stuck with me in the "bad" times and held the nay-sayers at bay until we delivered.
4. Delivered close to on-time, and close to on-budget. We were slightly over due to scope creep and the Business Users changing the business requirements.

Once we delivered, the real magic began. First we delivered on the promises to support the reporting requirements the business had. Then, we started educating the users on everything else we had stored along the way. Showing value to the fact that the enterprise data integration store (EIDS) or EDW contained not only "good" data but "bad" data as well, dispelling the whole notion of truth.

In this particular case, we focused on bringing finance data into the warehouse for matches against workers' hours and comparing that against contracted hours. Finance fought us tooth and nail at first, then they hired auditors to shut us down, claiming the data warehouse was wrong.... Well, if you've followed my blog, you'll understand that they couldn't shut us down, and what happened was: the auditor found a billing error that had been occurring in the operational reports for over 15 years. Once we had finance's blessing, other things became easier.

The following steps are necessary to complete the transition:
1. Establish and stay close to the original business sponsors of your integration store project.
2. Bring in more data than they ask for, begin to show them and the rest of the business HOW you can link this information together and make sense for the business, put together slide shows, give presentations, invite users to web-conferences, ask the business user to set it up and send out the e-mails. DON'T send emails as I.T., most of the time other business organizations will ignore you.
3. Have your business user start the presentation by explaining the value you've brought to them, have them explain a specific case or two - the more specific the better (in terms of dollars and cents/sense).
4. Tell the business user ahead of time: if we get a business user to sign up, we'll give them a log-in for 1 to 2 weeks for free. We'll track their queries; build a simple star schema (one star, single data mart, single denormalized reporting table).
5. At the end of their trail period, if they've used their data, logged in and actively "queried" it, then offer them a deal (through the business user of course): Pay to play - want to keep it? looks like you're using the information, help support the cause - sign an SLA and put money in the pot for storage, maintenance work, and hardware upgrades, and we'll keep your data on-line. If you can't afford it, don't want to or aren't using it, we'll drop the tables today, delete the data - and if you want it later, just ask.

What happens is: success breeds success, for every 1 to 2 business users that don't want to sign up there will be 5 waiting in line to get their integrated data set.

This is a bottom up implementation style; please please please remember to architect TOP DOWN so that when you add new data sources, your enterprise data integration store can handle it without much effort. (See the Data Vault data modeling architecture for more information).

The point being:
1. You need an AUDITABLE and COMPLAINT historical, integrated, enterprise information store.
2. You need a business user willing to go to other business units and sing the praises of the value you brought to them.
3. You need to track queries, and logins, and data utilization from new business units.
4. You need to be willing to SHUT DOWN those who do not pay-to-play.
5. You can't be afraid to get executive sponsorship and to go to the executive staff when you need business disputes resolved.

I did it, as a peon in a corporate good-old-boys-network world. It can be done. When I arrived on site, the project was 6 months in the hole, the business users were angry (to say the least) and the money holder was ready to pull the plug. 6 months later and 3 phases later we delivered with a team of 3 people and a data architect. It can be done. We received executive visibility and sponsorship after delivering.

If you are curious, or have hurdles in your organization you'd like to overcome we can help you. We've got assessment and corporate guidance assistance available. We've been there, we know what it takes.

Got a story? I'd love to hear it.
Cheers,
Dan Linstedt
CTO, Myers-Holum, Inc
http://www.MyersHolum.com


Posted May 1, 2006 8:47 PM
Permalink | No Comments |