Blog: Dan E. LinstedtJuly 16, 2008Part 7: Secrets of the Masters, Templates for ProjectsAny time we get back to secrets, we seem to fall right back to the category of standards, standardization, measurement and enablement. The old saying is: "if you can't measure it, you can't monitor it, and if you can't monitor it - you don't know when it's broke, or you can't optimize it/fix it." Something like this anyhow. The common feedback from the general project implementation community is usually: "Why do I need to standardize? Why should I document? Won't it take more time to follow standards than to build rapidly?" Continue reading "Part 7: Secrets of the Masters, Templates for Projects" » July 19, 2007ETL & ELT - as they relate to VLDWETL (extract transform & load), and ELT (extract load and transform) have been getting interesting rubs lately from different people in the market space. The issue is that most of what I read / see doesn't include dealing with super-large volumes of data. For example: every DFD (data flow diagram) I build is architected to deal with a minimum of 80Million rows per source, and is considered to be a small load. The medium sized loads deal with 150 Million rows per source, and the large loads deal with 300 million and up (to 1 to 2 billion rows per load). In this entry we explore the nature of both ETL and ELT as they relate to this size of data set. We'll cover problems, issues, and architectural changes that need to happen. Continue reading "ETL & ELT - as they relate to VLDW" » May 10, 2007EII and its future valueEII has been getting a lot of buzz lately, especially with the purchase of Meta Matrix by Red Hat. I want to turn your attention (instead) to where EII needs to go as an industry. These are my opinions, and I welcome you’re constructive comments. EII (enterprise Information Integration) is a pull technology - grabbing data on-demand when needed from all kinds of sources, and building a single integrated view of the current world of "transactional data." So what's left? Continue reading "EII and its future value" » March 21, 2007Defining Unstructured Data & DW2.0My last post discussed the notion of unstructured data being as much as 80% of the data that we in IT will / should begin to deal with. One of the readers requested that I expand on what I'm including in Unstructured Data. This entry discusses the types of structured/unstructured and semi-structured data as I see it. As usual, this pertains to business knowledge, and is a huge part of DW2.0. As it turns out, it also is (or will become) a huge part of changing IT from a cost center into a profit center; why? Because if we can integrate unstructured information, and glean the knowledge from it (determine contextual linkages), we can better understand where our business holes are. Continue reading "Defining Unstructured Data & DW2.0" » December 15, 2006ETL, ELT - Challenges and MetadataI had some good questions come in recently, thank-you. In this entry I'll share my experiences with ETL and ELT with regards to metadata; I'll also try to elaborate on when it is right to use which type of technology. This also goes back to my original articles in Teradata Magazine on ETLT. Continue reading "ETL, ELT - Challenges and Metadata" » December 6, 2006ETLT or ELT - Either way, pull back the sheets.Ok, I've said it before in previous entries, I've discussed ELT and ETL and loss of metadata here on the blog before. I've worked in both situations, I've worked in VLDW for 8 years, I've worked in ADW (Active or real-time data warehousing) for the past 3 to 4 years, I've been involved with non-data warehousing data integration projects using ETL and massive volumes. Now I'll say it again - there's a difference, a time a process, or set of processes, and singular points of architecture where everything converges. If you've seen me present VLDW, you've seen my Pyramid diagram that shows the impact of volume and latency on the number of ways to "execute". This entry redresses ELT or a term I wrote about in Teradata Magazine over 5 years ago called ETLT. Warning: read on if you like a good "one-sided" rant... I lay it on the table here, after all my years of experience, I just need to share. As always, I'm open to other sides of the argument - and invite anyone with alternative views to comment on this entry. If you agree, I'd love to hear from you as well... Continue reading "ETLT or ELT - Either way, pull back the sheets." » August 29, 2006Where should Dirty Data be cleaned up?I just blogged on the need for allowing dirty data to flow through to an auditable reporting area. There are a lot of questions about WHERE the dirty data should reside, and where the dirty data should be cleaned up. In this blog entry we'll dive in to take a look at that in a short consolidated view. If you are fighting with compliance and auditability at a systems or data level, hopefully this will be helpful. Continue reading "Where should Dirty Data be cleaned up?" » August 20, 2006Dirty Data Sets = Hemorrhaging Money in BusinessIn my most recent blog entry I discuss temperature related data sets, near the bottom I bring up a lot of questions about large scale data sets and dirty data. Let's pick up where I left off... One thing is clear, as we march forward, our data sets will only grow, not shrink. Something to take note is: what exactly is "garbage data", what does it mean? Can you identify it and remove it from your systems without impact to audits? If you clear it out, are you removing the possibility of tying together or discovering a meaningful relationship across your business that you didn't have before? If you have garbage data, does it mean your business is hemorrhaging money? Continue reading "Dirty Data Sets = Hemorrhaging Money in Business" » May 1, 2006IT from Cost Center to Profit Center - Next StepsIn a previous blog entry I discussed the nature of turning IT from a cost center into a profit center. In this entry I'll disclose a few of the requirements, and a few of the steps needed in order to head in that direction. Of course, you can always comment or send me an email - I'll be happy to discuss these items with you directly. Keep in mind that the company I did this with: a. I was an employee of IT, b. I was the "new guy" in the good-old-boys-network, c. there was a huge disbelief that IT could deliver anything let alone on-time or within budget, d. the organization was split by business contracts; my co-worker couldn't work on my project without budgetary approval even though they were a part of IT. Continue reading "IT from Cost Center to Profit Center - Next Steps" » April 15, 2006Demystifying SoR (System of Record) and MDMWhen Claudia Imhoff and Shawn Rogers and I got together for lunch the other day, we discussed this notion of SoR - it's a very interesting take. SoR has long been held as a single definition, and has been defined as residing in the source systems. Today, there are multiple definitions (3 to be exact) of SoR. Particularly since MDM evokes new notions of what SoR means to the business, as does a compliant and auditable enterprise warehouse. In this entry I'll walk through the multiple definitions of SoR. In my MDM night course in August at TDWI (2006, San Diego) I'll be discussing many of these things. Continue reading "Demystifying SoR (System of Record) and MDM" » April 7, 2006Clarifying MDM - Setting StandardsI've had a lot of great feedback on the MDM blogs that I've been adding lately, and one kind individual sent me an email asking for a couple of things, including a definition, a practical criteria, a practical taxonomy, and to keep the picture simple enough for organizations to use. In this entry I will do my best to offer my *opinion* on the subject, I am open to comments, corrections, and thoughts from all of you - again, this will be only my opinion. Please note that my opinion is biased towards compliance, accountability of data, traceability, and accountability of business users and arises from my experiences with SEI/CMM, PMP, Six Sigma, TQM, BPR, Lean Initiatives and Cycle Time Reduction. I can't wait to hear back from you. Continue reading "Clarifying MDM - Setting Standards" » March 1, 2006Data Warehouse Appliance, another lookAppliance based data warehousing is on the rise, and no wonder - the costs per terabyte are cheaper, and for specific applications of the warehouse - sometimes these platforms are blazingly fast. They offer plug and play technology with HA (high availability) and Fail Over just by plugging in another appliance. They offer remote management, self-updates to the BIOS, and firmware, and most of them run on open operating systems like Linux. In this blog entry I'll discuss both the pros and cons of Appliance Based warehousing, but I still believe that this will be a market segment to watch, and will eventually flood the market with the backbone for high availability data integration and warehouses. Continue reading "Data Warehouse Appliance, another look" » February 28, 2006MDM Part Deux (II)My last post generated some great responses, ranging from Master Data Management as MDM philosophy to MDM Data Marts as "better than gospel" or better than system of record on the source system. In this entry I will take a look at MDM in a little more detail, and try to answer my view points on some of the issues raised in the responses. First, thank-you to all those who are responding, I enjoy reading your thoughts. Continue reading "MDM Part Deux (II)" » February 7, 2006Master Data Management - Just Another Mart?Ok, we've all heard the term: MDM or master Data Management, but what the heck does it mean? My opinion is different than most, and in my search for the ultimate compliant warehouse I constantly battle with new acronyms... What EXACTLY is MDM? What about CDI (customer data integration)? Ok, how about these: PDM, BAM, BPR, LI, CRM, ERP, SEI, CMM, PMP, HIPPA, SARBOX, and so on? Well, I went overboard again (oh happy days...) Acronym soup is nothing new to me, I've been in the industry for over 15 years, been through the 80's and the Business Process Reengineering (BPR), also known as Lean Initiatives (LI), Six Sigma quality improvements, and ISO (international Standards Organization) setup. Now comes BAM (business Activity management), MDM (master Data Management), CDI (customer Data Integration), PDM (parts data management), SCM (supply chain management), CML (customer Master Lists), and so on... Maybe one day we will realize that some of these acronyms are just new names for OLD (but valid) business goals. In this case, it's all about the business management, better quality, and shorter lead times, reduced overhead and increased revenue - after all, if I'm not making money, then why am I in business? In this entry I will focus on MDM and explore just what it is. Continue reading "Master Data Management - Just Another Mart?" » January 24, 2006Governance is Muddy Water.I've been doing a lot of research lately on the nature of Governance. There are a lot of misconstrued definitions in the market place, and a lot of vendors throwing around terms that they don't define. It seems like I've found definitions for Corporate Governance, and IT Governance, and even a definition for "Country" Governance, but finding definitions of SOA Governance and Data Governance leads to muddy water. "It's like trying to catch smoke with your bare hands!" (Harry Potter, The Prisoner of Azkaban). Continue reading "Governance is Muddy Water." » January 6, 2006Redfining the EDW and ODSThis was a hot topic for most of you, with compliance breathing down our necks and the government hot on the auditing trail we have to do something. And something we shall do! In fact, the nature and notion of EDW and ODS is changing, as I blogged in my most recent entry in this category. I made a statement: "Flip the coin, and store RAW data as-it-stood on the source system, but in an integrated fashion in your data warehouse; now what have you got? A solid architecture (if modeled properly) which allows data to be auditable from that time period before the change. The Data Warehouse has now become a system-of-record." and a comment was made, that this sounded like an oxymoron - I was asked to elaborate. In this entry I'll attempt to explain what I mean by this statement. It's very possible that I didn't state it quite "correctly".... Continue reading "Redfining the EDW and ODS" » November 3, 2005Redefining the "Data Warehouse" and Combining ODS+EDWAlright - let's get down to business. I've blogged before about the convergence happening in the market place, but we've not stopped to consider what should be happening across the ODS and Data Warehouse. I went to dinner with Bill Inmon the other night, and he told me his "basic definition of what a data warehouse is, is changing." I agree - and by the way, it's not just me. I've spoken with Stephen Brobst, Claudia Imhoff, Larry English, and quite a few others. The base definition of what a warehouse is, and represents is changing - and for good reason. There are compliance initiatives afoot, there are problems with multiple copies of the data hanging around in the system, and there are issues of change to be covered in the source systems... Continue reading "Redefining the "Data Warehouse" and Combining ODS+EDW" » October 20, 2005Standards, Compliance, and SuccessesI've been asked about standards, and what they contribute to the success of a project within business. Particularly from the entry on Architecture, Standards, and Business. Standards contribute quite a bit actually. But standards can also be overkill. There are some neat comments on Agile Modeling forum regarding the use of standards, and I've spoken with Scott Ambler about some of these things (but not yet in detail). Grady Booch and I have discussed the nature of useful standards in brief conversations, of which we still have to draw some conclusions - with that let me continue my entry. Continue reading "Standards, Compliance, and Successes" » October 3, 2005Personal Security and your informationI've blogged about this recently, the judge in SF who basically ruled that credit card companies don't have to be accountable for telling you if your information is stolen right? We'll here's the flip side to this story. Turns out CardSystems is having stock trouble, on-line card processing merchants have seen sales fall a couple percentage points since the breech. Maybe they'll begin paying attention? Continue reading "Personal Security and your information" » September 30, 2005Can we get RFIDS for Data?Hmmm, I've been thinking about this for quite a while. In the tangible world we have tags for physical goods - yesterday they were bar codes, today they're RFIDS and RTLS systems. Tomorrow, physical elements may be tagged with DNA sequences, or electron signatures at the nano level. Why then is it so hard to track intangible "data"? For applications we have the equivalent of software licenses, but for the actual data? Nothing. Continue reading "Can we get RFIDS for Data?" » July 27, 2005EII - does it have a chance to survive?EII - aka; Enterprise Information Integration. Does it really have a chance to survive? or is it just another passing fad?? As an architecture it makes sense, a lot of sense - but then there's SOA - with a much larger view of the world, and lot more integration under the covers. So is EII just the technology to make SOA work? or is there something else going on here? Continue reading "EII - does it have a chance to survive?" » May 2, 2005ELT and ETL - candid view of pros and cons.Now that I've blogged on the needs for an ETL-T engine, I think it only fair to discuss what EL-T still leaves to be desired, and what is required to make EL-T perform. While ETL-T is the industry direction, EL-T has a ways to go before it can "take-over". Of course the notions of ELT "successes" are highly dependant on the RDBMS engine that it puts its' data in. Let's explore these notions a little deeper... Continue reading "ELT and ETL - candid view of pros and cons." » April 25, 2005ETL, ELT, EAI, EII and E-I-E-I-OWell well, lookie here - Old MacDonald had a farm, E-I-E-I-O. (sorry, on a bit of a funny kick today). What do all these things have in common? More over what problem are they trying to solve? Are some of these technology stacks "sun-setting"? In this blog we explore some of these garbled acronyms, and no - I won't repeat the farm joke... We'll also take a hard look at some of the existing business issues that are forcing changes in the way we (IT) work. If nothing else, a bit of light reading - you might get a laugh or two out of this... :) Continue reading "ETL, ELT, EAI, EII and E-I-E-I-O" » March 26, 2005Compliance, Data Integration, Part 2As strange as it sounds, statement-of-fact is exactly what the data warehouse should become. See Bill Inmons article on Bill Inmon's Vision for a data warehouse. In this second installment we explore compliance, auditability and integration routines. Let's take a look at data we think is not auditable by compliance rules... Continue reading "Compliance, Data Integration, Part 2" » March 24, 2005Compliance, Data Integration, Accountability?In this weeks' newsletter Bill discusses Sarbanes-Oxley and what it means to business. See Bill Inmons Newsletter article. In this blog we take it a step deeper - into the implementation world of data integration. What does compliance mean to those building ETL, EAI, EII and Web Services routines? What does it mean to the data set both IN the data warehouse and now being loaded into the data warehouse? What will data Integration have to endure in the coming year or two commercially? This category of blogs will explore these questions and more. Continue reading "Compliance, Data Integration, Accountability?" » |