Blog: Claudia Imhoff Subscribe to this blog's RSS feed!

Claudia Imhoff

Welcome to my blog.

This is another means for me to communicate, educate and participate within the Business Intelligence industry. It is a perfect forum for airing opinions, thoughts, vendor and client updates, problems and questions. To maximize the blog's value, it must be a participative venue. This means I will look forward to hearing from you often, since your input is vital to the blog's success. All I ask is that you treat me, the blog, and everyone who uses it with respect.

So...check it out every week to see what is new and exciting in our ever changing BI world.

About the author >

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines.

She is also the Founder of the Boulder BI Brain Trust, a consortium of independent analysts and consultants (www.BBBT.us). You can follow them on Twitter at #BBBT

Editor's Note:
More articles and resources are available in Claudia's BeyeNETWORK Expert Channel. Be sure to visit today!

 

April 2005 Archives

There have been a number of reports recently about security and privacy breaches involving customer data -- ChoicePoint, Lexis-Nexis and other information providers come to mind. What do you need to do to ensure that this never happens to your corporation's data?

I have been reading a lot lately about privacy and the need to secure customer data from hackers and outsiders who want to misuse the information. The erosion of privacy even within corporations seems to be pretty concerning as well. What is a corporation to do to ensure that its crown jewels -- its integrated customer data -- does not fall into the hands of the wrong people?

It was suggested to me recently that the only way to really secure this data is to encrypt it. If hacked into, the hackers would only get a string of gibberish and nothing meaningful would be usable. The only people who could use the data would be those individuals having the encryption key. Sounds good but there are problems with this approach too.

Encryption traditionally has involved a substantial amount of overhead -- overhead to encrypt the data, thus slowing down the data processing (think loading of data into the data warehouse, for example). Then there is the additional overhead of decrypting the data when someone wants to use it. There is good news here though. Encryption (and the corresponding decryption) technologies have come a long way. Encryption vendors now claim the ability to encrypt and decrypt with little or no performance hits whatsoever.

The second thing about encryption -- particularly from a BI point of view -- is that once it is decrypted and the query returned to the user, the data may be downloaded to their PC. Hmmm -- where's the security now? Unless it stays encrypted -- even on the user's PC, it seems to me that we still have a major hole in the overall data security and privacy scenario. This is an area that could use some vendor support as well.

If you have any ideas about security and data privacy, I welcome your comments.

Yours in BI success,

Claudia


Posted April 28, 2005 10:46 AM
Permalink | 1 Comment |

One of the questions that I get quite often is "Do I need to create an enterprise data model?" In the case of a recent phone call, the caller stated that his company never built applications; they only bought off the shelf software so why would he need such a data model. Here are my reasons for creating the model despite the buy versus build philosophy.

The caller's company had bought a set of a large ERP vendor's modules. Now they were looking at another ERP vendor for a new module. They also purchases CRM applications and even BI ones. Why should his group worry about an enterprise data model (EDM)? I told him that basically there were two main reasons (there are other smaller ones but let's stick with the biggies):

1. The purpose of the EDM is to capture or document the "ideal state" of the corporation's data needs. In other words, what data is needed (the attributes and entities of interest to the company), how it is associated with other data (the important relationships between entities), the business rules behind the associations (is it a mandatory or optional relationship?), and so forth. When buying a COTS piece of software, you would use your data model to determine how closely the application will match to your ideal state. The analysis should determine where there is a perfect match and the software will support your needs exactly -- where the application is close but not perfect and some modification or tweaks to the software may be needed - and where there is no match at all and either the corporation must change the way it currently does business or look for another application that is closer to their ideal. Without the EDM, I don't know how you can make such an assessment of fit.

2. Since no company seems to have a fully integrated set of operational systems (even these folks had different ERP and CRM vendors involved), some sort of "ideal" or master set of translation tables will be indeed. The EDM can serve that purpose as well. With the EDM in hand, you have a means of mapping one application to the other using the EDM as "translation tables" to ensure smooth integration. The definitions of the data, documented relationships between the data, and so on, will ensure that a proper integration can occur.

Hopefully these thoughts will help you justify the need for an EDM in your own organization. As always, I welcome your input and experiences along these lines.

Your in BI success,

Claudia


Posted April 26, 2005 10:29 AM
Permalink | 5 Comments |

An excerpt from the Open Source Initiative explains why open source is important to the technologists. They state "The basic idea behind open source is very simple: When programmers can read, redistribute, and modify the source code for a piece of software, the software evolves. People improve it, people adapt it, people fix bugs. And this can happen at a speed that, if one is used to the slow pace of conventional software development, seems astonishing."

So what does this mean to you?

Yesterday I spoke with CEO Sam Mohamad and CTO Luke Lonergan from Greenplum (formerly Metapa), a company that is a big proponent of open source technology for BI, about open source and the role it will play. Here are some excerpts from that conversation:

CMI: Why is open source important in today's BI market?

Greenplum: The current applications development community favors open source because it enables them to try a greater breadth of options more quickly, which injects more innovation into the process. Given the increasing stability and features of the open source offerings in many applications, business managers are becoming accustomed to the innovation and cost-effectiveness of the model.

CMI: Why does it have more appeal for larger corporations than smaller ones?

GP: Large corporations traditionally have had more difficulty in innovating than small ones. Open source is changing the model of innovation by putting more power into the hands of the developers and departmental leaders building the newest revenue-producing products. The risk of adoption is being spread over a greater number of applications, and the best are bubbling up with the best-of-breed open source foundations. Companies like RedHat have prospered from this trend.

CMI: What are the disadvantages of open source technologies to corporations?

GP: One of the big drawbacks to adoption of open source tools is that there isn't a clear picture of how the tool will evolve over time. As a result, there is less predictability of feature and function in the open source tool sets.

What mitigates this problem is the focusing influence of companies that commercialize open source tools by concentrating the community on the most relevant feature sets for chosen audiences. Combining this focus with the appropriate support infrastructure makes the packaging more comfortable and predictable the enterprise.


Posted April 22, 2005 3:16 PM
Permalink | 2 Comments |

I can't tell you how many times BI designers have told me that their users want to see reports filled with averages -- average sales by store, average customer purchase, average inventory levels, and so on. Do people really understand how misleading and erroneous these figures can be? Here's my tip for you.

Average calculations are "smoothing" techniques; they remove the ups and downs of the actual data. They may be useful to give you a high level trend or estimation but they can be completely misleading if you base business decision on them. Let me give you an example of what I mean.

Suppose someone in charge of inventory levels asks for the average monthly sales for product A. The answer comes back that, on average, the company sells 100 units of Product A per month. Does that mean that there should be only 100 units and no more of Product A in inventory every month? I wouldn't bet my job on that...

The average is based on the fact that 1200 units of Product A sold over a 12 month period. This does not take into account the fact that the product's sales are seasonal, affected by marketing campaigns, or boosted by recurring external factors like sports events. In reality, Product A is quite seasonal with 80% (or 960 units) of its sales occurring in the summertime -- a three month period. In fact, the units may sell withing a 6 week period within even the 3 months. The rest are sold on either side of summer with no sales occurring in winter.

If the inventory were stocked at 100 units per month, there would not be enough available in the height of summer to sell so sales would be lost. Yet, the levels would be way too high in winter, causing unnecessary inventory carrying costs. That is the problem with averages...

In gathering requirements from your business users, make sure you dig a bit deeper into the users' needs than hearing just a cursory "I want the average sales amounts of..." It is always useful to ask why and how the data be used. It is wise to store the details that go into each of these average amounts in the warehouse to ensure that it remains possible to look at the actual numbers as well as averages.

I hope you find this tip useful. If you have others, please add them to this blog.

Yours in BI success,

Claudia


Posted April 20, 2005 3:15 PM
Permalink | 3 Comments |

In a recent Conde Nast Traveler Magazine (March 2005), they reported on their readership poll (designed with Carnegie Mellon’s Risk Perception and Communication and Harvard’s Center for Risk Analysis) that was aimed at the most fundamental travel question – to go or not to go. Their analytic results were surprising in some respects.

The results show that although most of us may be debating this question, we are largely choosing to take the chance and go. What they deemed as remarkable is that we are NOT taking seriously the travel advisories and warnings issues by the US State Department and the Department of Homeland Security.

OK – maybe I have been a road warrior for way too long but haven’t we been under a code yellow or orange travel alert for more than three years? Does it surprise you that we travelers are no longer affected by these warnings? That we feel these warnings are influenced more by politics than reality? More by business interests and legal liabilities than real danger to the traveling public?

In short, the poll demonstrated that we don’t trust our government’s security information or analytics. It desperately needs a reliable BI environment. Our government must hire the best of the best BI consultants to get its analytics straightened out and produce reliable, consistent, nonpolitical intelligence about our travel situation…

Yours in BI success,

Claudia


Posted April 18, 2005 9:43 AM
Permalink | 2 Comments |
PREV 1 2 3