We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


From Business Intelligence to Enterprise IT Architecture, Part 5

Originally published August 3, 2010

This ongoing series of articles describes a new enterprise IT architecture that emerges from a reevaluation of modern business intelligence needs and technologies. The second article introduced the Business Integrated Insight (BI2) architecture as an evolution of data warehousing / business intelligence, while Parts 3 and 4 drilled into the information layer, the Business Information Resource (BIR). This article concludes our exploration of the BIR, focusing on the reliance / usage axis.

The Information Space



Figure 1: The Three Axes of the BIR Information Space

As described in Part 3 and shown one last time in Figure 1, the independent axes of the three-dimensional space along which information is positioned are:
  1. Timeliness / Consistency: Describes how information moves from creation for a specific purpose to broader, consistent usage and on to archival or deletion. (TC axis)
  2. Knowledge Density: Describes the journey of information from soft to hard and the related concept of the amount of knowledge embedded in it. (KD axis)
  3. Reliance / Usage: Describes the path of information from personal to enterprise usage and beyond and the implications for how far it can be relied upon. (RU axis)
Let’s now focus on the third dimension, the reliance / usage axis.

The RU Axis: Reliance / Usage

As we've seen in the previous two articles, the axes of the information space were chosen to correspond to fundamental issues that arise whenever we want to use information beyond the context of a single application. The TC axis addresses the basic issues of timeliness and consistency of information that arise when we combine data from two or more sources. The data classes on this axis are the foundation for classical data warehousing. The KD axis arises when we look beyond traditional hard information (or “data” in common computer usage) and consider the increasing need to include softer information (or “content”) within the scope of the architectures we use to structure and manage business information. As we saw, this axis provides important new insights suggesting that soft information should not be copied into traditional data warehouse structures. Instead, metadata should be used to bridge between these different classes of information.

The RU axis addresses yet another fundamental information management issue. This issue emerged at the moment it became possible to copy corporate information into personally managed storage. The oft-discussed problems arising from the use of spreadsheets1 in business illustrate the issue concisely. When a business user accesses data in the corporate business intelligence (BI) system, there is a justifiable assumption about the reliability of that data. The accuracy, consistency and timeliness of the information is guaranteed by the information management processes and procedures embedded in the data warehouse and elsewhere. However, the moment this information is downloaded into a personal spreadsheet, all reliability bets are off. The information can be changed deliberately – perhaps even with criminal intent – or accidentally. Erroneous calculations may be embedded in the cells of the spreadsheet. The level of reliance that can be placed in this data is considerably less than can be placed in its data warehouse source.

Reliability on its own is interesting. But, it's what you do with the information – usage – that makes this topic compelling. If our business person above uses his spreadsheet simply to check up on current sales and calculate the likelihood of reaching his month-end targets, the reduced reliability is relatively unimportant. If he brings the results to the monthly sales forecasting meeting, having the correct numbers and calculations is more important. And what if the senior vice president for sales likes the spreadsheet so much that she decides it should be used by all her reportees for the next year? It becomes pretty obvious that widespread use of such low reliability information could potentially lead to some significant problems.

The RU axis provides a conceptual basis for examining this problem and deciding how one might best address it. As in the case of the other axes, we identify a number of broad classes of information, bearing in mind that the axis is a continuum and that the class boundaries are relatively fluid.

Let's begin in the more familiar territory around the centre of the axis and work outwards from there. The local scope of reliance / usage implies that a set of information is well understood and trusted by a defined set of users. This user set may be a department or business functional area, or may be a team temporarily assembled to undertake a particular piece of work. As we move to the enterprise class, we mean that information here can be relied upon throughout the enterprise in structure, meaning and content. Of the hard information stored in the BI environment, data marts typically fall in the local class, while the enterprise data warehouse (EDW) clearly resides at the enterprise level. Note, however, that the RU classes should not be defined in terms of applications or databases; in fact, the axis represents a social / organizational construct describing how information can be used in the context of its reliability.

While the local and enterprise classes may be familiar, the global level is seldom explicitly recognized. Information at this level can be trusted and used in the widest arenas. At its most extreme, one could even envisage a universal class of, for example, physical and mathematical constants such as the mass of a proton or the value of pi. In a business sense, the global class refers to information that can be known and trusted between organizations, at a government or international level or within an industrial or geographical grouping. In today's interconnected and highly interdependent world, information at this level of reliance / usage is vital. However, as we have seen in the ongoing, international financial crisis, such trust is in short supply, based partially on the lack of reliable information.

Moving in the other direction along the RU axis, we come to the personal class, which consists of information defined and managed by one person, such as spreadsheets, documents and other files mainly residing on personal computers. As discussed earlier, the assumption that information that is relied upon personally can be used at local or enterprise levels causes significant data management and quality problems. In fact, similar problems arise whenever we try to use information created at one level of reliance / usage at a higher level on the axis.

Vague information, the lowest class on the RU axis, is information whose reliability and possible use is indeterminate or unknown. Information from the Internet often falls into this category. Compare Wikipedia, for example, with Encyclopedia Britannica. Both address the same need – reference information – but, Wikipedia information is vague: its provenance is unknown and its reliability open to question. Encyclopedia Britannica falls into the global class: its reliability is vouched for by reliable authorities. Wikipedia is free; Britannica costs money, and prior to the Internet used to cost a lot of money. This illustrates an important characteristic of the RU axis: moving information up this axis requires an investment in expertise, time and money.

Practicalities of the RU Axis

In the case of the other two axes, I've discussed technical implications; here I prefer practicalities. While it's true that the RU axis springs largely from technological advances – personal computers and, more recently, the World Wide Web – the movement and management of information along this axis depends as much on improving organizational procedures and personal behaviors as it does on any particular use of technology. Let's continue to focus on the most obvious example of reliance / usage issues – the spreadsheet problem.

Both the joy and sorrow of spreadsheets spring from an early design decision to allow the user to manipulate all the spreadsheet data and create any desired calculations almost entirely free from any computational, tracking or auditability restrictions. As an exploratory tool, this makes perfect sense. And given its origin in the world of personal computing in the late 1970s, with its embedded opposition to the centralized management approach of mainframes, such restrictions would probably have been anathema to early developers and users. Unfortunately, from the viewpoint of reliance / usage, this opposition to tracking and auditability appears to continue to this day. With the widespread and presumably enduring use of such internally unregulated spreadsheets, the options open to IT departments for managing the promotion of personal spreadsheets to group or enterprise class usage are mostly limited to the implementation of best practices and procedures in the organization.

Ideally, what is needed here is a technology that blends the ease of use of spreadsheets with an underlying tracking and auditability infrastructure in an environment that supports innovation by individual users and manages the process by which such innovation is first shared and then promoted to enterprise use under managed and controlled conditions. I've discussed these considerations extensively elsewhere2 in the context of a relatively new and widely praised analytic tool called Lyza.3

Conclusion

The provenance of information used in decision making has always been broader than the classical operational systems often considered as its sole source. With the increasing dependence on the Internet and other external sources, particularly for soft information, determining and tracking the reliability of information is key to managing its valid usage and improving its quality. The procedures for doing this – promoting information up the RU axis – are largely organizational and social in nature. However, I also believe that the inclusion of well thought out social networking function, close attention to tracking and auditability needs as well as support for promotion procedures in technology is possible. Such an approach is vital to allow IT to cost-effectively manage and control the much-needed promotion of information and its associated processes up the RU axis so that personal innovation can be safely made available in the wider business as repeatable and reusable procedures in a timely manner.

This final mention of processes leads me neatly to the next article in this series where I'll describe the Business Function Assembly (BFA), the process layer of the BI2 architecture.

End Notes:
  1. Eckerson, W. W. and Sherman, R. P., “Strategies for Managing Spreadmarts: Migrating to a Managed BI Environment,” TDWI Best Practice Report, First Quarter 2008, http://bit.ly/ZVx86.
  2. Devlin, B., “Playmarts: Agility with Control, Reconnecting Business Analysts to the Data Warehouse,” December 2008, http://bit.ly/lN5UV, and “Collaborative Analytics, Sharing and Harvesting Analytic Insights across the Business,” June 2009, http://bit.ly/otVWP.
  3. See www.lyzasoft.com for further product information.

  • Barry DevlinBarry Devlin
    Dr. Barry Devlin is among the foremost authorities in the world on business insight and data warehousing. He was responsible for the definition of IBM's data warehouse architecture in the mid '80s and authored the first paper on the topic in the IBM Systems Journal in 1988. He is a widely respected consultant and lecturer on this and related topics, and author of the comprehensive book Data Warehouse: From Architecture to Implementation.

    Barry's interest today covers the wider field of a fully integrated business, covering informational, operational and collaborative environments and, in particular, how to present the end user with an holistic experience of the business through IT. These aims, and a growing conviction that the original data warehouse architecture struggles to meet modern business needs for near real-time business intelligence (BI) and support for big data, drove Barry’s latest book, Business unIntelligence: Insight and Innovation Beyond Analytics, now available in print and eBook editions.

    Barry has worked in the IT industry for more than 30 years, mainly as a Distinguished Engineer for IBM in Dublin, Ireland. He is now founder and principal of 9sight Consulting, specializing in the human, organizational and IT implications and design of deep business insight solutions.

    Editor's Note: Find more articles and resources in Barry's BeyeNETWORK Expert Channel and blog. Be sure to visit today!

Recent articles by Barry Devlin

 

Comments

Want to post a comment? Login or become a member today!

Be the first to comment!