We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Data Warehouse Construction: A Constructional Paradigm Shift?

Originally published January 10, 2013

During a Great Cultural Paradigm Shift

Once upon a time from 1966 to 1976, a massive political movement called "The Great Cultural Revolution" took place in China. During this special period, there were absolutely no translated books available in the book shops there, apart from those written by the dead communist leaders such as Marx, Engels, Lenin and Stalin. There was, however, an exception, an absolute exception! The Structure of Scientific Revolutions, a work of a living American, i.e., Thomas Samuel Kuhn (1922–1996), was translated in Chinese and available in all major book shops and libraries. The reason for this exception should be obvious: It is about some kind of revolutions, and Chinese in this period enthusiastically loved everything promoting revolutions of any kind. Thus, I had the luck of getting a copy and reading it at the very first time, i.e., only several years after its publication (1962).

The most influential term introduced in this book is undoubtedly "paradigm shift." It has been used since then for expressing a radical change in a fundamental model of or basic assumptions about reality that substantially influence the way of perceiving, thinking, behaving or organizing of a certain community. It is really an expression that has never grown old. If you google for images of "paradigm shift" today and have a look at the resulted collection of images, you should realize how current and popular this term is, even after its fiftieth birthday. However, it has been almost always a challenge to use this term properly, even for Kuhn. Here, I would like to make a next attempt.

Scientific and Social Paradigms and Their Shifts

Kuhn defined his scientific paradigm as a set of universally recognized scientific achievements that, for a time, provide model problems and solutions, i.e., exemplary experiments, for a community of researchers. Underpinning these exemplars are shared preconceptions that embody hidden assumptions and quasi-metaphysical elements, and are made prior to and condition the collection of evidences.

Kuhn saw sciences as going through alternating periods of normal science and scientific revolution, i.e., mature or dominant paradigm and paradigm shift. During the former period, an existing model of reality dominates a protracted period of problem-solving, while during the latter one the model of reality itself undergoes drastic change to obtain more power for explaining the observed reality. An example in physics is the worldview transition from the Newtonian one to the Einsteinian one.

Instead of Kuhn, who did not consider the concept of paradigm as appropriate for social sciences, social scientists introduced the concept of social paradigm in their context. There, the term is used to denote a set of beliefs, values, system of thought and experiences in a society that are standard and most widely held at a given time. This paradigm is shaped by the cultural background and the historical moment of the society, and affects the way each individual in the society perceives reality and responds to that perception.

Similarly to Kuhn, they used paradigm shift to denote a change of a social paradigm in how a given society goes about understanding and organizing sociologic reality. Here, they focused on social circumstances that precipitate such a shift and the effects of the shift on social institutions. This shift in the social arena, in turn, changes the way the affected individuals perceive the sociologic reality.

As a matter of fact, the core component of any paradigm is a small set of fundamental assumptions. It is on these assumptions that a model of reality, a theory of science or beliefs and values of a community builds. All other components of the paradigm are only material or spiritual implications derived from these basic assumptions. These assumptions are set up consciously or unconsciously through our limited perceptions, observations, experiences, and abilities for imagining and reasoning. It is the changes in the fundamental assumptions that lead to a paradigm shift since the logic never changes. Examples for old but changed fundamental assumptions are: Speed is not a factor affecting the characteristics of the material world; God stipulates everything as it is; etc. Paradigm shifts are generally motivated by gaining certain benefits such as more explanatory power or effectiveness of all kinds.

Constructional Paradigms

In general, there are two major constructional paradigms. One is that represented by, say, Archimedes and the other by Henry Ford.

Due to limited observations, experiences and imagination, Archimedes could have assumed that it should be most effective and convenient in general if each product is made individually and independently from others. Based on this assumption, the whole society including the production organization was correspondingly shaped, as described in Data Warehouse Construction: How Would Great Engineers have Done It?

The decisive observation made by Ford and his design team was that the works accomplished for many car components were absolutely identical. Based on this observation, they should have assumed that it could be more effective if the same work, regardless of whether or not it is for the same car, would be made by the same worker team. Obviously, they should also have verified this assumption by experiments. Based on and derived from this fundamental assumption, complete assembly-lines, whole fabrics and then the entire society have been established accordingly. This way, a constructional paradigm shift took place and has been influencing our constructional life today.

Traditional Data Warehouse Constructional Paradigm

There is at least one corner where Ford’s light has not completely broken in, i.e., the corner of data warehouse construction. Here, a mixture of the paradigms of Archimedes and Ford is still the dominant paradigm. We call it the traditional one. Based on the observations of and experiences with constructing other types of software programs, two basic assumptions were made here:
  • It should be most effective if the same type of work, e.g., design, development or test, is accomplished by the corresponding specialized team (Ford’s paradigm).

  • It should be most effective and convenient if objects to be delivered, such as design specifications or extract-transform-load (ETL) programs, are produced individually and independently from each other (Archimedes’ paradigm).
Based on these assumptions, the traditional organization and methodology for data warehouse construction are principally shaped as follows:
  1. The architect team stipulates the architectural guidelines including algorithms, standards and conventions in accordance with the business strategy, requirements and actuality of the organization.

  2. The designer team develops the program designs including mappings and procedural specifications according to the architectural guidelines provided by the architect team, one design at a time.

  3. The developer team realizes the ETL programs adhering to the corresponding specifications and mappings, one program at a time.

  4. The tester team tests these programs developed by the developer team according to the respective specifications and mappings, one program at a time.
With the terms introduced in Data Warehouse Construction: Behavior Pattern Analysis and Data Warehouse Construction: Generator, Generic Knowledge and Operative Metadata, the informational scenario of the traditional paradigm for data warehouse construction can be illustrated now by Figure 1.

Figure 1: The Traditional Paradigm for Data Warehouse Construction (B. Jiang, 2011)

Here, the object-specific knowledge, represented by the gray balls, is nothing but the operative metadata describing individual objects like tables or mappings. It is not repetitive, although the corresponding balls have the same color. All white balls, however, stand for the same domain-generic knowledge. It is distributed to hundreds or even thousands of design specifications, programs and test specifications repetitively as soon as it leaves the architect office. As emphasized by the frame in the figure, this is an essential characteristic of the traditional paradigm. That the gray balls partially cover the white ones means that both types of knowledge are typically mixed and cannot be easily separated with the traditional paradigm.

A New Data Warehouse Constructional Paradigm

In the above-mentioned articles and in Data Warehouse Construction: The Real Life, we analyzed the behavior patterns according to the traditional data warehouse constructional paradigm and pointed out that this old paradigm is not effective because it distributes the domain-generic knowledge. Based on this observation, a new paradigm is proposed by assuming that the construction should be substantially more effective if the domain-generic knowledge is kept centralized (B. Jiang, 2011). As a matter of fact, this assumption has also been verified with a productivity improvement factor of twenty by realizations on large scales (B. Jiang, 2011).

In comparison to Figure 1, the impact of this new paradigm on the data warehouse construction can be illustrated by Figure 2.

Figure 2: The New Paradigm for Data Warehouse Construction (B. Jiang, 2011)

Here, "DDT-team" stands for a team consisting of designers, developers, and testers as roles for simultaneous acquisition and verification of the operative metadata. They verify the just acquitted metadata using the corresponding generic programs like program generators to generate programs from this metadata and execute them afterward. If the programs generated are not executable yet, they correct the metadata until the programs work. In principle, they do not pay any more attention to the issues of generic knowledge handling. Thus, they do not have any chance to "interpret" the generic knowledge as well. This way, the major cause for constructional ineffectiveness, i.e., multiple versions of the interpretation of the generic knowledge by means of programming, as extensively discussed in the above-mentioned articles and in Metathink: An Enterprise-Wide Single Version of the Truth, can be completely avoided. It is in the architect team where the domain-generic knowledge is stipulated, encapsulated or hard-coded directly into a very small number, e.g., a dozen of generic programs like program generators as containers for the domain-generic knowledge so as to keep it always centralized. This centralization makes an effective maintenance and an accurate and up-to-date documentation of the (small number of) generic programs a nonissue. Note that the operative metadata is always accurately documented and kept up to date since it is exactly the operative metadata that was used for the program generation.

It is noteworthy that the two types of knowledge are are no longer mixed in the figure. Moreover, all other activities depicted in Figure 1 disappear completely. There are only two teams in the figure, exactly corresponding to the two types of knowledge. Now, it should be apparent that the organization and methodology for data warehouse construction are changed fundamentally as well in comparison with the traditional one. Furthermore, individuals that have been involved in or trained with the new paradigm perceive the constructional reality completely differently. Almost always, they look automatically for the hidden genericity as soon as they are committed with a new task. All these are actually typical accompanying phenomena with paradigm shifts.

Above, we mentioned program generators as a representative for the new paradigm for the sake of presentation. In fact, within this new paradigm, there is another generic approach that is even more effective than the generators as we know. In my next articles, I will discuss this in detail.

  • It was said that Kuhn said goodbye to the term "paradigm" publicly at the Nymphenburg lecture in 1984 after about 34 unsuccessful attempts to define it accurately.

  • In this article, I gratefully used a lot of information about "paradigm shift" from en.wikipedia.org and de.wikipedia.org.
  • Bin Jiang, Ph.D.Bin Jiang, Ph.D.
    Dr. Bin Jiang received his master’s degree in Computer Science from the University of Dortmund / Germany in 1986. In 1992, he received his doctorate in Computer Science from ETH Zurich / Switzerland. During the research period, two of his publications in the field of database management systems were awarded as the best student papers at the IEEE Conference on Data Engineering in 1990 and 1992.

    Afterward, he worked for several major Swiss banks, insurance companies, retailers, and with one of the largest international data warehousing consulting firms as a system engineer, software developer, and application analyst in the early years, and then as a senior data warehouse consultant and architect for almost twenty years.

    Dr. Bin Jiang is a Distinguished Professor of a large university in China, and the author of the book Constructing Data Warehouses with Metadata-driven Generic Operators (DBJ Publishing, July 2011), which Dr. Claudia Imhoff called “a significant feat” and for which Bill Inmon provided a remarkable foreword. Dr. Jiang can be reached by email at bin.jiang@bluewin.ch

    Editor's Note: You can find more articles from Dr. Bin Jiang and a link to his blog in his BeyeNETWORK expert channel, Data Warehouse Realization.

Recent articles by Bin Jiang, Ph.D.



Want to post a comment? Login or become a member today!

Be the first to comment!