Data Warehouse Construction: Constructional Efficiency
Originally published April 4, 2013
Computational Algorithm, Problem Complexity and Computational Efficiency †In computer science, we usually use the term problem complexity to express the possible cost if the problem in consideration is solved using the best available computational algorithm for this problem. The more complex a problem is that there may be a higher cost for solving this problem using the algorithm. Here, we are interested mainly in the time and storage that the algorithm may need. From an opposite perspective, we investigate the time and storage required by different algorithms respectively for solving a given problem so as to compare their computational efficiency.
Constructional Approach and Constructional Efficiency or ExpenseConstructing a mechanism for updating a target table in the data warehouse using data in the corresponding source tables is also a problem (just as in constructing an automobile or a hotel). We are interested now not only in the time and storage necessary for executing the update operation as with the classical dogma mentioned above, but also we are especially concerned with the expense, i.e., the time and money, that the construction of the related programs induces by employing a given constructional approach. In other words, we are now interested in the constructional efficiency of different constructional approaches.
Keeping this in mind, let us analyze the constructional efficiency, or the constructional expense if expressed at little pessimistically, in more detail, namely at the level of updating a single target table.
Traditional Constructional Expense DeterminersAs a matter of fact, the following factors mostly determine the constructional expense by applying the traditional constructional approach, except for the number of attributes of the target table:
Mechanism Complexity Measured with the MGO ApproachFrom the new approach standpoint, i.e., the metadata-driven generic operator (MGO) approach, we have now a completely new and qualitatively different situation concerning the expense assessment. That is, the constructional expense is determined virtually exclusively by the related operative metadata when the relevant MGOs are in place. This is due to the fundamental fact that the treatment of most of the "traditional" expense factors listed above is already accomplished as the treatment of the domain-generic knowledge. Stated in another way, it has already been coded in the MGOs and, most importantly, it has been coded only once and forever. Consequently, it is no longer an expense consideration.†
Based on this fact, we can classify the mechanism complexity, or the related constructional expense, of an updating mechanism for a given target table by employing the MGO approach, for instance, as follows:
Expense Assessment Incommensurability between Constructional ParadigmsIt should be noticed that both constructional approaches have essentially different determiners for their respective constructional expenses. A factor for one approach can be extremely challenging and expensive, while for the other it represents only a banality. For instance, the constructing and thorough testing of a mechanism for treating the eighth factor listed above for a target table may mean several working days with the traditional approach, whereas it may be only a flag setting with the MGO approach, which, in turn, means only a finger movement within a fraction of a second. As mentioned above, this is because this mechanism has already been implemented in the related MG operator generically, and is applicable for every target table if required. This is valid for all factors listed above. Last but not least, it is worth mentioning that even the "very complex" updating enumerated above with the MGO approach is generally not more complex than the usual "simple" ones with the traditional approach.
It may be noted that the two expense assessment systems presented above are in fact incommensurable with each other. The reason for this consists in the fact that they are respectively defined in two paradigmatically3 different contexts. Between two different paradigms, things cannot always be compared meaningfully with each other, as observed by Thomas Samuel Kuhn.
Nevertheless, the constructional practice delivers an effective solution for this dilemma, i.e., measuring the time and money actually needed for constructing the respective mechanisms with the same functionality and of the same quality.
ConclusionThis and my previous articles are dedicated to a thorough analysis and detailed explanation for the reasons why the MGO approach can save so much time and money for constructing sophisticated data warehouses in comparison to the traditional approaches. As a matter of fact, the result of this conceptional work has also been confirmed with a productivity improvement factor of twenty in terms of time and money by realizations on large scales (B. Jiang, 2011). Last but not least, the MGO approach ensures a significantly better quality extract-transform-load mechanism, the core of any data warehouse, in terms of the mechanismís:
Recent articles by Bin Jiang, Ph.D.
Copyright 2004 — 2019. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC