I've recently begun research in to this area, and am calling this "Automorphic data models" rather than dynamic data warehousing, because I think the concept lends itself better to this kind of term. Dynamic Data Warehousing seems to be an overly-used slightly abused term in the industry, and raises quite a few questions as to how, and what it is. Vendors are also using this term to mean different things. We'll let the business and the vendors work out their definition of this term over the next few years. I'm going to write exclusively (for a while - in this section) on Automorphic Data Modeling. These entries are aimed at the researches and the scientific people in the audience.
First, I must apologize to all those who _really_ know this stuff. I am an architect, and an Information modeler at heart. I believe these connections exist to the Data Model architecture I wrote up called the Data Vault Model, because it is based in spatial-temporal mathematics, and because it is based on the "poor mans definition of how the brain MIGHT store/use/retrieve information." Based on these hypothesis, I can see where the mathematics of these types apply to the model. I'd love to hear from those of you as to why these theories will or won't work, it will be interesting to see how this progresses.
If we start with Websters definition of Automorphic we end up with the following:
Patterned after one's self.
The conception which any one frames of another's mind is more or less after the pattern of his own mind, -- is automorphic. --H. Spenser.
However, I prefer the mathematical definition of Automorphism:
In mathematics, an automorphism is an isomorphism from a mathematical object to itself. It is, in some sense, symmetry of the object, and a way of mapping the object to itself while preserving all of its structure. The set of all automorphisms of an object forms a group, called the automorphism group. It is, loosely speaking, the symmetry group of the object. http://encyclopedia.thefreedictionary.com/automorphism
Automorphic Groups: (Which is what I'd suggest the Data Vault model is built from)
In mathematics, the general notion of automorphic form is the extension to analytic functions, perhaps of several complex variables, of the theory of modular forms. It is in terms of a Lie group , to generalize the groups SL2(R) or PSL2 (R) of modular forms, and a discrete group , to generalize the modular group, or one of its congruence subgroups. The formulation requires the general notion of factor of automorphy for , which is a type of 1-cocycle in the language of group cohomology. The values of may be complex numbers, or in fact complex square matrices, corresponding to the possibility of vector-valued automorphic forms. The cocycle condition imposed on the factor of automorphy is something that can be routinely checked, when is derived from a Jacobian matrix, by means of the chain rule. http://encyclopedia.thefreedictionary.com/Automorphic+form
Essentially what we are doing within the Data Vault data model is a form of Automorphism. The Data Vault modeling structures are built The Data Vault Model is actually based on many different components of temporal mathematics and spatial mathematics. (I've listed a few of the research papers I used in the 1990's to help me construct the structural integrity of the Data Vault):
1. â€śUnifying Temporal Data Models via a Conceptual Modelâ€ť, http://www.cs.arizona.edu/~rts/pubs/ISDec94.pdf
2. â€śNotions of Upward Compatibility of Temporal Query Languageâ€ť, http://www.cs.arizona.edu/~rts/pubs/Wirtschafts.pdf
3. â€śTemporal Data Managementâ€ť, http://oldwww.cs.aau.dk/research/DP/tdb/TimeCenter/TimeCenterPublications/TR-17.pdf
4. â€śSpatio-Temporal Data Types: An Approach to Modeling and Queryingâ€ť, http://web.engr.oregonstate.edu/~erwig/papers/MovingObjects_GEOINF99.pdf
5. â€śFormal Semantics for Time in Databasesâ€ť, http://portal.acm.org/citation.cfm?id=319986&coll=portal&dl=ACM&CFID=6511873&CFTOKEN=58729889
The Data Vault model is capable of adapting, changing on the fly, exhibiting the mathematical properties of automorphism, in that through architecture mining combined with data mining efforts we can "learn" what architecture flaws exist, where stronger relationships exist, and where the architecture can change itself or re-connect to itself to form a stronger data model.
How does this work?
The Data Vault LINK is made up of vectors. It houses a directional connection to each HUB that it is associated with. The vector of that connection can be assigned a strength and confidence co-efficient to determine it's usefulness within the data set contained within a link. Mining the data over time can produce a powerful combination of patterns of change, along with the discovery of patterns of association (possibly never known before), or as a result of a pre-known state.
The data mining tool can then be taught either "what to look for", or it can be set-off in discovery mode to associate information based on a Data Vault model already constructed (use the existing Data Vault model as a starting point for the learning process), and then it can determine if any "undiscovered" relationships exist. Furthermore the process of mining the data can then be used to assign strength and confidence coefficients to EACH of the vectors in each link, thus preparing for the architectural mining phase.
So how is the Data Vault automorphic?
The Data Vault is connected (within itself) to itself via the Links and the vectors within the links. Each vector can be considered a component within the mathematical matrix of the automorphic functions. Then, the mathematics of "groups" and vector analysis can be applied to dynamically alter the matrix for a potentially different outcome.
Thus, new links can be constructed on the fly, tested, and then removed (if no real value to the human on the other end of the computer is perceived). They can likewise be constructed, and then old linkages can be removed to produce an auto - morphing data structure, something akin to self-correcting. I will NOT go so far as to say it's actually learning, because it (the computer) will still not understand the CONTEXT to which it's applying the changes.
This type of system STILL requires guidance, training, and tweaking from the operators in order to achieve the desired outcomes and modifications to the model that make sense to the business, even if the business itself is commercial or government oriented.
However, this type of system can be applied (easily) directly to the Data Vault modeling constructs in order to achieve a self-changing data store, something that appears to "point-out" different facts, or discover different unknown relationships without understanding what it has. The understanding part is still up to the human.
Ok, so how does this benefit business?
Well, if we can spot relationship changes automatically (on a simplistic level), or mathematically figure out POSSIBLE infancies in our business, then we might be able to adapt our business based on the information being collected (or in some cases, adapt the source systems to collect information WE MISSED that might be vital to our operations). The data sets and the architecture of the data sets can tell us as much about our business as the processes and the business models we use.
You can find out more about the Data Vault model (for free) at: http://www.DanLinstedt.com
Hope this is interesting,
Posted December 23, 2007 6:32 AM
Permalink | No Comments |