Hmmm, I've been thinking about this for quite a while. In the tangible world we have tags for physical goods - yesterday they were bar codes, today they're RFIDS and RTLS systems. Tomorrow, physical elements may be tagged with DNA sequences, or electron signatures at the nano level.
Why then is it so hard to track intangible "data"? For applications we have the equivalent of software licenses, but for the actual data? Nothing.
In a world of hypothetical speculation, I would suppose that tagging every data element with an individual signature may be desirable. We could start with "units of work". Take this blog entry for example, tag the header with a signature, and tag the extended entry with a signature. The important thing is: the signature must travel with the unit of data - everywhere it goes. It becomes the unique ID for data sets, like RFID's and should be tracked across the network.
What possibilities does this open up? In data warehousing we often tag our data with CRC32/CRC64, and MD5 (hash functions producing mostly unique values across a row of data). Why then can't these "keys" become universal, and shared around the world? These are standard functions that produce the same keys for the same data everywhere.
What would happen if we could actually tag every "word" entered into every application? I assume data traceability would increase exponentially, talk about a boost to search engines! Unfortunately the downside to these functions is they produce very large keys, and for the most part functions like MD5 cannot be "reversed" - which leads to a massive storage and lookup function. Another issue is that CRC32 can produce duplicates (as can CRC64, although less frequently).
If someone were to produce a device that can "tag" data going over the internet, store the data in compressed format with the key-tag, then pattern recognition would be easier to spot - data mining would see a huge boost, and it may be possible to aggregate what used to be seen as dissimilar data into a similar keyed entry. These keys could also be shared across environments - maybe this is a call to EII vendors who are sharing data over SOA and web-services?
For now it's a pipe dream, but it may step into reality with DNA computing or nanotechnology. Just think: Data Unique Universal Keys (DUUK) - a fascinating idea. From compliance and monitoring perspectives it opens a ton of doors.
Posted September 30, 2005 6:24 AM
Permalink | 1 Comment |