Big Data Needs Context
by Bill Inmon
Originally published November 1, 2012
Context is funny. You don’t miss it or even think about it until you don’t have it. Then suddenly context becomes a really big issue.
* Not valid phone numbers
Looking at the simple file and knowing what the columns of data mean, we make several assumptions about context. For example, we assume that the phone number is current. We assume that the phone number is in the U.S. We assume that the area code designates some geographical locale. As far as the bank balance is concerned, we assume that it is the current bank balance. We assume that it is in U.S. dollars. We assume that the name that is in the file is associated with the phone number and the account. In a word, because the data is structured, we make a lot of contextual assumptions about the data contained in the file.
Such is the nature of structured data. A big part of the “structure” of structured data is the context of the data found in the record or the file.
But when it comes to unstructured data, there is no context that can be conveniently associated with the data. For example, suppose you are reading an unstructured file. Suppose you encounter the number 7. Now what does “7” mean?
Is it the days in the week? The seven seas? The amount the Dow Jones went up this morning? The number of brothers and sisters you have? The truth is that the number “7” is naked. By itself it means nothing. In order for “7” to have meaning, it MUST have context. And with unstructured data there is no context.
So before you get all excited about “big data” and all the unstructured data you find there, you need to spend some time thinking about how you are going to apply context to your unstructured text. If you are seriously going to ponder that question, spend a few minutes on the larger question: What does context of raw text really mean? It turns out that there are many different kinds of context – some of them more useful than another. Some of the forms of context are:
The people that are going nuts over big data today seem to either not know or not care that there is this major issue of context that comes with big data.
SOURCE: Big Data Needs Context
Recent articles by Bill Inmon
Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC