Big Data, Text and Relational Database Management Systems
by Bill Inmon
Originally published May 3, 2012
Listen carefully to the “big data” vendors and this is what you hear: “Let’s get rid of relational.” It is like courtiers in the castle whispering, “The king must die.” What’s going on here?
Now there is no question that “Big Data DBMSs” can handle volumes of data – look at Google if you don’t believe me. And it is true that “Big Data DBMSs” can handle text – that is essentially all they have. So big data is here to stay.
Let’s take a realistic look at what is happening here. It is true that “Big Data DBMSs” can handle huge amounts of data and can handle textual data. But – for a variety of reasons – the text found in “Big Data DBMSs” is not fit for analytical processing. “Big Data DBMSs” are optimized for collection and storage of lots of data. But when it comes time to analyze the text found in “Big Data DBMSs,” it is a different story entirely.
Why is data found in “Big Data DBMSs” not fit for analysis? There are several reasons this data is so difficult to analyze:
In order to use the text for decision-making purposes, you need to add context to the text. That is what is meant by disambiguation, and disambiguation is necessary for ANY text found anywhere, which certainly includes text found in big data. To simply jump into big data and start to use the raw text that is found there for analysis is to invite disaster.
Textual ETL disambiguates text. Textual ETL reads the text found in big data and refines it. Textual ETL then puts its output into a relational database. Once in a relational database, text can be openly used by the industry or corporate analyst.
An interesting question is this: Does textual ETL have to place its results in a relational database? Of course not. Textual ETL can place its data anywhere. If the corporation wishes, textual ETL can place the refined data back into the big data database.
The reason why textual ETL places the data in a relational data base today is that is where the corporate business analysis takes place. But in the future, after textual ETL has disambiguated the raw data found in big data, there is no reason why textual ETL cannot place the refined text back into the big data environment, if that is what the corporation wishes. Textual ETL is agnostic.
But in any case, before text can be used for analysis, the text must be disambiguated.
Recent articles by Bill Inmon
Copyright 2004 — 2020. Powell Media, LLC. All rights reserved.
BeyeNETWORK™ is a trademark of Powell Media, LLC