As you look at the Web2.0 and Web 3.0 world, it is becoming clear that information management on the internet with respect to data will be a catalyst to your success or failure in the new world order. Traditionally when you build applications in the Web, you do not look at it being very data savy and largely transactional in nature. But looking at the way Flickr, Facebook, Twitter and Social networks have changed the game, we understand that no longer are we looking at a transactional silo, but rather need to react to a "long tail".
In the new world, you will need to be "big" but "nimble", large and flexible. Wow that's a mouthful to keep saying. The reason for this thinking is we need to look at the structured and the unstructured data to understand the customer and their needs, and react quickly to address those needs. When you talk of unstructured data, in a Web world you cannot afford to load Gigabytes and Megabytes of data, most of which is noise. You need to get the intelligence extracted and linked, but leave the content and the context outside. How do you accomplish this? there are some companies addressing this need, but we need a nimble and strong ETL engine to do this process.
This is where you need to look at Textual ETL and understand how to build the unstructured database. The traditional vendors are doing their part, but the end result has left a lot to your imagination.
Textual ETL is complex and deals with data which has minimal structure and completely 180 degrees opposite of transactional data. As we move towards the Web 2.0 --> Web 3.0 world, we will encounter this hurdle and I hope there are tools that will handle this Large data management to integrate the transactional and textual data.
Posted February 12, 2010 2:22 PM
Permalink | No Comments |