Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

Recently I made a key-note presentation at Array Communications conference in the Netherlands.  There were 50 people in the room, and it was a good crowd.  There are quite a few shifts occurring in the EDW/BI market space, and there are some reasons why changes are happening.  Game changing technology is coming to the market in ETL, services, hardware, databases, and applications.

Below are some game changing technologies that will help shape 2010 and the years to come:

  • Solid State Disk (SSD)
  • Hosted Cloud Computing
  • Column Based Databases
  • Unstructured Information
  • Ontologies / Taxonomies
  • Mining Engines (in-database)
  • Broad Based Web Services
  • Data Visualization
  • Flash & Silverlight Front-Ends
  • Analytic functions in Database Engines
  • Business Rules Engines Melding with ETL and Web Services
  • DW Appliances

If you're not familiar with these technologies, it would be good to get a grasp on them as we move forward.   By the way, SSD is now available in Macintosh Powerbooks - did you know this?  Solid State Disk has no moving parts, and is anywhere from 75x to 150x faster than standard disk devices.

Hosted cloud computing will bring the rise of EDW/BI as a service - There will be open and closed clouds available.  Of course the open clouds are things like Amazon EC2, closed clouds will be hosted in data centers over VPN, sponsored by vendors and the like - but will bring on the data warehouse and BI as a service solutions.  Mark my words - combining cloud based services with DW appliances will make things seemless for the back-end.  You will be able to mix and match technologies for appropriate purposes (column based data stores mixed with standard or traditional database engines).   The whole game is changing!

Unstructured information is coming to the table, businesses are finally realizing the value hidden in their unstructured information - and the technology is beginning to mature in this space.  It's taken many years, but it's coming of age.  Unstructured information can be "hooked" dynamically through Data Vault Link structures to structured data sets.  It can be interpreted by applying ontologies and taxonomy break-downs.  It can then be applied to build dynamic or RAM based cubes - lending itself to data exploration in a rapid fasion.

In-Database mining engines are making their way to the fore-front.  This is going to continue, as database engines get stronger - they will absorb more and more Transformation functionality.  Moving the transformations closer and closer to the data set - this is where the future is.  Real-time data mining will be available as another transformation call.  Real-time mining will be a continual running neural network - that takes transactional input and grades it on the fly.  This has been happening for years in the research community, and in the robot community - it's high time it moved in to main-stream technology.

Broad based web-services will be used, especially with the rise of the cloud technologies.  What we will see is: Cloud To Cloud vendors, technology that specializes in cross-cloud communication through web-services and grid components.  These technologies will include compression, encryption, and secure communication channels at high speeds over "smart" routers, hubs, and switches.  In fact - it wouldn't surprise me to see routers and switches get smarter, and include layers of web-services within their hardware that allow cloud-to-cloud communication over secure protocol.

On the BI side of the house, we will see a continued rise of use in Adobe Flash platforms, and Microsoft Silverlight platforms for Data Visualization.  We will move into the realm of using multi-touch, fly-through, dynamic graphs and 3D land-scapes for our data.  We will see an evolution of 2nd life, or technologies similar to this - where virtual worlds will offer the BI meeting spaces of tomorrow.  But these graphs won't be static.  You'll be able to alter, change, play what-if games with them, add metadata ON-THE-FLY, and write the new data back to dynamic cubes.

As database engines, appliances, and vendors sort themselves out, more and more transformation logic will be included in these engines - to the point where some of the "transformation" notions will be engineered into firmware or hardware levels.  Continuing to move the data closer and closer to the storage will be the end-goal.  it won't matter if the storage is old-world disk (standard disk), or SSD.  In fact, I would argue that the database manufacturers will work a deal with the SSD manufacturers, and have custom "Smart-Storage" built to include transformation functionality at the data store level. 

ETL engines will re-focus their efforts on managing and manipulating the Metadata around the transformation logic.  People like Tielhard (who just sued everyone in the land-scape) will have to work deals with database vendors to provide transformation functionality at the hardware level.  "traditional" ETL vendors will no longer exist - they will change to provide service based, cloud based technology - again, with a heavy focus on metadata management, process design, and controls.

Did I miss something?  Comment on what you see!

Cheers,
Dan Linstedt

http://www.DataVaultInstitute.com


Posted November 12, 2009 3:15 AM
Permalink | 1 Comment |

1 Comment

I guess MapReduce and the NoSQL folks? Although I suppose those could fit in the unstructured and analytics buckets but I'm not sure.
Where would you see those?
Thanks
J.

Leave a comment

    
Search this blog
Categories ›
Archives ›
Recent Entries ›