Blog: William McKnight Subscribe to this blog's RSS feed!

William McKnight

Hello and welcome to my blog!

I will periodically be sharing my thoughts and observations on information management here in the blog. I am passionate about the effective creation, management and distribution of information for the benefit of company goals, and I'm thrilled to be a part of my clients' growth plans and connect what the industry provides to those goals. I have played many roles, but the perspective I come from is benefit to the end client. I hope the entries can be of some modest benefit to that goal. Please share your thoughts and input to the topics.

About the author >

William is the president of McKnight Consulting Group, a firm focused on delivering business value and solving business challenges utilizing proven, streamlined approaches in data warehousing, master data management and business intelligence, all with a focus on data quality and scalable architectures. William functions as strategist, information architect and program manager for complex, high-volume, full life-cycle implementations worldwide. William is a Southwest Entrepreneur of the Year finalist, a frequent best-practices judge, has authored hundreds of articles and white papers, and given hundreds of international keynotes and public seminars. His team's implementations from both IT and consultant positions have won Best Practices awards. He is a former IT Vice President of a Fortune company, a former software engineer, and holds an MBA. William is author of the book 90 Days to Success in Consulting. Contact William at wmcknight@mcknightcg.com.

Editor's Note: More articles and resources are available in William's BeyeNETWORK Expert Channel. Be sure to visit today!

October 2011 Archives

This week, at the PASS Summit, Microsoft unveiled its inevitable "big data" strategy.  The world of big data is the new unchartered land in information management and the big vendors are jumping on board.  "New economy" giants like eBay, twitter, FaceBook and Google are the early adopters - and many even built the big data tools that everything is based on. 

 

It would be too easy to dismiss big data as a Valley-only phenomenon, and you shouldn't.  Microsoft's information management tools serve perhaps the widest ranging set of clients anywhere.  They've either made their move to "keep up with the Joneses" (Oracle had some big data announcements last week) or there must be some Global 2000 budgets in it.  The industry will not thrive without some of the latter and that's what I'm betting on.

 

There's vast utility in unstructured and machine-generated data (somehow tweets count in this category) and many reasons, starting with monetary, why, once a company finds some use for it, they will choose a big data tool like Hadoop rather than a relational database management system to store the data.  Yes, and even live with the tradeoffs of lack of ACID compliance, lack of transactions, lack of SQL (although this is eroding by the day), lack of schema sharing, the need to user-assemble (although this is also eroding) and node failures being a way of life.  Indeed, the "secret sauce" of Hadoop is the distribution of data and node recovery failure - RAID-like, but less costly.

 

It's better to play with this "hippy developed" (as one skeptic referred to it as) Hadoop than ignore it at this point.  That's what Microsoft has done.  Microsoft is working to deploy Hadoop on Windows and cloud-based Azure.  This could really work in Microsoft's big data land grab.  It's a hedge against going too hard-core into the open-source world.  It's comfortable Windows combined with Hadoop.  For the many, many fence-sitters out there, this is good timing.  Many want to trace movements of physical objects, trace web clicks and other Web 2.0 activity.  They want to do this without sacrificing enterprise standards they are used to with products like Windows and its management toolset.

 

Development will occur with the Yahoo-legacy Hortonworks and will go into Apache.  This announcement follows the development of the Sqoop-compatible Microsoft SQL Server Connector for Apache Hadoop.

 

A simultaneous Microsoft big data announcement was an ODBC Driver to Hive.  Hive was developed by FaceBook to make the data access to Hadoop easier than MapReduce.  Every day, FaceBook runs 150,000 jobs.  Only 500 are MapReduce, the rest are HiveQL.  HiveQL is SQL-like and, in some ways, actually exceeds SQL capabilities with complex types like associative arrays, lists and structure data types.  And soon, it will have an ODBC driver from Microsoft.

 

The announcements didn't coincide with any showable development so apparently there's still some work involved before we will have substantially more information, but it's definitely worth watching as a milestone in the big data journey.


Posted October 15, 2011 2:12 PM
Permalink | No Comments |


   VISIT MY EXPERT CHANNEL

Search this blog
Categories ›
Archives ›
Recent Entries ›