We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Richard Hackathorn Subscribe to this blog's RSS feed!

Richard Hackathorn

Welcome to my blog stream. I am focusing on the business value of low latency data, real-time business intelligence (BI), data warehouse (DW) appliances, use of virtual world technology, ethics of business intelligence and globalization of business intelligence. However, my blog entries may range widely depending on current industry events and personal life changes. So, readers beware!

Please comment on my blogs and share your opinions with the BI/DW community.

About the author >

Dr. Richard Hackathorn is founder and president of Bolder Technology, Inc. He has more than thirty years of experience in the information technology industry as a well-known industry analyst, technology innovator and international educator. He has pioneered many innovations in database management, decision support, client-server computing, database connectivity, associative link analysis, data warehousing, and web farming. Focus areas are: business value of timely data, real-time business intelligence (BI), data warehouse appliances, ethics of business intelligence and globalization of BI.

Richard has published numerous articles in trade and academic publications, presented regularly at leading industry conferences and conducted professional seminars in eighteen countries. He writes regularly for the BeyeNETWORK.com and has a channel for his blog, articles and research studies. He is a member of the IBM Gold Consultants since its inception, the Boulder BI Brain Trust and the Independent Analyst Platform.

Dr. Hackathorn has written three professional texts, entitled Enterprise Database Connectivity, Using the Data Warehouse (with William H. Inmon), and Web Farming for the Data Warehouse.

Editor's Note: More articles and resources are available in Richard's BeyeNETWORK Expert Channel. Be sure to visit today!

IOD-Gold%20logo.jpgStart of the second day and going strong! The first session was IBM Information Management Research Topics by Hamid Pirahesh, IBM Fellow and Senior Manager of Exploratory Database at IBM Research Division. Hamid has been presenting to the Gold Consultants for many years. We look forward to his visions! He focused today on the role of cloud computing and its impacts on future IT. It is all about a fundamental transformation of IT. If IT does not pay attention, they will be at a disadvantage.

New players are constantly emerging. And, this discussion is not just about IBM. Most of the discussion is in the context of the open community. The issues are about cloud computing and collaboration workload. Massively parallel analytic computation will be augmenting our data warehouses. In the post-Dot-Com, technology is actually causing disruption on industry. There is an order of magnitude more data coming into the DW. As we move to petabytes of data, you also need much more computing power, with thousands of nodes. Substantial amount of VC money is flowing into this area because it has been its potential of disruption.

IBM-Gold%20hamid.JPGThere are some massive analytic applications, such as. . . Facebook gets 500 GB per day of new web log data from 3000+ web servers, driven by 50M+ users along with categories (such as starbucks group in Facebook). Yahoo 2000+ server clusters for building search indices plus a similar cluster for research experiments. Google is similar, using a MapReduce parallel dataflow. Common characteristics are: flexible schema, complex code in parallel, flexible resource scheduling, and failures are the rule (not the exception).

The impacts on enterprises are: billions of mobile devices, more people are on social networking, location-based services, and deeper analytics about customer behavior. This will require platforms with huge parallel, changing complex analytic algorithms, and up to 10K nodes.

There are several major cloud computing projects underway from IBM, Microsoft, HP, CMU, U of Wisconsin, European universities and so on. These projects are using MapReduce functional math paradigm, in contract to the SQL paradigm. Jaql as a query language for JSON. New companies are emerging, like Amazon EC2/S3/EBS, Mosso Rackspace, IBM BlueCloud , and more. Terminology is evolving with the term de jure as Data-as-a-Service (DaaS).

Hamid show a picture of the Google data model on the Columbia River for cooling and power generations. Similarly, enterprises require:

- Secure tunnel from the corporate data model to/from the cloud services, which must be secure, permanent, and high availability

- SecureSphere within the cloud

- Security of cloud operations (who in the cloud can see my data?)

IBM is partnering with Google to provide a cloud to universities, as the Google/IBM Imitative. Dozen universities are using 1100+ servers, with 700+ students studying advanced computing architectures.

Deep dive into MapReduce computation model. . . What is new here? In a gross summary, MapReduce is a sequence of partition, sort and grouping with high parallelism. Hamid made the point that this is not new, since it is similar to the DB2 shared-nothing architecture. So IBM have been doing MayReduce for a long time.

Hamid shared a key slide that showed a horizontal dimension for scale (low with 20 nodes to ultra high with 10K nodes). The vertical dimension is function (low to very high). The sweet point is in the upper right quadrant for high-high on both dimensions. He shared his current project, which is confidential. And then, he pointed out other related projects, such as Facebook (with Hadoop applications and Hive for semi-structured data, Couch Database, ObjectGrid and so on. In summary, we need to move to the upper right quadrant of this diagram.

My take on this. . . Hamid always finishes with a flurry, regardless of how much time he is allocated. Just do not blink during his last 5 minutes. The fast fly-through of his visions are always informative, intense and fun!

[Blog stream from IBM IOD/Gold October 2008 is here]


Posted October 26, 2008 8:50 AM
Permalink | No Comments |

Leave a comment