For those not familiar with the concept or name, a Rube Goldberg machine, contraption, invention, device, or apparatus is a deliberately over-engineered or overdone machine that performs a very simple task in a very complex fashion, usually including a chain reaction--this is the definition used by Wikipedia. Examples of very simple tasks are pouring beer in a glass, opening up a door, or switching on a TV. Rube Goldberg was an American cartoonist and was most popular for drawing weird machines. Here you can find a photo showing an example of a Rube Goldberg machine. On YouTube you can find numerous films showing such machines at work. One you have to see is the one developed by a young kid called Audri.
Why a discussion on these weird and often useless machines? Quite recently, I received an email from my customer with the following remark: "I have been thinking about the complexities of physical integration of our systems [with physical integration he means using classic ETL and duplicating data in several databases]. I wish I had Audri's YouTube video when I was trying to urge my team to consider data virtualization. After seeing that little boy, it feels as if developing and testing a system based on physical integration, is like trying to develop and test a Rube Goldberg machine."
He continues with "Knowing what I know now [after studying data virtualization more seriously], if data virtualization is comparable to using a remote control to turn a TV on and off, then physical integration is comparable to developing and using a Rube Goldberg machine to turn a TV on and off."
Evidently, this is an exaggeration, because there are still various situations for which you have to or want to deply a form of physical integration. But there is some truth in it. Software and hardware are currently so much more powerful than ten years ago. In fact, there is so much more "power" available that if organizations would have to design their current BI systems from scratch, they would probably come up with much simpler architectures, ones in which agility would be a fundamental design factor. Data virtualization would be one of the technologies that would clearly help to develop more agile BI systems.
So, years ago, when we designed the architectures of our BI systems, they were not considered Rube Goldberg machines. They were necessities, there was no other choice. But today there is. So, if we look at these architectures today, they do resemble Rube Goldberg machines. They are like machines in which the data values roll down a spiral, are thrown from one database to another, are changed occasionally, fall of some track sporadically, and sometimes even float a few inches, before they arrive in a report.
I have decided to use Audri's film from now on to explain what the differences are between developing BI systems with and without data virtualization.
Note: If you have questions related to data virtualization, send them in. I am more than happy to answer them.
Posted November 19, 2012 11:15 AM
Permalink | 1 Comment |



