Blog: Dan E. Linstedt Subscribe to this blog's RSS feed!

Dan Linstedt

Bill Inmon has given me this wonderful opportunity to blog on his behalf. I like to cover everything from DW2.0 to integration to data modeling, including ETL/ELT, SOA, Master Data Management, Unstructured Data, DW and BI. Currently I am working on ways to create dynamic data warehouses, push-button architectures, and automated generation of common data models. You can find me at Denver University where I participate on an academic advisory board for Masters Students in I.T. I can't wait to hear from you in the comments of my blog entries. Thank-you, and all the best; Dan Linstedt http://www.COBICC.com, danL@danLinstedt.com

About the author >

Cofounder of Genesee Academy, RapidACE, and BetterDataModel.com, Daniel Linstedt is an internationally known expert in data warehousing, business intelligence, analytics, very large data warehousing (VLDW), OLTP and performance and tuning. He has been the lead technical architect on enterprise-wide data warehouse projects and refinements for many Fortune 500 companies. Linstedt is an instructor of The Data Warehousing Institute and a featured speaker at industry events. He is a Certified DW2.0 Architect. He has worked with companies including: IBM, Informatica, Ipedo, X-Aware, Netezza, Microsoft, Oracle, Silver Creek Systems, and Teradata.  He is trained in SEI / CMMi Level 5, and is the inventor of The Matrix Methodology, and the Data Vault Data modeling architecture. He has built expert training courses, and trained hundreds of industry professionals, and is the voice of Bill Inmons' Blog on http://www.b-eye-network.com/blogs/linstedt/.

November 2007 Archives

I recently attended the Oracle Open-World show, it was well put together. The vendor floor was extremely busy. I learned a little bit about some up and coming technology called FABRIC SYSTEMS. There are a few vendors out there (it seems) trying to make a go of it. As a result of learning about fabric based systems, I'm going on a quest to learn more. What I learn, I will share here - because I believe that these types of systems will change the landscape as well.
Fabric based systems seem to have MPP in mind, but allow the individual SMP's units to act together. They appear to be scalable, and high powered.

Let's start with defining a FABRIC of systems:

Gt: From where does the term "fabric" come? Is the metaphor based on how -- like the threads in a real fabric -- the individual machines appear and act as one? HAAR:That's an important part of it: that a large number of machines look and act like one ... a unified fabric of processing power so to speak. It's also a metaphor for the business and technical agility -- flexibility -- that the technology unlocks for customers. In those ways, an application fabric is kind of the antithesis to "big iron."
http://www.gridtoday.com/grid/738014.html

I've been watching, learning, and reasoning about big systems for quite a number of years. I've seen the notions of Grid, Grid computing, and application power - but up until now (when I learned about fabric) these concepts didn't seem to cross over in to the hardware world where MPP and SMP live. You've heard me espouse the virtues and values of high-powered (BIG-IRON) SMP machines, and you've heard me discuss the values of MPP computing systems. That is: when it comes to database power, scalability and flexibility.

Oracle (a couple years back), and other vendors over the years have tried to bring Grid Computing to the center stage, but sometimes they've succeeded (in very specific configurations) and most times, they've not succeeded in constructing true GRID computing power.

Well, last week I learned about a company called: LiquidIQ (http://www.liquidcomputing.com/home/home.php) They define themselves this way:

Liquid Computing is the only vendor offering a converged computing, networking and broadband system to scalable computing users. LiquidIQ™ delivers sustained performance over scale, is highly reliable, and it is simple – simple to implement, to operate, to scale and to reconfigure. As a result, the lifecycle economics of LiquidIQ are extremely attractive.

IWhat's interesting about LiquidIQ is the fact that they are nothing but a massive computing engine with no local storage... You can hot-swap blade computers with multi-core processors, the network components are all high-speed, custom built interconnects, the backbones and motherboards all have high speed data transfer throughput. It truly gives new meaning to the MPP looking and acting like SMP on big-iron, without the SINGLE back-plane requirements.

On a business note, their technology saves huge amounts of energy (efficient design), increases up-time, and can put as many as 4,000 core's to work in a single rack... That's impressive. I didn't have time to ask them how much RAM could be placed in the box, but their base entry point is around $120,000. If you need all this computing power for large scale database work... then this would be an answer for you. More on the storage, benefits, and other items shortly...

There's a few other players in the market, one is Fabric7 - you can see an article on IT Edge Here or you can visit their web-site: http://www.fabric7.com/products_q160.php

Their definition of Fabric Computing is as follows:

Fabric servers feature a highly scalable and hardware partitionable symmetric multiprocessing complex that can be configured into a flexible fabric of large or small servers. The fabric offers a pool of compute and network resources that can be dynamically provisioned into servers to deploy applications and services as business demands. The fabric can extend across multiple servers within a datacenter or across geographically dispersed sites, with a single point of control.

System Fabric Works:

Founded in 2002, System Fabric Works (SFW) develops and integrates server, storage and network software solutions that provide platforms for high-performance distributed applications throughout the enterprise. SFW uses the phrase "Fabric Computing" to describe the leverage that Remote Direct Memory Access (RDMA) architectures provide to computing, network and storage systems to deliver higher performance, lower latency and less energy consumption. SFW currently supports InfiniBand (IB) and Low Latency Ethernet/iWARP RDMA fabrics.

http://www.systemfabricworks.com/

Yea, so what, why should I care?
I think this is important technology. The business values and benefits are to SAAS providers, but also to massive world-wide consolidation efforts, and of course to the Business Intelligence community. I think the benefits include the following:
* Lower cost, lower power consumption with more computing resources
* Incredible up-time
* Massive virtualization of single platform hardware (what I'm saying here is that the RDBMS engine and the ETL engine and the BI Reporting engine can live on the same hardware, and share dynamically allocated resources, but live in separate logical spaces).
* All of the sudden, database engines like SQLServer2005, and MySQL become a serious player in BIG DATA SPACES.. Why? Because scalability is hardware based, interconnectivity and network problems have been solved by the engineers, and now - hundreds of terabytes is no longer a problem on a single machine!!
* Business can save money, buy not buying "proprietary hardware/proprietary operating systems"

If we thought that MPP vendors had it good, they did - for a very long time. Now, the fabric computing space is going to make software (like Microsoft Cubes for instance) incredibly powerful and scalable - pervasive to the end-user... Appliance vendors will need to watch out for this space as well, and possibly partner with these Fabric vendors, the landscape is already changing.

Now, about the storage... Most of these fabric devices don't have on-board storage. That's ok, I spoke to Brocade recently about their efforts to provide FABRIC SWITCHED STORAGE with dedicated backbones and super-high speed access, they've got FABRIC computing built in to their latest storage SAN with dedicated accessibility, load balancing switches, and so on... I'm sure that IBM, EMC, Fujitsu, and Hitachi also have similar offerings, I'd like to hear about them. All I'm saying is that high-speed storage won't be the bottleneck here.

This could mean GRID COMPUTING (on a hardware level) has finally arrived, and been packaged in a form that is usable by the masses. Fabric based computing is a huge space to watch, and as I learn more, I'll be sharing more.

If you have thoughts or experience in this area, I'd love to hear from you.

Thanks,
Dan L
DanL@DanLinstedt.com


Posted November 19, 2007 5:48 AM
Permalink | 2 Comments |

There's a lot of stir in the market place these days with a term called: Operational Business Intelligence, over the years there's been a lot of interest and usage of the term: Active Data Warehousing, which according to Bill Inmon is really: "real-time data warehousing" or "operational data warehousing." What is this thing? Has it been defined before? What kinds of systems are we building these days and what is the next stage going forward? In this entry we'll look at the terms Operational Data Warehousing, Operational Business Intelligence, and Active Data Warehousing...

For the past several years there's a growing trend in learning from the history of the strategic data warehouse, placing that knowledge nugget into a running artificially intelligent engine (live neural net) and then sending an operational transaction through the mix. Why? For one instance: fraud detection.

The AI engine needs to learn from the patterns stored in the strategic elements of the warehouse, then it needs to apply (in right-time) what it knows to the incoming transaction to determine if it's fake, fraudulent, or simply undesired action. All the while it also needs to build up an "operational view" of a stream of transactions arriving within a specific time-frame, so that it can uncover recurring fraud within a short period of time.

What is all this?
Some folks have called it Operational Business Intelligence (see Claudia Imhoff's article) In fact, this is the right name for it... It's the ability to perform action on data that makes sense to the business _when_ it makes sense to perform that action... Huh?

Right-time data arrival, coupled with strategic learning, coupled with neural nets to put the current transaction into "context", and then produce an actionable result for the business to operate from. Operational Business Intelligence.

Ok, the ability to have Operational BI almost demands that we build an Operational Data Warehouse right?
Wrong. It's not always the right choice, however once a data warehouse architecture has been "activated" - the next step or next level is turning it on to an Operational level. It's nothing more, nothing less than the next step or the next evolution of an Active Data Warehouse. ** you can find lots of definitions of Active Data Warehouse in the market, so I didn't feel the need to rehash it.

Operational Data Warehousing??? impossible you say...
Not really. In fact, it takes on the notions of Operational BI and actually enables Operational BI to engage directly with the Data Warehousing levels. But it requires some sophisticated machinery, architectures, and standards. The notions of the Data Warehouse becoming a System Of Record for Operational Data are a reality with Operational DW.

In fact, some of the requirements of an Operational DW are as follows:
* It must already be an Active or Right-Time Data Warehouse
* It must contain data captured by the Data Warehouse as "source data" that is never altered, and adheres to the DW2.0 principles.
* It must have integrated information, time-variant (strategic), non-volatile information
* It must have an operational business purpose that it addresses going forward

What happens to the system housing an ODW?
* It is now truly 24x7x365
* it is a system-of-record
* it might be a master-data-system
* It answers business questions directly
* It captures operational data from web-interfaces, web-services, that doesn't exist anywhere else.
* It maintains partial levels of business rules for capturing the information just discussed.
* it maintains master metadata across the enterprise
* It maintains security to all the information

And so on... All the principles of "operational systems" are added to all the principles of an "active data warehouse", which are added to the requirements from Operational Business Intelligence - to make up an Operational Data Warehouse.

We'll continue discussing this topic moving forward, as always this is bleeding edge? maybe not... but I'd love to hear your opinions - help me formulate the definitions in the market place, or simply tell me that "no such thing exists..."

Thanks,
Dan L
DanL@GeneseeAcademy.com


Posted November 7, 2007 4:47 PM
Permalink | No Comments |