We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish Krishnan is a worldwide-recognized expert in the strategy, architecture, and implementation of high-performance data warehousing solutions and big data. He is a visionary data warehouse thought leader and is ranked as one of the top data warehouse consultants in the world. As an independent analyst, Krish regularly speaks at leading industry conferences and user groups. He has written prolifically in trade publications and eBooks, contributing over 150 articles, viewpoints, and case studies on big data, business intelligence, data warehousing, data warehouse appliances, and high-performance architectures. He co-authored Building the Unstructured Data Warehouse with Bill Inmon in 2011, and Morgan Kaufmann will publish his first independent writing project, Data Warehousing in the Age of Big Data, in August 2013.

With over 21 years of professional experience, Krish has solved complex solution architecture problems for global Fortune 1000 clients, and has designed and tuned some of the world’s largest data warehouses and business intelligence platforms. He is currently promoting the next generation of data warehousing, focusing on big data, semantic technologies, crowdsourcing, analytics, and platform engineering.

Krish is the president of Sixth Sense Advisors Inc., a Chicago-based company providing independent analyst, management consulting, strategy and innovation advisory and technology consulting services in big data, data warehousing, and business intelligence. He serves as a technology advisor to several companies, and is actively sought after by investors to assess startup companies in data management and associated emerging technology areas. He publishes with the BeyeNETWORK.com where he leads the Data Warehouse Appliances and Architecture Expert Channel.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

December 2007 Archives

We all have been discussing, writing and reading about Data Warehouse Appliances. Still in the formative years, this technology is already making rounds in the data center. Recently I had the opportunity to spend time on columnar databases based Appliances. This technology is just awesome for Analytical applications. Why do we need a separate appliance for Analytical purposes.

When we look at the current RDBMS technologies, they are geared towards OLTP applications. They all provide analytical functions built on the database platform. But these functions when running on the traditional SMP architecture cannot perform at high speeds and encounter severe disk and cpu constraints. This is especially a fact when it comes to running OLAP queries on the SMP architecture.

This is where a columnar database differs. Traditional databases processes queries in a row based fashion while columnar databases processes queries in a columnar fashion. The architecture of a columnar database is

1. Columnar data storage
2. High-performance data loads and updates processing
3. Shared-nothing and Massively Parallel Processing architecture
4. Adaptive compression of data based on data type and length
5. Data is stored compressed on the disk

The query processing architecture of a columnar database is

1. Only columns relevant to the query being executed are retrieved
2. All operations are done in parallel (A traditional DBMS will scan all of the data sequentially)
3. There is a very low overhead in data retrieval
4. Data is scanned compressed and only expanded on retrieval.

Since the underlying database is architected to store data compressed and retrieve data in columns rather than rows, there is an advantage in building a multidimensional query on this platform. While there is a potential for the columnar database to provide a platform advantage for the Analytical data warehouse, the other appliance technologies also provide a similar advantage in terms of performance.

One columnar database vendor has already proven their database strength by executing the TPC-H benchmarks.

As Operational BI matures over the next few years and the demand for operational reporting increases there will be an increase in demand for data availability and data accessibility. These are the technologies that will be deployed in the data center to augment the workload from the data warehouse.

Watch this Blog for further details on this topic.

Posted December 29, 2007 6:15 PM
Permalink | No Comments |

I have been asked this questions a number of times, "what is in the data warehouse appliance?, I do not feel comfortable with it being a black box". Every time the answer that I have assured the user is, there is no black box concept, but the interfaces to manage the data warehouse appliance might not be robust as the mainstream databases yet and it is a maturity process on the technology itself.

While the vendors out there are working to make the maintenance and management of the Data Warehouse appliance easier, here are a few things that I would like to see implemented in any of the appliance technologies.

1. GUI interface - A thin client user interface for technology and user management of the appliance. While it is there in a few appliances, it is not at a level to instill confidence in users. Technologies where this is not available are going to have to get this done as a priority.

2. DBA and Administrator documentation - While the appliance can run on any Linux and Unix platforms, there are additional commands that have been added for the MPP engine integration. Robust documentation on configuration and system administration will be greatly appreciated. Similarly from a DBA perspective, documentation and management gui interface wil be an absolute success criteria.

3. TPC-H benchmarks - I'm not suggesting that every vendor needs to implement a TPC-H benchmark. But doing a TPC-H benchmark will provide the IT user statistics in a decision making process, and provide apples to apples comparison on the platform.

4. Reports - To ensure managability and provide indepth infromation, reports should be made available on disk and cpu utilization, data allocation etc. This will serve as an educational inout and will also provide operations support.

Posted December 28, 2007 6:01 AM
Permalink | No Comments |

Whether it is liked or not, the Data Warehouse Appliance has been making the rounds and it is finally getting the attention. Looking at Gartner's latest magic quadrant, we see three vendors in the quadrant (hey they are there with the big biys who have been around forever). In the next year at this time we will see more expansion of names in this area. What has made the Appliance click or get attention?.

The initial focus that was being showered on the Appliance was the competitive advantage of cost, but not anymore. the Appliance vendors have started providing feature and functionality that traditional database solutions are putting on the roadmap as future releases. The very reason that these companies are young and are ready to take the challenge and provide the solution in a record time is proof enough.

Appliance vendors have long shown the ability to move data in volumes at record speeds, the technology is built to perform and eliminates the need for overheads like indexes in most cases, there are situations where you might need them but that is a rarity. Commoditisation has proven the ability to execute in non-proprietary platforms. The ability to scale has been demonstarted by all the vendors.

Appliance vendors have been leveraged as partners by leading BI tools such as Business Objects, MicroStrategy and Informatica to name a few.

Yes, this is a relatively new technology and there is room for improvement. There are more exciting things coming in the next year from this area, and lot of these technologies will be future trendsetters.

Watch, read and participate in this channel for more information on the Applainces, their integration, issues etc.

Posted December 12, 2007 10:04 PM
Permalink | No Comments |

With the ever growing need for data to be available in real time mode for consumption by business and non-business users, we are seeing a new rush for data agility and a need for a new backbone architecture for data integration. YouTube has given a new meaning to information sharing in the media, similarly digital dashboards has become an integral component to the business owners and executives for decision making. Realtime demand forecast engines have started making supply chain more agile then ever and customer feedback in realtime has become a major investment for theme parks (e.g. EuroDisney).

What drives this demand is the need to be agile in your business. Whether it is a meat packaging plant in rural Iowa or the theme park in the world's largest cities, the need to be agile and responsive to the customer has brought a new meaning to data availability and data integration. I do agree that you cannot change production processes or schedules or alter already manufactured goods, but with the right information available in the right time, you can work wonders with managing your product, your offerings or services or better yet your production schedule.

In order to meet this ever growing demand, technology has also been improving consistently. CPU's have become more faster and less expensive overall, memory has just about doubled in performance increase and dropped in pricing. Disk has become incredibly cheap, infact with the world going digital with SDCards and Flash Drives (even in camcorders). disk demand for storage will increase in the future.

In the data warehouse space, the demand for data agility has been consistently met with new and innovative offerings. Data Warehouse Appliances have established a strong footprint in the data centers around the globe in this year. I see this technology being embraced by data centers and data warehouse IT staff in the coming years.

Data integration architectures are being revamped to accommodate the data agility needs. DW 2.0 from Bill Inmon is pathbreaking with UnStructured data integration techniques. We are seeing the ODS being revived considering the Operational BI requirements and the data agility needs thereof.

Retail and Financial services data requirements have just about quadrupled in the last couple of years. I'm seeing the healthcare industry's growing pains with data and see that solutions are getting ready to address the issues.

Whichever way you choose to look, the next phase of this journey for all data practitioners is going to be an interesting and rewarding one.

Posted December 5, 2007 12:00 PM
Permalink | No Comments |