We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Blog: Krish Krishnan Subscribe to this blog's RSS feed!

Krish Krishnan

"If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein.

Hello, and welcome to my blog.

I would like to use this blog to have constructive communication and exchanges of ideas in the business intelligence community on topics from data warehousing to SOA to governance, and all the topics in the umbrella of these subjects.

To maximize this blog's value, it must be an interactive venue. This means your input is vital to the blog's success. All that I ask from this audience is to treat everybody in this blog community and the blog itself with respect.

So let's start blogging and share our ideas, opinions, perspectives and keep the creative juices flowing!

About the author >

Krish Krishnan is a worldwide-recognized expert in the strategy, architecture, and implementation of high-performance data warehousing solutions and big data. He is a visionary data warehouse thought leader and is ranked as one of the top data warehouse consultants in the world. As an independent analyst, Krish regularly speaks at leading industry conferences and user groups. He has written prolifically in trade publications and eBooks, contributing over 150 articles, viewpoints, and case studies on big data, business intelligence, data warehousing, data warehouse appliances, and high-performance architectures. He co-authored Building the Unstructured Data Warehouse with Bill Inmon in 2011, and Morgan Kaufmann will publish his first independent writing project, Data Warehousing in the Age of Big Data, in August 2013.

With over 21 years of professional experience, Krish has solved complex solution architecture problems for global Fortune 1000 clients, and has designed and tuned some of the world’s largest data warehouses and business intelligence platforms. He is currently promoting the next generation of data warehousing, focusing on big data, semantic technologies, crowdsourcing, analytics, and platform engineering.

Krish is the president of Sixth Sense Advisors Inc., a Chicago-based company providing independent analyst, management consulting, strategy and innovation advisory and technology consulting services in big data, data warehousing, and business intelligence. He serves as a technology advisor to several companies, and is actively sought after by investors to assess startup companies in data management and associated emerging technology areas. He publishes with the BeyeNETWORK.com where he leads the Data Warehouse Appliances and Architecture Expert Channel.

Editor's Note: More articles and resources are available in Krish's BeyeNETWORK Expert Channel. Be sure to visit today!

I had written in my prior post that Big Data Appliances are coming around the corner and it looks like deja vu all over again.

Teradata today announced the new Teradata Aster Big Data Appliance, and yes as the name spells it all it is the first integration product from the stables combining not only Aster and its legendary platform with the bells and whistles of Teradata platform in terms of management tools and hardware capabilities and combining the Hadoop integration with Hortonworks

The system in my opinion is a perfect combination of power and ruggedness with a lot of finesse and it differs from competitive announcements on these grounds

1. Aster's patented Sql-H and its legendary scalability architecture
2. 50 plus analytical functions designed to work on large data sets
3.  SQL-MapReduce combining SQL and MapReduce, which was an Aster core strength in 2009
4. Teradata Viewpoint - a well known and tested management tool for the platform - extended to include Aster and Hadoop
5. Teradata TVI a very sophisticated hardware support and failure prevention software
6. Infiniband network interconnect - makes ultra-high-performance connectivity between Aster and Hadoop, as well as scalability, a non-issue

For those of you who have been around these platforms, it definitely is not a me-too solution or taking existing solutions and re-integrating or re-branding them. Neither is the platform configuration a custom build. From a CIO's perspective this is where the Teradata Aster Big Data Appliance makes a strong business case, it brings all proven and tested technologies in a appliance footprint with all the configurations required to handle the onslaught of Big Data.

As a Big Data practitioner and a Data Warehouse evangelist, what truly is a future think architecture from my perspective is the "unified architecture". This is where the rubber meets the road in my opinion and I have discussed a similar solution in any seminar or discussions that I continue to have on Big Data.



What this architecture does for you as a user is creating two platforms at a the same time - one for exploration and mining purposes and the other for analytics and management reporting. You can push workloads across the different architectures here and leverage the power of all the pieces of the infrastructure. With the right approach and solution architectures, enterprises can take a giant leap forward for the Big Data journey on these type of platforms.

The engineering efforts of Teradata, Aster and HortonWorks speaks for itself in terms of performance tests. I look forward to more testing and benchmarking results on large data volumes.

While all the technology announcements till date have been very innovation focused, this one makes a business case at the very introduction itself and that is what gets a business executives attention. The passion and commitment of all the teams involved shows off in the appliance performance from current tests and recorded benchmarks.

This is not the last of the appliances, there are more to come and the users will have a greater choice. If I were to compare anything in the Auto industry today with these appliances, Teradata Aster Big Data Appliance is like the next generation of Prius with more bells and whistles, while IBM Pure is like a custom built souped up supercar. Both have their enthusiasts and loyalists, while both have been able to address different user needs.

Whichever is the direction one chooses, this is the dream come true era for many solution architects, DBA's and CIO's.

Posted October 18, 2012 9:07 PM
Permalink | No Comments |
This year will be one to remember in the history of Data Warehousing. About an year ago, the frenzy of Big Data had already numbed our senses and we were wondering how to tame the beast? is Hadoop the only known savior. An year later, we are seeing a new lineup of infrastructure and technology warriors from the known stables of Data Warehouse and Database vendors - Oracle Big Data Appliance; IBM BigInsights platform; ParAccel; Kognitio; EMC Greenplum and the saga goes on.

What interests me is the legacy that is being carried through, as we move on from the Data Warehouse Appliance to Big Data Appliances. The original promise of the Data Warehouse Appliance was a self contained box of hardware, software and API built to handle the rigors of the Data Warehouse, and the Big Data Appliances will be a self contained box configured to handle the demands of Big Data. In the newer situation the issue will be where does all of this go into the ecosystem and how does one handle all the workload. Well that is for the architects to figure over time.  Of course at the time of this writing, we still have more to come our way in all probability, never say we are done with anything in technology.

Innovation thrives and exists.

Posted October 10, 2012 4:43 PM
Permalink | No Comments |
One of the key requirements for the next generation Data Warehouse is the ability to process streaming data and provide analytical insights while the data is in transit. The technology to process such a requirement exists outside the DW today, and there are platforms that have been enabled to provide these types of features within the DW infrastructure by IBM, Teradata and other vendors. The real question is does the business have the data in their hands at the right time? and an even more pressing question is why does the DW need to process this type of requirement?

Let us look at the first question, today the business does not get access to the data at the right time, for example credit card fraud, high volume trade fraud or any risk management platform can process a fraud alert based on complex event processing algorithms, but the results are provided to the business user in most cases after the fact when the data is at rest in the DW. This now adds complexity as someone has to trace through the data end to end to sequence the events and discover patterns and with add the added delays due to human intervention in the process, the decisions are often more impacting on the financials. In the present infrastructure, applying a near real-time processing platform is expensive.

Let us look at the second question - why the DW? the answer to this question lies in the fact that the DW is the most stable data repository and the most integrated analytical data source in any organization. The deeper question is should this type of processing be done in the DW or outside the DW with just the results being integrated to the DW. Doing it the latter way is ideal as the volume of data we add is considerably lower compared to all the data being loaded and processed in the DW

This is where the next generation of DW comes into play. In the future, the DW will be the analytical hub with an ecosystem of technologies surrounding it. The better designed platforms for such streaming data analytics include Hadoop, NoSQL and Armanta Inc's platform to name a few. These platforms can process and digest high volumes of data and provide alerts that can be caught while the transaction is in process. An example of such an implementation is where a large credit card vendor can provide an additional client verification service and reject the transaction if found or suspected to be fraudulent.

As businesses transform to a service centric model, these types of data management needs will become the reality of the day. While there are several niche solutions and initial adopters to these types of platforms, the immediate future looks promising to lap-up all the new capabilities especially in the Big Data platform, where cost is not the barrier to enter.

Streaming Data Processing and Analytics will emerge as a top requirement, from a large market demand and adoption perspective in the world of Data Warehousing.

Posted August 28, 2012 6:41 PM
Permalink | 1 Comment |
One of the hottest topics trending today is the user adoption to in-memory data-stores. While in-memory has been around for more than 20 years, there are several significant events leading to its widespread adoption, and the most common among them include

  • Big Data
  • Commoditization of Infrastructure
  • Emergence of NoSQL databases

In a recent conversation with Roger Gaskell, the CTO of Kognitio, I had an opportunity to hear their vision for the future. It is interesting to hear that the queries and users coming to Kognitio are looking to adopt the architecture as an in-memory datastore to their existing EDW or new Big Data environment.

It is not a well known fact that Kognitio had been designed as in-memory data store in its original avatar as WhiteCross (there are several mentions in the web if you are interested). In the first 20 years of its existence, the company could not bring the in-memory design to forefront due to the cost factors  The UK based company had been accepted as a second generation DW Appliance, though going by fellow analyst records, their customer base was not a publicly available artifact. I had mentioned them in my coverage of DW Appliance industry since 2007 and they have continued to remain a part of that space.

Roger briefed me about the Cloud based model from Kognitio and discussed their recent partnerships with Allteryx and HortonWorks from a big data extension perspective. In a typical Big Data solution architecture, we can see how Kognotio will fit into the big picture. Remember that you need to create a heterogenous solution to become scalable and flexible.

While Kognitio may not have a deep pockets of another in-memory vendor SAP. Their technology is something taking a look at, especially if you are non-SAP environment.

Roger cited some clients who were already engaged in the technology evaluation for Big Data platform from an in-memory perspective. However the client names are under NDA and were not disclosed. Going by the recent spate of announcements from Kognitio and its partners, the indications are that the latest trend is in Kognitio's direction. Let us see how this evolution continues.

We will look at other technologies in this series later this week into next.



Posted July 23, 2012 5:31 PM
Permalink | No Comments |
In a recent series of conversations that occurred with CxO's, a common lament that was expressed is the lack of Actionable or Action Oriented insights from Analytics or Business Intelligence. Upon deeper analysis of this lament, what I have seen is there is more of the "latency curve" that is building silently after the initial metrics and analytics are delivered for decision support.

We initially discovered that on a time value curve, the data acquisition, analysis and delivery latencies, which were built due to design and architecture weaknesses. With the advent of technology and tools, we have somewhat overcome the initial latencies deficiency. With the reduction of initial latency, we have enabled adoption of Business Intelligence. But the adoption has brought the second problem, the situation is something similar to the sequence below

  1. Company XYZ implements a DWBI solution
  2. Users develop reports and analytics,  discover performance needs
  3. Performance and Architecture are tuned
  4. Users start gaining adoption to the Analytics and Reports
  5. Executive Dashboards are commissioned and provide critical metrics
  6. An executive asks a simple question like - How or Why or What? - this is where we build the secondary latencies

The post query analysis to provide different levels of data to the executive decision maker is where the effective use of Business Intelligence falls off the cliff. The different reports, data points and metrics quickly get transitioned from a BI tool to Excel spreadsheets and further analysis and discovery leads to chaos and eventually a million dollar investment loses steam.

This is where we need to draw our attention to looking at new data visualization techniques like mashup's. Powerful tools like Tableau, Spotfire and BIS2 offer a wide variety of data mashup techniques. With a mashup, you can create a summary and details views separately and present them in one single view. This will be an easier technique as you will think of the scenarios together and present better analytics as a end result. There are better techniques then drill down and drill across with the advent of newer visualization tools, as related to providing actionable insights and analytics. This will reduce post analysis latencies and drive Business Intelligence success.




Posted June 14, 2012 9:47 AM
Permalink | No Comments |

1 2 NEXT