We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has been a thought leader in the business intelligence field since the early 1990s. He has conducted numerous research studies and is a noted speaker, blogger, and consultant. He is the author of two widely read books: Performance Dashboards: Measuring, Monitoring, and Managing Your Business (2005, 2010) and The Secrets of Analytical Leaders: Insights from Information Insiders (2012).

Wayne is founder and principal consultant at Eckerson Group,a research and consulting company focused on business intelligence, analytics and big data.

Recently in Cloud Computing Category

This is the fourth in a four-part series on cloud computing for business intelligence (BI) professionals.

Business intelligence in the Cloud is inevitable. In fact, it's already happening. Although Cloud BI hasn't slowed the growth of on-premises BI software, an increasing number of organizations are using Cloud-based BI services and many more are waiting until the time is right.

By Cloud BI, I primarily mean reports and dashboards that run in a multi-tenant, hosted environment and which users access via a Web browser. The reports and applications can be packaged (i.e., Software-as-a-Service or SaaS) or custom-built by the service provider or customer using Platform-as-a-Service (PaaS) or other tools. (See Part II of this series for more detailed explanations of SaaS and PaaS.)

Benefits. The Cloud offers numerous benefits for organizations that want to run reports and dashboards. There is no hardware and software to buy, install, tune, and upgrade. Consequently, there are no IT people to hire, pay, and manage. Applications upgrade automagically and can be scaled seamlessly. Customers pay a monthly subscription based on usage rather than an upfront licensing fee. Essentially, the Cloud speeds delivery and drives down costs. What's not to like?

Impediments. But there are concerns. One of the biggest impediments to Cloud BI today is security, or at least the perception that data in the Cloud is less secure than data housed in a corporate data center. In reality, data is actually safer in the Cloud than in many corporate data centers. The real issue is not security, it's "control." Executives today simply feel safer if their data is housed in a corporate data center. (Ironically, most companies have already outsourced sensitive data, like payroll and sales, to third party providers.) To be fair, some companies, especially those in financial services, must comply with regulations that currently require them to keep data on premise.

E-Commerce. Interestingly, many industry experts raised the same security bogeyman in regards to e-commerce. In the late 1990s, many experts said, "Consumers will never type their credit card number into a Web browser and ship it off to an unknown destination via the public internet because it could be stolen." Of course, we know how this story played out. In 2009, more than 154 million people in the U.S. bought something online, and online sales are growing four times faster than retail sales in general, according to Forrester Research. When it comes to security, convenience trumps fear, especially when fear isn't grounded in reality. The same appears to be happening with the Cloud.

The biggest challenge with running BI in the Cloud involves packaging custom BI development into a cost-effective, online service. By its nature, BI involves creating custom applications that integrate data from unique combinations of data sources. Cloud BI vendors are still figuring out how to deliver BI services without losing their shirts or turning into a custom development shop. (See Part III of this series.)

Finally, some Cloud BI vendors only deliver interactive reports and dashboards (e.g. SAP, Indicee, and GoodData), while only a few offer more in-depth analysis using on-line analytical processing (OLAP) (e.g., Birst) or pivot table functionality against big data (e.g, PivotLink). However, for most organizations getting started with BI, reporting and dashboarding functionality is more than sufficient to satisfy their information appetites.

Gaining Traction

Despite these obstacles, Cloud BI is gaining ground, according to a recent survey of BI Leadership Forum, a global network of BI Directors and other BI professionals. (See www.bileadership.com.) More than one-third of organizations are currently using the Cloud for some part of their BI program, according to the survey. (See figure 1.)

Figure 1. Are you using the Cloud for any part of your BI program?
Cloud- Figure 1.jpg
Source: BI Leadership Forum, June, 2011. Based on 112 responses. www.bileadership.com.

Organizations that have embraced the Cloud point to "speed of deployment" (30%) and "reduced maintenance" (30%), followed by "flexibility" (19%) and "cost" (11%). (See figure 2.)

Figure 2. Motivating Factors
CLoud Figure 2.jpg

Momentum. So far, Cloud BI users are happy campers. Almost two-thirds (65%) said they plan to increase their usage of Cloud BI in the next 12 months. Only 3% said they would decrease usage while another 16% will keep their implementation the same and 16% weren't sure. (See figure 3.)

Among respondents who are not using Cloud BI, 16% said they plan to implement Cloud BI in the next 12 months and 32% were not sure. So Cloud BI has momentum. However, it may take a five to 10 years for Cloud BI to reach the tipping point where it becomes a mainstream component of every BI program. Given Cloud BI's benefits, this trajectory is inevitable.

Figure 3. Future Usage
Cloud Figure 3.jpg

Small Companies Lead the Way

A closer look at the data confirms what many pundits have said about the target market for Cloud-BI software: that it's currently ideal for small companies with few IT resources, limited capital to spend on servers and software, and minimal to no BI expertise. Almost half of small companies under $100M in annual revenues (46%) use Cloud BI in some shape or form. In contrast, large companies with over $1B in annual revenues are almost less than half as likely to adopt the cloud (29%), while medium-sized companies with between $100M and $1B in annual revenues lag further behind with less than one-fifth using BI in the Cloud (18%). (See figure 4.)

Figure 4. Cloud BI Deployment by Company Size
Cloud Figure 4.jpg

Small Companies. For small businesses without legacy BI applications, Cloud BI services are a godsend. The economics and convenience are compelling. Instead of passing around spreadsheets, small companies can implement a Cloud BI service to standardize reports and dashboards and make them available to all employees anywhere via a Web browser.

"What's refreshing for me is that I can go in at any time of day and [run a] report on any metric in our organization, such as item received delivered, inspected at the category, personnel, or employee level and track it by any time period," says Wayne Deer, vice president of operations, at Gazelle, an electronics recycler, which uses GoodData's Cloud BI service.

Large Companies. Interestingly, large companies are the next most prevalent users of Cloud BI services. Often, it's a department head who wants to build a BI application quickly without getting corporate IT involved. Like small companies, departments at larger companies often have limited budgets and BI expertise and most don't want the headaches and expense of having to maintain servers and software.

But enterprise BI managers have assessed the potential of Cloud BI and like what they see. Unfortunately, many are hamstrung by legacy BI implementations. "I see us moving very slowly with adoption because of installed base and switching costs," wrote Darrell Piatt, Director and Chief Architect at a large professional services firm based in Virginia, in a BI Leadership Forum discussion thread. "When and if we decide to replace our BI infrastructure, Cloud BI offerings will be seriously considered."

Mid-size Companies. According to figure 2, medium-sized companies are least likely to adopt Cloud BI services. The reason is that most have already implemented an enterprise IT platform, usually Microsoft SQL Server, which bundles BI tools and applications for free. If the organization has assigned an analyst or IT administrator to build and maintain enterprise reports and dashboards using the platform, it likely has little bandwidth, incentive, or capital to change courses and introduce an alternative BI stack, unless it is having difficulty meeting user requirements.

Cloud BI Vendor Perspective

Given that they provide a service that makes them an extension of an organization's IT team, Cloud BI vendors have a good handle on who their customers are and what they are doing with their services.

For example, Sam Boonin, vice president of products and marketing at GoodData, says that his company's customers fall into three camps: 1) fast-growing technology companies that run all their applications in the Cloud, 2) departments in larger companies, many of which have implemented Salesforce.com and are comfortable with the SaaS model, and 3) SaaS vendors who OEM their product.

Boonin also said that 90% of GoodData's customers start by using one of its packaged applications, which generate reports against a single SaaS-based, front-office application, such as Salesforce.com, ZenDesk, and Google Analytics, or an on-premise package, such as Microsoft Dynamics CRM. (GoodData currently offers 20 packaged applications.) These applications, which deploy in hours, start at $1,000 a month and are often bundled into third party, SaaS products.

Many customers then extend their packaged SaaS BI application by customizing data models and adding other data sources. GoodData configures or customizes the application to the customer's specifications. The process takes roughly six weeks and typically raises the monthly subscription price to between $3,000 and $10,000 a month. "For companies used to spending $500,000 on BI and getting virtually nothing, they see us as a godsend," says Boonin.

Today, almost half (43%) of GoodData's customers generate reports that run against multiple applications. These customers generate the majority of GoodData's revenues. Currently, GoodData has about 100 direct customers and 3,000 indirect customers through OEMs. This is comparable to other pureplay Cloud BI vendors. "Business is good," says Boonin. "We have a $25 billion failed market to disrupt."


Some experts see dark clouds in the Cloud BI market. While ramp up of Cloud BI services hasn't been as fast as some anticipated, it's clearly catching on. The BI market poses unique challenges compared to other SaaS market segments that automate operational business processes. That's because BI applications are generally custom built and require companies to integrate data from multiple sources. Cloud BI vendors have taken different approaches to "servitize" a custom application, which is a logical contradiction. Not all have succeeded, but those still in the market are making headway. The value of Cloud computing is high, and the BI industry will eventually find a way to succeed with it.

Posted September 9, 2011 9:34 AM
Permalink | 3 Comments |

This is the second in a four-part blog series on cloud computing for BI professionals.

Cloud computing offers a compelling new way for organizations to manage and consume compute resources. Rather than purchase, install, and maintain hardware and software, organizations rent shared resources from an online service provider and dynamically configure the services themselves. This model of computing dramatically speeds deployment times and lowers costs. (See prior article "What is Cloud Computing?")

Although cloud computing shares the above attributes, it can be deployed in several different ways. The key factor is whether the cloud service provider is an external vendor or an internal IT department. There are three deployment options for cloud computing:

  • Public Cloud. Application and compute resources are managed by a third party services provider.
  • Private Cloud. Application and compute resources are managed by an internal data center team.
  • Hybrid Cloud. A private cloud that leverages the public cloud to handle peak capacity; a reserved "private" space within a public cloud; or a hybrid architecture in which some components run in a data center and others in the public cloud.

Public Cloud

Most of the discussion about cloud computing in the press refers to public cloud offerings. The public cloud offers the most potential benefits and the greatest potential risks. With a public cloud, organizations can obtain application and computing resources without having to make an upfront capital expenditure or use internal IT resources. Moreover, customers only pay for what they use on a usage or monthly subscription basis, and they can terminate at any time. Thus, public clouds accelerate deployments and reduce costs, at least in the short run. This is sweet news to BI teams that often must spend millions of dollars and months of development time before they can deliver their first application.

In addition, a public cloud also obviates the need for customers to maintain and upgrade application code and infrastructure. Many public cloud customers are astonished to see new software features automatically appear in their software without notice or additional expense. And the public cloud frees up IT departments to focus on more value-added activities rather than hardware and software upgrades and maintenance. In short, there is something for everyone to like about the public cloud.

Security and Privacy. But the public cloud also comes with risks. Security and privacy are the biggest bugaboos. Some executives fear that moving data and processing beyond their own firewalls exposes them to security and privacy risks. They fear that moving data across public networks and comingling it with other company's data in a public cloud might make it easier for sensitive corporate data to get into the wrong hands.

While security and privacy are always an issue, the fact is that most corporate resources are more secure in the public cloud than in a corporate data center. Public cloud providers, after all, specialize in data center operations and must meet the most stringent requirements for security and privacy. However, there are compliance regulations that legally require some organizations to maintain data within corporate firewalls or pinpoint the exact location of their data, which is generally impossible in a public cloud which virtualizes data and processing across a grid national or international computers.

Other Challenges. The public cloud poses other challenges:

  • Reliability. Executives may question the reliability of public cloud resources. For example, Amazon EC2 has had two short, but high profile, outages, causing companies that ran mission critical parts of their business there to be left stranded without much visibility into the nature or longevity of the outage.

  • Costs. It can be extremely difficult to estimate public cloud costs because pricing is complex and often companies can't accurately estimate their usage (which is why they want to migrate workloads to the cloud in the first place.)

  • Blank Slate. Administrators must redefine corporate policies and application workflows from scratch in the public cloud, which generally provides plain vanilla services.

  • Vendor and Technology Viability. The public cloud market is evolving fast so it's difficult to know which vendors and technologies will be around in the future.

Private Clouds.

Because of the above reasons, many organizations are beginning their journey into the cloud with private clouds. This is especially true in the infrastructure-as-a-service arena where IT administrators are implementing virtualization software to consolidate servers and increase overall server utilization, flexibility, and efficiency. In addition, a private cloud gives an organization greater control over its processing and data resources, providing ease of mind for worried executives, if not greater security and privacy for sensitive data. And since a private cloud runs in an existing data center, IT administrators don't have recreate security and other policies from scratch in a new environment.

But the private cloud has its own challenges. IT administrators have to learn and install new software (hypervisors and cloud management utilities). They need to manage two compute environments side by side and keep IT policies aligned in both. This adds to complexity and staff workload. And it goes without saying that a private cloud runs in an existing corporate data center, which carries high fixed costs to maintain.

Hybrid Cloud

Companies are increasingly pursuing a two-pronged strategy that uses the private cloud for the bulk of processing and the public cloud to handle peak loads. The key to a hybrid cloud is obtaining cloud management software that spans both private and public cloud environments. The software supports the same hypervisors used in each environment (ideally it's the same hypervisor) and has built-in interfaces to the public cloud provider so internal IT policies and virtual images can be transferred to the public cloud environment.

In addition, many public cloud vendors allow customers to carve out private clouds within the public cloud domain. For example, Amazon.com offers a virtual private cloud within its Elastic Compute Cloud (EC2) environment that lets customers reserve dedicated machines and static IP addresses, which they can link to their internal data centers via virtual private networks. Hybrid clouds are obviously more complex and challenging to manage. Currently, few people have experience blending private and public clouds in a seamless way.

Adding Public Cloud Components to a BI Architecture

Another form of hybrid cloud uses public cloud facilities to enhance an existing architecture. In a BI environment, there are several that organizations can mix and match public cloud offerings with their on premises software (which may or may not be running in a private cloud.)

Scenario #1 - Analytic Sandbox. When a data warehouse is running at full capacity, administrators might consider offloading complex ad hoc queries submitted by a handful of business analysts to a public cloud replica. In this scenario, complex queries submitted by the analysts are bogging down performance of the data warehouse. Since it's difficult to estimate ad hoc processing requirements and the costs of replicating a data warehouse are high, the IT staff decides it's faster and cheaper to create a new data mart in the public cloud and point the business analysts to it. The IT staff (or analysts) can increase or decrease capacity on demand using self-provisioning capabilities of the public cloud. (See figure 1.)

Figure 1. Analytic Sandbox Using a Public Cloud (click to expand)
Scenario 1.jpg

The primary challenge in this scenario is the cost and time required to move data across the internet from an internal data center to the cloud. Since the initial load may take days or weeks depending on data volumes, IT staff will usually ship a disk to the cloud provider to load manually. Thereafter, the IT staff needs to figure out whether it can move daily deltas across the internet in time within an allotted batch window. Considering that it takes six days to move 100GB across a T-1 line, organizations may need to skip doing batch loads and instead trickle feed data into the data warehouse replica. In addition, it is often difficult to estimate pricing for such data transfers and charges may add up quickly. Cloud providers generally charge for transferring data in and out of the cloud and storing it. (Amazon, however, has recently discontinued fees for transferring data into EC2.)

Also, depending on the speed of network connections, the business analysts might experience delays in query response times due to internet latency. Invariably, internet speeds won't match internal LAN speeds so users might notice a difference. Finally, there are security and privacy issues discussed in the previous article. (See "What is Cloud Computing?")

Scenario #2. Cloud-based Departmental Dashboard. A more common scenario is when a department head purchases a Software-as-a-Service (SaaS) BI solution from a SaaS BI vendor, of which there are many. Here, an organization's source systems and data warehouse remain in the corporate data center but the dashboard and associated data mart run in the cloud. (See figure 2.)

Figure 2. Cloud-based Departmental Dashboard
Scenario 2.jpg

SaaS BI tools are popular among department heads who want a dashboard on the cheap and don't want to involve corporate IT. Unfortunately, designing a data mart, whether in the cloud or on premise, is never easy or quick, especially if it involves integrating multiple operational sources. (See "Expectations Versus Reality in the Cloud: Understanding the Dynamics of the SaaS BI Market.")
This is not a problem if organizations are willing to pay the costs of creating a custom data mart and wait about three to four months, which is the time it usually takes to build out a relatively complex, custom environment. It's also not a problem if they simply want to visualize an existing spreadsheet. But if they believe the cloud provides quick, easy, and inexpensive deployments for any type of BI deployment, they will be disappointed. Also, they still need to transfer data to the cloud and users may experience response time delays due to internet latencies.

Scenario #3. BI in the Cloud Without the Data. To eliminate security, privacy, and data transfer issues, companies may want to keep data locally in a corporate data center while maintaining the BI application in the cloud. (See figure 3.) BI developers can configure the SaaS BI tool to meet their branding and workflow requirements, gaining the speed and cost advantages of cloud deployments, while minimizing data security and privacy problems.

Figure 3. BI in the Cloud Without Data
Scenario 3.jpg

While this scenario sounds like it optimally balances the risks and rewards of cloud-based BI deployments, it has a major deficiency: it requires the IT department to open a port in the corporate firewall to support incoming queries. If the organization is worried enough about data security to want to keep data locally where it's safe, they will kill it as soon as they recognize the security vulnerability it presents.

Scenario #4. Data Warehouse in the Cloud. The final scenario is to put the entire data warehousing environment in the cloud. (See figure 4.) Today, this only makes sense if all your operational applications also run in the cloud. Obviously, this scenario only applies to few companies, namely internet startups that have fully embraced cloud computing for all application processing. However, these companies have to manage all the problems associated with the public cloud (i.e., security, reliability, availability, and vendor viability). At some point in the future, this architecture may prove dominant once we get past security and latency hurdles.

Figure 4. Data Warehouse in the Cloud
Scenario 4.jpg


There are three major deployment options for cloud computing: public, private, and hybrid. As in most things in life, there is rarely a clearcut solution. So, too, with cloud computing. Organizations will experiment with public and private clouds, and most will probably have a mix of both. Most data center shops have already implemented virtualization, which is the first step on the way to private clouds. Once they get comfortable with private clouds, they will soon experiment with hybrid cloud computing to support peak computing rather than spend millions on new hardware to support a few days or weeks of peak processing a year. And if the data is particularly sensitive, they may begin with a private virtual cloud inside a public cloud data center to ease their fears about security, privacy, and reliability.

When push comes to shove, economics and convenience always trumps principles and ideals. This is how e-commerce overcame the security bogeyman and gained its footing in the consumer marketplace, and I suspect the same will happen with cloud computing.

Posted July 19, 2011 8:26 AM
Permalink | 1 Comment |

This is the first of four-part blog series on cloud computing for BI professionals.

There is a lot of confusion about cloud computing, even among professionals in the field. But that's true of any new, fast moving technology in which there are a lot of new technologies and methods. After reading a few definitions of cloud computing that caused me to nod off at my keyboard, I created a simpler one:
Shared, online compute resources that you rent from a service provider and dynamically configure yourself.

Let's unpack this definition a bit:

  • Shared: You share compute resources with other groups or companies, even your direct competitors! Obviously, this raises security and privacy concerns.
  • Online: You access the compute resources via a Web browser or a programmatic Web application programming interface. In this respect, cloud computing delivers online "services".
  • Compute resources: Compute resources consist of the infrastructure (servers, storage, and networks), development tools, and applications. Basically, the whole stack, accessible via a Web browser or service call.
  • Rent: You only pay for what you use and you can terminate the service at any time (although there may be some exit fees.) This is value-based pricing. Cloud infrastructure vendors generally charge by the hour while cloud software providers generally charge by user per month.
  • Service provider: A service provider could be your internal IT department (private cloud) or an external company (public cloud).
  • Dynamically configure: Unlike traditional hardware and software, you don't purchase, install, test, tune, and maintain cloud-based resources. With cloud-based infrastructure, you simply configure a virtual image of your compute environment (hardware, storage, network) using a Web browser. With cloud-based software, you simply configure your application using a Web browser to conform with your branding and workflow requirements.

Three Services

As you probably have already surmised, cloud computing is divided into three classes of services, each of which can be applied to the business intelligence market: 1) software as a service (applications) 2) platform-as-a-service (application development), and infrastructure as a service (compute resources). (See figure 1.)

Figure 1. Three Types of Cloud Services with BI Examples (click to expand)
Three Services.jpg

  • Software-as-a-Service (SaaS) delivers applications. SaaS was first popularized by Salesforce.com, which was founded in 1996 to deliver online sales applications to small- and medium-sized businesses with few or no IT resources and few capital resources. Salesforce.com now has 92,000 customers of all sizes and has spawned a multitude of imitators. Within the BI market, many startups and established BI players are offering SaaS BI services, although the uptake of such services is slower than expected. (See "Expectations Versus Reality in the Cloud: Understanding the Dynamics of the SaaS BI Market.") SaaS BI vendors including Birst, PivotLink, GoodData, Indicee, Rosslyn Analytics, and SAP, among others.

  • Platform-as-a-Service (PaaS) enables developers to build applications online. PaaS services provide development environments, such as programming languages and databases, so developers can create and deliver applications without having to purchase and install hardware. In the BI market, the SaaS BI vendors (above) are actually PaaS BI vendors, which is the primary reason why growth of SaaS BI is slow. Before you can consume a SaaS BI application, you have to build a data mart, which is often tedious and highly customized work since it involves integrating data from multiple, unique sources, cleaning and standardizing the data, and modeling and transforming the data. SaaS BI vendors are peddling a finished product when they are actually selling a custom PaaS development effort.

  • Infrastructure-as-a-Service (IaaS) provides online computing resources (servers, storage, and networking) which customers use to augment or replace their existing compute resources. In 2006, Amazon popularized IaaS when it began renting space in its own data center using virtualization services to outside parties. Some BI vendors are beginning to offer software components within public cloud or hosted environments. For example, analytic database vendors Vertica and Teradata are now available as services within Amazon EC2, while Kognitio offers a hosted service. ETL vendors Informatica and SnapLogic offer services in the cloud.

  • Key Characteristics of the Cloud

    Virtualization. Virtualization is the foundation of cloud computing. You can't do cloud computing without virtualization; but virtualization by itself doesn't constitute cloud computing.

    Virtualization abstracts or virtualizes the underlying compute infrastructure using a piece of software called a hypervisor. With virtualization, you create virtual servers (or virtual machines) to run your applications. Your virtual server can have a different operating system than the physical hardware upon which it is running. For the most part, users no longer have to worry whether they have the right operating system, hardware, and networking to support an application. Virtualization shields from the underlying complexity (as long as the IT department has created appropriate virtual machines for them to use.)

    With virtualization, organizations can run multiple, heterogeneous virtual servers on a single physical server to maximize utilization, or they can run a single virtual server on multiple physical servers to increase scalability. Because virtualization decouples applications from the underlying hardware, IT administrators can migrate applications to new hardware without having to reinstall software. They also can spawn multiple instances of a single application using virtual servers and run them in parallel on a single physical server to improve application performance and throughput. (See figure 2.)

    Figure 2. Virtualization Use Cases (click to expand)
    Virtualization Use Cases.jpg
    Left: heterogeneous system images and applications run on a single server, maximizing server utilization. Middle: a single image runs across multiple physical machines, increasing scalability. Right: multiple instances of an application run in parallel on a single machine, increasing efficiency.

    In short, virtualization increases the flexibility, scalability, efficiency, and availability of data center resources, and it dramatically lowers data center costs by enabling the IT department to consolidate servers and reduce power, cooling, space, and staffing overhead.

    To the Cloud: Dynamic Provisioning

    Browser Interface. To turn virtualization into cloud computing, you need to add software that enables business users to dynamically provision their own virtual servers and use the servers as long as they desire.

    For instance, developers using a Web browser can configure a custom virtual server to support a new development and test bed. Or, they can select a virtual image (i.e., server and applications) from a library of virtual images created in advance by the IT department. Once the developers are finished using the virtual images, they "release" them. Thus, developers no longer need to submit requests to the IT department for servers, storage, and networking capacity. They either configure their own virtual machine or select one from a library that meets their application's processing requirements. They no longer have to wait for purchasing and legal to execute a purchase order or the IT department to install, tune, test, and deploy the systems.

    Services Interface. To make the leap to cloud computing, you also need a services interface so administrators can programmatically provision servers based on a schedule or events (e.g., an ETL job that begins). Administrators use Web services interfaces to support auto-scaling, failover, and backups.

    With auto-scaling, a BI administrator uses a cloud services interface to automatically provision and release virtual BI servers during the course of a day to efficiently allocate processing power among servers to support various BI workloads. For example, at 2 a.m. in a typical BI environment, the system fires up an ETL server and database server to run nightly ETL jobs, while at 4 a.m. it releases the ETL server and provisions a BI server to process and burst daily reports. At 10 a.m. it provisions an additional BI server and database server to handle peak usage. Failovers and backups work much the same way.

    Cloud Management Software. Cloud computing also requires management software to help IT administrators keep track of all the moving parts in a virtualized environment. Cloud management software enables IT administrators to define systems-level policies (e.g., security and usage), create and manage virtual images which enforce the policies, manage virtual server versions, monitor servers and performance, manage user roles and access, track usage, and manage chargebacks or accounting, among other things. There are a variety of vendors that offer cloud management software, including cloud data center providers, such as Amazon.com and Rackspace, and independent software vendors, such as Eucalyptus and RightScale.


    Another key characteristic of cloud computing (in particular, Software-as-a-Service) is that applications are multi-tenant, which means multiple users from different organizations run the same application code running on the same hardware. This is different from a traditional hosting or outsourcing environment in which each customer owns or rents a dedicated set of hardware and software in the service provider's data center. The hosted model leads to a lot of wasted compute resources since customers only use their own compute resources even when other machines in the data center are idle. In contrast, multi-tenancy makes much more efficient use of hardware and software resources, delivering economies of scale that make cloud computing an attractive business model to service providers, as long as they can attract enough customers.

    One problem with multi-tenancy is that applications must be designed from scratch to support it. Multi-tenancy creates virtual partitions within the application and database for each distinct customer. Customers usually configure the application to match their unique branding and workflow requirements. On the data side, customer data is either interleaved by row and separated using unique identifiers or partitioned into separate tables or database instances.

    Legacy applications not designed for multi-tenancy have to fudge it. Either the service provider must create dedicated environments for each customer, which is highly inefficient (e.g., the old application service provider model) or they use virtualization software to run parallel instances of each application (e.g., a virtual appliance.) In some respects, the virtual appliance approach is more flexible than multi-tenancy because the virtual appliances can be ported to run on almost any hardware. (See figure 3.)

    Figure 3. Application Architectures (click to expand)
    Traditional on-premise software (far left) tightly couples logic and data to hardware in a LAN environment. A hosted environment (second from left) gives each customer their own dedicated hardware and software resources in a third party data center which they access via a virtual private network. A true multi-tenant environment (second from right) partitions a single application and database so different customers get their own unique views while sharing the same application, database, hardware, and network connection. A virtual appliance model (far right) enables legacy software not written for multi-tenancy to run parallel instances, essentially virtualizing multi tenancy.

    SaaS BI vendors have long waged battles over whether their respective software is truly multi-tenant or not. The virtual appliance model gives legacy software vendors venturing into SaaS a more equal footing on which to compete.


    This blog defined cloud computing and discussed some of its more salient attributes. However, there are several ways to deploy the cloud, and these deployment options have significant implications on costs, security, and staffing. The next blog in this series will discuss the differences between public clouds, private clouds, and hybrid clouds and show how an organization might architect its BI environment to leverage public cloud offerings.

    By the way, I'm once again speaking at CFO Magazine's Corporate Performance Management Conference, which is being held September 11-12 in Dallas Texas. I'll be delivering a presentation on Monday about the future of business intelligence, using my BI Delivery Framework 2020 as the basis for the presentation. On Tuesday afternoon, I'll be delivering a half-day seminar on performance dashboards. If you are interested in registering for the all-access pass, use the code LF1000 to get a $1,000 discount. Cool!

    Posted July 11, 2011 3:56 PM
    Permalink | 2 Comments |


    This is part three in a four-part series on cloud computing for BI professionals.

    There are no shortcuts in business intelligence (BI). And Software-as-a-Service (SaaS) BI vendors and some of their Cloud-based customers are finding this out the hard way.

    I'm a firm believer that most computing will eventually move to the Cloud but I've been surprised that the adoption of SaaS BI services has been slower than expected. Most pureplay SaaS BI vendors today are small and struggling, and leading BI vendors no longer market their SaaS BI solutions to a significant degree (if at all.) So the question is "Why?"

    Red Herrings. The two most commonly cited obstacles to SaaS BI adoption are security and data transfer rates. The security issue is mostly a red herring, in my opinion, except at organizations with strict compliance regulations. Data can be safer in the Cloud than in many corporate data centers. In terms of data transfer rates, a majority of organizations simply don't generate enough daily data to overwhelm a reasonable internet connection. And internet speeds are getting faster and cheaper all the time. Another red herring.

    The Missing Link

    I believe there is something deeper going on. There is a fundamental flaw in the SaaS BI equation. And I think I've found it.

    But first, it's important to recognize that there is a lot to like about the Cloud. There are numerous benefits to running your applications as a service rather than on premise. There is no hardware and software to buy, install, tune, and upgrade. Consequently, there are no IT people to hire, pay, and manage. As a result, software services drive down costs and speed delivery. What's not to like?

    Preparing Data. Unfortunately, this equation doesn't add up in the BI space. That's because the hard part about delivering BI applications is not what users see--the graphical report or dashboard--it's collecting, cleaning, normalizing, integrating, and aggregating data from various systems so it can be viewed in a clear, coherent way by business users.

    Preparing data is hard, tedious work, but it's the foundation of BI. Do it right, and you can ice your cake with sweet-tasting frosting. Do it wrong or not at all, and there is no cake to ice! Too many SaaS BI vendors have been peddling the icing and downplaying the need to bake the cake, and now they're suffering. The same thing is happening with visual analysis tools, such as QlikView. They are great at handling simple data sets, but give them dirty data from complex operational systems and they fall apart. Someone, somewhere has to do (and pay for) the dirty work of preparing data or else everyone goes hungry.

    Software Services or Professional Services?

    Let me take another slight digression: What's the difference between a SaaS BI vendor and a BI consultancy? Not much.

    Custom Data Marts. On one hand, you can argue that pureplay SaaS BI vendors, such as GoodData, Indicee, Birst, and PivotLink, offer software, which consultancies don't, and that the best offer true multi-tenant BI services that run in a virtualized environment. But on the other hand, SaaS BI vendors, just like BI consultancies, provide professional services to build custom data marts for their customers. Like consultants, they need to gather requirements, build a data model, extract and map source data, and build reports. This is a lot of work. If you peel back the covers on many SaaS BI deployments, they are really custom consulting jobs masquerading as a software service. But that's not the end of it.

    Operational Management. Once the development work is done, BI consultancies go home or move on to the next job, but SaaS BI vendors have to stick around and run the BI environment, just like an inhouse IT staff would. They have to schedule and execute jobs to extract and clean data and then transform and load it into the data mart. They have to manage change control and error processes, troubleshoot problems, and staff a help desk to answer any questions customers might have. And before they can upgrade their software, they need to test every customization that they've built for every customer (which happens to undermine one of the major benefits of Cloud-based services, which is rapid delivery of software upgrades.)

    Fixed Costs. Adding insult to injury, before SaaS BI vendors can begin collecting money, they have build out and staff a highly secure and scalable data center that offers full backup/recovery, failover, and disaster recovery services. Customers have been trained to demand the highest level of IT platform and administration services possible from a Cloud or hosting vendor even though many would not pay for the same level of services in their own data centers.

    Subscription Pricing. Obviously, all of this involves a lot of work and is very expensive. So you would think that SaaS BI vendors command premium prices, right? Well, not really. In fact, mostly the opposite. Customers pay only for what they use on a monthly or annual basis and they can cancel their subscription at any time (although there may be exit fees.) Compared to on-premise software where vendors get all their money upfront, SaaS BI vendors have to wait several years before they accrue a comparable sum. But, in the meantime, they have to finance an expensive technical and organizational infrastructure that requires large upfront capital outlays and ongoing expenditures. In short, the business model for SaaS BI just doesn't work.

    Wrong Audience? SaaS BI vendors have backed themselves into this corner by touting their services as low cost, easy to use, and fast to deploy. They've had a receptive audience among the unwashed masses of small- and medium-sized businesses that have no or minimal IT budget and people and little knowledge of BI. They've also done well selling to department heads at large companies which have clamped down hard on IT budgets. So, SaaS BI vendors have done a good job of selling an information-rich vision to data-hungry business people who have few capital dollars, tight budgets, and minimal understanding of BI.

    Unfortunately, unlike on premises software vendors, SaaS BI vendors have to back up their claims. They can't sell a promise and then vacate the premises. They have to live daily with the expectations that they've created among their customers who demand low-cost, high-speed delivery of robust BI services. So SaaS BI vendors are stuck between a rock and a hard place: it costs more money to deliver SaaS BI solutions than customers seem to be willing to pay for them.

    Market Strategies - The Way Forward

    As I see it, SaaS BI vendors have five options to extricate themselves from this pickle:

    1) Consult. If SaaS BI vendors want to deliver a complete BI solution that solves real business problems, they should shift from selling software services to professional services, and compete head-on with BI consultancies. SaaS BI vendors would have several advantages here:
    -- SaaS BI vendors can not only develop custom solutions, they can run them. And they can do so in a cost-effective (but not inexpensive) way due to the economies of scale of a virtualized, hosted infrastructure.
    -- They can also develop solutions faster than BI consultancies because they can leverage prebuilt software, models, and metrics built for other customers (although veteran consultancies will also have at least prebuilt models and metrics to contribute to a project.)

    I haven't come across any SaaS BI vendor that is taking this approach overtly, although many are doing so in practice. Perhaps the closest is SAP BusinessObjects OnDemand.

    2) Simplify and Shift. Another approach is for SaaS BI vendors to strip out all the custom work from the equation by making the application as simple as possible, shifting the burden of uploading, modeling, and mapping data to the customer. In other words, the SaaS BI vendor does the easy stuff and the customer does the hard stuff.

    The challenge is here making the modeling and mapping tools both easy to use and suitably sophisticated. This is a devilish tradeoff and, in most cases, a SaaS BI vendor will side with simplicity rather than power and flexibility. This means that their customers will likely hit the wall with such tools once they want to do something complex. And if the application is really simple, then it is probably more cost effective to build it in Excel than in the Cloud.

    Of all the SaaS BI vendors, Indicee seems to be following this path most closely.

    Package and Configure. Another way to minimize the amount of custom development is to deliver packaged analytic applications that come with canned but configurable data mappings, data models, metrics, and reports. The mappings extract, transform, and load data from a specific source application (e.g., Salesforce.com) to a target data model with predefined dimensions, hierarchies, and metrics. Packaged analytic applications streamline development and accelerate deployment.

    The challenge with packaged analytic applications is that they only work if the customer has the same source application that the package supports and they can live with the canned reports, dashboards, and metrics with some modification. Packages typically fall apart when customers want to customize rather than configure the application or they want to extract data from more than one source application to feed the canned data models and reports. And then the implementation becomes a custom consulting engagement. The key to making the packaged approach work is for vendors is to build out a sizable portfolio of applications that meet the majority of customers's needs out of the box. This obviously takes time and long-term investment.

    PivotLink and GoodData seem to be following this approach, although GoodData claims it only packages back-end mappings to various Cloud-based applications, such as Salesforce.com, Microsoft Dynamics CRM Online, and SugarCRM. (And most of its packages only source data from a single Cloud-based application.) GoodData reportedly leaves the front-end fully customizable although they offer rich templates that embed metrics and reports for each source application. In essence, GoodData delivers a series of packaged operational reports for various Cloud-based applications.

    Go On Premise. Another option is to abandon the Cloud, either in part or in full, and deliver software as an on premise solution. Here, the SaaS BI vendor gets its money upfront and leaves the customer with the responsibility of managing its data and delivering a BI solution. However, if the vendor also maintains a SaaS BI service, it can use the cost differential between its on premise and Cloud-based service to educate customers about the true expense of building and maintain a BI solution. This might push customers to purchase the SaaS BI service if they don't want the hassle of building a solution themselves .

    The challenge here is that the vendor needs to offer both Cloud services and on-premises software, which is a mixed business model that might be hard to sustain. The vendor still has to maintain a large-scale data center operation while it also has to provide maintenance and support for on-premise software. The vendor will need patient investors to achieve economies of scale to support both models.

    There is a chance that Birst might follow this course so it can better compete head on with what it considers its chief rival, QlikTech.

    Offer A Real Software Service. Another approach is to offer a software service, not a solution service, which is what most SaaS BI vendors deliver today. A software service takes a component of a BI solution and makes it aailable as a service via the Cloud or a hosted environment. We have already seen database, ETL, and data quality vendors put their software in the Cloud and provide subscription based access to it. This includes companies such as Kognitio (database), SnapLogic (ETL), and Melissa Data (data quality.) These vendors don't purport to deliver a complete BI solution, only a piece of a larger puzzle.

    ConclusionThe only way to make money in the Cloud is to have a lot of customers. The only way to get a lot of customers quickly is to give everyone the same configurable application and avoid custom development work. (A configurable application lets users customize the GUI, create unique workflows, and extend the data model.) In the Cloud, economies of scale are everything. But BI is largely a custom development effort. Unfortunately, most business customers don't realize this and most SaaS BI vendors have done little to disabuse them of the notion. In addition, most SaaS BI vendors have underestimated the challenge of delivering robust BI services that address real business needs and are now struggling to find a sustainable business model that will deliver real profitability.

    Ultimately, the industry will figure out a way to make SaaS BI work for everyone involved. We may have to ratchet down our expectations on both sides of the equation. But there is too much value in running applications remotely in a virtualized environment for SaaS BI not to succeed in the long run.

    Posted July 1, 2011 9:43 AM
    Permalink | 3 Comments |


    I don't think I've ever seen a market consolidate as fast the analytic platform market.

    By definition, an analytic platform is an integrated hardware and software data management system geared to query processing and analytics that offers dramatically higher price-performance than general purpose systems. After talking with numerous customers of these systems, I am convinced they represent game-changing technology. As such, major database vendors have been tripping over themselves to gain the upper hand in this multi-billion dollar market.

    Rapid Fire Acquisitions. Microsoft made the first move when it purchased Datallegro in July, 2008. But it's taken two years for Microsoft to port the technology to Windows and SQL Server so, ironically, it finds itself trailing the leaders. Last May, SAP acquired Sybase, largely for its mobile technology, but also for its Sybase IQ analytic platform, which has long been been the leading column-store database on the market and has done especially well in financial services. And SAP is sparking tremendous interest within its installed base for HANA, an in-memory appliance designed to accelerate query performance of SAP BW and other analytic applications.

    Two months after SAP acquired Sybase, EMC snapped up massively parallel processing (MPP) database, Greenplum, and reportedly has done an excellent job executing new deals. Two months later, in September, 2010, IBM purchased the leading pureplay, Netezza, in an all cash deal worth $1.8 billion that could be a boon to Netezza if IBM can clearly differentiate between its multiple data warehousing offerings and execute well in the field.

    And last month, Hewlett Packard, whose NeoView analytic platform died ingloriously last fall, scooped up Vertica, a market leading columnar database with many interesting scalability and availability features. And finally, Teradata this week announced it was purchasing AsterData, a MPP shared nothing database with rich SQL MapReduce functions that can perform deep analytics on both structured and unstructured data.

    So, in the past nine months, the world's biggest high tech companies purchased five of the leading, pureplay analytic platforms. This rapid pace of consolidation is dizzying!

    Consolidation Drivers

    Fear and Loathing. Part of this consolidation frenzy is driven by fear. Namely, fear of being left out of the market. And perhaps fear of Oracle, whose own analytic platform, Exadata, has gathered significant market momentum, knocking unsuspecting rivals on their heels. Although pricey, Exadata not only fuels game-changing analytic performance, it now also supports transaction applications--a one-stop database engine that competitors may have difficulty derailing (unless Oracle shoots itself in the foot with uncompromising terms for licensing, maintenance, and proofs of concept.)

    Core Competencies. These analytic platform vendors are now carving out market niches where they can outshine the rest. For Oracle, it's a high-performance, hybrid analytic/transaction system; SAP touts its in-memory acceleration (HANA) and a mature columnar database that supports real-time analytics and complex event processing; EMC Greenplum targets complex analytics against petabytes of data; Aster Data focuses on analytic applications in which SQL MapReduce is an advantage; Teradata touts its mixed workload management capabilities and workload-specific analytic appliances; IBM Netezza focuses on simplicity, fast deployments, and quick ROI; Vertica trumpets its scalability, reliability, and availability now that other vendors have added columnar storage and processing capabilities; Microsoft is pitching is PDW along with a series of data mart appliances and a BI appliance.

    Pureplays Looking for Cover. The rush of acquisitions leaves a number of viable pureplays out in the cold. Without a big partner, these vendors will need to clearly articulate their positioning and work hard to gain beachheads within customer accounts. ParAccel, for example, is eyeing Fortune 100 companies with complex analytic requirements, targeting financial services where it says Sybase IQ is easy pickings. Dataupia is seeking cover in companies that have tens to hundreds of petabytes to query and store. Kognitio likes its chances with flexible cloud-based offerings that customers can bring inhouse if desired. InfoBright is targeting the open source MySQL market, while Sand Technology touts its columnar compression, data mart synchronization, and text parsing capabilities. Ingres is pursuing the open source data warehousing market, and its new Vectorwise technology makes it a formidable in-memory analytics processing platform.

    Despite the rapid consolidation of the analytic platforms market, there is still obviously lots of choice left for customers eager to cash in on the benefits of purpose-built analytical machines that deliver dramatically higher price-performance than database management systems of the past. Although the action was fast and furious in 2010, the race has only just begun. So, fasten your seat belts as players jockey for position in the sprint to the finish.

    Posted March 8, 2011 8:20 AM
    Permalink | 3 Comments |