Analysts and journalists appear to be far more interested in and active about analytics in the cloud than those who build and deliver analytic applications and capabilities for business use. The hype seems to outpace the adoption rate. But hype in IT circles is often the predecessor to adoption. So maybe it is time to ask what’s going on here.
The signals from the vendor community are mixed. Many business intelligence (BI) solutions providers – some new and some long-time BI players – have entered into cloud computing including Teradata, IBM, Cloud9 Analytics, Cloudscale, In2Clouds, Vertica and others (more about this later). Yet Lucidera, an analytics-as-a-service pioneer, ceased operations in June of 2009 due to financial and funding challenges.
From the user community, the signals are even more uncertain and confusing. A Google search for “cloud analytics case study” brings up only one hit in the first 10 listings that is a true case study of user adoption. The remaining nine hits point to courses, vendor sites and blogs.
Despite the uncertainty, instinct tells me that there is something here. I’ve been around IT for a long time (since 1968) and seen many fads and silver bullets come and go. I know that analytics is certainly not a fad. And I’m confident that cloud computing is here to stay. It makes sense that the two converge. So let’s take a closer look and try to answer some of the questions that surround cloud analytics: What is the technology? What are the applications? What is the upside? What is the downside? Who are the vendors? And realizing that only fools, charlatans and predictive modelers forecast the future: What happens next?
What is the Technology?
Cloud computing is the popular and widely used term for virtualization of computing services. Cloud technology is all about services, and it fits into several categories:
- Software as a Service (SaaS) is an Internet-based model for deployment of software applications. SaaS allows on-demand use of applications without the need to license and install for every computer where the software is used. SaaS is differentiated from earlier client-server and application service provider (ASP) models by multi-tenant architecture where many customers simultaneously use a single instance of the software.
- Data as a Service (DaaS) is a relatively recent term that encompasses four somewhat different kinds of service models.
- Most common among DaaS models is the “data marketplace” where many different kinds of data are available on a pay-per-use basis. Marketplace data can be combined with internal data to enrich data warehouses and add new dimensions to analytics.
- Some providers of data marketplace services extend the model to include some data quality, standardization and correlation services. Most of these services revolve around address data and go beyond common address standardization routines to include features such as delivery route optimization.
- A more recent and emerging DaaS model uses web services to create a developer-centric data hub. In this model, developers upload their own data to be hosted by the DaaS service provider. Developers can then build web services around that data.
- Finally, the acronym DaaS is also used to mean data warehousing as a service, an architecture in which data and BI applications are hosted in a virtualized environment and deployed using a services model.
- Platform as a Service (PaaS) is an approach to virtualizing the hardware, operating systems, applications frameworks and technology stacks upon which applications are built and deployed. PaaS removes the cost and complexity of buying and managing the hardware and software layers needed to deploy applications. A PaaS environment supports the entire life cycle of building, delivering, operating and supporting web applications.
- Infrastructure as a Service (IaaS) is an architecture of virtualized hardware and operating systems. On the surface it sounds a lot like another name for PaaS, but there are some subtle differences. PaaS provides developer environment as well as operations environment but is limited to web applications. IaaS delivers only the operations environment but supports a broader range of applications. IaaS is, in fact, the foundation upon which PaaS environments are constructed.
- Analytics as a Service has not been tagged with the acronym AaaS, perhaps due to the inevitable vocalization that is sure to fail as a marketing buzzword. But analytics as a service is a concept that is gaining attention in the BI community. Analytic services are built upon SaaS and DaaS foundations to create analytic applications and OLAP engines as web-hosted and web-deployed applications.
One perspective that helps to understand the distinctions among these virtualization concepts is to view them from the basis of who uses each of the layers.
Collectively, these virtualization architectures form a technology stack as shown in Figure 1. Infrastructure services are the foundation upon which platform services are built. Platform services are, in turn, the basis for software and data services. SaaS and DaaS work together to provide the foundation for analytics as a service. The entire stack constitutes the “cloud” that is of interest when we talk about “BI in the cloud” or “analytics in the cloud.” Note that the top of the stack is unlabeled. I believe that there is more to come – something beyond analytics as a service – but we don’t yet know what that something will be.
What are the Applications?
The range of possibilities for cloud analytics is at least as broad and diverse as that for in-house analytics, with the promise that the cloud may make it easier, faster and cheaper to deploy.
At one end of the spectrum we find SaaS used for analytic applications – mostly concentrated in the sales force analytics space. You might consider these to be “light duty” analytic applications because they don’t typically work with massive amounts of data and processor-intensive calculations. But they’re big in their own way – generally targeting mass deployment to a large and mobile sales organization.
At the opposite end of the continuum we find PaaS-IaaS used to bring cost-effective and scalable processing power for data mining and predictive analytics where compute-intensive processing and large volume data come together. The exceptionally large amounts of data that are not practical to retain in corporate databases become manageable with DaaS. The complex and cycle-demanding processes to build, test and execute predictive models become attainable with PaaS. And the models themselves become repeatable, reusable and shareable with SaaS.
Somewhere between these two extremes we find that DaaS brings opportunity to combine external data with enterprise data to create new kinds of information and to enrich your analytics. Analytics as a service brings virtualization to OLAP engines and specialty analytic applications.
The analytic application possibilities are limited only by your imagination. Somewhere in the cloud you can find answers to very large data volumes, massive processor needs, high volume of users, high volume of transactions, highly mobile users, needs for data enrichment, needs for data standardization and much more.
What is the Upside?
As happens with any emerging technology, the claims of cloud computing benefits are abundant, and they cover the bases. You can have it better, faster and cheaper; and you can have it all now.
It isn’t my intent to be cynical or to sound skeptical. I’ve listed many of the common claims below. Each of them is possible with cloud computing, but none are guaranteed. It falls to you to sort out the IaaS-PaaS-SaaS-DaaS questions, to choose the right technologies and to use them well. Then you may realize promises of:
- Scalability with on-demand, ready-to-use platforms that adapt readily to growth.
- Elasticity with platforms that expand at times of peak load and contract when the workload shrinks.
- Agility of technical infrastructure that can be reconfigured and realigned as quickly and frequently as your needs change.
- Agility of projects with ready-to-use platforms, services and data.
- Infrastructure cost reduction with multi-tenant sharing of infrastructure expense, pay-per-use, and pay-as-you-go models.
- Device and location independence making applications web accessible and suited for mobile devices and mobile workers.
- Reliability through multiplicity, redundancy and fail-over of cloud infrastructure that reduces disaster recovery and business resumption complexity.
- Operational cost reduction especially for heavy-lifting applications such as very large data warehouses.
- Productivity gains as staff time and resources concentrate on business results instead of infrastructure management.
- Faster time-to-payback for analytic applications with the pay-as-you-go model.
What is the Downside?
That is quite a list of benefits. If the potential is so great and the implementation is so easy, then why the slow adoption rate? Several recent surveys consistently show that the reluctance arises primarily from five areas of IT and infrastructure management concerns – each expressing a dimension of risk:
- Security tops the list of concerns. Are my business activities and performance metrics exposed on the Internet? Are they intrusion-proof and hacker-proof?
- Privacy is closely related to security and raises similar concerns. Is proprietary data at risk of becoming publicly exposed information? What about anonymity of my customers, suppliers and business partners?
- Compliance considerations bring yet more questions of risk. From healthcare providers to retailers who process credit cards, virtually every industry is subject to regulatory constraints that include requirements of data handling. Does the cloud increase my exposure? Does it add to the complexities and uncertainties of being in compliance?
- Control of the technical infrastructure is a long-standing responsibility of IT organizations. Is control diminished with cloud computing? Can the responsibility be fulfilled if the infrastructure is remote and virtual?
- Governance extends beyond control of infrastructure to question viability for consistency, cohesion and sustainability of applications. Can the IT infrastructure be bypassed? Will business units deploy their own applications in the cloud without enterprise-level integration? How will we manage an application portfolio? Are ITIL and COBIT cloud compatible? Will data integration suffer? Is the integration work of the past two decades – ERP, legacy system renewal, data architecture, data sharing, data warehousing, etc. – now at risk?
These are all valid questions, fair questions and questions that IT management should be asking. From these questions and concerns arises the concept of the “private cloud.” A private cloud is a distinct virtual space that comprises private and protected services that are deployed within a shared or public cloud. A virtual private cloud (VPC) provides secure services in much the same way that a virtual private network (VPN) provides secure connectivity. Is VPC a complete answer to the concerns? Certainly not, but it is a sign that the cloud is evolving and maturing. VPC is but a few months old and only the beginning.
Who are the Vendors?
Cloud computing is a big space that is getting lots of vendor attention. I’m sure to miss some, but my intent is to provide examples and not attempt an exhaustive list. I’ll only give brief mention to IaaS and PaaS services, as they are not cloud analytics. They are, rather, the foundation upon which cloud analytics are built. The big players among the IaaS-PaaS providers include Amazon, Google, EMC, Oracle and IBM.
The SaaS list is quite large and encompasses many different kinds of software and applications including financial management, risk management, CRM, sales automation, Internet marketing, exchange hosting, project management and much more. While some of these applications contain analytic components, none are the pure-play analytic applications that are the essence of analytics in the cloud.
DaaS brings us much closer to the target of cloud analytics. The DaaS providers offer services that enable business analytics. Noteworthy among DaaS providers are:
- Caspio Bridge, pioneering the developer hub model of data as a service.
- Jigsaw, providing a data marketplace for company and contact information.
- Kognitio, with the data warehousing as a service model.
- PostcodeAnywhere, offering address data services in a marketplace format.
- StrikeIron, providing a variety of data marketplace services.
- TheWebService, leading the charge of innovation in data as a hub services.
Ultimately, we’re interested in analytics as a service – the service providers who get to the heart of analytics in the cloud. These vendors range from long-time BI vendors with recent addition of cloud analytics to relatively new companies where cloud analytics is the mainstream of their business. Analytics as a service can be found at:
- Birst, offering a robust suite of analytic applications in the cloud including marketing, sales, operations, financial and HR analytics.
- Cloud9 Analytics offers packaged sales analytic applications as a service.
- Cloudscale provides capability to push analytic models to the cloud for active monitoring of real-time data streams.
- Good Data with CRM analytics, reporting and dashboards deployed as cloud computing.
- In2Clouds brings data mining and predictive modeling to the cloud for sales, marketing, customer and risk analytics.
- OCO with packaged analytic applications designed for rapid deployment in a virtual environment with web-based interface.
- Panorama whose PowerApps product brings an OLAP engine to the cloud.
- PivotLink with a cloud-based pay-as-you-go model for reporting, data analysis, dashboards, and collaborative analytics.
- QlikTech, a leading data visualization company who has recently established partnerships to deploy their QlikView product in the cloud.
- Sonoa is an established provider of cloud computing gateways who has recently entered into the field of analytics and decision support with Sonoa Analytics.
- Teradata with Teradata Enterprise Analytics Cloud virtualizing Teradata Express with Amazon’s PaaS environment.
- Vertica, deploying their columnar analytic database in several cloud computing environments including Amazon and Sun.
I have undoubtedly missed someone – not a reflection on the limits of their technology, but the limits of my knowledge. The size of this list, however, is a strong indicator of a vital and growing technology space. Lucidera may be gone, but the concept of SaaS and DaaS analytics lives on.
What Happens Next?
It is only in the very recent past that the term grid computing was new and interesting. In only a few short years, it has faded, giving way to virtualization and cloud computing – ideas that germinated with and evolved from grid computing.
Today the cloud is real, and it will be adopted as part of BI solutions. We’ll see hybrid implementations that integrate all of in-house, private cloud and public cloud solutions to balance the concerns of cost, scalability, security, privacy and local control. We’ll get all of the standard fare that comes with adoption of emerging technologies: case studies proclaiming success, doom-and-gloom stories decrying the technology as a failure to live up to its promises, analysts describing best practices, and consultants cataloging mistakes to avoid. And somewhere amid all of the noise, we’ll actually deliver some value to business.
But adoption is only the beginning. The service-oriented architecture (SOA) foundation and the services nature of cloud computing bring new opportunity for integration and innovation. When business analytics, agile development methods, collaboration technology, social media, mash-ups, text analytics, data mining and predictive analytics converge, they will change the nature of business information and push business management to new horizons.
SOURCE: What’s Up with Cloud Analytics?
Recent articles by Dave Wells