Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has been a thought leader in the business intelligence field since the early 1990s. He has conducted numerous research studies and is a noted speaker, blogger, and consultant. He is the author of two widely read books: Performance Dashboards: Measuring, Monitoring, and Managing Your Business (2005, 2010) and The Secrets of Analytical Leaders: Insights from Information Insiders (2012).

Wayne is currently director of BI Leadership Research, an education and research service run by TechTarget that provides objective, vendor neutral content to business intelligence (BI) professionals worldwide. Waynes consulting company, BI Leader Consulting, provides strategic planning, architectural reviews, internal workshops, and long-term mentoring to both user and vendor organizations. For many years, Wayne served as director of education and research at The Data Warehousing Institute (TDWI) where he oversaw the companys content and training programs and chaired its BI Executive Summit. He can be reached by email at weckerson@techtarget.com.

September 2011 Archives

I attended O'Reilly's Strata Summit and Conference on big data this week in New York City. It was a real eye opener. A culture shock, really.

I suspect many of you have experienced the enlightenment that comes upon returning home from spending several days or weeks in a foreign country. Armed with a new cultural context, you see your own beliefs, traditions, and cultural mores through a fresh lens. Often, we learn things about ourselves that we didn't know because the truth was too obvious to see.

That was the experience I had attending Strata this year and Hadoop World a year ago. Last year, after spending two days at Hadoop World surrounded by developers, I discovered how much a data guy I am. This year, after two days at Strata surrounded by bright, young, hip open source people, I realize how much an old, corporate guy I am. Maybe it's not too late to change??

If all you have is a hammer...

Two of the more enlightening presentations I attended were delivered by John Rauser, a software engineer at Amazon.com.

In his first presentation, John made a good case for using Hadoop and cloud services. (It was a great pitch for Amazon Web Services.) He argued that Hadoop is a godsend when you need to scale an application beyond one computer. His point is that Hadoop makes distributed computing easy. He also argued that Hadoop isn't just for big data, but any volume of data in which peak processing requirements require a clustered environment. Finally, he explained the economic upside of using a cloud-based instance of Hadoop because you dynamically provision for current workloads not peak usage.

John is incredibly bright and articulate and clearly smitten with Hadoop and MapReduce. He and his fellow engineers at Amazon are using Hadoop/MapReduce to build (or rebuild) a lot of data-intensive applications that need greater horsepower at low cost. Although it's clear they are getting tremendous value out of Hadoop and MapReduce, I wonder how applicable Amazon's experience is to other companies.

John made a couple of assumptions while encouraging the audience to consider Hadoop as a distributing computing platform. His first assumption is that you are going to code applications rather than implement a software package. You don't need Hadoop if you buy an enterprise business application, most of which are designed to run in clustered environments already. Most companies I know are trying to reduce their dependence on custom applications because of the overhead of maintaining programmers and code. Second, John assumes that you have talented software programmers who have the skills to learn and build Hadoop/MapReduce applications. And he also assumes you have administrators who know how to operate and maintain a Hadoop cluster, which is still a relatively immature data processing environment. (Of course, this is a good argument for using AWS Elastic MapReduce services.)

So, the corollary to John's argument is that if your company likes to code software, employs lots of talented programmers, and either has an operations team versed in Hadoop or can afford to run Hadoop in the Cloud, then consider Hadoop when you need to scale an application cost effectively.

Dissing the DW

John was also pretty dismissive of data warehouses and the IT folks who administer them, but that's because he's a programmer. As a development platform, data warehouses aren't terrific, I'll admit. But then again, that's not what they are designed for. They are data repositories that support batch-oriented, SQL-based, reporting applications and which feed other data environments, including operational data stores that handle more run-the-business applications. To throw the baby out with the bath water is a tad irresponsible. But I'll agree, we need to make data warehouses more analyst and developer friendly.

Data Scientist Characteristics

John's second presentation covered the qualities that make up a good scientist, like himself. I thought his list was rather good. A data scientist needs to be 1) a good software engineer (i.e., programmer), 2) well versed in statistics and applied math 3) a critical thinker who takes nothing at face value and 4) someone who is perpetually curious and eager to learn new things.

This list is quite similar to the one I composed to define the characteristics of a good analytical modeler. During John's presentation, I tweeted that a data scientist is a "an old-school SAS analyst who knows how to program." Most old-school SAS analysts are statisticians who are adept at extracting and preparing large data sets from a variety of internal and external sources, building models, and programming the output in SQL or C so the model can be inserted into a corporate database and run against all new and existing records. In essence, SAS analysts have been forced to augment their statistical knowledge with computer science skills to create and deploy their models.

However, David Smith took exception with my data scientist analogy. David is VP of marketing at Revolution Analytics, which sells a distribution of the popular R open source programming language for creating analytical models. David doesn't think SAS analysts have the computer science or social interaction (ouch!) skills that data scientists need. He likened them to mainframe developers (but that's not surprising given that David is immersed in the open source movement.) David thinks the best data scientists have formal training or experience in both statistics and computer science, folks like, well, R programmers!

Visualizing Models

David emphasized that the goal of a data scientist is not to produce a model or report but to publish it so others can learn from it. To that end, data scientists need to know how to visualize quantitative data. (David then explained how R is terrific for visualizing data.) I wholeheartedly agree. But additionally, a data scientist needs to know how to deploy a model in an operational process where it can augment or automate decision making at the point of transaction or interaction. Embedded models that automate decisions should be the holy grail of a data scientist.

On the whole, my time at Strata was well spent. It was intellectually invigorating and culturally illuminating. In many ways, Strata the TED of data conferences. I'll be back again!


Posted September 22, 2011 1:28 PM
Permalink | 2 Comments |

9-19-2011 5-16-03 PM.jpg

For all the talk about analytics these days, there has been little mention of one of the most powerful techniques for analyzing data: location intelligence.

It's been said that 80% of all transactions embed a location. A sale happens in a store; a call connects people in two places; a deposit happens in a branch; and so on. When we plot objects on a map, including business transactions and metrics, we can see critical patterns with a quick glance. And if we explore relationships among spatial objects imbued with business data, we can analyze data in novel ways that help us make smarter decisions more quickly.

For instance, a location intelligence system might enable a retail analyst working on a marketing campaign to identify the number of high-income families with children who live within a 15-minute drive of a store. An insurance company can assess its risk exposure from policy holders who live in a flood plain or within the path of a projected hurricane. A sales manager can visually track the performance of sales territories by products, channels, and other dimensions.

Geographic Information Systems. Location intelligence is not new. It originated with cartographers and mapmakers in the 19th and 20th century and went digital in the 1980s. Companies, such as Esri, MapInfo, and Intergraph, offer geographic information systems (GIS) which are designed to capture, store, manipulate, analyze, manage, and present all types of geographically referenced data. If this sounds similar to business intelligence, it is.

Unfortunately, GIS have evolved independently from BI systems. Even though both groups analyze and visualize data to help business users make smarter decisions, there has been little cross-pollination between the groups and little, if any, data exchange between systems. This is unfortunate since GIS analysts need business data to provide context to spatial objects they define, and BI users benefit tremendously from spatial views of business data.

Convergence of GIS and BI

However, many people now recognize the value of converging GIS and BI systems. This is partly due to the rise in popularity of Google Maps, Google Earth, global positioning systems, and spatially-aware mobile applications that leverage location as a key enabling feature. These consumer applications are cultivating a new generation of users who expect spatial data to be a key component of any information delivery system. And commercial organizations are jumping on board, led by industries that have been early adopters of GIS, including utilities, public safety, oil and gas, transportation, insurance, government, and retail.

The range of spatially-enabled BI applications are endless and powerful. "When you put location intelligence in front of someone who has never seen it before, it's like a bic light to a caveman," says Steve Trammel, head of corporate alliances and IT marketing at ESRI.

Imagine this: an operations manager at an oil refinery will soon be able to walk around a facility and view alerts based on his proximity to under-performing processing units. His mobile device shows a map that depicts the operating performance of all processing units based on his current location. This enables him to view and troubleshoot problems first-hand rather than being tethered to a remote control room. (See figure 1.)

Figure 1. Mobile Location Intelligence.
9-19-2011 4-49-49 PM.jpg

A spatially-aware mobile BI application configured by Transpara for an oil refinery in Europe. Transpara is a mobile BI vendor that recently announced integration with Google Maps.

GIS Features. Unlike BI systems, GIS specialize in storing and manipulating spatial data, which consists of points, lines, and polygons. A line is simply the intersection of two points, and a polygon is the intersection of three or more points. Each point or object can be imbued with various properties or rules that govern its behavior. For example, a road (i.e., a line) has a surface condition and a speed limit, and the only points that can be located in the middle of the road are traffic lights. In many ways, a GIS is like computer-aided design (CAD) software for spatial applications.

Most spatial data is represented as a series of X/Y coordinates that can be plotted to a map. The most common coordinate system is latitude and longitude, which enables mapmakers to plot objects on geographical maps. But GIS developers can create maps of just about anything, from the inside of a submarine or office building to a geothermal well or cityscape. Spatial engines can then run complex calculations against coordinate data to determine relationships among spatial objects, such as the driving distance between two cities or the shadows that a proposed skyscraper cast on surrounding buildings.

Approaches for Integrating GIS and BI

There are two general options for integrating GIS and BI systems: 1) integrate business data within GIS systems and 2) integrate GIS functionality within BI systems. GIS administrators already do the former when creating maps but their applications are very specialized. Moreover, most companies only purchase a handful of GIS licenses, which are expensive, and the tools are too complex to use for general business users.
The more promising approach, then, is to integrate GIS functionality into BI tools, which have a broader audience. There are several ways to do this, which vary greatly by level of GIS functionality supported.

  • BI Map Templates. Most BI tools come with several standard map images, such as a global view with country boundaries or a North American view with state boundaries. A report designer can place a map in a report, link it to a standard "geography" dimension in the data (e.g. "state" field), and assign a metric to govern the shading of boundaries. For example, a report might contain a color-coded map of the U.S. that shows sales by state. This is the most elementary form of GIS-BI integration since these out-of-the box map templates are not interactive.
  • BI Mashups. An increasingly popular approach is to integrate a BI tool with a GIS Web service, such as those provided by Google or Microsoft Bing. Here the BI tool integrates static and interactive maps via a Web service (e.g., REST API) or a software development toolkit. An AJAX or other Web client renders the map and any geocoded KPIs or objects in the data. (Geocoding assigns a latitude and longitude to a data object.) End users can then pan and zoom on the maps as well as hover over map features to view their properties and click to view underlying data. (See figure 1 above.) This approach requires a developer to write Javascript or other code.
  • GIS Mashups. GIS mashups are similar to BI mashups above but go a step further because they integrate with a full-featured GIS server, either on premise or via a Web service. Here, a BI tool embeds a special GIS connector that integrates with a mapping server and gives the report developer a point-and-click interface to integrate interactive maps with reports and dashboards. In this approach, the end-user gains additional functionality, such as the ability to interact with custom maps created by inhouse GIS specialists and "lasso" features on a map and use those selections to query or filter other objects in a report or dashboard. Some vendors, such as Information Builders and MicroStrategy built custom interfaces to GIS products, while other vendors, such as IBM Cognos and SAP BusinessObjects, embed third party software connectors (e.g., SpotOn and APOS respectively.)
  • GIS-enabled Databases. Although GIS function like object-relational databases, they store data in relational format. Thus, there is no reason that companies can't store spatial data in a data warehouse or data mart and make it available to all users and applications that need it. Many relational databases, such as Oracle, IBM DB2, Netezza, and Teradata, support spatial data types and SQL extensions for querying spatial data. Here both BI systems and GIS can access the same spatial data set, providing economies of scale, greater data consistency, and broader adoption of location intelligence functionality. However, you will still need a map server for spatial presentation.
.

Recommendations

As visual analysis in all shapes and forms begins to permeate the world of BI, it's important to begin thinking about how to augment your reports and dashboards with location intelligence. Here are a few recommendations to get you started:

  1. Identify BI applications where location intelligence could accelerate user consumption of information and enhance their understanding of underlying trends and patterns.
  2. Explore GIS capabilities of your BI and data warehousing vendor to see if they can support the types of spatial applications you have in mind.
  3. Identify GIS applications that already exist in your organization and get to know the people who run them.
  4. Investigate Web-based mapping services from GIS vendors as well as Google and Bing since this obviates the need for an inhouse GIS.
  5. Start simply, by using existing geography fields in your data (e.g., state, county, and zip) to shade the respective boundaries in a baseline map based on aggregated metric data
  6. Combine spatial and business data in a single location, preferably your data warehouse so you can deliver spatially-enabled insights to all business users.
  7. Geocode business data, including customer records, metrics, and other objects, that you might want to display on a map.

Location intelligence is not new but it should be a key element in any analytics strategy. Adding location intelligence to BI applications not only makes them visually rich, but surfaces patterns and trends not easily discerned in tables and charts.


Posted September 19, 2011 2:51 PM
Permalink | No Comments |

The first part of this blog examines the pros and cons of centralized, decentralized, and federated BI architectures using Intuit as a case study. The second part looks at organizational models and how they influence BI architectures using Harley-Davidson and Dell as case studies. The third part shows different ways that corporate and divisional groups can divide responsibility for BI architectural components in the context of various organizational models.

PART I

The Federated Option

"Business needs trump architectural purity every time." Douglas Hackney

For years, business intelligence (BI) experts have diagrammed data warehousing architectures in which a single set of enterprise data is extracted and transformed from various operational systems and loaded into a staging area or data warehouse that feeds downstream data marts. This classic architecture is designed to deliver a single version of high-quality enterprise data while tailoring information views to individual departments and workgroups. It also ensures an efficient and scalable method for managing the flow of operational data to support reporting and analysis applications.

That's the theory, anyway. Too bad reality doesn't work that way.

Federated Reality. Most organizations have a hodgepodge of insight systems and non-aligned reporting and analysis applications that make achieving the veritable single version of truth a quest that only Don Quixote could appreciate. Let's face it, our BI architectures are a mess. But this is largely because our organizations are in a constant state of flux. If an ambitious, tireless, and lucky BI team achieved architectural nirvana, their edifice wouldn't last more than a few months. Mergers, acquisitions, reorganizations, new regulations, new competitors, and new executives with new visions and strategies all conspire to undermine a BI manager's best laid architectural blueprints.

Thus, the BI reality in which most of us live is federated. We have lots of overlapping and competing sources of information and insight, most of which have valid reasons for existence. So rather than trying to roll a boulder up a hill, it's best to accept a federated reality and learn how to manage it. And in the process, you'll undoubtedly discover that federation--in most of incarnations--provides the best way to balance the need for enterprise standards and local control and deliver both efficient and effective BI solutions.

Understanding the Options - Evolution of Intuit's BI Architecture

Federation takes the middle ground between centralized and decentralized BI approaches. Ideally, it provides the best of both worlds, while minimizing the downsides. Table 1 describes the pros and cons of centralized and decentralized approaches to BI. Not surprisingly, the strengths of one are the weaknesses of the other, and vice versa.

Table 1. Centralized versus Decentralized BI
Table 1.jpg

Founded in 1983, Intuit is a multi-billion dollar software maker of popular financial applications, including Quicken, TurboTax, and QuickBooks. Intuit's BI evolution helps to illuminate the major characteristics of the two approaches and explains how they ended up with a federated BI architecture. (See figure 1.)

Decentralized Approach. In the late 1990s, after a series of acquisitions, Intuit had a decentralized BI environment. Essentially, every division--most which had been independent companies--handled its own BI requirements without corporate involvement. Divisions with BI experts built their own applications to meet local needs. Deployments were generally quick, affordable, and highly tailored to end user requirements. As true with any decentralized environment, the level of BI expertise in each division varied significantly. Some divisions had advanced BI implementations while others had virtually nothing. In addition, there was considerable redundancy in BI software, staff, systems , and applications across the business units, and no two systems defined key metrics and dimensions in the same way.

Centralized Approach. To obtain promised financial synergies from its acquisitions and create badly needed enterprise views of information, Intuit in 2000 implemented a classic, top-down, centralized data warehousing architecture. Here, a newly formed corporate BI team built and managed an enterprise data warehouse (EDW) to house all corporate and divisional data. Divisions rewrote their reports to run against the unified metrics and dimensions in the data warehouse. The goal was to reduce costs through economies of scale, standardize data definitions, and deliver enterprise views of data.

Although Intuit realized these gains, the corporate BI team quickly became overwhelmed with project requests, much to the chagrin of the divisions, which began to grumble about the backlog. When the divisions began to look elsewhere to meet their information requirements, the corporate BI team recognized it needed to reinvent itself and its BI architecture.

Figure 1. Intuit's BI Architectural Evolution
Figure 1.jpg
Courtesy Intuit

Federated Approach. To forestall a grassroots resurrection, Intuit adopted a federated architecture that blends the best of top-down and bottom-up approaches. To uncork its project backlog, the corporate BI team decided to open up its data warehousing environment to the divisions. It gave each division a dedicated partition in the EDW to develop their own data marts and reports. It also taught the divisions how to use the corporate ETL tool and allowed them blend local data with corporate data inside their EDW partitions. For divisions that had limited or no BI expertise, the corporate BI team continued to build custom data marts and reports as before. (See figure 2.)

Figure 2. Intuit's Federated ArchitectureFigure 2.jpg

Benefits. The federated approach alleviated Intuit's project backlog and gave divisions more control over their BI destiny. They could now develop custom tailored applications and reports without having to wait in line. For its part, the corporate BI team achieved economies of scale by maintaining a single platform to run all BI activity and maintained tight control over enterprise data elements. Today, the corporate BI team no longer tries to be the only source of corporate data. Rather, it focuses on capturing and managing data that shared between two or more divisions and develop BI applications when asked.

Table 2. Federated BI
Table 2.jpg

Challenges. The main challenge with federation is keeping everyone who is involved in the BI program, either formally or informally, on the same page. This requires a lot of communication and consensus building, or "social architecture" as Intuit calls it. The corporate BI team needs to relinquish some of its control back to the business units--which is a scary thing for a corporate group to do--while business units need to adhere to basic operating standards when using the enterprise platform. Ultimately, to succeed, federation requires a robust BI Center of Excellence with a well-defined charter, a clear roadmap, strong business-led governance, an active education program, and frequent inhouse meetings and conferences. (See Table 2.)

Ultimately, managing a federated environment is like walking a tightrope. You have to balance yourself to keep from swinging too far to the left or right and falling off the rope. To succeed, you have to make continuous adjustments, ensuring that everyone on the extended BI team is moving in the same direction. Without sufficient communication and encouragement, a federated approach courts catastrophe.


PART II
Organizational Models That Impact BI

"Architecture must conform with corporate culture and evolve with the prevailing winds."

In Part I of this series, we discussed the strengths and weaknesses of different architectural options, including federation. And we used Intuit as an example of an organization that evolved through decentralized and centralized architectural models until it settled on federation as a means to marry the best of both worlds.

It's important to recognize that the real problem that federation addresses is not architectural; it's organizational. There is natural tension between local and central organizing forces within a company. Every organization needs to give local groups some degree of autonomy to serve customers effectively. Yet it also needs a central, guiding force to achieve economies of scale (efficiency), present a unified face to the customer (effectiveness), and deliver enterprise views of corporate performance. BI, as an internal service, needs to reflect a company's organizational structure. But knowing where to draw the line architecturally is challenging, especially as a company's business and organizational strategy evolves.

Organizational Models

At a high level, there are three basic organizational models that define the relationship between corporate and divisional groups: Conglomerate, Cooperative, and Centralized. Although, this article treats each as a unique entity, the models represent points along a spectrum of potential organizational structures. The key to each model's identity and defining characteristics, however, stems from a company's product strategy. (See table 3.)

Conglomerate Model. A conglomerate, for example, has a highly diversified product portfolio that spans multiple industries. Here, each division sells radically different products to very different customers. For example, General Electric, is a conglomerate. It sells everything from financial services to household appliances to wind turbines. A conglomerate typically gives business unit executives complete autonomy to run their operations as they see fit using their own money and people. There is little shared data and each business unit has its own BI team and technology and exchanges little, if any, data with other business units.

Table 3. Organizational Models
Table 3.jpg

Centralized Model. The centralized model falls at the other end of the organizational spectrum. Here, an organization sells one type of product to anyone, everywhere. As a result, corporate plays a strong hand in defining product and marketing strategies while business units primarily execute sales in their regions. In a centralized model, IT and BI are shared services that run centrally at corporate. There is one BI team and enterprise data warehouse that supports all business units and users.

Dell, for example, sells personal computers and servers on a global basis. Although its regional business units had significant autonomy to meet customer needs during the company's rapid expansion, Dell now has retrenched. Without dampening its entrepreneurial zeal, Dell has strengthened its centralized shared services, including BI, to streamline redundancies, reduce overhead costs, improve cross-selling opportunities, and deliver more timely, accurate, and comprehensive views of enterprise performance to top executives.

Cooperative Model. The Cooperative model--which is the most pervasive of the three--falls somewhere between the Conglomerate and Centralized models on the organizational spectrum. Here, business units sell similar, but distinct, products to a broad set of overlapping customers. Consequently, there are significant cross-selling opportunities, which require business units to share data and work cooperatively for the greater good of the organization. For example, Intuit, with its various financial software products, can increase revenues by cross-selling its various consumer and business products across divisions. Most financial services firms, which offer banking, insurance, and brokerage products, also employ a Cooperative model.

Although most divisions in a Cooperative model maintain BI teams, corporate headquarters also fields a sizable corporate BI group. Given the distribution of BI teams and technology across divisions, the Cooperative model by default requires a federated approach to BI. The challenge here is figuring out how to divide the BI stack between corporate and divisional units. There are many places to draw the line. Often, the division of responsibility is not clear cut and changes over time as a company evolves its business strategy to meet new challenges.

Harley-Davidson. For example, Harley-Davidson Motor Co., an iconic, U.S.-based manufacturer of motorcycles, decided to broaden its business strategy after the recent recession to focus more on global markets, e-commerce, and custom-made products. Consequently, the BI team created a new strategy to meet these emerging business requirements that emphasizes the need for greater agility and flexibility.

Currently, Harley-Davidson has well-defined, centralized processes for sourcing, validating and delivering data. Any application or project that requires data must be approved by the Integration Competency Center (ICC), which generally also performs the data integration work. The ICC's job is to ensure clean, consistent, integrated data throughout the company.

Currently, the ICC handles all business requests the same way, regardless of the nature or urgency of the request. In the future, to go as fast as the business wants, the ICC will break free from its "one size fits all" methodology and take a more customized approach to business requests. "We need to make sure our processes don't get in the way of the business," said Jim Keene, systems manager of global information services.

Dell. While Harley-Davidson is loosening up, Dell is tightening up. After experiencing rapid growth during the 1990s and 2000s through an entrepreneurial business model that gave regional business units considerable autonomy, the company now wants to achieve both fast growth and economies of scale by strengthening its central services, including BI. In the mid-2000s, Dell had hundreds of departmental systems, thousands of BI developers, and tens of thousands of reports that made it difficult to gain a global view of customers and sales.

As a result, Dell kicked off an Enterprise BI 2.0 initiative chartered to consolidate and integrate independent data marts and reports into a centralized data warehouse and data management framework. It also created a BI Competency Center that oversees centralized BI reporting teams that handle report design and access, metric definitions, and governance processes for different business units.. By eliminating redundancies, the EBI program has delivered a 961% return on investment, improved data quality and consistency, and enhanced self-service BI capabilities.

While Harley-Davidson and Dell represent two ends of the spectrum--a centralized BI team that is rethinking its centralized processes and a decentralized BI environment that is consolidating BI activity and staff--most organizations are somewhere in the middle. Part III of this article will map our three organizational models to a BI stack and show the various ways that a BI team can allocate BI architectural elements between corporate and divisional entities.

Part III
The Dividing Line

"If it can't be my design, tell me where do we draw the line." Poets of the Fall

In Part II of this article, we discussed the impact of organizational models on BI development. In particular, we described Conglomerate, Cooperative, and Centralized models and showed how two companies (Harley-Davidson and Dell) migrated across this spectrum and its impact on their BI architectures and organizations.

As any veteran BI professional knows, BI architectures come in all shapes and sizes. There is no one way to build a data warehouse and BI environment. Many organizations try various approaches until they find one that works, and then they evolve that architecture to meet new business demands. Along the way, each BI team needs to allocate responsibility for various parts of a BI architecture between corporate and business unit teams. Finding the right place to draw the proverbial architectural line is challenging.

The BI Stack. Figure 3 shows a typical BI stack with master data flowing into a data warehouse along with source data via an ETL tool. Data architects create business rules that are manifested in a logical model for departmental marts and business objects within a BI tool. BI developers who write code and assemblers who stitch together predefined information objects, create reports and dashboards for business users. Typically, most databases and servers that power operational and analytical systems run in a corporate data center.

Using this high-level architectural model, we can study the impact of the three organizational models described in Part II of this article on BI architectures.

Figure 3. Typical BI Stack
Figure 3.jpg

Conglomerate - Shared Data Center. In a Conglomerate model, business units have almost complete autonomy to design and manage their own operations. Consequently, business units also typically own the entire BI stack, including the data sources, which are operational systems unique to the business unit. Business units populate their own data warehouses and marts using their own ETL tools and business rules. They purchase their own BI tools, hire their own BI developers, and develop their own reports. The only thing that corporate manages is a data center which houses business unit machines and delivers economies of scale in data processing. (See figure 4.)

Figure 4. Conglomerate BI
Figure 4.jpg

Cooperative BI - Virtual Enterprise. In a Cooperative model where business units sell similar, but distinct, products, business units must work synergistically to optimize sales across an overlapping customer base. Here, there is a range of potential BI architectures based on an organization's starting point. In the Virtual Enterprise model, an organization starts to move from a Conglomerate business model to a more Centralized model to develop an integrated view of customers for cross-selling and upselling purposes and maintain a single face to customers who purchase products from multiple business units.

In a Virtual Enterprise, business units still control their own operational systems, data warehouses, data marts, and BI tools and employ their own BI staff. Corporate hasn't yet delivered enterprise ERP applications but is thinking about it. Its first step towards centralization is to identify and match mutual customers shared by its business units. To do this, the nascent corporate BI team develops a master data management (MDM) system, which generates a standard record or ID for each customer that business units can use in sales and service applications. (See figure 5.)

Figure 5. Cooperative BI - Virtual Enterprise
Figure 5.jpg

The corporate group also creates a fledgling enterprise data warehouse to deliver an enterprise view of customers, products, and processes common across all business units. This enterprise data warehouse is really a data mart of distributed data warehouses. That is, it sources data from the business unit data warehouses, not directly from operational systems. This can be a persistent data store populated with an ETL or virtual views populated on the fly using data virtualization software. A persistent store is ideal for non-volatile data (i.e., dimensions that don't change much) or when enterprise views require complex aggregations, transformations, or multi-table joins, or large volumes of data. A virtual data store is ideal for delivering enterprise views quickly at low cost and building prototypes and short-lived applications.

Cooperative BI - Shared BI Platform. The next step along the spectrum is a Shared BI Platform. Here, corporate expands appetite for data processing. It replaces business unit operational applications with enterprise resource planning (ERP) applications (e.g., finance, human resources, sales, service, marketing, manufacturing, etc.) to create a more uniform operating environment. It also fleshes out its BI environment, creating a bonafide enterprise data warehouse which pulls data directly from various source systems, including the new ERP applications, instead of departmental data warehouses. This reduces redundant extracts and ensures greater information consistency. (See figure 6.)

Figure 6. Cooperative BI - Shared Platform
Figure 6.jpg

Meanwhile, business units still generate localized data and require custom views of information. While they may still have budget and license to run their own BI environments, they increasingly recognize that they can save time and money by leveraging the corporate BI platform. To meet them halfway, the corporate BI team forms a BI Center of Excellence and teaches the business units how to use the corporate ETL tool to create virtual data marts inside the enterprise data warehouse. The business units can upload local data to these virtual marts, giving them both enterprise and local views of data without having to design, build, and maintain their own data management systems.

In addition, the corporate BI team builds a universal semantic layer of shared data objects (i.e., a semantic layer) that it makes available within the corporate BI tool. Although business units may still use their own BI tools, they increasingly recognize the value of building new reports with the corporate BI tools because they provide access to the standard business objects and definitions, which they are required or highly encouraged to use. The business units still hire and manage their own BI developers and assemblers, but they now have dotted line responsibility to the corporate BI team and are part of the BI Center of Excellence. This is the architectural approach used by Intuit (See Part I.)

Centralized BI - Shared Service. Some organizations, like Dell, don't stop with a Shared Platform model; they continue to centralize BI operations to improve information integrity and consistency and squeeze all redundancies and costs out of the BI pipeline. Here, the corporate BI team manages the entire BI stack and creates tailored reports for each business unit based on requirements. For example, Dell's EBI 2.0 program (see Part II) reduced the number of report developers in half and reassigned them to centralized reporting teams under the direction of a BI Competency Center where they develop custom reports for specific business units. The challenge here, as Harley-Davidson discovered, is to keep the corporate BI bureaucracy from getting too large and lumbering and ensure it remains responsive to business unit requirements. This is a tall task, especially in a fast-moving company whose business model, products, and customers change rapidly.

Summary

Federation is the most pervasive BI architectural model, largely because most organizations cycle between centralized and decentralized organizational models. A BI architecture, by default, needs to mirror organizational structures to work effectively. Contrary to popular opinion, a BI architecture is a dynamic environment, not a blueprint written in stone. BI managers must define an architecture based on prevailing corporate strategies and then be ready to deviate from the plan when the business changes due to an unanticipated circumstance, such as a merger, acquisition, or new CEO.

Federation also does the best job of balancing the twin needs for enterprise standards and local control. It provides enough uniform data and systems to keep the BI environment from splintering into a thousand pieces, preserving an enterprise view critical to top executives. But it also gives business units enough autonomy to deploy applications they need without delay or IT intervention. Along the way, it minimizes BI overhead and redundancy, saving costs through economies of scale.


Posted September 16, 2011 12:03 PM
Permalink | No Comments |

This is the fourth in a four-part series on cloud computing for business intelligence (BI) professionals.

Business intelligence in the Cloud is inevitable. In fact, it's already happening. Although Cloud BI hasn't slowed the growth of on-premises BI software, an increasing number of organizations are using Cloud-based BI services and many more are waiting until the time is right.

By Cloud BI, I primarily mean reports and dashboards that run in a multi-tenant, hosted environment and which users access via a Web browser. The reports and applications can be packaged (i.e., Software-as-a-Service or SaaS) or custom-built by the service provider or customer using Platform-as-a-Service (PaaS) or other tools. (See Part II of this series for more detailed explanations of SaaS and PaaS.)

Benefits. The Cloud offers numerous benefits for organizations that want to run reports and dashboards. There is no hardware and software to buy, install, tune, and upgrade. Consequently, there are no IT people to hire, pay, and manage. Applications upgrade automagically and can be scaled seamlessly. Customers pay a monthly subscription based on usage rather than an upfront licensing fee. Essentially, the Cloud speeds delivery and drives down costs. What's not to like?

Impediments. But there are concerns. One of the biggest impediments to Cloud BI today is security, or at least the perception that data in the Cloud is less secure than data housed in a corporate data center. In reality, data is actually safer in the Cloud than in many corporate data centers. The real issue is not security, it's "control." Executives today simply feel safer if their data is housed in a corporate data center. (Ironically, most companies have already outsourced sensitive data, like payroll and sales, to third party providers.) To be fair, some companies, especially those in financial services, must comply with regulations that currently require them to keep data on premise.

E-Commerce. Interestingly, many industry experts raised the same security bogeyman in regards to e-commerce. In the late 1990s, many experts said, "Consumers will never type their credit card number into a Web browser and ship it off to an unknown destination via the public internet because it could be stolen." Of course, we know how this story played out. In 2009, more than 154 million people in the U.S. bought something online, and online sales are growing four times faster than retail sales in general, according to Forrester Research. When it comes to security, convenience trumps fear, especially when fear isn't grounded in reality. The same appears to be happening with the Cloud.

The biggest challenge with running BI in the Cloud involves packaging custom BI development into a cost-effective, online service. By its nature, BI involves creating custom applications that integrate data from unique combinations of data sources. Cloud BI vendors are still figuring out how to deliver BI services without losing their shirts or turning into a custom development shop. (See Part III of this series.)

Finally, some Cloud BI vendors only deliver interactive reports and dashboards (e.g. SAP, Indicee, and GoodData), while only a few offer more in-depth analysis using on-line analytical processing (OLAP) (e.g., Birst) or pivot table functionality against big data (e.g, PivotLink). However, for most organizations getting started with BI, reporting and dashboarding functionality is more than sufficient to satisfy their information appetites.

Gaining Traction

Despite these obstacles, Cloud BI is gaining ground, according to a recent survey of BI Leadership Forum, a global network of BI Directors and other BI professionals. (See www.bileadership.com.) More than one-third of organizations are currently using the Cloud for some part of their BI program, according to the survey. (See figure 1.)

Figure 1. Are you using the Cloud for any part of your BI program?
Cloud- Figure 1.jpg
Source: BI Leadership Forum, June, 2011. Based on 112 responses. www.bileadership.com.

Organizations that have embraced the Cloud point to "speed of deployment" (30%) and "reduced maintenance" (30%), followed by "flexibility" (19%) and "cost" (11%). (See figure 2.)

Figure 2. Motivating Factors
CLoud Figure 2.jpg

Momentum. So far, Cloud BI users are happy campers. Almost two-thirds (65%) said they plan to increase their usage of Cloud BI in the next 12 months. Only 3% said they would decrease usage while another 16% will keep their implementation the same and 16% weren't sure. (See figure 3.)

Among respondents who are not using Cloud BI, 16% said they plan to implement Cloud BI in the next 12 months and 32% were not sure. So Cloud BI has momentum. However, it may take a five to 10 years for Cloud BI to reach the tipping point where it becomes a mainstream component of every BI program. Given Cloud BI's benefits, this trajectory is inevitable.

Figure 3. Future Usage
Cloud Figure 3.jpg

Small Companies Lead the Way

A closer look at the data confirms what many pundits have said about the target market for Cloud-BI software: that it's currently ideal for small companies with few IT resources, limited capital to spend on servers and software, and minimal to no BI expertise. Almost half of small companies under $100M in annual revenues (46%) use Cloud BI in some shape or form. In contrast, large companies with over $1B in annual revenues are almost less than half as likely to adopt the cloud (29%), while medium-sized companies with between $100M and $1B in annual revenues lag further behind with less than one-fifth using BI in the Cloud (18%). (See figure 4.)

Figure 4. Cloud BI Deployment by Company Size
Cloud Figure 4.jpg

Small Companies. For small businesses without legacy BI applications, Cloud BI services are a godsend. The economics and convenience are compelling. Instead of passing around spreadsheets, small companies can implement a Cloud BI service to standardize reports and dashboards and make them available to all employees anywhere via a Web browser.

"What's refreshing for me is that I can go in at any time of day and [run a] report on any metric in our organization, such as item received delivered, inspected at the category, personnel, or employee level and track it by any time period," says Wayne Deer, vice president of operations, at Gazelle, an electronics recycler, which uses GoodData's Cloud BI service.

Large Companies. Interestingly, large companies are the next most prevalent users of Cloud BI services. Often, it's a department head who wants to build a BI application quickly without getting corporate IT involved. Like small companies, departments at larger companies often have limited budgets and BI expertise and most don't want the headaches and expense of having to maintain servers and software.

But enterprise BI managers have assessed the potential of Cloud BI and like what they see. Unfortunately, many are hamstrung by legacy BI implementations. "I see us moving very slowly with adoption because of installed base and switching costs," wrote Darrell Piatt, Director and Chief Architect at a large professional services firm based in Virginia, in a BI Leadership Forum discussion thread. "When and if we decide to replace our BI infrastructure, Cloud BI offerings will be seriously considered."

Mid-size Companies. According to figure 2, medium-sized companies are least likely to adopt Cloud BI services. The reason is that most have already implemented an enterprise IT platform, usually Microsoft SQL Server, which bundles BI tools and applications for free. If the organization has assigned an analyst or IT administrator to build and maintain enterprise reports and dashboards using the platform, it likely has little bandwidth, incentive, or capital to change courses and introduce an alternative BI stack, unless it is having difficulty meeting user requirements.

Cloud BI Vendor Perspective

Given that they provide a service that makes them an extension of an organization's IT team, Cloud BI vendors have a good handle on who their customers are and what they are doing with their services.

For example, Sam Boonin, vice president of products and marketing at GoodData, says that his company's customers fall into three camps: 1) fast-growing technology companies that run all their applications in the Cloud, 2) departments in larger companies, many of which have implemented Salesforce.com and are comfortable with the SaaS model, and 3) SaaS vendors who OEM their product.

Boonin also said that 90% of GoodData's customers start by using one of its packaged applications, which generate reports against a single SaaS-based, front-office application, such as Salesforce.com, ZenDesk, and Google Analytics, or an on-premise package, such as Microsoft Dynamics CRM. (GoodData currently offers 20 packaged applications.) These applications, which deploy in hours, start at $1,000 a month and are often bundled into third party, SaaS products.

Many customers then extend their packaged SaaS BI application by customizing data models and adding other data sources. GoodData configures or customizes the application to the customer's specifications. The process takes roughly six weeks and typically raises the monthly subscription price to between $3,000 and $10,000 a month. "For companies used to spending $500,000 on BI and getting virtually nothing, they see us as a godsend," says Boonin.

Today, almost half (43%) of GoodData's customers generate reports that run against multiple applications. These customers generate the majority of GoodData's revenues. Currently, GoodData has about 100 direct customers and 3,000 indirect customers through OEMs. This is comparable to other pureplay Cloud BI vendors. "Business is good," says Boonin. "We have a $25 billion failed market to disrupt."

Conclusion

Some experts see dark clouds in the Cloud BI market. While ramp up of Cloud BI services hasn't been as fast as some anticipated, it's clearly catching on. The BI market poses unique challenges compared to other SaaS market segments that automate operational business processes. That's because BI applications are generally custom built and require companies to integrate data from multiple sources. Cloud BI vendors have taken different approaches to "servitize" a custom application, which is a logical contradiction. Not all have succeeded, but those still in the market are making headway. The value of Cloud computing is high, and the BI industry will eventually find a way to succeed with it.


Posted September 9, 2011 9:34 AM
Permalink | 3 Comments |