Blog: Wayne Eckerson Subscribe to this blog's RSS feed!

Wayne Eckerson

Welcome to Wayne's World, my blog that illuminates the latest thinking about how to deliver insights from business data and celebrates out-of-the-box thinkers and doers in the business intelligence (BI), performance management and data warehousing (DW) fields. Tune in here if you want to keep abreast of the latest trends, techniques, and technologies in this dynamic industry.

About the author >

Wayne has been a thought leader in the business intelligence field since the early 1990s. He has conducted numerous research studies and is a noted speaker, blogger, and consultant. He is the author of two widely read books: Performance Dashboards: Measuring, Monitoring, and Managing Your Business (2005, 2010) and The Secrets of Analytical Leaders: Insights from Information Insiders (2012).

Wayne is currently director of BI Leadership Research, an education and research service run by TechTarget that provides objective, vendor neutral content to business intelligence (BI) professionals worldwide. Wayne’s consulting company, BI Leader Consulting, provides strategic planning, architectural reviews, internal workshops, and long-term mentoring to both user and vendor organizations. For many years, Wayne served as director of education and research at The Data Warehousing Institute (TDWI) where he oversaw the company’s content and training programs and chaired its BI Executive Summit. He can be reached by email at weckerson@techtarget.com.

Recently in Business Intelligence Category

I am immersing myself in the world of location intelligence this week, attending ESRI's annual user conference in San Diego. Here's an eye opener: there are 15,000 people attending this event!

Location intelligence, also known as spatial analytics, creates maps that enable users to view the relationship of objects in space and perform a variety of spatial calculations, such as, "How long will it take to drive from Detroit to Cleveland?" or "What percentage of high income customers are located within a 15 minute drive of this store?" Or "What's my risk exposure to a hurricane that plows through Dade County, Florida?"

Parallel Worlds

Like business intelligence, location intelligence supports analysis and decision making. But for the past 20 years ago, these two data-centric disciplines have forged independent, but parallel paths. Only now are they beginning to converge.

After my presentation here, one attendee asked, "Why hasn't location intelligence taken off in the business intelligence community?" My first response was that a majority of BI shops have been consumed trying to get adoption for basic reporting and analysis applications and only now are ready to incorporate new capabilities, such as location intelligence, predictive analytics, and unstructured data.

But later I realized that the BI community has already embraced location intelligence, at least the mapping part of it. During the past 10 years, most BI professionals have spent significant time learning how to display the shape and content of data in visual form, using charts and graphics, including maps. Meanwhile, BI vendors have invested heavily in beefing up the visualization capabilities of their tools and adding new charting components, including maps. To BI professionals, maps are now an integral charting component of any BI portfolio.

Spatial Analysis Via GIS

But location intelligence is more than just a map with dots on it. Location intelligence is a full-fledged analytical system. These so-called geographical information systems, or GIS, specialize in storing and manipulating spatial data, which consists of points, lines, and polygons plotted as coordinates in space. Each spatial object can be imbued with various properties or rules that govern its behavior.

For example, a road (i.e., a line) has a surface condition and a speed limit, and the only points that can be located in the middle of the road are traffic lights. Spatial engines can then run complex calculations against coordinate data to determine relationships among spatial objects, such as the driving distance between two cities or the shadows that a proposed skyscraper casts on surrounding buildings, or RFID tagged products that move beyond a specific area (e.g. geofencing.) In essence, a GIS is an object-oriented analytical system that models things in space.

So without access to a GIS system, analytically-driven organizations miss valuable insights. Until recently, most spatial analysis was conducted by a handful of GIS specialists working in the bowels of a company who imported business data into a GIS system to create spatial models. But now, spatial insights can be delivered to all users via GIS-enabled applications, including BI, ERP, and CRM. And some location intelligence providers, like ESRI, can publish GIS applications to the cloud, allowing users to access interactive maps and spatial applications via Web browsers.

Integration with Business Intelligence

In the BI world, the first step towards converging location and business intelligence is plotting business metrics on a map. Like other types of visualizations, maps bring data to life and make it easier for business users to identify the significant trends and issues contained in most reports and dashboards. But location intelligence goes beyond basic geographical displays; it delivers interactive spatial models that correlate business data on a three-dimensional surface.

For example, BI users might use interactive maps to sift through hundreds of variables to optimize the siting of new stores, dealerships, branch offices, factories, drill heads, pipelines, and so on. Or they could use maps to view how the buying habits and demographics of customers located around stores have changed over time. Facilities managers could use interactive maps to plot the optimal evacuation routes from any point in an office building or estimate the physical and financial impact of a nearby bomb explosion. Insurance agents could use GIS-enabled BI tools to simulate what they would have to pay policy holders based on the wind speed and path of an oncoming hurricane. (See figure 1.)

Figure 1. A Dashboard With Embedded Interactive Map
ins-risk-filter-query-result.png
This dashboard embeds an interactive map that lets users model the risk exposure of simulated hurricanes. Or it can integrate a real-time weather feed and display the physical and financial impact of an actual hurricane.Image courtesy of ESRI.

Finally, the explosion of mobile devices, such as smartphones and tablet computers, places a premium on integrating business and location intelligence. For example, mobile dashboards can notify plant managers about the status of poorly performing machines as they walk a factory floor or alert store managers about stock-outs as they move through the aisles. Mobile dashboards can deliver executives and salespeople a 360-degree view of a customer as they approach a customer site. The use cases are endless, and organizations will discover new ones once they GIS-enable their BI applications.

Integration Options

Integrating BI and GIS applications is not as hard as it once was. GIS vendors, like ESRI, now offer rich REST-based, Web services APIs that developers can use to integrate with other applications. And some, like ESRI, now offer cloud-based GIS services so you don't even need to own a GIS system to create spatial applications.

As a result, BI vendors are integrating greater GIS functionality into their applications. A decade ago, BI vendors delivered static, graphical maps that customers could overlay with dots. Many now embed GIS shape files that enable BI users to plot business data on standard baseline maps that support basic GIS functionality, such as zoom, hover, drill, and synchronized filtering. And a few interface directly with GIS systems, allowing BI report authors to easily add custom maps and more sophisticated GIS functionality to reports and dashboards without having to write code.

Summary. As BI shops seek to infuse reports and dashboards with better visualization and more analytics, it's imperative that they explore the rich opportunities afforded by location intelligence. GIS-enabling a BI tool is a simple way to add more robust analytical capabilities to run-of-the-mill reports and dashboards.


Posted July 25, 2012 8:30 AM
Permalink | 1 Comment |

In the quest to deliver business intelligence (BI) solutions, we often get wrapped up in the technology and lose sight of the end game, which is to help the business make better decisions.

Lately, I've been reading about decision making. One book, "Decide and Deliver: 5 Steps to Breakthrough Performance in Your Organization," is chock full of practical advice about improving decision making in organizations. Written by a trio of Bain consultants, the book, offers some great anecdotes and useful tools for improving your company's decision effectiveness.

One key point is that decision speed and execution are as important as decision quality. The authors quote Bill Graber, a long-time General Electric executive, who describes the source of GE's extraordinary performance during the 1980s and 1990s. "There is this myth that we made better decisions than our competitors. That's just not true. Our decisions probably weren't any better than many other companies. What GE did do was make decisions a lot faster than anybody else, and once we made a decision, we followed through to make sure we delivered the results we were expecting."

The authors say there are four elements to decision effectiveness:


  1. Quality. Good decisions are based on facts, not opinions, and take into account risk. There is also healthy debate among valid alternatives.

  2. Speed. Good decisions are made at the right speed. If made too slowly, the competition gains an advantage. If made too fast, valid alternatives are not explored and wrong decisions are made.

  3. Yield. Yield refers to an organizations' ability to execute decisions. Decisions that don't trickle down to the people that need to carry out the actions don't succeed.

  4. Effort. Effort refers to the "time, trouble, expense, and sheer emotional energy" required to make or execute a decision. Too much effort slows decision making, too little effort results in poor quality decisions.

The authors created a survey instrument to help organizations quantify their capabilities along these four dimensions. They also created a formula to determine an organization's overall decision effectiveness: Quality x Speed x Yield - Effort.

As BI professionals, we need to strive to deliver the "right information to the right people in the right format" - a critical success factor touted in the book. Thus, it's imperative we embrace agile development techniques and operationalize our data flows to ensure timely delivery of information to decision makers so they can not only make the right decision, but do so quicker than the competition.


Posted January 3, 2012 8:37 AM
Permalink | 1 Comment |

Despite all the talk about self-service business intelligence (BI), no one has really delivered a tool that makes it easy for casual users to perform true ad hoc queries. Until now.

Most so-called self-service BI tools are better suited to super users, those tech-savvy business people who gravitate to information technology and become the "go to" people in their departments who create ad hoc reports or analyses for themselves and their colleagues. These tools include semantic layers, lists of data objects that users can drag and drop onto a query panel to generate custom queries, and mashboards, which enable users to drag predefined report parts (e.g., charts, tables, controls) from a widget library onto a dashboard canvas.

Visual discovery tools also provide some measure of ad hoc capabilities to casual users within the context of a published performance dashboard. Performance dashboards enable casual users to drill down on top-level metrics and do some root cause analysis, but not perform unfettered exploration. A well-designed performance dashboard is a sandbox that is large enough to answer 60% to 80% of casual users' questions, yet small enough that they don't get lost.

True Ad Hoc for Casual Users

What about the 20% to 40% of time that casual users want to explore data without guardrails? Today's self-service BI tools are just too difficult for them to use. It's not that executives, managers, and front-line workers are dumb, they simply don't have time to learn how to use self-service BI tools or they forget how to use them since they use the functionality so infrequently. It's a lot easier for a casual user to ask Fred, the super user, to do the analysis for them.

For the past several years, I've argued that the search technology is the only way to empower casual with self-service. My BI Framework 2020 holds a spot for such technology, depicted in the wedge that sits between Business Intelligence and Content Intelligence. (See figure 1.) This technology consists of a Google-like tool that enables casual users to type words into a search box and generate ad hoc queries and reports.

Figure 1. BI Framework 2020
BI Framework 2020.jpg
The intersection of Business Intelligence and Content Intelligence gives casual users the ability to explore unstructured and structured content through a keyword search interface.

Several vendors offer products with these capabilities, including Endeca (Lattitude), SAP BusinessObjects (Explorer), and Information Builders (Magnify). Except for SAP BusinessObjects Explorer, which is less a search engine than a visual slicer/dicer of semantic layer fields and data, none have really caught on, which has puzzled me. My sense is that the market just isn't ready for this capability.

EasyAsk: Natural Language Meets Voice Recognition

I recently met with EasyAsk, which has long offered a BI search tool that uses natural language processing (NLP) to understand the intent and meaning of words that users type into a keyword search box. In the use of NLP, they are fairly unique. Their tool is the search engine for Land's End and about 250 other online retailers, but they also offer the tool in the BI space. They believe market events are converging to finally make BI search a reality, and I think they're right.

For one, IBM Watson, which trumped World Jeopardy champions this year, has demonstrated the power of NLP to correctly interpret complex human language. Second, the advent of Apple's Siri for the iPhone 4s is teaching a generation of consumers about the power of voice recognition as a search engine interface. EasyAsk has long boasted an interface to voice recognition software but is currently building a iPhone application as well.

"Siri is teaching people how to search the right way," says Craig Bassin, CEO of EasyAsk. Instead of typing a single word and getting thousands of results, which is the case with most search engines operate, Siri encourages people to submit requests using English phases and sentences. "The user interface of the future will be a big search box in the middle of the screen or a voice recognition system," says Bassin.

This sentiment was echoed by Tim Leonard, chief technology officer of US Xpress, a nationwide shipping company in a recent conversation that I had with him. "BI is going mobile and soon our users will submit queries verbally, by speaking a request into their phones."

Another factor that will help push BI search to the forefront is the NoSQL movement, which uses search-like storage indexes to bring together structured and unstructured data in one system, designed for information processing, reporting, and analysis. These systems are raising expectations that users should be able query both types of data within a single interface. EasyAsk, which is designed to run against SQL databases and product catalogs, now boasts a connector to Hadoop that bridges the worlds of structured and unstructured data.

Summary

BI search is inevitable. It's the only way to give casual users ad hoc query and exploration capabilities. As voice recognition, natural language processing, and NoSQL databases go mainstream, BI search will surely join them.


Posted December 16, 2011 11:19 AM
Permalink | 1 Comment |

The previous two articles in this series covered the organizational and technical factors required to succeed with advanced analytics. But as with most things in life, the hardest part is getting started. This final article shows how to kickstart an analytics practice and rev it into high gear.

The problem with selling an analytics practice is that most business executives who would support and fund the initiative haven't heard of the term. Some will think it's another IT boondoggle in the making and will politely deny or put off your request. You're caught in the chicken-or-egg riddle: it's hard to sell the value of analytics until you've shown tangible results. But you can't deliver tangible results until an executive buys into the program.

Of course, you may be fortunate to have enlightened executives who intuitively understand the value of analytics and are coming to you to build a practice. That's a nice fairy tale. Even with enlightened executives, you still need to prove the value of the technology and, more importantly, your ability to harness it. Even in a best-case scenario, you get one chance to prove yourself.

So, here are ten steps you can take to jumpstart an analytics practice, whether you are working at the grassroots level or working at the behest of a eager senior executive.

1. Find an Analyst. This seems too obvious to state, but it's hard to do in practice. Good analysts are hard to come by. They combine a unique knowledge of business process, data, and analytical tools. As people, they are critical thinkers who are inquisitive, doggedly persistent, and passionate about what they do. Many analysts have M.B.A. degrees or trained as social scientists, statisticians, or Six Sigma practitioners. Occasionally, you'll be able to elevate a precocious data analyst or BI report developer into the role.

2. Find an Executive. Good sponsors are almost as rare as good analysts. A good sponsor is someone who is willing to test long-held assumptions using data. For instance, event companies mail their brochures 12 weeks before every conference. Why? No one knows; it's the way it's always been done. But maybe they could get a bigger lift from their marketing investments if they mailed the brochures 11 or 13 weeks out, or shifted some of their marketing spend from direct mail to email and social media channels. A good sponsor is willing to test such assumptions.

3. Focus Your Efforts. If you've piqued an executive's interest, then explain what resources you need, if any, to conduct a test. But don't ask for much, because you don't need much to get going. Ideally, you should be able to make do with people and tools you have inhouse. A good analyst can work miracles with Excel and SQL and there are many open source data mining packages on the market today as well as low cost statistical add-ins to Excel and BI tools. Select a project that is interesting enough to be valuable to the company, but small enough to minimize risk.

4. Talk Profits. It's very important to remember that your business sponsor won't trust your computer model. They will go with their gut instinct rather than rely on a mathematical model to make a major decision. They will only trust the model if it shows either tangible lift (i.e., more revenues or profits), or it validates their own experience and knowledge. For example, the head of marketing for an online retailer will trust a market basket model if he realizes that the model has detected purchasing habits of corporate procurement officers who buy office items for new hires.

5. Act on Results. There is no point creating analytical models if the business doesn't act on them. There are many ways to make models actionable. You can present the results to executives whose go-to-market strategies might be shaped by the findings. Or you can embed the models in a weekly churn report distributed to sales people that indicates which customers are likely to attrite in the near future. (See figure 1.) Or you can embed models in operational applications so they are triggered by new events (e.g., a customer transaction) and automatically spit out recommendations (e.g., cross-sell offers.)

Figure 1. An Actionable Report
Part V - Actionable Report.jpg

6. Make it Useful. The models not only should be actionable, they should be proactive. The worst thing you can do is tell a salesperson something they already know. For instance, if the model says, "This customer is likely to churn because they haven't purchased anything in 90 days", a salesperson is likely to say, "Duh, tell me something I don't already know." A better model would be one that detects patterns not immediately obvious to the salesperson. For example, "This customer makes frequent purchases but their overall monthly expenditures have dropped ten percent since the beginning of the year."

7. Consolidate Data. Too often, analysts play the role of IT manager by accessing, moving, and transforming data before they begin analyze it. Although the DW team will never be able to identify and consolidate all the data that analysts might need, it can always do a better job understanding their requirements and making the right data available at the right level of granularity. This might require purchasing demographic data and creating specialized wide, flat tables preferred by modelers. It might also mean supporting specialized analytical functions inside the database that lets the modelers profile, prepare, and model data.

8. Unlock Your Data. Unfortunately, most IT managers don't provide analysts ready access to corporate data for fear that their SQL queries will grind an operational system or data warehouse to a halt. To balance access and performance, IT managers should create an analytical sandbox that enables modelers to upload their own data and mix it with corporate data in the warehouse. These sandboxes can be virtual table partitions inside the data warehouse or dedicated analytical machines that contain a replica of corporate data or an entirely new data set. In either case, the modelers get free and open access to data and IT managers get to worry less about resource contention.

9. Govern Your Data. Because analysts are so versatile with data, they often get pulled in multiple directions. The lowest value-added activity they perform is creating ad hoc queries for business colleagues. This type of work is better left to super users in each department. But to prevent Super Users from generating thousands of duplicate or conflicting reports, the BI team needs to establish a report governance committee that evaluates requests for new reports, maps them to an existing inventory, and decides which ones to build or roll into existing report structures. Ideally, the report governance committee is comprised of Super Users who are already creating most of the reports users use.

10. Centralize Analysts. It's imperative that analysts feel part of a team and not isolated in some departmental silo. An Analytics Center of Excellence can help build camaraderie among analysts, cross train them in different disciplines and business processes, and mentor new analysts. A director of analytics needs to prioritize analytics projects, cultivate an analytics mindset in the corporation, and maintain a close alliance with the data warehousing team. In fact, it's best if the director of analytics also has responsibility for the data warehouse. Ideally, 80% to 90% of analysts are embedded in the departments where they work side by side with business users and the rest reside at corporate headquarters where they focus on cross-departmental initiatives.

Summary

Although some of the steps defined above are clearly for novices, even analytics teams that are more advanced still struggle with many of the items. To succeed with analytics ultimately requires a receptive culture, top-notch people (i.e., analysts), comprehensive and clean data, and the proper tools. Success will not come quickly but takes a sustained effort. But the payoff, when it comes, is usually substantial.


Posted November 21, 2011 7:34 AM
Permalink | No Comments |

The prior article in this series discussed the human side of analytics. It explained how companies need to have the right culture, people, and organization to succeed with analytics. The flip side is the "hard stuff"- the architecture, platforms, tools, and data--that makes analytics possible. Although analytical technology gets the lionshare of attention in the trade press--perhaps more than it deserves for the value it delivers--it nonetheless forms the bedrock of all analytical initiatives. This article examines the architecture, platforms, tools, and data needed to deliver robust analytical solutions.

Architecture

The term "analytical architecture" is an oxymoron. In most organizations, business analysts are left to their own devices to access, integrate, and analyze data. By necessity, they create their own data sets and reports outside the purview and approval of corporate IT. By definition, there is no analytical architecture in most organizations--just a hodge-podge of analytical silos and spreadmarts, each with conflicting business rules and data definitions.

Analytical sandboxes. Fortunately, with the advent of specialized analytical platforms (discussed below), BI architects have more options for bringing business analysts into the corporate BI fold. They can use these high-powered database platforms to create analytical sandboxes for the explicit use of business analysts. These sandboxes, when designed properly, give analysts the flexibility they need to access corporate data at a granular level, combine it with data that they've sourced themselves, and conduct analyses to answer pressing business questions. With analytical sandboxes, BI teams can transform business analysts from data pariahs to full-fledged members of the BI community.

There are four types of analytical sandboxes:


  • Staging Sandbox. This is a staging area for a data warehouse that contains raw, non-integrated data from multiple source systems. Analysts generally prefer to query a staging area that contains all the raw data than each source system individually. Hadoop is a staging area for large volumes of unstructured data that a growing number of companies are adding to their BI ecosystems.

  • Virtual Sandbox. A virtual sandbox is a set of tables inside a data warehouse assigned to individual analysts. Analysts can upload data into the sandbox and combine it with data from the data warehouse, giving them one place to go to do all their analyses. The BI team needs to carefully allocate compute resources so analysts have enough horsepower to run ad hoc queries without interfering with other workloads running on the data warehouse.

  • Free-standing sandbox. A free-standing sandbox is a separate database server that sits alongside a data warehouse and contains its own data. It's often used to offload complex, ad hoc queries from an enterprise data warehouse and give business analysts their own space to play. In some cases, these sandboxes contain a replica of data in the data warehouse, while in others, they support entirely new data sets that don't fit in a data warehouse or run faster on an analytical platform.

  • In-memory BI sandbox. Some desktop BI tools maintain a local data store, either in memory or on disk, to support interactive dashboards and queries. Analysts love these types of sandboxes because they connect to virtually any data source and enable analysts to model data, apply filters, and visually interact with the data without IT intervention.

Next-Generation BI Architecture. Figure 1 depicts a BI architecture with the four analytical sandboxes colored in green. The top half of the diagram represents a classic top-down, data warehousing architecture that primarily delivers interactive reports and dashboards to casual users (although the streaming/complex event processing (CEP) engine is new.) The bottom half of the diagram depicts a bottom-up analytical architecture with analytical sandboxes along with new types of data sources. This next-generation BI architecture better accommodates the needs of business analysts and data scientists, making them full-fledged members of the corporate BI ecosystem.

Figure 1. The New BI Architecture
Part IV - BI Architecture of Future.jpg

The next-generation BI architecture is more analytical, giving power users greater options to access and mix corporate data with their own data via various types of analytical sandboxes. It also brings unstructured and semi-structured data fully into the mix using Hadoop and nonrelational databases.

Analytical Platforms

Since the beginning of the data warehousing movement in the early 1990s, organizations have used general-purpose data management systems to implement data warehouses and, occasionally, multidimensional databases (i.e., "cubes") to support subject-specific data marts, especially for financial analytics. General-purpose data management systems were designed for transaction processing (i.e., rapid, secure, synchronized updates against small data sets) and only later modified to handle analytical processing (i.e., complex queries against large data sets.) In contrast, analytical platforms focus entirely on analytical processing at the expense of transaction processing.

The analytical platform movement. In 2002, Netezza (now owned by IBM), introduced a specialized analytical appliance, a tightly integrated, hardware-software database management system designed explicitly to run ad hoc queries against large volumes of data at blindingly fast speeds. Netezza's success spawned a host of competitors, and there are now more than two dozen players in the market. (see Table 1).

Table 1. Types of Analytical Platforms
Part IV - Tools Table.jpg

Today, the technology behind analytical platforms is diverse: appliances, columnar databases, in memory databases, massively parallel processing (MPP) databases, file-based systems, nonrelational databases and analytical services. What they all have in common, however, is that they provide significant improvements in price-performance, availability, load times and manageability compared with general-purpose relational database management systems. Every analytical platform customer I've interviewed has cited an order-of-magnitude performance gains that most initially don't believe.

Moreover, many of these analytical platforms contain built-in analytical functions that make life easier for business analysts. These functions range from fuzzy matching algorithms and text analytics to data preparation and data mining functions. By putting functions in the database, analysts no longer have to craft complex, custom SQL or offboard data to analytical workstations, which limits the amount of data they can analyze and model.

Companies use analytical platforms to support free-standing sandboxes (described above) or as replacements for data warehouses running on MySQL and SQL Server, and occasionally major OLTP databases from Oracle and IBM. They also improve query performance for ad hoc analytical tools, especially those that connect directly to databases to run queries (versus those that download data to a local cache.)

Analytical Tools

In 2010, vendors turned their attention to meeting the needs of power users after ten years of enhancing reporting and dashboard solutions for casual users. As a result, the number of analytical tools on the market has exploded.

Analytical tools come in all shapes and sizes. Analysts generally need one of every type of tool. Just as you wouldn't hire a carpenter to build an addition to your house with just one tool, you don't want to restrict an analyst to just one analytical tool. Like a carpenter, an analyst needs a different tool for every type of job they do. For instance, a typical analyst might need the following tools:

Excel to extract data from various sources, including local files, create reports, and share them with others via a corporate portal or server (managed Excel).
BI Search tools to issue ad hoc queries against a BI tool's metadata.
Planning tools (including Excel) to create strategic and tactical plans, each containing multiple scenarios.
Mashboards and ad hoc reporting tools to create ad hoc dashboards and reports on behalf of departmental colleagues
Visual discovery tools to explore data in one or more sources of data and create interactive dashboards on behalf of departmental colleagues
Multidimensional OLAP (MOLAP) tools to explore small and medium sets of data dimensionally at the speed of thought and run complex dimensional calculations.
Relational OLAP tools to explore large sets of data dimensionally and run complex calculations
Text analytics tools to parse text data and put it in a relational structure for analysis.
Data mining tools to create descriptive and predictive models.
Hadoop and MapReduce to process large volumes of unstructured and semi-structured data in a parallel environment.

Figure 2. Types of Analytical Tools
Part IV - Types of Tools.jpg

Figure 2 plots these tools on a graph where the x axis represents calculation complexity and the y axis represents data volumes. Ad hoc analytical tools for casual users (or more realistically super users) are clustered in the bottom left corner of the graph, while ad hoc tools for power users are clustered slightly above and to the right. Planning and scenario modeling tools cluster further to the right, offering slightly more calculation complexity against small volumes of data. High-powered analytical tools, which generally rely on machine learning algorithms and specialized analytical databases, cluster in the upper right quadrant.

Data

Business analysts function like one-man IT shops. They must access, integrate, clean and analyze data, and then present it to other users. Figure 2 depicts the typical workflow of a business analyst. If an organization doesn't have a mature data warehouse that contains cross-functional data at a granular level, they often spend an inordinate amount of time sourcing, cleaning, and integrating data. (Steps 1 and 2 in the analyst workflow.) They then create a multiplicity of analytical silos (step 5) when they publish data, much to the chagrin of the IT department.

Figure 2. Analyst Workflow

In the absence of a data warehouse that contains all the data they need, business analysts must function as one-man IT shops where they spend an inordinate amount of time iterating between collecting, integrating, and analyzing data. They run into trouble when they distribute their hand-crafted data sets broadly.

Data Warehouse. The most important way that organizations can improve the productivity and effectiveness of business analysts is to maintain a robust data warehousing environment that contains most of the data that analysts need to perform their work. This can take many years. In a fast-moving market where the company adds new products and features continuously, the data warehouse may never catch up. But, nonetheless, it's important for organizations to continuously add new subject areas to the data warehouse, otherwise business analysts have to spend hours or days gathering and integrating this data themselves.

Atomic Data. The data warehouse also needs to house atomic data, or data at the lowest level of transactional detail, not summary data. Analysts generally want the raw data because they can repurpose in many different ways depending on the nature of the business questions they're addressing. This is the reason that highly skilled analysts like to access data directly from source systems or a data warehouse staging area. At the same time, less skilled analysts appreciate the heavy lifting done by the IT group to clean and integrate disparate data sets using common metrics, dimensions, and attributes. This base level of data standardization expedites their work.

Once a BI team integrates a sufficient number of subject areas in a data warehouse at an atomic level of data, business analysts can have a field day. Instead of downloading data to an analytical workstation, which limits the amount of data they can analyze and process, they can now run calculations and models against the entire data warehouse using analytical functions built into the database or that they've created using database development toolkits. This improves the accuracy of their analyses and models and saves them considerable time.

Summary

The technical side of analytics is daunting. There are many moving parts that all have to work synergistically together. However, the most important part of the technical equation is the data. The old adage holds true: "garbage in, garbage out." Analysts can't deliver accurate insights if they don't have access to good quality data. And it's a waste of their time to spend days trying to prepare the data for analysis. A good analytics program is built on a solid data warehousing foundation that embeds analytical sandboxes tailored to the requirements of individual analysts.


Posted November 15, 2011 7:44 AM
Permalink | No Comments |
PREV 1 2 3 4 5 6