Originally published February 10, 2009
How multifaceted is the use of business intelligence (BI) in your company? Most companies have developed enterprise reporting, effective executive dashboards, and even customer or supplier scorecards. These are examples of “structured analytics” – analytics based on a structured or official way of looking at a business and its operations. The structure is based upon the semantic consistency and hierarchical taxonomy of conformed dimensions and is often implemented in an OLAP environment for use in drill-down, roll-up and other point-and-click data navigation.
The strength of structured analytics is the consistent application of key performance indicators (KPIs) and important metrics to the business for understanding important results, both current and as trends over time, of the business and their underlying factors. Structured analytics are well-suited for executives and operational managers who manage business operations and need to act quickly to resolve problems and turn information into action.
However, structured analytics are not well-suited for “out-of-the-box” thinking. It is important to break away from the blind spots that can occur with a complete reliance on the structure of your analytics and use unstructured analytics to find new insights into your business. There are two main types of unstructured analytics: statistical analysis and data discovery.
Many companies have gone beyond structured analytics and utilize statistical analysis, often called data mining, for actuarial analysis of insurance and financial products, behavioral analysis of customers, quality analysis of suppliers and many other uses. Statistical analysis applies mathematical models to find statistical correlations between elements of business data. The purpose is to develop a model that predicts an outcome with a measurable degree of certainty so the model may be applied to help create more of a desirable outcome or less of an undesirable one.
The strength of statistical analysis is in its ability to use the power of mathematics to create models that can help increase revenues by optimizing business actions. Potential applications are cross-selling and up-selling products, identifying high-value customers, maximizing supplier quality and so forth. There is one problem, however: statistical analysis requires users to have some background in mathematics or statistics. While statistical analysis tools implement algorithms that don’t require this background, it helps to have an understanding of linear programming, distribution analysis, time-series analysis and other concepts to ensure that the model developed is accurate for the business situation being analyzed.
The use of statistical analysis in business is limited to specialized areas because of this skills requirement. Also, the development of a model takes time: data needs to be gathered, correlations between data elements considered and weighed, and a statistical distribution selected. Nonetheless, statistical analysis is an important capability that should be put to wider use. However, a method of unstructured analytics that does not require as rigorous a skill set is also available.
Data discovery tools promote ease of data analysis and end user self service by not adhering to the traditional BI model of ETL, quality control, data structuring and OLAP delivery of structured analytics. Rather, these tools automate data loading and table joining, typically based on matching column names, to establish de facto relationships between data elements and sets of values. The objective of these tools is to provide easy access to, and manipulation of, data.
It is this ease of data access that makes data discovery tools powerful. Rather than ETL, data discovery technologies usually sit on top of application databases, database views, and users’ personal Excel and Access data stores. Further, data discovery technologies facilitate the visualization and analysis of data by providing a user-friendly interface for data manipulation.
However, while sitting on top of production data does facilitate data access, data discovery tools typically bypass data quality, control and audit issues addressed in well-designed BI environments. Problematic data can skew metrics and decisions made using them, which is why data quality is important to successful business intelligence. For this reason, I find data discovery technology insufficient for a full-fledged BI solution when used as a stand-alone technology on top of production data. However, as a complementary technology sitting on top of a well designed and integrated operational data store, it can eliminate data quality, control and audit issues and become a powerful technology for data discovery (hence, the name), complementing structured analytic and statistical analysis capabilities.
Data discovery technology is just that: it allows exploratory, sandbox manipulation and investigation of data. Unconstrained by a defined structure or intent to develop a statistical model, data discovery allows a user to observe data values and relationships chosen by the user’s curiosity or interest. Discovering a new understanding requires a user knowledgeable about the business and its data, which means that data discovery tools are best suited to power users or in-depth analysts.
Data discovery observations are often not apparent in structured analytics or statistical analysis. They can evolve into additional metrics that become part of the structured analytics or new data correlations that improve a statistical analysis. In any case, data discovery is a powerful component in a company’s overall BI capabilities, one that complements the capabilities provided by structured analytics and statistical analysis.
Recent articles by Richard Skriletz