Originally published November 28, 2007
Operational analytics is an important and growing field of business intelligence (BI). As is typical with any evolving technology, new and sometimes confusing buzzwords are suddenly introduced in technical articles and vendor marketing campaigns. Some of these buzzwords add valuable new ideas, while others are simply a repackaging of old ones. In this article, I explore operational analytics buzzwords, such as complex event processing (CEP) and stream analytics, and review their relationship to the familiar world of business intelligence.
The motivation for the article came from an excellent presentation by Hamid Pirahesh of IBM Research at a recent IBM conference in Las Vegas. The presentation, “Forces Defining the Next Generation of Business Intelligence and Analytics,” discussed many of the different concepts of operational analytics outlined in this article, and my article borrows several of Hamid’s examples to demonstrate the potential business value for this aspect of business intelligence.
Analytics are used to track and analyze a company’s business operations and to answer three key questions:
Analytics produced by traditional business intelligence applications and their underlying data warehouses help businesses understand what has happened in the past. There are three types of traditional analytics: strategic, tactical and operational. Strategic and tactical analytics are used for both long-term and short-term business decision making and action taking. If the latency of the historical data in the data warehouse is low enough, operational analytics can be generated from the data for aiding daily and intra-day business decisions and actions.
A data warehouse can also be used in conjunction with advanced analytical and data mining technologies to look for business trends and patterns in historical data, and the resulting predictive analytics are used to forecast what potentially could happen in the future. Examples include detecting patterns that reflect fraudulent actions or the propensity of a person to a purchase a product.
Certain business situations require operational analytics to track business operations as close to real time as possible. However, the latency introduced by moving data into a data warehouse is usually too high to answer real-time questions or to take real-time actions. One solution is to embed the analytical processing in the business process workflows and to analyze the operational processing as it occurs. The embedded processing may use the results (customer value scores or fraud detection models, for example) from traditional and predictive analytical processing to aid in determining what recommendations should be made, or actions should be taken, during the operational processing.
Embedded operational analytics help applications and business users take close to real-time action. There is, however, yet another class of applications where even close to real-time analytics are not sufficient. An application such as algorithmic trading requires split-second actions to be taken based on analyzing hundreds of thousands of trading events per second. This style of analytical processing involves what is known as stream analytics because the events are analyzed as they stream across networks between devices and across systems. Stream processing requires long-running continuous queries as opposed to the one-time queries used by traditional business intelligence applications. Business applications that can exploit stream analytics include fraud detection such as insider trading, fleet management, IT alert analysis and RFID tracking analysis for manufacturing, supply chain optimization and pharmaceutical product tracking.
Stream analytics requires massive amounts of event processing, heavy use of parallel computing, and new database approaches and languages. The underlying technology that supports stream analytics is known as complex event processing (CEP).
The Wikipedia entry for CEP describes it as:
“A technology used for building and managing event-driven information systems. CEP is primarily an event processing concept that deals with the task of processing multiple events from an event cloud with the goal of identifying the meaningful events within the event cloud. Event stream processing is a related technology that focuses on processing streams of related data.”
An event cloud is a storage abstraction where processing is initiated by the arrival of a new event in the cloud. A set of rules looks for patterns involving this event in relation to previously stored events (from the cloud) and reacting by creating one or more new (possibly composite) events.
Event stream processing (ESP) is concerned with processing and analyzing a series of related events in an event cloud. The Wikipedia entry for ESP notes that:
“ESP technologies include event visualization, event databases, event-driven middleware, and event processing languages. ESP deals with the task of processing multiple streams of event data with the goal of identifying the meaningful events within those streams, employing techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes. ESP enables applications such as algorithmic trading in financial services, RFID event processing applications, fraud detection, process monitoring, and location-based services in telecommunications.”
As the Wikipedia entry points out, ESP involves not only querying and analyzing the events flowing through the system, but also using predictive techniques to look for relationships and patterns in the event streams. ESP and stream analytics require continuous queries over time-varying data streams, and traditional database systems and BI applications are ill-equipped to handle these types of processing. Many aspects of data management and BI processing therefore need to be changed to deal with these stream analytics.
The issue here is that many ESP middleware vendors are developing stream-based analytical solutions that are completely disconnected from the traditional BI processing and data warehousing environment. This is somewhat analogous to the way business activity monitoring (BAM) solutions were developed independently from the existing business intelligence environment. BAM failed, and the solution was to instead build embedded BI applications that supported the concept of BAM, but which also were integrated with traditional BI processing.
One company that is putting large amounts of resource and development into CEP (and ESP) is IBM. This is obviously because CEP requires significant amounts of computing power. IBM has several
research and commercial projects and products related to CEP. Examples include System
S (a distributed stream mining system), IBM Active Middleware Technology, or AMiT (a lightweight
CEP engine), IBM ObjectGrid (an
infrastructure that supports extreme transaction processing and CEP) and IBM WebSphere RFID Information
Center. Key customers involved in these efforts include Amazon and Google.
IBM also has a relationship with StreamBase, which was founded by database luminary Dr. Mike Stonebraker. StreamBase commercializes stream technology conceived as part of the Aurora project at MIT, Brown University, and Brandeis University. The objective of StreamBase is to integrate real-time stream processing with historical processing. Other academic research in this area includes STREAM (Stanford University), Telegraph (UC Berkeley) and SASE (UC Berkeley and UMass Amherst).
StreamBase StreamSQL (and other SQL derivatives) allows users to mix access to non-persistent event streams with access to stored data in a database (an event store or data warehouse, for example). It also extends SQL to allow updates to stored tables from data in a stream. Thus, any part of the stream can be stored for further use (in an event store, for example).
Operational analytics offers some significant benefits for optimizing daily business operations. The trend of the market is to move toward real-time processing involving stream analytics and complex event processing for certain types of critical business applications. Not all companies will require this style of extreme processing.
There are an increasing number of buzzwords and technology terms being introduced to describe the various types of operational analytics that exist. My approach is to describe analytical processing
as a branch of business intelligence that delivers three types of analytics: strategic, tactical and operational. All three types may be reactive or predictive in nature. In addition, there are
three types of operational analytics: traditional, embedded and stream. Stream analytics supports real-time processing and employs key technologies such as complex event processing (CEP) and event
stream processing (ESP).
Recent articles by Colin White