We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.


Database Technology for the Web: Part 2 – Event Analytics

Originally published July 28, 2009

In this month’s newsletter, I want to continue my review of new database technologies for the Web by looking at event analytics. Like MapReduce, which was discussed in my last article, the technologies I will be reviewing may be used in multiple application-processing environments, but the Webis ideally suited to this style of processing.

The topic of event analytics is a complex topic and may involve many different types of technologies and products. It is, therefore, very difficult to discuss this field in depth in a few hundred words. My objective in this article is to provide an overview of event analytics and to show how this type of processing fits into an operational business intelligence (BI) environment. By necessity,to simplify the discussion, I will in certain cases generalize concepts and technology definitions.

What is an Event?

In her excellent paper “Event-Driven Architecture Overview,”1 Brenda Michelson of the Patricia Seybold Group states that, “An event is a notable thing that happens inside or outside your business. An event (business or system) may signify a problem or impending problem, an opportunity, a threshold, or a deviation.”

A business event could be a user action, such as clicking a mouse button to enter information to order a new product. A system event could be an indication that a server or disk has failed, or that a sensor has recorded a change in state. Business and system events generate messages specifying the type of event that has occurred and providing event data that documents information about the event.

The data contained in an event message can be processed using the same techniques and technologies that process other types of data; but as we will discuss later in this article, there are also new technologies and products appearing on the market that enable organizations to analyze event data in ways that improve upon existing approaches.

Why Analyze Events?

To explain the business value of analyzing events, we first need to examine the role of events in existing business intelligence projects.

Data in operational systems is created and modified by business processes that are initiated by business events (see Figure 1) such as a product order, airline reservation, bank ATM withdrawal, stocktrade, credit card transaction, etc. These business processes may in turn lead to the creation of other events that cause changes to resources in the same or different systems. The processing of aproduct order, for example, may lead to changes in order entry, inventory control, billing and shipping systems.
 
alt

Figure 1: Traditional BI Processing Environment

Data in operational systems is captured, transformed and consolidated into a data warehouse for analysis by business intelligence applications. The data integration applications that capture operational data for use in a data warehouse may take a periodic snapshot of the source data (once a week, for example) or may capture every data change that occurs in the source systems. The actual method used will depend on the business requirements of the BI application.

The output from BI processing against a data warehouse consists of data analytics that reflect the status of business operations at specific moments in time.The time periods that can be analyzed are dependent on the frequency at which the data warehouse is updated. A weekly analysis of bank balances would require the data warehouse to be updated at leastonce a week. In the intra-day world of operational BI, more timely analyses and reports can be obtained by updating the data warehouse at short intervals.  

It is important to note that a data warehouse records the data changes caused by business events, but not the actual business events themselves. A traditional business intelligence system is not directly aware of the events that flow through operational systems. This lack of knowledge means that it is difficult for a BI system to track and monitor the business performance of the flow ofevents that drive business processes and their underlying activities.

In the case of a product order, for example, it is not possible for a business intelligence system to track all the activities involved in processing the order as it flows from order entry toshipping and billing. A traditional BI system can analyze specific data points in a business process, but cannot track the complete business process workflow unless it sees all of the events involvedin the flow. Also, any system errors that occur during the processing of the product order will not be seen. If the order was entered from a web retail store, the fact that the customer may have had difficulty entering the order due to technical problems or credit card validation issues would not be seen by the BI system.

Companies are, however, beginning to realize the value of capturing and analyzing events that flow through their operational systems. Many telecom companies, for example, now capture raw call detail records (CDRs) into event data stores for analysis. The CDR event data store can be used to monitor network traffic for computer switching or cell tower issues, discover billing errors by correlating the CDRs against billing information, and so forth.

In some cases, event data may be captured directly into a data warehouse instead of an event data store. One important characteristic of event data is that once it has been created it is not modified. From a data warehousing perspective, this lends a slightly different meaning to the term historical data. Event data is historical data in the sense that it ages, but we rarely have to be concerned about updates to the event data because once the event data is created it is always current!

But it is not always practical or even necessary to store all the detailed event data in a data warehouse or an event data store. The sheer data volume of the events and their low-level of granularity may make this impractical. Event pre-processing applications that filter and consolidate notable events before they are loaded into a data warehouse or event data store can help mitigate some of these problems, but it is important to realize that pre-processing the event stream may remove useful information. Last month’s newsletter article discussed how MapReduce could be used to filter web log data for loading into a data warehouse. The need, however, to make fast decisions and take rapid actions(to detect fraud, for example) may still rule out storing detailed event data prior to analysis.

Another reason for not storing event information in a data warehouse is that once the event analyses have been done, the detailed data may no longer be required. The results of the analyses can, of course, be stored in the data warehouse for future use. An example here involves companies that sell and provide information services to website advertisers on ad placement on websites. The placement information and advice is produced by regularly analyzing huge volumes of web interactions over fixed periods of time. Once a web event analysis for a specific time period is complete, the detailedweb event data is no longer required.

Analyzing Event Data In Flight

For certain kinds of analysis, the paradigm shift taking place in the business intelligence industry is to create event analytics by analyzing events inflight as they flow through operational systems, rather than capturing and storing the data first in a data warehouse before analyzing it (see Figure 2). This approach uses an analyze and store paradigm where the events are analyzed in flight and the results stored in a data warehouse, displayed on an operational dashboard, and/or routed to a downstream application.

alt

Figure 2: Processing and Analyzing Operational Events

There are two main scenarios to consider when analyzing in-flight event data: simple event processing (SEP) and complex event processing (CEP). SEP is used to process and analyze one or more streams of events associated with a specific operational business process such as an order entry process, for example. SEP is used to produce analytics showing what is happening now in the operational process. It is suited to real-time or close to real-time operational BI.

CEP supports more complex analyses. It can process and analyze multiple event streams associated with multiple business processes. These event streams are often called an event cloud. CEP can correlate event streams in an event cloud looking for patterns that may not be immediately apparent to the observer. It could, for example, examine how buying patterns are affected by news reports, advertising campaigns, product reviews, etc. Like SEP, CEP can be used to produce event analytics showing what is happening now; but in addition to analyzing the current situation, it can also be used to predict possible future outcomes.        

Simple Event Processing

Many application middleware products support SEP, including application servers, enterprise service bus (ESB) products and business process management (BPM) tools. These products provide the ability to monitor and analyze events pertaining to a specific process as they flow through the middleware. Some vendors and analysts call this style of processing business activity monitoring (BAM). Some also call the analytics produced process analytics.

Another way of producing these event analytics is to embed BI processing into the business process application or workflow. This can be achieved using in-line code or by defining the BI processing asa service and then calling the service from the operational application or workflow.

SEP is useful for monitoring and analyzing the business performance of particular operational processes to gain a better understanding of the factors that affect performance and to identify bottlenecks and high-cost activities that degrade performance and impact service levels.

Complex Event Processing

There are several companies that specialize in products for querying, correlating and analyzing event streams. These products process multiple event streams using parallel computing and a provided CEP engine. They can also correlate event streams with data retrieved using traditional approaches (from a data warehouse, for example). A variety of different data manipulation languages are used; but one common one is StreamSQL, which extends SQL with capabilities to query continuous streams of data using features such as time windows. The event analytics produced by these products are sometimes called continuous analytics.

CEP is used by projects that require real-time monitoring and analysis of continuous event streams and event clouds. Examples include capital markets, RFID and sensor analysis, intelligence agencies,network and DP center performance and availability, and web computing (social computing analysis, web advertising, online gaming, etc.). 

Analyzing Web Events

Web events record activity on the Internet, intranets and extranets. These events may reflect interactivity on websites, instant messages, emails, RSS feeds, etc. Web events are not necessarily different from other types of events. A web event could represent an order for a new product, which would then be processed in a similar fashion to that ofa transaction event for an order from a traditional retail store. It is, however, convenient to separate web events into their own category because they reflect a more interactive environment and represent a new opportunity for event analytics to drive and optimize business performance. I will take a more detailed look at the topic of web events and web analytics in a future article entitled “Web Analytics: From Google to BAM and CEP.”  

Summary

We can see from the above discussion that event analytics can be produced using traditional data warehousing techniques and/or by analyzing in-flight events and event streams. These analytics can be used for a variety of different types of applications, including web computing. For the web environment, event analytics extend the basic traffic analyses provided by traditional web analytics solutions with the ability to continuously analyze in real-time multiple event streams to determine operational trends and patterns and to support predictive as well as reactive analytics.

I would like to thank Judith Davis, Claudia Imhoff and Neil Raden for their valuable input on this article.  

End Notes:
  1. Brenda Michelson, “Event-Driven Architecture Overview,” Patricia Seybold Group, February 2006.



  • Colin WhiteColin White

    Colin White is the founder of BI Research and president of DataBase Associates Inc. As an analyst, educator and writer, he is well known for his in-depth knowledge of data management, information integration, and business intelligence technologies and how they can be used for building the smart and agile business. With many years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles and papers on deploying new and evolving information technologies for business benefit and is a regular contributor to several leading print- and web-based industry journals. For ten years he was the conference chair of the Shared Insights Portals, Content Management, and Collaboration conference. He was also the conference director of the DB/EXPO trade show and conference.

    Editor's Note: More articles and resources are available in Colin's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Colin White

 

Comments

Want to post a comment? Login or become a member today!

Posted July 29, 2009 by Prashanth Maruthur Chakrapani

Colin, yet another great article. A great introduction to 'Event Analytics'. I was ignorant of this concept and the article made it very simple to understand. Looking forward to your articles in the future.

Is this comment inappropriate? Click here to flag this comment.