In this month’s newsletter, I want to continue my review of new database technologies for the Web by looking at event analytics. Like MapReduce, which was discussed in my last article, the technologies I will be reviewing may be used in multiple application-processing environments, but the Webis ideally suited to this style of processing.
The topic of event analytics is a complex topic and may involve many different types of technologies and products. It is, therefore, very difficult to discuss this field in depth in a few hundredwords. My objective in this article is to provide an overview of event analytics and to show how this type of processing fits into an operational business intelligence (BI) environment. By necessity,to simplify the discussion, I will in certain cases generalize concepts and technology definitions.
What is an Event?
In her excellent paper “Event-Driven Architecture Overview,”
1 Brenda Michelson of the Patricia Seybold Group states that, “An event is a notable thing that happensinside or outside your business. An event (business or system) may signify a problem or impending problem, an opportunity, a threshold, or a deviation.”
A
business event could be a user action, such as clicking a mouse button to enter information to order a new product. A
systemevent could be an indication that a server or disk has failed, or that a sensor has recorded a change in state. Business and system events generate messages specifying the type of event thathas occurred and providing event data that documents information about the event.
The data contained in an event message can be processed using the same techniques and technologies that process other types of data; but as we will discuss later in this article, there are also newtechnologies and products appearing on the market that enable organizations to analyze event data in ways that improve upon existing approaches.
Why Analyze Events?
To explain the business value of analyzing events, we first need to examine the role of events in existing
business intelligence projects.
Data in operational systems is created and modified by business processes that are initiated by business events (see Figure 1) such as a product order, airline reservation, bank ATM withdrawal, stocktrade, credit card transaction, etc. These business processes may in turn lead to the creation of other events that cause changes to resources in the same or different systems. The processing of aproduct order, for example, may lead to changes in order entry, inventory control, billing and shipping systems.
Figure 1: Traditional
BI Processing Environment
Data in operational systems is captured, transformed and consolidated into a data warehouse for analysis by business intelligence applications. The data integration applications that captureoperational data for use in a data warehouse may take a periodic snapshot of the source data (once a week, for example) or may capture every data change that occurs in the source systems. The actualmethod used will depend on the business requirements of the BI application.
The output from BI processing against a data warehouse consists of
data analytics that reflect the status of business operations at specific moments in time.The time periods that can be analyzed are dependent on the frequency at which the data warehouse is updated. A weekly analysis of bank balances would require the data warehouse to be updated at leastonce a week. In the intra-day world of operational BI, more timely analyses and reports can be obtained by updating the data warehouse at short intervals.
It is important to note that a data warehouse records the data changes caused by business events, but not the actual business events themselves. A traditional business intelligence system is notdirectly aware of the events that flow through operational systems. This lack of knowledge means that it is difficult for a BI system to track and monitor the business performance of the flow ofevents that drive business processes and their underlying activities.
In the case of a product order, for example, it is not possible for a business intelligence system to track all the activities involved in processing the order as it flows from order entry toshipping and billing. A traditional BI system can analyze specific data points in a business process, but cannot track the complete business process workflow unless it sees all of the events involvedin the flow. Also, any system errors that occur during the processing of the product order will not be seen. If the order was entered from a web retail store, the fact that the customer may have haddifficulty entering the order due to technical problems or credit card validation issues would not be seen by the BI system.
Companies are, however, beginning to realize the value of capturing and analyzing events that flow through their operational systems. Many telecom companies, for example, now capture raw call detailrecords (CDRs) into event data stores for analysis. The CDR event data store can be used to monitor network traffic for computer switching or cell tower issues, discover billing errors by correlatingthe CDRs against billing information, and so forth.
In some cases, event data may be captured directly into a data warehouse instead of an event data store. One important characteristic of event data is that once it has been created it is notmodified. From a data warehousing perspective, this lends a slightly different meaning to the term
historical data. Event data is historical data in the sensethat it ages, but we rarely have to be concerned about updates to the event data because once the event data is created it is always current!
But it is not always practical or even necessary to store all the detailed event data in a data warehouse or an event data store. The sheer data volume of the events and their low-level ofgranularity may make this impractical. Event pre-processing applications that filter and consolidate notable events before they are loaded into a data warehouse or event data store can help mitigatesome of these problems, but it is important to realize that pre-processing the event stream may remove useful information.
Lastmonth’s newsletter article discussed how MapReduce could be used to filter web log data for loading into a data warehouse. The need, however, to make fast decisions and take rapid actions(to detect fraud, for example) may still rule out storing detailed event data prior to analysis.
Another reason for not storing event information in a data warehouse is that once the event analyses have been done, the detailed data may no longer be required. The results of the analyses can, ofcourse, be stored in the data warehouse for future use. An example here involves companies that sell and provide information services to website advertisers on ad placement on websites. The placementinformation and advice is produced by regularly analyzing huge volumes of web interactions over fixed periods of time. Once a web event analysis for a specific time period is complete, the detailedweb event data is no longer required.
Analyzing Event Data In Flight
For certain kinds of analysis, the paradigm shift taking place in the business intelligence industry is to create event analytics by analyzing events
inflight as they flow through operational systems, rather than capturing and storing the data first in a data warehouse before analyzing it (see Figure 2). This approach uses an
analyze and store paradigm where the events are analyzed in flight and the results stored in a data warehouse, displayed on an operational dashboard, and/or routed to adownstream application.
Figure 2: Processing and Analyzing Operational Events
There are two main scenarios to consider when analyzing in-flight event data: simple event processing (SEP) and complex event processing (CEP). SEP is used to process and analyze one or more streamsof events associated with a specific operational business process such as an order entry process, for example. SEP is used to produce analytics showing what is happening now in the operationalprocess. It is suited to real-time or close to real-time operational BI.
CEP supports more complex analyses. It can process and analyze multiple event streams associated with multiple business processes. These event streams are often called an
event cloud. CEP can correlate event streams in an event cloud looking for patterns that may not be immediately apparent to the observer. It could, for example, examinehow buying patterns are affected by news reports, advertising campaigns, product reviews, etc. Like SEP, CEP can be used to produce event analytics showing what is happening now; but in addition toanalyzing the current situation, it can also be used to predict possible future outcomes.
Simple Event Processing
Many application middleware products support SEP, including application servers, enterprise service bus (ESB) products and
business process management (BPM) tools. These products provide theability to monitor and analyze events pertaining to a specific process as they flow through the middleware. Some vendors and analysts call this style of processing
business activity monitoring (BAM). Some also call the analytics produced process analytics.
Another way of producing these event analytics is to embed BI processing into the business process application or workflow. This can be achieved using in-line code or by defining the BI processing asa service and then calling the service from the operational application or workflow.
SEP is useful for monitoring and analyzing the business performance of particular operational processes to gain a better understanding of the factors that affect performance and to identifybottlenecks and high-cost activities that degrade performance and impact service levels.
Complex Event Processing
There are several companies that specialize in products for querying, correlating and analyzing event streams. These products process multiple event streams using parallel computing and aprovided CEP engine. They can also correlate event streams with data retrieved using traditional approaches (from a data warehouse, for example). A variety of different data manipulation languagesare used; but one common one is StreamSQL, which extends SQL with capabilities to query continuous streams of data using features such as time windows. The event analytics produced by these productsare sometimes called
continuous analytics.
CEP is used by projects that require real-time monitoring and analysis of continuous event streams and event clouds. Examples include capital markets,
RFID and sensor analysis, intelligence agencies,network and DP center performance and availability, and web computing (social computing analysis, web advertising, online gaming, etc.).
Analyzing Web Events
Web events record activity on the Internet, intranets and extranets. These events may reflect interactivity on websites, instant messages, emails, RSSfeeds, etc. Web events are not necessarily different from other types of events. A web event could represent an order for a new product, which would then be processed in a similar fashion to that ofa transaction event for an order from a traditional retail store. It is, however, convenient to separate web events into their own category because they reflect a more interactive environment andrepresent a new opportunity for event analytics to drive and optimize business performance. I will take a more detailed look at the topic of web events and web analytics in a future article entitled“Web Analytics: From Google to BAM and CEP.”
Summary
We can see from the above discussion that event analytics can be produced using traditional data warehousing techniques and/or by analyzing in-flight events and event streams. These analyticscan be used for a variety of different types of applications, including web computing. For the web environment, event analytics extend the basic traffic analyses provided by traditional web analyticssolutions with the ability to continuously analyze in real-time multiple event streams to determine operational trends and patterns and to support predictive as well as reactive analytics.
I would like to thank Judith Davis, Claudia Imhoff and Neil Raden for their valuable input on this article.
End Notes: - Brenda Michelson, “Event-Driven Architecture Overview,” Patricia Seybold Group, February 2006.
SOURCE: Database Technology for the Web: Part 2 – Event Analytics
-
Colin White
Colin White is the founder of BI Research and president of DataBase Associates Inc. As an analyst, educator and writer, he is well known for his in-depth knowledge of data management, information integration, and business intelligence technologies and how they can be used for building the smart and agile business. With many years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles and papers on deploying new and evolving information technologies for business benefit and is a regular contributor to several leading print- and web-based industry journals. For ten years he was the conference chair of the Shared Insights Portals, Content Management, and Collaboration conference. He was also the conference director of the DB/EXPO trade show and conference.
Editor's Note: More articles and resources are available in Colin's BeyeNETWORK Expert Channel. Be sure to visit today!
Recent articles by Colin White
Comments
Want to post a comment? Login or become a member today!
Posted July 29, 2009 by Prashanth Maruthur Chakrapani
Colin, yet another great article. A great introduction to 'Event Analytics'. I was ignorant of this concept and the article made it very simple to understand. Looking forward to your articles in the future.
Is this comment inappropriate? Click here to flag this comment.