Print
Email
ADVERTISEMENT
Business Intelligence Resources
Using Unstructured Business Content in Business Intelligence
Listen Podcast
(To subscribe to receive audio articles, click here for the audio article feed.)
Published: April 25, 2007
Business content will have an increasing effect on business intelligence processing over the next few years.

While numbers differ about how much unstructured and semi-structured business content exists in organizations, there is no question that it far exceeds the amount of structured business data managed by today’s data warehousing systems. To date, there has been limited interest in the use of business content in business intelligence (BI) and data warehousing, but this is beginning to change. The rapid growth in business content generated by web and collaborative applications coupled with technology improvements in areas such as search and text mining have created a situation where business content can now be used to extend the analytical power and decision support capabilities of business intelligence.

In this article, I look at the different types of business content that exist in organizations and their value in the decision-making process. I also review different approaches to accessing and analyzing this content in a BI environment. For simplicity, I use the term business content to refer to unstructured and semi-structured business information, and the term business data to refer to structured business data.

Types of Business Content
There are five main types of business process in IT systems: business transaction, master data, business intelligence, business planning, and business collaboration. Most business data is created and managed by business transaction and master data processes. Business content, on the other hand, is created and used by all five types of business process.

Unstructured and semi-structured information varies widely in both format and content. Unstructured information, for example, includes rich media files containing audio and video, and text files containing electronic forms, reports and web pages. All of these file types may contain useful business information. Audio file recordings of conversations between customers and support center staff provide valuable insight into the efficiency of support staff and about customers’ views concerning products and services. Similarly, electronic forms used by support staff may also contain information about customer attitudes and viewpoints. Product review web logs (blogs) offer valuable feedback on the acceptance of new products in the market, whereas product and services websites contain competitive pricing on everything from books and DVDs to airline fares.         

A review of semi-structured information shows that a high percentage of this type of information is in an XML format. Tags in XML files provide some semantic information about the contents of the file. There are also an increasing number of industry XML vocabularies, or metamodels, that add additional semantics to XML documents. An example here is XBRL for reporting financial information. XML is becoming the standard approach for exchanging information between systems and between companies.

Examples of applications that can analyze unstructured and semi-structured content and thus enhance BI processing include customer and market intelligence, pricing optimization, customer sentiment and complaint analysis, product safety and quality analysis, regulatory compliance, legal discovery, fraud detection, financial analysis, and IP protection.  

Processing Business Content
Information in business content can add value to the existing business data used by business intelligence applications and their underlying data warehouses. In some cases, the business content may be converted into structured business data and loaded into a data warehouse, while in other situations, the business content may be accessed dynamically and used in conjunction with the results obtained from BI processing.

There are six main approaches to accessing and using business content in a business intelligence environment:

  • BI search where search technology is used to give users access to BI business content such as metadata, queries and analyses, and BI output such as reports and metrics. BI search can also supply connectors to a BI system for use by enterprise search. BI search is often provided as a component of a BI portal.

  • Enterprise search for locating and accessing corporate business content (including BI content) and business data. Compared with Internet search approaches, enterprise search adds techniques such as guided navigation, semantic search, and result clustering. Enterprise search is usually used in conjunction with an enterprise portal.

  • Business content analytics that are created by accessing and analyzing the contents of a search repository or the results of search operations. Often these analytics are delivered to business users in the form of a BI dashboard, or through a BI portal. 

  • Business content exploration that extracts metadata (facts, concepts, relationships) from business content. This extracted metadata may be used to build a business taxonomy, or by business content categorization tools, enterprise search tools, business content analytical applications, and data integration applications that capture and transform business content for loading into a data warehouse.

  • Business content federation techniques that use federated database queries to access and analyze business content and business data that may be maintained in multiple files and databases.

  • Business content integration applications that capture, transform, and integrate business content into the BI and data warehousing environment.

These six approaches are illustrated in Figure 1. (The difference between managed content and unmanaged content in the diagram is that managed content is usually maintained by a content management system and subject to governance procedures, whereas unmanaged content exists in standalone files and databases.) With all six approaches, it is important to understand the capture, transformation, and delivery techniques that are supported and used.

Figure 1

The Impact of Business Content on Business Intelligence
Business content will have an increasing effect on business intelligence processing over the next few years, and it is essential that business intelligence and data warehousing designers, architects, and specialists thoroughly understand how to use this type of information and exploit the significant benefits it offers to the business. It will also become increasingly important for BI staff to work closely with their counterparts that are responsible for building the systems that manage and process business content.

 

If you found this article helpful and would like to receive the latest insights each month from Colin White and other Network authors, please subscribe to the Business and Data Integration Newsletter.


Recent articles by Colin White

Colin White -

Colin is the Founder of BI Research. He is well known for his in-depth knowledge of leading-edge business intelligence and business integration technologies, and how they can be used to build a smart and agile business. With more than 35 years of IT experience, he has consulted for dozens of companies throughout the world and is a frequent speaker at leading IT events. Colin has written numerous articles on business intelligence and enterprise business integration. Colin has an expert channel and blog on the B-Eye-Network and can be reached at cwhite@bi-research.com.

Editor's note: More Colin White articles, resources, news and events are available in the Business Intelligence Network's Colin White Channel. Be sure to visit today!

showing all