The text-analytics market advanced by leaps and bounds in 2010 to an estimated global value of $835 million. Growth was driven by the technology's central role in social-media analysis and by text analytics' contribution to advanced, semantic search and search-based applications.
I have created annual text-analytics market-size estimates for several years, in advance of each spring's Text Analytics Summit. My 2010 estimate has two major components. It projects a 2009 figure of $425 million forward by 26% to $535 million, covering the core text-analytics market plus a valuation of text analytics’ contribution to information acquisition and management and enterprise applications. The 2010 text-analytics market growth rate was nearly double the 2010 13.4% growth rate seen by Gartner for the larger business intelligence (BI), analytics, and performance management software market.
I created my first text-analytics market estimate in April 2008, when I calculated a $250 million global text-analytics market for 2007 and projected 25% year-on-year growth, which a year later I revised upward to 40% in computing a $350 million estimate for 2008. Figures reflect the value of software licenses, service subscriptions, and vendor-provided technical support and professional services. My 2009 estimate of $425 million indicated continued strong growth.
My $835 million 2010 estimate involves a more sophisticated analysis than I had pursued in previous years, partitioning the overall text-analytics market into five categories:
- Core text analytics including natural language processing (NLP), information extraction, sentiment analysis, text-BI, and other semantic applications: tools and solutions, whether installed or delivered as a service. (NLP is the use of statistical and linguistic methods, augmented by machine learning, to identify and extract information from unstructured sources.) Leading technology and solution providers include Attensity, Basis Technology, Clarabridge, Daedalus, Expert System, GATE, IBM, Lexalytics, Linguamatics, OpenAmplify, Open Text, Python NLTK, SAP, SAS, SRA, and TEMIS.
- Information acquisition, by which I mean Web crawling and page retrieval for analytical purposes, also basic information extraction and data cleansing and integration. Vendors include 80legs, Connotate, Informatica, ISYS Search, Kapow Software, Oracle, and SAP. I do not include in this category Web crawling for search-engine indexing.
- Semantic enterprise information management (EIM) and publishing applications that facilitate "smart content." Vendors include, most notably, EMC, IBM, JustSystems, MarkLogic, Open Text, and SchemaLogic.
- Enterprise applications of text analytics tap text/content sources to support business functions such as customer relationship management (CRM), market research, enterprise feedback management (EFM)/surveys, and competitive intelligence. Companies in this space will often license text technologies from third parties, including about half the companies in this solution-provider list: Allegiance, Cambridge Technologies, Crimson Hexagon, Cvent, Digimind, J.D. Power & Associates, Medallia, Mindshare, Radian6, Satmetrix, Scout Labs/Lithium, Sysomos, Verint, and Vovici.
- Search and search-based applications make use of extracted information to enhance information access, whether delivered via smarter search or in business-process-integrated applications. Providers include Attivio, Autonomy, Cataphora, Content Analyst, Dow Jones/Factiva, Endeca, Exalead, FirstRain, Google, IBM, Lixto, Microsoft, NetBase, Peer39, Recommind, Reed Elsevier, and Thomson Reuters.
(Disclosure: Open Text is a consulting client. In the last year, SAP
paid me to write a white paper and Lexalytics and IBM were paid sponsors of conferences I run.)
I anticipate 2011 text-analytics market growth in the 25-40% range, again paced by the desire to automate analysis of online and social media as recently reported by my colleague, Doug Henschen, in Social Media Shapes Up As Next Analytic Frontier
. Larger BI-suite vendors, notably IBM Cognos and SAP BusinessObjects, are embedding text analytics in their broad-market BI
solutions, which will boost text-analytics uptake.
I'll leave you with one additional thought: It is a truism that 80 percent of business-relevant information originates in "unstructured" form, primarily text
, yet a 2010 text-analytics market of $835 million represents only 8% of the $10.5 billion BI, analytics, and performance management software market seen by Gartner.
I don’t expect text's 8% share to become 80% any time soon – text is hard to analyze well, which limits ROI and therefore investment itself – but I do see that business is willing to spend on analytical technologies that further better decision making. Text analytics does just that. Text analytics’ future is very promising.
Recent articles by Seth Grimes