Blog: Seth Grimes http://www.b-eye-network.com/blogs/grimes/ Welcome to my BeyeNETWORK Blog, which will focus on text analytics and other matters related to making sense of unstructured information sources in support of better enterprise decision making. Copyright 2012 Wed, 16 May 2012 08:29:36 -0700 http://www.movabletype.org/?v=4.261 http://blogs.law.harvard.edu/tech/rss Please check out the Text Analytics Summit, Boston, June 12-13 The next Text Analytics Summit is coming up in four weeks. The June 12-13 conference will be the 8th annual Boston summit, the 8th Boston summit I've been privileged to chair. Will you join us?

The summit series was the first business-focused conference dedicated to BI on text, to techniques that turn text into data in the service of diverse applications. It remains the best, a testimony to outstanding speakers, great networking opportunities, and the unparalleled importance text plays in the Social, Big Data era.

As chair, I can extend to you a special $300 registration discount, via the code SG12. Use it and hear speakers on customer experience, marketing, e-discovery, financial services, and social-media analytics -- from organizations that include American Express, eBay, Fidelity Investments, Maritz Research, Monster.com, NASA, and Walt Disney. Visit www.textanalyticsnews.com for information, and follow the Registration link to register now.

Whether you're a veteran user or just getting started with text analytics, I hope you'll join us next month in Boston!
]]>
http://www.b-eye-network.com/blogs/grimes/archives/2012/05/please_check_ou.php http://www.b-eye-network.com/blogs/grimes/archives/2012/05/please_check_ou.php BI Wed, 16 May 2012 08:29:36 -0700
Social Sentiment Matters! Social sentiment matters -- customer opinions, attitudes, and emotions -- rants and raves that affect corporate reputation, provide valuable market and brand insights, and help you understand and engage with customers.

Yet there are too many low-grade tools out there. Sentiment analysis done right is about much, much more than simply scoring tweets and reviews. Sentiment analysis done right discovers business value in customer, consumer, and constituent content and behaviors, whether online, on-social, or in enterprise feedback.

The Sentiment Analysis Symposium, May 8 in New York, is the place to learn more.  This is an authoritative conference that brings together experts and practitioners from research and industry. You'll have an unmatched opportunity to learn about state-of-the-art technologies, how they are applied, return on investment, and how to choose from among the many available options.

Options help you understand social conversations and also direct and indirect feedback (such as surveys, contact-center notes, and warranty and insurance claims), online news, presentations, even scientific papers: Any information source that captures subjective information.

Advanced analyses monitor and measure sentiment and often much more, linking sentiment to demographics, customer profiles, behaviors, and transactional records. They help business analysts (and marketers, market researchers, customer service and support staff, product managers, and other users) get at root causes.  These are the explanations of behaviors captured in transaction and tracking records. Sentiment analysis means better targeted marketing, faster detection of opportunities and threats, brand-reputation protection, and the ultimate aim, profit.

Social Media revolve around feelings, attitudes, and emotions. Facebook and Twitter are major sources of sentiment (and also of complementary social connectedness data). Facebook and Twitter accounts have profile data attached to them, but nothing that matches the detailed, usably-structured information you can find on LinkedIn. Google is the ultimate information-access engine, capable of bringing together information from a huge variety of disparate sources, including sentiment information such as product, restaurant, and hotel ratings, although when corporations wish to find, mine, and exploit sentiment they need to turn to deeper BI and analytics tools.

There's no one-size-fits-all sentiment solution, not Google or one of the several as-a-service solutions out there or any of the capable analysis workbenches or social-media analytics tools. Instead, there's a whole spectrum of sentiment sources and analysis possibilities.

These are a sampling of the topics that will be covered at the May 8 Sentiment Analysis Symposium. You will meet and learn from experts, strategists, practitioners, researchers, and solution providers - experienced and new users and those evaluating solutions. A sample of speakers for the event includes the American Red Cross, Fidelity Investments, Thomson Reuters, American Express, Kraft Foods, and the Wall Street Journal.

For a crash course on technology concepts, you should also attend the May 7, half-day Practical Sentiment Analysis tutorial, taught by Prof. Bing Liu. (Check out this profile of Bing that appeared in the January 27, 2012 New York Times.)

To register please visit sentimentsymposium.com/registration.html today. See you there!

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2012/04/social_sentimen.php http://www.b-eye-network.com/blogs/grimes/archives/2012/04/social_sentimen.php text analytics Mon, 16 Apr 2012 05:45:28 -0700
The Science of Sentiment: An Interview with Seth Grimes
(Originally appeared at http://www.greenbookblog.org/2011/10/23/the-science-of-sentiment-an-interview-with-seth-grimes/)

Seth is an analytics strategist with Washington DC based Alta Plana Corporation . He is contributing editor at TechWeb's InformationWeek, founding chair of the Sentiment Analysis Symposium, and Text Analytics Summit, and text analytics channel expert for TechTarget's BeyeNETWORK.com. He is the leading industry analyst covering text analytics. Seth consults, writes, and speaks on business intelligence, data management and analysis systems, text mining, visualization, and related topics.

One of my favorite perks about this whole blogger thing is meeting some amazing people, and every once in while even getting to know people I consider to be my "heroes".  Seth Grimes is one of those folks. Since I began exploring the possibilities of social media analysis, text analytics, "Big Data", etc.. over and over again I would run into some piece pf genius written by Seth. In many ways our current careers parallel in terms of overall positioning and strategies, but Seth has achieved a level of reach, influence, credibility, and thought leadership than I can only aspire to. He's just that good.

Seth is an analytics strategist with Washington DC based Alta Plana Corporation . He is contributing editor at TechWeb'sInformationWeek, founding chair of theSentiment Analysis Symposium, and Text Analytics Summit, and text analytics channel expert for TechTarget's BeyeNETWORK.com. He is the leading industry analyst covering text analytics. Seth consults, writes, and speaks on business intelligence, data management and analysis systems, text mining, visualization, and related topics.

Seth invited me to attend his upcoming Sentiment Symposium as a guest blogger, but I just couldn't fit into my schedule. Instead, I suggested that we do an email interview to talk a bit about the symposium, but also about his views on where market research fits (or doesn't fit) in the new Business Intelligence paradigm taking shape right now. I think his take is important for market research to hear and I think you'll find a lot of value in what he has to say.

Seth will also be a guest on Radio NewMR on Tuesday, October 25th. Click here to register to listen to the interview live.

We conducted this interview via email over the past week. Enjoy!

LM: Thanks for agreeing to chat with me Seth! As a long time fan,  it's a real honor. First off, you're deeply involved in text  analytics and sentiment analysis.  How and, more important, why?  And what's in it for market researchers?

SG: My thanks to you Lenny for inviting me.  I'm always honored when people value what I have to say!

How is easy: I'm a consultant and industry analyst.  I help user organizations and solution providers with analytics strategy.  This work involves business intelligence and text analytics and their application to meet business challenges.

Why?  Personally I'm fascinated with language and use of automated technologies -- natural language processing (NLP) and computational linguistics -- to help machines get at meaning and discover patterns.

What's in it for market researchers?  First off, the technology will help you automate analysis of free-text survey responses, verbatims.  There's huge potential ROI just in that step.  I know of one organization that, via use of text analytics software, was able to reduce processing of periodic surveys from one person-week to half a day's work.  But beyond surveys, the technologies allow you to turn the Net -- online and social sources -- into one huge focus group and to draw insights in near real-time.

LM:  You're involved in a number of conferences including the upcoming  Sentiment Analysis Symposium. Can you tell me a bit about these  events and what your goal is in having them?

SG: The conferences are a natural for me, an outgrowth of the writing, speaking, and consulting I've been doing for years.  So, we have -

The Sentiment Analysis Symposium, coming up November 8-9 in San Francisco, and the Text Analytics Summit, -- where folks in market research, marketing, customer experience, financial service should be in order to best exploit attitudes and emotions in online and enterprise source, and where they should be heading.  And I'm founding chair of the summit, which started in 2005  with similar goals, covering a broader area however.

The conferences are at the intersection of technology and business, about discovering insights in content that contribute to better decision making. They're about learning and making connections.

LM: Over the past year my career has followed a similar path  (consulting, writing, speaking leading into event organizing), so I  can relate to the trials and joys of putting these things together! I've found an intense interest in topics related to emerging business  intelligence technology from a fairly small segment of the marketing  and insights communities, and a lot of resistance to embracing these  new approaches from the rest. Has your experience been similar, or are you finding  growing interest from a broader audience? If so, what is fueling the  change?

SG: The audience is growing, as folks understand the technologies' potential and as they learn how leading-edge organizations are benefiting from it. I try to cross-pollinate where I can, to evangelize analytical technologies in business domains that could profit by adopting them, and to bring business concerns to technology companies.  There's significant need.

Text and content analytics appear first on my agenda, as means of discovering and exploiting the business value of content.  I'm also a booster of integrated analytics.  The aim is to link content-sourced information with data from transactional and operational systems and -- given new, renewed interest in location intelligence and in the "when" of clickstreams, transactions, and behavior tracking -- geospatial and temporal analyses, joined via semantics.

OK, that's a load of techno-speak, so I'll restate by saying this stuff is going to be -- getting to be -- huge.  It will surface in augmented reality and other consumer-facing applications, with smart content and advertising delivery, sensitive to context and situation, critical tools for business competitiveness.  Yet the mainstream BI and market-research worlds are only starting to clue in. Resistance? More, I'd say, a lack of vision in some cases and of time to consider the possibilities in others.

LM: There is certainly a lot of energy being applied to developing  new tools in this space; what is your take on the current "state of  the industry"? How close are we to fulfilling the potential of these technologies?

SG: I've been using a photo of Alan Turing in recent presentations.  Turing's 1930s work defined computability, and he was also a  marathoner who almost qualified for the 1948 British Olympic Team.  I use a photo of Turing running, and only once has someone in my audience recognized him.  I show Turing as a runner because we're engaged, with text analytics and sentiment analysis, on a course toward machine comprehension of human language and, complementing understanding, machine language generation, toward machines that can pass a modernized Turing Test.  The race is on, but we're still a long way from the finish line.

LM: What recent developments in the field are you most excited about,  and which company do you think is closest to "getting it right" in  terms of the practical application of these technologies?

SG: What's cool?  Beyond-polarity sentiment technologies, which detect mood and emotion, not just in text but also in speech.  Image and video analytics: Information extraction from even more sources. Identify resolution: What's someone's demographic and psychographic profile?  Question answering -- that is, semantically-infused information access that goes way beyond search -- the kind of stuff we're seeing in IBM Watson, Wolfram Alpha, and Apple's Siri.

Cool stuff, but in terms of meeting basic, right-now business needs, there are actually a fair number of companies getting it right.  I won't answer in print, but folks should get in touch or attend the conference to learn about them!

LM: LOL, fair enough! On that note, who are you most excited about hearing at the upcoming Symposium and why? 

SG: I curated the program to appeal to a business and technology common ground.  It's designed for people working in customer experience and CRM, marketing and market research, competitive intelligence, financial services, and so on, and not just myself.  You should check out the agenda, which is online ; it's all great (although I admit to bias)!

But actually, what I'm really, really looking forward to is just chatting with people -- speakers and attendees -- during the breaks and the pre- and post-conference receptions.  Frankly, I learn the most in those informal, unscripted conversations.

LM: A lot of media coverage has been given to the idea of "Big Data",  and I certainly see what appears to be a fairly rapid wave of  consolidation, new entrants, and repositioning from the big tech firms taking place. It seems as if the focus of all that activity is to make  a play for data ynthesis/convergence to support the "Big Data" idea.  What role is text analytics and sentiment analysis going to play in bringing this brave new world to life?

SG: Yeah, Big Data, this season's buzzword.  It's marketing speak, and we're already seeing backlash, that the challenge most often isn't volume, it's complexity and data integration.

Much of that complexity is created by the desire to bring text-sourced information -- facts and opinions -- into the analytical mix.  You need "natural language" to explain what you're seeing in the numbers... hence our conversation now.

LM: OK, last question Seth. What changes do you expect to see in the next 5 years in the market research space as a result of the advances in text analytics, sentiment analysis,  and "big data" integration/analysis? How  does the traditional survey/focus group paradigm fit into that future?

SG:  heard a speaker say, earlier this year, that with a "culture of listening," there is "no need for surveys." I posted a photo of his slide to Twitter -- the gentleman is director, consumer services at large CPG company -- which ignited a Twitter exchange with consensus: No, you need surveys. For customer-experience initiatives, for market research, you can't learn everything you need to know without systematically asking a set of directed questions to a known set of respondents.  Text analytics, sentiment analysis: These technologies will help you do better surveys, with larger numbers of respondents, even flash surveys (let's call them) that can be turned around really quickly.

Focus groups, on the other hand, are slow, expensive and subjective.  As I see it, they are very replaceable by online/social-media monitoring.  Bye-bye.

We'll see even further linking of survey- and social-sourced insights with behavioral and psychographic profiles inferred from "big data" clickstream, location, service utilization, transactional, and other tracking data and mined from content.  This triangulation -- ensemble methods that coordinate and combine multiple models and approaches -- is the way to go. 

LM: You're preaching to the choir my friend; I couldn't agree more that the future is about the synthesis of multiple data streams. Thanks so much for the great conversation Seth and good luck with all of your efforts! 

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2011/10/the_science_of.php http://www.b-eye-network.com/blogs/grimes/archives/2011/10/the_science_of.php Thu, 27 Oct 2011 05:48:54 -0700
Find value in online/social text and sentiment: free report, conferences
- My free report, "Text/Content Analytics 2011: User Perspectives on Solutions and Providers," is out. Are you looking for business value in "unstructured" social, online, or enterprise sources? My report will provide background information and "wisdom of the crowds" guidance you can use. Download the report free via altaplana.com/TA2011.

- For a deeper dive into customer/market attitudes and opinions, check out the Sentiment Analysis Symposium, November 9 in San Francisco, sentimentsymposium.com. We have speakers lined up from Zynga, HP, Amazon, TripAdvisor, the Red Cross, and more. They'll talk about the role sentiment plays in customer experience, marketing, market research, quality, and other applications.  It'll be a great day for learning and networking!  (BeyeNETWORK & TechTarget community members should use the registration code BEYE for $100 off.)

If you're new to sentiment analysis, the optional, half-day Practical Sentiment Analysis tutorial, http://sentimentsymposium.com/tutorial.html, is for you. The tutorial precedes the symposium on Tuesday afternoon, November 8.

- If you'd like a broader view of the text analytics market -- technology, solutions, and applications, the Text Analytics Summit is for you. This year, 2011, we're going west, with a San Jose summit November 10-11. 

For a really rich experience in text, content, and sentiment analysis, why not join me at both conferences?

Please get in touch with any questions.

Seth, @sethgrimes

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2011/10/find_value_in_o.php http://www.b-eye-network.com/blogs/grimes/archives/2011/10/find_value_in_o.php BI Tue, 04 Oct 2011 07:52:49 -0700
Anderson Analytics Eyes Text Analytics Software Space with OdinText Data mining and market research consultancy, Anderson Analytics is beta testing a new text analytics software platform with two Fortune 500 clients. The platform, currently known as OdinText, is being developed specifically for use by market researchers. It is expected to be offered in SaaS model and may be commercially available as early as summer 2011.

Tom is a text analytics early adopter and long-time proponent. He was one of the first to apply natural language processing (NLP) techniques for ad-hoc consumer research although he chose to focus on professional services in the years when market-research oriented solutions such as Buzz Metrics and Cymfony (later acquired by Nielsen and TNS, respectively) first emerged. Tom is also behind a Next Generation Market Research movement and says that Anderson Analytics' solution is based on the firm's years of research and experience.

Tom has responded to a few questions.

Seth: That Anderson Analytics has been quietly working on developing a text analytics software product is a welcome surprise. How did that come about?

Tom: We've always done a little internal development to fill gaps. I've been relatively open about my opinions on the state of text analytics software in general, and that there are no perfect tools out there. It's more about selecting the right tool or combination of tools for the right job and then knowing how to use them. I realized as early as late 2005 that the software out there really isn't developed with the analyst in mind, and that developing something seemed to make sense. Text Analytics has changed a lot since 2005, and it will continue to do so. So I doubt it comes as a surprise to anyone following this field that we're now in development of something more elaborate. Our feedback so far has been very positive.

Seth: There are many text tools on the market, so why now?

Tom: Well I've spoken to market research directors at several Fortune 500 companies. Interestingly, most of them had similar experiences and opinions in regard to text analytics that I had. Many had tried, or requested proof of concepts from large vendors in the text analytics industry and had been underwhelmed by what they saw, especially considering the price tag. It was clear to me that there was still a lot of room for something created by those who both understand text analytics and the needs of market research professionals.

Seth: Will this be a stand-alone, do-it-yourself tool or part of a larger service offering?

Tom: Probably both. Initially an 'OdinText Lite' intended as DIY option, I also think text analytics can add value to some of the other offerings of full service research firms. We also envision slightly different modifications depending on intended use. This in my opinion is one of the major failings of what some of the other large vendors are offering, tools that supposedly can handle any type of text regardless of source. If you build something for everything, then how accurate and useful can it possibly be for a domain expert? In this case the intended expert is the customer intelligence expert, not so much the PR or advertising executive.

Seth: I heard you've already been approached by one large agency regarding some sort of investment or partnership?

Tom: Well yes. However, Anderson Analytics is well positioned to get a quality product into the hands of our customers. That said I do try to keep an open mind if someone brings something additional to the table which can add value.

Seth: Thanks, Tom.

Tom: We're on the Web at OdinText.com if readers would like to get in touch.

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/12/anderson_analyt.php http://www.b-eye-network.com/blogs/grimes/archives/2010/12/anderson_analyt.php text analytics Thu, 16 Dec 2010 05:24:49 -0700
Content Analytics in Five Easy Pieces The Why of content analytics is clear: Quantify text and other "unstructured" sources and you can improve content delivery and findability, optimize storage, facilitate reuse, and extract business information. Why is easy; How is more complicated. A variety of semantic and analytical technologies come into play although different approaches best suit different business challenges, information types, and analysis styles.

We will explore the Why and How of content analytics at next week's Smart Content conference. Whether or not you can attend, five articles I've recently published will help you come up to speed. They're accessible -- not technical -- Five Easy Pieces if you will, with a great deal of richness, to be explored, beneath their surfaces.

Check out --

We'll hear more about success stories in diverse business domains at the conference; we'll explore all things content analytics. But even if you can't join us, key a watch on the topic. "Unstructured" content, from online, social, and enterprise sources, is a next big thing for BI and content management both as boundaries expand to create a world of semantically unified business information.

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/10/content_analyti.php http://www.b-eye-network.com/blogs/grimes/archives/2010/10/content_analyti.php Wed, 13 Oct 2010 10:26:49 -0700
Semantics and Analytics Unlock Value in Social and Online Content Facebook, LinkedIn, Trip Advisor, and Twitter -- social media -- are almost incidental, replaceable tomorrow if another platform proves more attractive, powerful, and agile. (Think AOL and MySpace.) It's content that is king, the message delivered via the blog/e-mail/news/forum medium, generated by corporations and individual producers, traveling a two-way street between them and information-consumer audiences, who in turn comment, repost, and remix at will. And it's Smart Content, the focus of a conference I'm organizing, that allows producers and consumers alike to find the greatest profit, however measured, in online and enterprise content.

The information governance concept, beloved by corporations and consultants, barely applies. It's a challenge creating standards and maintaining content-production rules, more a drag than a benefit, given the highly competitive, fast changing, almost chaotic content marketplace. It's semantic and analytical technologies, which help you find and exploit patterns relevant to your goals, whether expanding readership or automating sense-making, that allow content producers and consumers to keep up, to create findable, flexible, and reusable content and to generate business-linked insights.

Smart Content, the conference, is really just a next step in the BI/analytics/applications market education and match-making I've been doing for years. The opportunity is huge -- business and technical -- a consequence of the value content analytics can bring to news and social media and Web and enterprise content.

We'll cover a spectrum of approaches -- as applied in media & publishing, advertising & on-line commerce, marketing and PR, finance, research, and the Semantic Web -- enhancing the value of news and social media and Web and enterprise content -- with links to enterprise information management, content strategy, BI, text analytics, and search.

We'll start with Visionaries Panel with Dries Buytaert, CTO at Acquia and creator of Drupal, Natasha Fogel, EVP at Edelman StrategyOne, and Mark Stefik from XEROX PARC, followed by Jeff Fried of Microsoft explaining What Business Innovators Need to Know about Content Analytics.

We'll have talks by Rachel Lovinger, content strategy lead at Razorfish, Darrell W. Gunter, EVP/CMO at Collexis, and Randall Snare & Elizabeth McGuane of iQ Content, Dublin -- preceded by a series of lightning talks that will help attendees learn about the gamut of innovative smart-content solutions -- and followed by five Application Spotlight talks. Then stick around for a networking reception.

Smart Content will take place Tuesday, October 19 at the Executive Conference Center at 48th & Broadway in Manhattan. Learn more, and register today, at smartcontentconference.com. Register by September 10 for a $200 early-bird discount.

As my colleague Laurel Earhart, Smart Content marketing director, puts it, Smart Content is designed for decision makers, implementers, solution providers, and also investors. We're expecting great things to happen!

Lastly, I'm quite happy and appreciative to have TechTarget and the BeyeNETWORK as a Smart Content media sponsor, and the support of other prominent media and solution providers in the content management and analytics space. I'd love to have TechTarget readers and community members join us for what is sure to be an excellent program, due of course to the quality and expertise of the Smart Content speakers.

Please visit smartcontentconference.com for more information and to register. (Early-bird rates run through September 10.) Thanks!

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/09/semantics_and_a.php http://www.b-eye-network.com/blogs/grimes/archives/2010/09/semantics_and_a.php BI Wed, 08 Sep 2010 08:58:51 -0700
Sentiment analysis, text analytics Just a final note about the Sentiment Analysis Symposium, April 13 in New York. The symposium is a business-focused conference designed to educate users -- current and prospective -- on sentiment solutions for social media, public relations, customer experience, financial markets, and other applications.

I hope you can attend. We have a great program lined up. I'm especially looking forward to the lightning talks, a series of quick demo-presentations, and to "Selecting a Social Media Analysis Platform/Provider" with moderator Suresh Vittal from Forrester Research and social-analytics gurus Nathan Gilliatt and Marshall Sponder. Note that registration is 50% off for academics, government, and non-profits.

And this is a first note about this year's Text Analytics Summit, slated for May 25-26 in Boston. I'm program chair once again and will present an introductory workshop the afternoon of May 24. Please consider joining us!

Do get in touch about the conferences or anything else related to BI, text analytics, or sentiment analysis -- my coordinates are on-line -- or follow me on Twitter for updates.

Seth

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/04/sentiment_analy_1.php http://www.b-eye-network.com/blogs/grimes/archives/2010/04/sentiment_analy_1.php business intelligence Thu, 08 Apr 2010 09:02:03 -0700
Examining the Big Questions Facing the Text Analytics Industry N-DEPTH: Alta Plana's Seth Grimes on how text analytics is expected to shape up in 2010
 
No single solution provider dominates text analytics. According to Seth Grimes, president, Alta Plana, no single provider dominates any significant text-analytics market segment.
 
"This is good news for current and prospective users," recently wrote Grimes, industry expert and Conference Chair at the upcoming 6th Annual Text Analytics Summit
 
In order to know more about the latest trends and issues, Text Analytics News' Ritesh Gupta recently spoke to Grimes. Excerpts: 
 
Publishers, media portals, social-network and forum sites: they all realise that intelligent content tagging and conceptual search and semantic integration -- capabilities supported by text analytics and related semantic technologies -- are key to information findability, to a rich and satisfying user experience.  Last year, you told me the use of these technologies is on a fast track, a major growth area for text analytics and semantics. How do you assess the situation as of today?
 
Seth Grimes:
There's strong uptake on the publishing side, where organisations seek to make their information more findable and usable (and profitable), and even stronger uptake on the consumer side, where organisations analyse and integrate content-extracted information for a gamut of business needs.
 
One of the most interesting developments on the publishing side is the emergence of a wide range of APIs, application programming interfaces, that allow functions such as tagging, topic classification, and content enrichment (with semantically associated information) to be included in publishing processes. 
 
And on the information-consumption side, yes, there's semantic search and also semantically supported content integration that allows real-time information aggregation, essentially information aggregation and analysis dashboards that range from "listening platforms" to interfaces for BI-style analyses of text-sourced data.

... continued at http://social.textanalyticsnews.com/news/examining-big-questions-facing-text-analytics-industry ]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/03/examining_the_b.php http://www.b-eye-network.com/blogs/grimes/archives/2010/03/examining_the_b.php unstructured information Mon, 15 Mar 2010 06:19:51 -0700
Sentiment Analysis 2010

For a forthcoming BeyeNETWORK article, Perspectives on Text Analytics in 2010, I asked solution-provider executives about the top challenges and opportunities they foresee for the coming year.  Eight out of ten responses cited social media and sentiment analysis: solutions that harvest opinions, attitudes, mood, and other subjective information from news, social media, surveys, and other forms of enterprise feedback.

Claire Thomas, text analytics lead at SAP, calls sentiment analysis "broadly applicable to various industries and initiatives." She captures the rationale for the up-coming Sentiment Analysis Symposium, a conference I am organizing, slated for April 13 in New York. 

Claire and SAP have not endorsed the symposium.  I'm quoting Claire and other industry leaders simply as evidence that a practical, solutions focused sentiment-analysis forum, bridging technology and business concerns, is sorely needed.

Behind the need:

IBM SPSS Vice President Olivier Jouve notes that "Twitter, Facebook and other Web 2.0 media are the new critical sources for marketing" and a range of other enterprise functions.  These functions include what's often now called "social CRM," customer relationship management that taps social sources.  I predict that category will be short-lived, just as early-2000's "e-commerce" is now just one facet of modern, comprehensive business solutions.

The fact is, "more companies are looking at 360-degree views of customer feedback, and social media is a critical early warning (before a customer buys) and customer support (when a customer is having issues) indicator of customer experience," as Clarabridge CEO Sid Banerjee puts it.

I'd guess we can all agree with these statements, but how to handle these and traditional information sources?

According to Lexalytics CEO Jeff Catlin, in 2010, "sentiment will complete its transition to a 'checklist' feature that everyone who works in this space will have to provide.  All of the vendors (big and small) will claim to have sentiment."  So there's a problem evaluating sometimes over-stated claims to choose an appropriate solution.

What's appropriate?  Many user organizations can get by -- for the present -- with what Attensity CTO Ian Hersey characterizes as "social media aggregation and lightweight analytics (e.g., buzz analysis, media monitoring)." 

The Sentiment Analysis Symposium will be for organizations with needs that range from focused to sophisticated, to users who, in Ian's words, want "to incorporate the social media into the same analytical models as they use for their internal data and, more important, plug that social media into business processes."

These enterprise scale users require solutions that, in Olivier Jouve's words, handle sources that are "voluminous, cryptic, multi-lingual and deeply interconnected" via "sophisticated data collection mechanisms, advanced multi-lingual analysis and the infrastructure to manage daily terabytes of data."

This being the BeyeNETWORK's text-analytics channel, I've quoted text-analytics industry leaders although I expect media-monitoring, listening platform, and brand/reputation management users, agencies, and solution providers will be well represented at the Sentiment Analysis Symposium.

So check out the event on-line -- follow @SentimentSymp or me, @SethGrimes, on Twitter for updates -- and you have until February 3 to submit a speaking proposal by the way -- and do send me your questions and comments.

 

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2010/01/sentiment_analy.php http://www.b-eye-network.com/blogs/grimes/archives/2010/01/sentiment_analy.php Tue, 26 Jan 2010 08:55:47 -0700
Text Data Quality: Mistakes and More Text Data Quality.  It's titled Text Data Quality: Mistakes and More.  Stay tuned: The BeyeNETWORK will post another article on text data quality, this one focusing on sources, in early December.

Seth
]]>
http://www.b-eye-network.com/blogs/grimes/archives/2009/11/text_data_quali.php http://www.b-eye-network.com/blogs/grimes/archives/2009/11/text_data_quali.php text analytics Wed, 25 Nov 2009 08:37:36 -0700
Text Analytics New & Noteworthy, Fall 2009 Covering text analytics software, market, conference, and other news and developments to help readers better understand advances in Knowledge Discovery in Text...

Software

Attensity has released a new version of its Voice of the Customer product, Analyze for VOC Version 5.2, that includes accuracy enhancements, a new sentiment scoring feature, a RESTful real-time integration architecture, and analysis enhancements include normalized time-series charts and calculated values according to the company. Version 5.2 introduces new out-of-the box reports for sentiment, Net Promoter Score (NPS) issues, customer churn and competition, according to Attensity.

Attensity has also released E-Service Version 6.1, an enhanced version of the company's application suite for customer service and support organizations.

Clarabridge launched Clarabridge Social Media Analysis (SMA), which the company characterizes as "the industry's first advanced text analytics software that allows companies to integrate social media content into their existing internal enterprise feedback to create more useful customer analysis," in September. The solution uses social media content from Alterian Techrigy's warehouse of social media content, with data from blogs, Facebook, Twitter, YouTube, MySpace, and other social media sites.

Linguamatics released I2E 3.1 in October. According to the company, key enhancements include support for enterprise deployment, NLP-based querying of a greater choice of document types, a new I2E Chemistry option with substructure and structure similarity search powered by ChemAxon, extended results reporting, flexible hyperlinking from extracted entities to web resources such as gene identifiers, glossaries of biomedical terms, and chemical structure visualization.

Orchestr8, a developer of semantic tagging and text mining software, in September announced a new technology to complement their content analysis service, AlchemyAPI. Visual Constraints is designed for extraction of structured data (product info, pricing, descriptions, etc.) from Web pages. A new, October AlchemyAPI release extends extraction capabilities to quotations and named-entity coreferences.

Resources

The UK National Centre for Text Mining has posted presentation slides from the October workshop on Text Mining for Scholarly Communications and Repositories.

An e-book by Graham Wilcock of the University of Helsinki, "Introduction to Linguistic Annotation and Text Analytics," according publisher Morgan & Claypool, "provides a basic introduction to both fields, and aims to show that good linguistic annotations are the essential foundation for good text analytics."

Conferences

Text Analysis Conference (TAC 2009) workshops will be held November 16-17, 2009 at the National Institute of Standards and Technology in Gaithersburg, Maryland, co-located with the Text REtrieval Conference (TREC), November 17-20, 2009.

A May 1, 2010 workshop on text mining will be held in Columbus, Ohio in conjunction with the 2010 SIAM International Conference on Data Mining (SDM 2010). The workshop is devoted to techniques of machine learning in conjunction with natural language processing, information extraction and algebraic/mathematical approaches to computational information retrieval.

The 1st Information Retrieval Facility Conference is slated for May 31, 2010 in Vienna, followed by the 3rd IRF Symposium, June 1-4 2010. The conference aims to provides a forum for researchers in information retrieval, Semantic Web technologies for IR, Natural language processing for IR, and large-scale or distributed computing for those areas. They symposium will especially focus on methodology and evaluation in patent searching and retrieval.

The North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT) 2010 conference will take place June 1-6, 2010 in Los Angeles, California. The conference covers a broad spectrum of disciplines working towards enabling intelligent systems to interact with humans using natural language, and towards enhancing human-human communication through services such as speech recognition, automatic translation, information retrieval, text summarization, and information extraction.

The 48th annual meeting of the Association for Computational Linguistics (ACL 2010) will be held in Uppsala, July 11-16, 2010. The ACL workshops will be held July 15-16.

The 33rd annual ACM SIGIR Conference is slated for 18-23 July 2010, Geneva, Switzerland. SIGIR is the major international forum for the presentation of new research results and for the demonstration of new systems and techniques in the broad field of information retrieval (IR).

The 23rd International Conference on Computational Linguistics (COLING 2010) will be held in Beijing, August 23-27, 2010. There will be pre-conference workshops on August 21-22 and post-conference workshops on August 28.

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2009/11/text_analytics_7.php http://www.b-eye-network.com/blogs/grimes/archives/2009/11/text_analytics_7.php text analytics Sat, 14 Nov 2009 11:11:33 -0700
Text Analytics as a Meeting Point for Search and BI I recently received an inquiry from a student at a European management school who is writing a thesis about the relationship between search technology and business intelligence. She sees the two technologies as having a meeting point at text analytics and asked to pose a few questions on the topic. Many folks share her interest so my BeyeNETWORK blog seemed like a great place to share my responses. Here goes!

Management student> I have been struggling differentiating some terms and understanding them more clearly. Therefore, my questions are related to that confusion. I would also like to hear your opinion on these two technologies (BI as software and enterprise search) and their uses of text analytics.

MS> What is the difference between text analytics and text mining? Is it related to structured vs. unstructured data? Or is text mining a subset of text analytics?

Seth> There isn't a significant difference. I find that text mining is used in areas that have applied the technology longer and that apply data mining. Examples include life sciences and intelligence (e.g., counter-terrorism). Text analytics is more often used in business.

MS> Is content analysis the same as text analysis (if we look at textual documents, not rich data)?

Seth> To me, "content" generally indicates managed information that is typically found in a repository and that is often published on the Web. In this sense, e-mail and IM messages, survey responses, contact center notes and transcripts, and other forms of text generated during business operations are not content. In this sense, content analysis that concerns text is a subset of text analysis.

But "content" does also cover video, audio, and other media as you note. Content analysis would include these forms where text analysis wouldn't, as you understand, beyond work with textual tags.

MS> Is there a difference between text analytics done by search technology and BI applications?

Seth> Text analytics that backs up search is meant to support information retrieval: indexing, summarizing, and ranking documents in response to a search query. TA enables semantic indexing by topics and themes and relationships in order to go beyond indexing based solely on keywords. TA in support of search can also enable smarter, and natural-language, query processing. The example I'll give is that you can enter "map oslo" in Google and get a map of Oslo, because Google is doing a combination of named entity recognition for the geographic area, Oslo, and pattern matching that understands that "map " is a request for a map.

TA in BI (outside use of search for BI) is different. A complete definition of BI include treatment of information in textual and other forms, in databases, repositories, and on the Web. Search is a BI tool, and so is information extraction (a text analytics technique; information = entities, facts, topics, themes, etc.) into structured databases -- some see IE from text as equivalent to ETL for traditional databases -- and also analysis in the sense of data mining of text-extracted information. So when, for instance, you visualize a relationship network that includes people, companies, etc., based on text-extracted named entities and links (relationships, events, etc.), that's TA at work for BI.

MS > What are the fields that use text analytics the most? (any industries in particular?)

Seth> Life sciences and intelligence (including counter-terrorism) were the earliest use cases with serious work going back to the late '90s and they're still very strong domains for TA. But now we're seeing use in a spectrum of business applications as well.

Seth> Let me refer you for this question and the next to a report I recently published, which you can download for free at http://altaplana.com/TA2009 .

MS> How would you describe text analytics market?

[Seth> In my paper, I estimate a 2008 diversified, global market for text-analytics software and vendor provided professional services at $350 million, representing 40% growth from 2007. I foresee sustained growth rates of up to 25% for 2009.]

MS> There is a lot of talk about eDiscovery where text analytics plays a crucial role, but it is also one of the main markets for search technology. Are these two technologies (is it ok to call text analytics is a technology?) coming together?

Seth> I believe that in e-discovery, the principal application of TA is (still) in support of search in the sense that I wrote about above, creating richer indexes that allow legal researchers (litigants) to respond faster and comprehensively to discovery mandates. TA is only starting to be used by legal professional for investigatory purposes, for what you could call "making the case." Compliance and fraud investigations, and risk management, are starting points in this type of use. But I don't think the technology is being used systematically by litigators yet. I do think we'll see a lot more of this investigatory type of use.

I hope you've found our Q&A useful! As always, if you have questions or comments, do get in touch.

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2009/07/text_analytics_6.php http://www.b-eye-network.com/blogs/grimes/archives/2009/07/text_analytics_6.php text analytics Thu, 30 Jul 2009 13:13:47 -0700
Text Analytics New & Noteworthy, July 2009 Covering text analytics software, market, conference, and other news and developments to help KDnuggets readers better understand advances in Knowledge Discovery in Text...

Software

Orchestr8 released on June 18 a significant upgrade to its AlchemyAPI content analysis online service. According to the company, the update includes expanded language coverage (adding Portuguese and Swedish), enhanced text categorization, and integration with Linked Data standards. "AlchemyAPI is a web-based service that enriches a publisher's content through automated tagging, categorization, and semantic analysis available as both a free online API and commercial subscription service."

Attensity Group announced on July 8 the availability of its new, hosted Survey Advantage service at a $5,000 per month point of entry. Attensity Survey Advantage "enables departments within large organizations and government agencies to measure, chart and understand customer sentiment and top issues expressed in customer feedback surveys."

Book

Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper was published in June 2009. "This book offers a highly accessible introduction to Natural Language Processing, the field that underpins a variety of language technologies ranging from predictive text and email filtering to automatic summarization and translation. You'll learn how to write Python programs to analyze the structure and meaning of texts, drawing on techniques from the fields of linguistics and artificial intelligence." Visit O'Reilly for information.

Conferences

A joint conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing will be held August 2-7, 2009 in Singapore. ACL-IJCNLP 2009 will cover a broad spectrum of technical areas related to natural language and computation.

The 2009 conference of the German Society for Computational Linguistics and Language Technology (GSCL) will include a workshop on the Unstructured Information Management Architecture (UIMA), September 30, 2009, in Potsdam, Germany. "Participants are invited to present applications realized using UIMA, general experiences using UIMA as a platform for natural language processing, as well as technical papers on particular aspects of the UIMA framework. Alternatives to and comparisons of other frameworks - e.g. GATE, LingPipe, etc. - with UIMA are of interest, too."

The third IEEE International Conference on Semantic Computing is slated to be held September 14-16, 2009 in Berkeley, California. ICSC 2009 is "an international forum for researchers and practitioners to present research that advances the state of the art and practice of Semantic Computing, as well as identifying the emerging research topics and defining the future of the field."

Recent Advances in Natural Language Processing RANLP 2009 is slated for September 14-16, 2009 in Borovets, Bulgaria, preceded by September 12-13 tutorials and followed by associated workshops September 17-18.

The ACM Eighteenth Conference on Information and Knowledge Management (CIKM 2009) will take place in Hong Kong, November 2-6, 2009. The conference is sponsored by ACM SIGIR and SIGWEB.

Language and Technology Conference 2009: Human Language Technologies as a Challenge for Computer Science and Linguistics (LTC 2009) will take place November 6-8 in Poznan, Poland. "Human Language Technologies (HLT) continue to be a challenge for computer science, linguistics and related fields as these areas become an ever more essential element of our everyday technological environment... [creating] a favorable climate for the intensive exchange of novel ideas, concepts and solutions across initially distant disciplines."

Text Analysis Conference (TAC 2009) workshops will be held November 16-17, 2009 at the National Institute of Standards and Technology in Gaithersburg, Maryland, co-located with the Text REtrieval Conference (TREC), November 17-20, 2009.

Mining User-Generated Content for Security (MINUCS 2009) will take place December 9, 2009, in Venice, Italy, colocated with the First International Conference on User Centric Media (UCMedia 2009) in Venice, 9-11 December 2009. "The aim of this workshop is to bring together researchers from academia and industry who develop technologies for mining open-source user-generated textual data on the Web, as well as end-users interested in exploiting such technologies for knowledge discovery. The emphasis is placed on large-scale text mining systems..."

]]>
http://www.b-eye-network.com/blogs/grimes/archives/2009/07/text_analytics_5.php http://www.b-eye-network.com/blogs/grimes/archives/2009/07/text_analytics_5.php text analytics Thu, 30 Jul 2009 07:10:47 -0700
Talking Text Mining, Web 3.0 & the Semantic Web
I'm thrilled that a couple of attendees have blogged my talk, in particular, my June 18, 2009 London presentation.  John Welsh posted "Seth Grimes on the semantic web - but is B2B media ready to benefit?" and Peter Thomas's write-up was titled "Literary calculus?"  (I don't know why these two Brits are so enamored of question marks.) 

As an aside -- Peter had earlier posted a blog article, "A first for me...," noting that he "received my invitation to the event through Seth himself after having made contact with him on twitter.com."  I'm the first person Peter has met "IRL" -- you can surely guess what those letters stand for -- post our twitter contact.  I similarly met analyst Merv Adrian that way.  We became twitter friends through mutual real-life contacts.  Then one May morning he tweeted that he was in New York City for a meeting, near Penn Station.  It so happens that I was in New York to attend a different meeting, and I was staying near Penn Station.  Forty-five minutes later, Merv and I were sitting down to breakfast.  (For you foodies: We met at the Tick-Tock Diner at the corner of 34th St. and 8th Ave and both had the excellent corned-beef hash.)

Back to Text Mining, Web 3.0 & the Semantic Web -- If you'd like to listen to a recording of my Nstein webinar, please visit http://www.nstein.com/en/ondemand_webinars.php .  Let me know what you think!
]]>
http://www.b-eye-network.com/blogs/grimes/archives/2009/06/talking_text_mi.php http://www.b-eye-network.com/blogs/grimes/archives/2009/06/talking_text_mi.php Semantic Web Fri, 26 Jun 2009 15:13:20 -0700