Blog: James Taylor Subscribe to this blog's RSS feed!

James Taylor

I will use this blog to discuss business challenges and how technologies like analytics, optimization and business rules can meet those challenges.

About the author >

James is the CEO of Decision Management Solutions and works with clients to automate and improve the decisions underpinning their business. James is the leading expert in decision management and a passionate advocate of decisioning technologies business rules, predictive analytics and data mining. James helps companies develop smarter and more agile processes and systems and has more than 20 years of experience developing software and solutions for clients. He has led decision management efforts for leading companies in insurance, banking, health management and telecommunications. James is a regular keynote speaker and trainer and he wrote Smart (Enough) Systems (Prentice Hall, 2007) with Neil Raden. James is a faculty member of the International Institute for Analytics.

I recently got the survey results from the annual data mining survey that Karl Rexer of Rexer Analytics runs. You can get the summary here or the full results from Karl but here are my thoughts:

  • Data mining is everywhere. The most cited areas are CRM / Marketing and Financial Services with a big lead over Retail and Telecom. Healthcare did poorly, no surprise.
  • Data miners most frequently work in are Marketing & Sales, Research & Development, Risk.
  • Data miners' most commonly used algorithms are regression, decision trees, and cluster analysis - way ahead of the others. Text mining was back in the pack, interesting given the amount of text mining coming presentations we saw at Predictive Analytics World.
  • Half of data miners say their results are helping to drive operational processes. This is encouraging as I think this is by far the most effective way to use predictive analytics.
  • Batch scoring with the results getting stored in the database came top of deployment approaches at 30% with interactive real-time scoring at 21% and 16% putting the model into some overall software project.
  • 60% of respondents say the results of their modeling are deployed always or most of the time. This is still not good enough - nearly half are not getting deployed.
  • The top challenges facing data miners are dirty data, explaining data mining to others, and difficult access to data. However, in 2009 fewer data miners listed data quality and data access as challenges than in the previous year. 34% also have problems with IT.
  • Open-source tools Weka and R made substantial movement up data miner's tool rankings this year, and are now used by large numbers of both academic and for-profit data miners.
There's lots more in the survey so go get it and read it.

Posted March 26, 2010 2:03 PM
Permalink | 1 Comment |

1 Comment

Good points, James.

We're planning the fourth annual data miner survey now, and we're going to probe more to see what people say they're really doing with text mining.

The 48 slide summary deck is free - people should email me if you want a copy: krexer@RexerAnalytics.com.

-- Karl

Leave a comment