We use cookies and other similar technologies (Cookies) to enhance your experience and to provide you with relevant content and ads. By using our website, you are agreeing to the use of Cookies. You can change your settings at any time. Cookie Policy.

Selling a Data Mining Project to Management

Originally published December 20, 2011

This article discusses a strategy and related tactics for selling a “new” data mining project to company management. It lays out a general framework for what I call “metrics-based selling.” It addresses the fundamental issues of demonstrating a well-thought out business case, quantifying risk (as variance around an expected outcome) and developing the broad-based support necessary to ensure a positive outcome.
There are two types of issues in selling any project to management: rewards and risk. Rewards and risks are both quantitative (directly measurable) as well as qualitative (strategic or positional). Both those who support as well as those who object to the project leverage uncertainty in quantifying risks and rewards to achieve their own ends. In the uncertainty lies the doom of many projects. The challenge then is to convince the stakeholders  – those who will benefit from the project, those who must participate in the project (by choice or otherwise), and those who will be relied upon for advice and insight regarding the project – that the rewards exceed the risk by the corporate hurdle rate (expected return) for projects.

The Approach

The strategy to address these issues is to develop relationships with all of the stakeholders early in the process – and work to address objections as well as further encourage supporters. The vehicle for accomplishing this is a metrics-based approach to project planning. Underlying the metrics-based approach is engaging each of the stakeholders in the portion of the process pertinent to them, explaining how their input will be used in the plan, and the way in which the various elements will be assembled. The underlying assumption is that if they agree with the assumptions and the method for combining them, they must agree with the conclusion. This is a powerful, but incomplete argument. It does not address the human component.

Before beginning a project, attempt to reduce the elements with the greatest uncertainty. In a data mining project, this is often some estimate of the “benefits” that will be achieved by applying the technology. These “benefits” are often technical in nature – improvement in an ROC curve, increased click-through rate on a coupon or increased duration of a website visit. The best approach to estimating this is to do a small “skunk works” project using your company’s data and benchmark the proposed approach to the current one.

Failing that, draw on your contacts in other companies to convey their experience. If you are dealing with a new vendor, ask for references in your (or a closely related) industry, and call them. These estimates can be used to do a “back-of-the-envelope” calculation of the benefits. If the benefits at this point do not meet the corporate return threshold for projects, the project may not be viable. Sometimes it is better to cut your losses early and move on no matter how “cool” the technology may appear on the surface.
With a belief that the project is viable, the next step is to lay out how the project will impact the entire organization. Initially, there may be multiple options for how the impact plays out. For example, a data mining project to improve the ROC curve for a direct mail piece could be used to decrease the number of pieces mailed while achieving the same response rate, or increasing the absolute number of responses while mailing to the same number of individuals. Both of these should be evaluated in conjunction with stakeholders who will be impacted  – the printing and mailing department, and the response processing department. The critical question to ask each is the degree to which an additional level of activity would impact their costs.

The diagram in Figure 1 is one possible set of linked metrics. In this simplified diagram, an improved ROC curve impacts the number of pieces mailed, the response rates, and IT resources. The “IT Resources” box should be expanded. Generating the improved ROC curve could require purchase of additional runtime software. This may require additional equipment and data storage. This is a great opportunity to get the IT department on board with the project. Sometimes trade-offs between vendors may result in substantially lower requirement for IT resources, and any hit to the improved ROC curve may be worth it.

Alternatively, it may be that other projects within the analytics department could be run less frequently without any material impact. The ROC curve could be used to increase response rate while keeping mailed pieces constant. In this event, printing costs, mailing costs, and equipment utilization would be unaffected. However, response rate would increase. Response rate will increase call center activity or result in additional sales with a direct impact on the necessary inventory levels.

Depending on the expected increases, this could mean substantial additional costs if more equipment or storage is required. It may be that warehouse space is currently underutilized, and an increase in response rate could have a positive impact on the warehouse operation. Regardless, it will impact cash flow required to acquire additional inventory. This is an excellent opportunity to get supply chain staff involved.

Figure 1: Simplified Metric Chain

The development of how various metrics impact the organization is a great opportunity to start conversations with other departments. Out of those discussions need to come cost curves that show the relationship between a change in the metric and the corresponding change in the unit costs. For some metrics, this will be a constant. For others, it may be a step function if additional capacity must be added in a step function. If capacity must be added, some estimate of the time frame required should also be included.

Another very important element in discussing each metric is to have the experts define their range of expectations: the expected cost/unit, the best-case cost, and the worst-case cost. Figure 2 illustrates one possible set of curves. Per unit costs typically decrease as the number of inventory units increases until expansion is required. The new costs must be amortized, and eventually the cost per unit is reduced.

Figure 2: Storage Costs per Unit as a Function of Inventory Units

This level of detail is helpful for understanding the “sweet spot” in terms of how to use the improved ROC curves. (Note: Even these new curves have uncertainty bands associated with them.) Often, it is best to replace the worst case, expected, and best case of Figure 2 with a linear approximation with the same slope and three different constants. This kind of model can be constructed in Excel and various add-ons used to establish an optimum as well as the variance around it.

The Results

When you are done, you will be able to demonstrate to management that:
  1. You have thoroughly understood the impact of this technology on the corporation.

  2. You have buy-in for the costs / benefits / risks associated with the impact of this system.

  3. You have been able to quantify the range of outcomes in terms of risk and reward.

  4. You have made tradeoffs in coming to these conclusions that address the concerns of the stakeholders.

  5. Most importantly, you have demonstrated your management savvy in making this proposal to improve the profitability of the corporation.
Yes, this is a lot of work. The first time is always the hardest. However, the result is reusable. The next time a project opportunity arises, you will have all of the tools in place to evaluate the outcomes. This approach facilitates translating technical metrics into a business case and builds the necessary relationships throughout the organization to ensure broad-based support. Done well, and provided that the rewards meet the corporate return requirements, this is an effective way to plan and sell a data mining project.

  • Casey KlimasauskasCasey Klimasauskas
    Casey spent 15+ years in data mining, starting as a co-founder and key member of the NeuralWare staff in 1987. He has also worked as a sales account manager. This article is a fusion of those experiences. Casey can be contacted at: Casey@Klimasauskas.com.


Want to post a comment? Login or become a member today!

Be the first to comment!