Monday, August 24, 2015

Predictive Lead Scoring using R (Part 2/2)

Introduction:

In part 1 of this blog series we saw what is Predictive Lead Scoring. And we introduced R, which is the most popular free software for Statistical Computing and Graphical Visualization including Predictive Modeling. In this concluding part 2 of the series we shall see what is the “Predictive Lead Scoring Life Cycle”. We shall also see how we can harness the rich collection of libraries in R to optimize the Predictive Lead Scoring process.


Predictive Lead Scoring Life Cycle

Following infographic illustrates the “Predictive Lead Scoring Life Cycle” as we understand it

Phase 1:

The various disparate sources of customer and leads data such as ERP, CRM, etc. are identified.

Phase 2:

Marketing Automation tools extract leads data from the /ERPCRM and embellish it with “Market Intelligence” viz. Inputs from web-sites and Social media.

Phase 3:

Data Integration tools are used to standardize and  integrate data from disparate sources  like ERP, CRM and Market Automation tools.
The integrated lead/customer data includes attributes such as,
Demographic/Explicit
  • Geographical location (country, politics, economy, etc.)
  • Industry
  • Company financials
  • job title
Implicit
  • website visits
  • content downloads
  • webinar attendance
  • form completions etc

Phase 4:

Predictive Lead Scoring Applications use various machine learning algorithms like logistic regression, recursive partitioning trees, neural networks and random forests to first identify the predictive attributes of the leads and then assign each lead with a propensity/probability to convert into a Prospect.
Here we would like to highlight 2 distinct advantages of “Predictive” Lead Scoring.
  • Predictive Lead Scores are based on the statistical relationship between numerous attributes (customers’ behaviour) and outcomes (Very Hot, Hot, warm or cold lead).
  • Predictive analytic algorithms discover data associations which may not be immediately obvious to even experienced sales people.
This improves the accuracy of the lead scores. Now the Sales team can concentrate their efforts and resources on the “Hot Leads” Vs. The warm/cold leads. Accurate scores would lead to higher conversion rates.

Phase 5:

Insights gained are presented to the top decision makers  in the form of BI dashboards.
Moreover the insights gained are also ploughed back into the CRM system, further enhancing the customer profiles.
This feedback loop results in highly precise (lead score) predictions over a period of time / number of sales cycles, in a cost efficient manner.
The R statistical programming language provides a rich set of libraries for Predictive Analytics, including the following, used in Predictive lead scoring:
  1. lm or Logistic Regression algorithm is used for basic predictive lead scoring.
  2. rpart or recursive partitioning trees & converting trees to rules
  3. randomForest to identify predictive attributes of leads (varImp)
  4. The Party package offers conditional inference trees, unbiased random forest variable importance and model based trees.
  5. neuralnet or Neural Networks
  6. bnlearn or Bayesian Networks to understand causal relationships between scoring parameters
If you would like to harness the power of “Predictive Lead Scoring using R” to improve your Sales leads conversion rates, please feel free get in touch with me.