Innovative Methodology in Market Research
In the past 70 years, Shapley Value-based approaches were adopted in many industries including Market Research. This blog describes and explains three most popular applications of the concept used by Big Village to meet various clients’ needs: Shapley Value Attribution, Shapley Value Regression, and Shapley Value Line Optimization.
Shapley Value (SV) is a basic concept in cooperative game theory. It was named after Lloyd Shapley, an American mathematician and economist, who introduced it in 1951, and won the Nobel Prize in Economics for it in 2012.
What is SV? Imagine the following situation: a team plays a cooperative game and obtains a certain overall gain from this cooperation. Some players may contribute more to the gain than others. SV allows quantification of each player’s strength within the team and assessing their individual contributions to the gain. It assigns the value for each player over all possible combinations or coalitions of players within the team. From a game theoretic perspective, a player can be any entity in some context, like athletes in a football team, allied countries in a war, or a set of SKUs in a product line. SV would be a robust and reliable model in all these cases.
In the last 70 years, SV gained enormous popularity, was adapted to different contexts, and applied in many industries. Let us review a few interesting ways to use SV in marketing and market research.
Shapley Value Attribution
Traditional approaches to market attribution are all based on certain sets of rules. That means researches have to decide upfront how they want the credit for conversions to be divided between channels. In the last few years, new data-driven techniques were adopted for this class of problems, and one of them was Shapley Value. These days, the technique is already widely used – for example, by Google Analytics.
In the context of attribution, campaign channels are the players of the game, and they form coalitions interacting with accounts throughout each buyer journey. Shapley Value provides a robust way to measure channel influence and fairly divide the credit for conversions between the channels. SV allows analyzing interactions in buyers’ journeys, accurately assessing their individual contribution to the total payoff, and optimizing marketing investment and improving sales results.
Shapley Value Regression for Key Drivers Analysis or Derived Importance
Traditional regressions might not be fully reliable for a Key Drivers Analysis, especially in cases when predictors (independent variables) are highly correlated. This problem is also known as multicollinearity. Multicollinearity, when it is severe, results in imprecise and unstable coefficients, and thus the relative importance among predictors cannot be accurately gauged. Statisticians have developed a number of procedures to address the effects of multicollinearity, one of them is a Shapley Value Regression.
This method utilizes Shapley’s “fair allocation” of the predictors’ impact on the dependent variable. All possible combinations of predictors are run against the final outcome (dependent variable), and the R-square statistics or a generalized fit metric is computed for each regression. Shapley Value is estimated as an average contribution of each predictor to the regression R-square in all combinations, and it can be interpreted as the driver strength or importance measure for each of the predictors.
Shapley Value Line Optimization
For many years, researchers used TURF (Total Unduplicated Reach and Frequency) analysis for Line Optimization problems. SV is a modern, and often superior, alternative to TURF. Researchers know that there tends to be a substantial overlap in appeal of different product variants. As a result, multiple potential product line combinations can have identical or similar TURF reach, even without accounting for a sampling error. This type of behavior limits the utility of TURF as a decision tool.
As discussed above, SV represents the worth of each player over all possible combinations of players in a cooperative game. For a Product Line Optimization, we estimate an SV of each product, and then select the products with the highest SV’s for the optimal line. This works like a regularization for the optimal solution – we are not just making sure that a combination of products has the optimal reach, we also guarantee that each product in the combination is strong individually. With the SV approach, a product is considered “strong” if it is reaching a lot of consumers and also penetrating into smaller consideration sets (is liked by “picky” consumers who only like a small number of possible product variants).
SV provides a more reliable and robust solution in many actual marketing situations. Since it identifies the product variants that are the best among all possible combinations of products or product lines, the SV optimization is more consistent from both a statistical and practical perspective. Since SV selects individually strong items for optimal lines, it allows avoiding niche products, which is not possible in TURF optimizations. If an optimal line was generated using an SV, all subsets of variants in this line are also sub-optimal, which might be beneficial if not all products in the line are presented on a shelf in a particular store. Also, variants in a line designed using an SV approach should be the best response to potential changes in competitive lines. Additionally, the SV Line Optimization is very efficient numerically – allowing optimization of hundreds (and if needed, even thousands) of variants. SV computations can be built into efficient simulators, and researchers can efficiently perform optimizations in real time.
At Big Village, we are constantly updating and growing our analytical portfolio to meet various research needs. We are ready to support all three SV applications listed above to provide our clients with reliable and actionable solutions.
Written by Faina Shmulyian, Vice President, Data Science, and Mike Miller, Vice President at Big Village Insights.