What’s the Significance of Bayesian Statistics?

A HISTORICAL PERSPECTIVE ON PREDICTING CAUSE AND EFFECT THOUGH AN EDUCATED GUESS AND SUBSEQUENT INFORMATION BRINGS NEW VALUE

A client recently asked for a better description of the Bayesian approach that we frequently reference when describing our modeling engine. After an animated white board discussion, I offered a number of online resources and promised to explain this further in a blog entry, using a simple example from daily life. For a better understanding of this concept, let’s begin with a quick history of the Bayesian approach. Then I’ll demonstrate how an educated guess and subsequent information combine to make the best possible estimation of a cause and effect relationship. The implications for marketing response modeling are endless.

THE HISTORY OF BAYESIAN INFERENCE

Thomas Bayes was an 18^th century English minister who studied probabilities and subsequent statistical theory as a hobby. After he passed away, a friend found and published his manuscripts. A century later, another statistician named Pierre-Simon Laplace took this work in the nineteenth century, made adjustments, and formalized it.

The Bayesian idea was based on inverse probability. Some interpret this as looking at statistical inference somewhat in reverse. Traditional probability analysis predicts future events based on observed information viewed through a series of rules. Inverse probability, on the other hand, determines causes from observed events to explain why changes occur. Bayes starts with an educated guess (Priors) about cause and effect. As we introduce information (Observed Data) to the problem, the working hypothesis of the cause and effect relationship (Posterior Distribution) is adjusted and refined.

AN EXAMPLE

I recently found an illustration of Bayes Rule on Wikipedia. Suppose that you move to a new town and the weather on any given day is easily classified as either Sunny or Rainy.

Based on the wisdom of your new neighbor, who has lived in the area for years, we have a starting point: Probability (tomorrow is Sunny | given today is Rainy) = 50%

Since tomorrow has to be is either Sunny or Rainy, it follows that: Probability (tomorrow is Rainy | given today is Rainy) = 50% = 100% – 50%

The same wise neighbor can attest that: Probability (tomorrow is Rainy | given today is Sunny) = 10%

Therefore, it follows that: Probability (tomorrow is Sunny | given today is Sunny) = 90% = 100% – 10%

We might ask several questions whose answers follow:

Q1: If the weather is Sunny today, then what is the likely weather tomorrow?

A1: Since we do not know what is going to happen for sure, the best guess is the wisdom of the neighbor who says there is a 90% chance that it will be Sunny and 10% chance that it will be Rainy.

Q2: What about two days from today?

A2: Following the same predictions of 90% Sunny and 10% Rainy for tomorrow, we can predict two days from now. The first day will be Sunny and the next day will be Sunny as well. Chances of this happening are 90% x 90%. Alternatively, the first day can be Rainy and second day can be Sunny. Chances of this happening are 10% x 50%.

Therefore, the probability that the weather will be Sunny in two days is: Probability (Sunny two days from now) = 90% x 90% + 10% x 50% = 81% + 5% = 86%

Similarly, the probability that it will be Rainy is: Probability (Rainy two days from now) = 10% x 50% + 90% x 10% = 5% + 9% = 14%

If we keep forecasting weather like this as new data is introduced during a large number of days, (>30) the forecasts will reach equilibrium with the following probabilities: Probability (Sunny) = 83.3% versus Probability (Rainy) = 16.7%

Equilibrium occurs when the forecast is consistent. You can check for equilibrium by changing the starting point with the weather today being either Sunny or Rainy. If you get the same forecast, you have reached equilibrium. The key is that we took the wisdom of the neighbor (Priors); integrated it with the information introduced across a large number of days (Observed Data); and, using the iterative processing of the new data, explained the probabilities of Sunny versus Rainy (Posterior Distribution).

We can do the same thing with customer responses to marketing stimuli. Instead of predicting the probability of rain based on yesterday’s weather, we can predict the behavior of shoppers due to the presence of a new product gaining distribution, a price discount on premium brands in the portfolio, a display that takes advantage of occasion-based marketing, or a media campaign that started last week.

RESISTANCE TO BAYES IDEA

Unfortunately, the mainstream statistical community has been resistant to this solution. Another statistics giant named Sir R.A. Fisher led the charge against what he deemed the Bayesian guessing game and reliance on that wise old neighbor to start the analysis. Fisher’s approach depended on designing and analyzing controlled experiments. Using large and random samples, we could explain significance through confidence intervals in the context of the Gaussian Bell Curve. The reliance on sampling is what has led statisticians to describe Fisher’s ideas as the Frequentist approach, and it has dominated college statistics textbooks from 1920 to 2010. For introductory statistics students in current undergraduate programs, the Frequentist approach is usually the only perspective taught.

Regrettably, market response analytics are simply not as easy as Fisher explains. The first Frequentist problem is the uncertainty of the real world. It is nearly impossible to create controlled experiments that resemble the actual marketplace. Worse, most market research relies on data that the reviewer can interactively acquire or passively observe. This means there is no consistency or control. The next Frequentist problem is what 2005 Nobel laureate Thomas Schelling presented as he addressed the complexity of human choices. He pointed out that our decisions often affect the actions of others, and subsequently affect our later decisions and behaviors. Controlled experiments using independent variables fail to account for this type of feedback loop. As a side note, Schelling is one of the godfathers of Agent-Based Modelling that has gained significant exposure for solving marketing analytics problems.

RESURGENCE OF BAYESIAN INFERENCE

Despite the problems with designing controlled experiments due to uncertainty and complexity, computing the conditional probabilities to derive the Bayesian modelling for anything more complicated that the Sunny versus Rainy example above required too many resources. This is, until Gelfand and Smith in 1990 introduced a numerical technique called Monte Carlo Markov Chain (MCMC). MCMC simulates the posterior distribution of the modelling parameters, generating them in the context of their posterior probability. MCMC is the “iterative processing of a large number of days” in the Sunny versus Rainy example. In highly complex regression-based models with numerous variables, both the mean and variance of the cause and effect could now be computed. The concept was further refined as different sampling techniques were introduced to “drive” the MCMC. Since 1997, various statistical software packages have incorporated these techniques, and marketing analytics forever changed for Middlegame.

The famous economist John Maynard Keynes originally postulated the idea of “vaguely right versus precisely wrong.” Len Lodish, Samuel R. Harrell Emeritus Professor at the Wharton School, introduced Middlegame to Keynes’ concept when he challenged the true impact of advertising in 1986. This is the real issue in marketing response analytics. Bayes Rule lets us tackle the problem from the vaguely right perspective and avoid being precisely wrong. As more data is introduced through different products, geographies and time-periods, the Bayesian answers become less wrong in the face of uncertainty and complexity. More importantly, we avoid the false certainties that the Frequentist approach offers us with confidence intervals based on extrapolated past data.

To find out more about how we use Bayesian statistics to help marketers convert shopper response analytics into action, including examples of these results, visit our website at www.middlegame.ie. To gain a better understanding of the wide applications of Bayesian Statistics in marketing analytics, take a look at Dr. Greg Allenby’s video at https://www.youtube.com/watch?v=IfzyMtQ8Ngk. Winner of the 2012 Charles Coolidge Parlin Marketing Research Award, Allenby is the Helen C. Kurtz Professor of Marketing for the Fisher College of Business at The Ohio State University. Along with other colleagues including Peter Rossi, he has led the Bayesian charge for marketing analytics. We pay attention to everything these two publish.

Middlegame is the only ROMI consultancy of its kind that offers a holistic view of the implications of resource allocation and investment in the marketplace. Our approach to scenario-planning differs from other marketing analytics providers by addressing the anticipated outcome for every SKU (your portfolio and your competitors) in every channel. Similar to the pieces in chess, each stakeholder can now evaluate the trade-offs of potential choices and collectively apply them to create win-win results.

back to Blog