Deterministic vs. Probabilistic Data Tracking: Which is More Effective?

Some time ago, we explained the difference between first, second, and third party data and how each of them can be used for mobile advertising purposes. Today, we look into a different category split: deterministic and probabilistic data. Both typologies are independent from each other, so that third-party data can very well be deterministic or probabilistic.

Google’s Panda update, which gives search result preference to sites that are mobile-friendly, was a clear signal to businesses that users are accessing websites through their mobile devices at a steadily increasing rate. That update reportedly affected 11% of all search results. But the effects of the switch to mobile devices extend far beyond search engine results, and one of those results is in the process of evolving from a curse into a blessing: cross-device tracking. While brands have been aware that consumers often access a site from several different devices during the buying process, they’ve so far had little success in tracking that behavior, which creates a challenge when it comes to deciding where to spend advertising dollars. In other words, a brand could gather a lot of information about a customer during the checkout process but would have no visibility to any previous research the customer performed using other devices. And that meant is missed opportunities for engagement. Thanks to the increased accuracy of probabilistic data tracking, this is fortunately now changing.

The Gold Standard: Deterministic Data Tracking

Deterministic data (also called “first party data”) tracking has long been considered the most accurate way of identifying consumers. “Deterministic” refers to the analysis of data that is known to be true. For example, when a customer makes an online purchase and inputs information such as name, address, zip code, phone number, credit card number, etc., that’s deterministic data. The next time a consumer logs in with the user ID and password associated with that identifying information, the brand can know with a high degree of certainty who that person is.

The New Benchmark: Probabilistic Data Tracking

Probabilistic data tracking, by definition, includes either unknowns, or such a wide array of knowns that deterministic models lose their accuracy. Weather forecasting is a common example of probabilistic analysis. When a meteorologist forecasts a 60% chance of rain, it means that, in the past, when the same conditions existed, it rained 60% of the time. However, even when the primary factors that determine weather (things like temperature, cloud cover, wind speed and direction, humidity, etc.) are the same, secondary factors like the amount of dust in the air, the presence or strength of an El Niño, and any environmental factors that have changed (deforestation, for example) during the time the data was collected can influence what actually happens at any particular time. That’s the real difference between deterministic and probabilistic analysis. With deterministic data, the answer is the same every time: 2 + 2 = 4. With probabilistic data, the results can be different based on a number of factors that are either unknown or not included in the calculation: 2 + x = y. If the value of x changes, the value of y changes. In other words, probabilistic data is data which needs to be inferred with the help of statistical analysis.

What Does That Have To Do With Programmatic Buying In Mobile Advertising?

Programmatic buying programs can now gather non-permanent data like cookies (on mobile web), device IDs, GPS locations, and operating systems and use big data algorithms to make predictions about the identity, and, more importantly, the predicted intent of the user. That’s important, because it allows brands and advertisers to draw conclusions about the identity of consumers before they log in to make a purchase. In other words, if a consumer buys a new computer from his desktop, but started the search on a mobile device, the brand will be able to capture that data and use the insight it provides to make future programmatic buys.

What Are The Drawbacks To Probabilistic Data Tracking?

Traditionally, accuracy has been the primary concern. The assumption has always been that deterministic data like login information provides more certain identification than the types of data that are used in probabilistic analysis. However, a recent Nielsen test put the accuracy of identification using probabilistic data at 90% and up. Therefore, accuracy is no longer a significant drawback, except in cases where 100% accuracy is truly needed. Affiliate marketing programs are a good example: 100% accuracy in cross-device tracking is necessary for properly crediting sales. For programmatic buying, however, the accuracy is on par with deterministic data analysis and gives advertisers the flexibility to include data from consumers who are too early in the buying process to have logged in – which is the only way to capture truly deterministic data.

How Important Is Cross-Device Tracking?

Very – and it’s only going to get more important as more traffic shifts to mobile. Consider Facebook, an undisputed leader in cross-device tracking: the social network currently tracks one billion users who log in through multiple devices. Google and Twitter add hundreds of millions of additional users who switch between different devices. In fact, as the trend toward using multiple devices grows, the effectiveness of single-device tracking will continue to decline.

Are you combining both types of data for your user acquisition campaigns? Let us know in the comments!