The Death of App Attribution

The problem with traditional attribution techniques is they are either probabilistic (meaning there’s a chance the data is wrong), or siloed inside a single platform (web or app). A persona graph provides the best of both worlds.

Imagine the game of Concentration (for those who haven’t played this in a few years, it’s the one where you flip two random cards over, hoping to find a match). The chances of discovering a pair on your first turn are extremely low, but over time (and time is the critical element here), you learn where everything is. Eventually, assuming you have a good memory, you’re uncovering matches on almost every round.

Now, let’s take the metaphor one step further: instead of you flipping cards to learn where they are, imagine a hypothetical situation where you get to join a game in progress, where every card on the table has already been turned face up by other players before your first turn. It wouldn’t be much of a game, but you’d be guaranteed to find a match every time.

Like a Concentration game where all the cards have already been flipped before your first turn, a persona graph allows you to accurately match users that YOU haven’t seen before, but someone else in the network has.

That’s the concept behind a persona graph: by sharing matches between anonymous data points, everyone wins. Like a Concentration game where all the cards have already been flipped before your first turn, a persona graph allows you to accurately match users that YOU haven’t seen before, but someone else in the network has.

The elephants in the room: privacy, security, and confidentiality.

For a persona graph to survive, there are a couple of critical things that must be guaranteed: 1) privacy and security of user data, and 2) confidentiality.

User privacy and data security. A persona graph makes it possible to recognize a given user in different places, but it does not tell you anything about WHO that user is. If the user wants you to know that information, then you already have it in your own system — the persona graph simply closes the loop by telling you that you’re seeing an existing customer in a new place. And like cookies or device IDs, the user can reset their connection to the persona graph on demand.

In other words, the persona graph must take the same approach to privacy as the postal service. Our letter carriers need to know our physical location in order to deliver mail, but they’re only concerned with the address, not the addressee. We trust that they won’t open our letters and won’t sell information about what we buy to the highest bidder.

At Branch, we feel so strongly about user privacy that we have made a number of public commitments about it. The short version can be expressed as three points in plain English: 1) we proactively limit the data we collect to only what is absolutely necessary to power the service that we deliver to our customers, 2) we will only ever provide our customers with data about end-user activity that happens on their own apps or websites, and 3) we do not rent or sell end-user personal data, period (not as targeting audiences to other Branch customers, not via cookie-syncing side deals with identity companies, not via an “independent” subsidiary — we just don’t do it).

In addition, we rigorously and proactively follow best practices to purge sensitive data and protect our platform against bad actors.

Confidentiality. The only data that is available via a persona graph is knowledge of the connection itself. Not where or how the connection was made, or by which company’s end user. A persona graph must guarantee that it will never allow Pepsi to purchase a list of Coke’s customers.

Said another way, the Swiss have avoided every war in Europe for over 500 years, because everyone recognizes that they are (and always will be) neutral. A persona graph must maintain the same unimpeachable reputation.

A peek inside the Branch persona graph

When we set out to build Branch in 2014, there was already a well-established industry of mobile attribution providers. All of them were competing with each other for the low-hanging fruit of measuring ad-driven app installs. If you work in the mobile industry, you’re likely familiar with their names already (Branch acquired the attribution business of one last year).

Even though the Branch platform might resemble a traditional attribution provider on the surface, the engine underneath is something fundamentally, radically different.

We decided to take a different approach: we realized the app install ad was a bubble that would eventually deflate, and we also knew that seamless user experiences would become increasingly important as marketers began to care about other channels and conversion events again. So we started by solving the more difficult technical problems that everyone else was ignoring (this is the story we told two years ago in Deep Linking is Not Enough).

The result: through solving the cross-platform user experience problem at scale, for many of the best-known brands in the world, we created a persona graph that allows Branch to provide an attribution solution that is both more accurate and more reliable than anything else available.

Here’s how it works today:

Step 1: Collect deterministic IDs

Believe it or not, this is actually the relatively easy part. User activity occurs in fragments across platforms, and the goal is to have a deterministic ID for each of them. Since Branch’s customers invest most of their marketing resources into websites and mobile apps, these are the platforms where we’ve focused the majority of our effort so far. But the same principle applies anywhere.

To create deterministic IDs on the web, we use a javascript SDK to set first-party cookies. Inside apps, we offer native SDKs to leverage device IDs.

We’ve also built SDKs for desktop apps on macOS and Windows, and custom OTT (Over The Top) device integrations. We will continue adding support for new platforms as customers request them.

Step 2: Create persona matches

Once we have an ID for an identity fragment, we use a layered system of cross-platform matching techniques to tie it back to a persona record on the persona graph. Here are a few examples:

  • Deep links. When a user clicks a link to go from one place to another, that is an ideal time to make a connection. This is our primary method for matching fragments that exist on the same device (e.g., Safari, Facebook browser, native apps), and one of the most reliable because it’s driven by the user’s own activity.
  • User IDs. When a user logs into an account, they’re providing a unique ID that can then be matched if the same user signs in later in another place. We only use this signal to a limited extent today, because there are a number of tricky problems related to shared devices, but we’re actively working on solutions and see a lot of promise in this method. As a side note, this is the only matching method we’ve seen competitors use when they talk about “people-based attribution.” Given the shared device challenges mentioned above, or the fact that (depending on the vertical) the vast majority of visitors never log in, this is certainly an area to question if you’re currently working with one of them.
  • Google Play referrer. Google passes a limited amount of data through the Play Store during the first install. Branch uses this one-time connection to create a permanent match back to the persona graph.
  • Fingerprinting. This is one cross-platform matching method we don’t use to build the persona graph, but it deserves a mention because it is so commonplace in the attribution industry. Branch sometimes has to fall back on fingerprinting when the persona graph can’t provide a stronger pre-existing match, so we’ve invested in an IPv6-based engine that greatly increases accuracy over traditional mobile attribution providers that still rely exclusively on IPv4.

Because of Branch’s massive, worldwide scale, we can also use machine learning to uncover connections between different personas that likely belong to the same user, and just haven’t yet been deterministically merged. We call these “probabilistic matches” because they’re not 100% guaranteed on each end, but they’re still useful and helpful when combined with the high degree of confidence that we get from observing other deterministic patterns.

Here’s how probabilistic matching compares to fingerprinting:

Fingerprinting. Fingerprinting has to happen in real time. In other words, it requires a guess to be made based solely on whatever data is available at the exact moment a user does something. That user might be sitting alone at home (high accuracy situation), or they might be sharing public wifi with several thousand other people while walking around a shopping mall (very low accuracy situation). With fingerprinting, the system has only two choices: 1) it can take a gamble and make the match, or 2) it can throw away the match and say no attribution happened. All of the fancy “dynamic fingerprinting” systems offered by traditional mobile attribution providers are really just trying to decide when to choose option 2.

Probabilistic matching. Because the persona graph is persistent, Branch can afford to be patient. We don’t have to play roulette in real time when the conversion event occurs; instead, we’re able to preemptively store “prob-matches” when the system detects no ambiguity (e.g., when the user is alone at home) to use later (e.g., when the user is inside a crowded shopping mall). For example, the algorithm might create a prob-match if it notices that persona A and persona B have matching fingerprints, were both active on the same IP within 60 seconds of each other, and no other activity occurred from that IP within the last day.

When making these prob-matches between different personas, our system records a “confidence level.” This allows us to move linked personas in and out of consideration depending on the use case. For example, a “match guaranteed” deep link used for auto-login would obviously require a confidence level of 100%, but the industry expects ad installs to be matched with a confidence level usually between 50–85% (the persona graph allows Branch to hit the top end of this range without being forced to accept lower-confidence matches).

Today, Branch dynamically sets the confidence level required for each use case, but this is a configuration we could expose directly to our customers in the future.

Step 3: Scale the network

It’s impossible to just “build a persona graph” because — in the beginning — there is no reason for anyone to sign up.

Why? The value of a persona graph increases for everyone as more companies contribute to it, which means the benefit of joining an existing persona graph is enormous, but there is very little incentive to be one of the best participants in a brand new persona graph — it would be like giving up that already-flipped Concentration game for a new one where you’re playing all by yourself.

Because Branch started out by solving cross-platform user experiences, our persona graph scaled as a natural side-effect of other products that provide independent value at the same time. This approach allowed the Branch persona graph (which now covers over 50,000 companies) to reach critical mass. However, while basic deep linking was a hard problem to solve back in 2014, it is now well on the way to commoditization. Today, it would be almost impossible to get a persona graph off the ground using basic deep links, let alone ever reach a similar level of coverage.

Step 4: Use the match data

What can Branch do with these cross-platform/cross-channel/cross-device personas? Here are a few examples:

Solve attribution ambiguities. This is the obvious one, of course. The persona graph makes it possible to correctly attribute the complicated user journeys we’ve been discussing, such as when you and the other Starbucks customer were both using the same shopping app, and traditional fingerprint-based attribution methods couldn’t tell the difference.

Provide data for true multi-touch reporting. Using multi-touch modeling to better understand user activity is the Promised Land of attribution: every marketer wants it, and everyone has a different idea of what it should be. But there’s one thing everyone should agree on: multi-touch attribution is only as good as the data you feed it, and bad data compounds the problem.

The persona graph allows Branch to consolidate data from across channels and platforms. Legacy mobile attribution providers completely miss this data, which means their “multi-touch attribution” is really just “multi-ad app install attribution.”

Protect user privacy. Fingerprinting has long been a necessary evil for mobile attribution, but inaccurate measurement isn’t the only cost — when fingerprinting matches the wrong user, this also introduces user privacy issues because it means the system believes it is dealing with someone else. The persona graph allows Branch to dramatically reduce the risk of incorrect matching (we even offer a “match guaranteed” flag to enforce it), better protecting the privacy of end users.

Go beyond measurement. Attribution is only possible if the conversion happens in the first place. The persona graph allows Branch to provide the seamless cross-platform user experiences that make this more likely, improving the performance of all your marketing efforts.

For example, if a user lands on your website, even though they already have your app installed, Branch can use the persona graph to detect this and show that user the option to seamlessly switch over to the same content inside your app, where they’re much more likely to complete a purchase.

Comparing persona graph attribution with previous-generation alternatives

To wrap up, let’s revisit the three core tasks of an attribution system, and compare the capabilities of a persona graph-based platform with the traditional alternatives.

1. Capture interactions

Mobile attribution providers started with ads, and have struggled ever since to retrofit their systems in a way that accommodates other channels.

A persona graph is able to support ads, but also support email, web, social, search, offline, and more.

2. Count conversions

Mobile attribution providers are optimized to capture app install events, and aren’t set up to handle non-install conversions that happen on other platforms. Many of them are now rushing to figure out how to perform basic web measurement, a problem that was solved years before apps entered the picture.

A persona graph can attribute app installs, and also captures other down-funnel conversions on websites, desktop apps, OTT devices, and more.

3. Link conversions back to interactions that drove them

As described in part 2, mobile attribution providers have two matching methods available: they default to device IDs, and fall back on fingerprinting.

A persona graph-powered system can also use device IDs for single-platform user journeys (app-to-app), and has device ID <> web cookie pairs for cross-platform (web-to-app) user journeys. It may occasionally have to fall back on fingerprinting when a matched ID pair is not yet available, but this is a far less frequent situation.

What comes next

Fragmentation in the digital ecosystem is a hornet’s nest that can’t be un-kicked, and the challenge of attribution between web and app is just the beginning — it’s going to get worse (just imagine what it will be like when you need to attribute between your toaster and your car!)

Web and app is just the beginning — it’s going to get worse. Just imagine what it will be like when you need to attribute between your toaster and your car.

Attribution based on a persona graph makes it possible to handle this fragmentation, and a persona graph built on user-driven link activity is even more powerful because it leads to a virtuous circle: links are the common thread of digital marketing, which means they’ll always be the natural choice for every channel, platform, and device. These links help build the persona graph, and the result is increased ROI, comprehensive measurement everywhere, and more reliable links.

No other platform-specific attribution solution is even in the same league.

At Branch, we see attribution as one part of a holistic solution that provides far more than app install measurement. Our true mission is to solve the problem of content discovery in the modern digital ecosystem. Deep linking was one critical part of this mission. Fixing attribution is another. But the real win is yet to come…stay tuned!

read original article here