The Drum Awards Festival - Extended Deadline

-d -h -min -sec

Data & Privacy Marketing

6 things you want to ask about data clean rooms explained

By Pete Danks, Vice-president of product

May 30, 2023 | 7 min read

With much buzz around ‘clean room’ tech enabling a surge of privacy-compliant data enrichment, Pete Danks, vice-president of product at Magnite, offers an explainer for beginners.

Cleanroom Pete Danks

Clean rooms and Privacy Enhancing Technologies (PETs) provide methods for sharing, analyzing and activating data in environments designed to safeguard data. Advertisers and media owners are adopting these solutions to collaboratively improve insights, and run more informed and most importantly effective campaigns. As the industry establishes clean room standards and best practices - a process that recently began – advertisers, agencies and media owners are starting to ask more questions. We’ve answered six of them.

1. What is hashing and how does it work?

Hashing is a set of techniques that can disguise an identifier in a way that preserves its uniqueness while making it impossible to remove that disguise (without knowing the original identifier). For example, audience IDs present in a publisher’s ad request can be uniquely hashed before passing them to the DSP in a bid request. A number of popular email-based IDs use hashing (e.g. UID2.0 and LiveIntent) or pseudonymization (i.e. LiveRamp's RampID) to pass user data through the bidstream in a way that minimizes the risk of data leakage. These hashed email-based identifiers can enable publisher monetization that is less reliant on the future whims of browsers and OEMs, giving publishers more control over their data.

However, there are some drawbacks to hashing techniques. For example, their use for matching across partners requires that all partners know a shared key (usually known as a salt). While a hashed identifier cannot be revealed directly even when the key is known, any partner that possesses the key can uncover whether that hashed identifier appears in their own data, which – if it does – reveals the identifier itself. A related drawback is that a hashed identifier can only be used to check for an exact match against another identifier that was hashed using the exact same key.

New cryptographic techniques such as multi-party computation (MPC) and homomorphic encryption offer more flexible options for matching data without revealing it. For example, MPC allows each data contributor to use their own private key (rather than a shared key) to encrypt their data, while still allowing SSPs to match identifiers (even if they were encrypted using distinct keys).

Powered by AI

Explore frequently asked questions

2. Is data matching safe?

Yes, when two or more parties are sharing data, access, availability and usage are agreed upfront - including the consideration of privacy law requirements - and these agreements are then enforced by the clean room provider. Clean rooms and PETs that use cloud-based distributed storage allow each party involved to control their own data, enabling purpose-limited collaboration while protecting the data. The ability to store hashed user data without it being removed from the environment helps the parties involved to abide by privacy laws and gain more value from the data through enrichment.

3. What happens when cookies go away?

Privacy-preserving data-matching solutions will become even more valuable. Data matching allows buyers and sellers to build out a scaled method of matching first-party data maintaining the ability to accurately target and measure audiences in a controlled environment. The swathe of new advertising identifiers aimed at replacing third-party cookies could also be used with data matching to provide greater scale and accuracy for targeting and measurement through improved match rates. With current clean room match rates averaging at around 50%, the interoperability between clean room providers and identifiers is a key focus.

4. What does it mean for publishers and advertisers?

At the moment most publishers and advertisers are leveraging clean rooms for insights, combining datasets to understand audience behavior and overlap to inform campaign planning. For example, a fitness brand might know nothing about their customers except basic transactional data and that they like to keep fit. Matching profiles with a publisher’s behavioral data provide enrichment, understanding what customers are interested in beyond fitness which can give the brand a better idea of what kind of content they should advertise against.

The next step would be to utilize an activation layer where they can execute campaigns based on those matched users, whose data has been hashed and encrypted. While this is something the likes of Google, Amazon, and Meta offer, those walled garden-based clean rooms were primarily built to advertise across their own media with no post-campaign user-level data sent back to the advertisers. Independent vendors can help activate data in programmatic environments outside of walled gardens, which will also help advertisers grow their first-party data sets.

5. What does activation look like in the bid request?

In order to activate matched datasets, publishers and brands need to either leverage existing identifiers in the bid stream or via their own IDs in the ad request. By hashing the publisher or brand ID, we are able to limit the extent this personal data is revealed in the bidstream.

6. Why isn’t everyone using clean rooms?

A lack of resources, difficulty in proving ROI, lack of interoperability, and the continued ease of using third-party cookies are reducing the urgency to adopt clean rooms, though standardization and interoperability across clean room providers should help adoption. For instance, IAB Tech Lab’s ‘Open Private Join and Activation’ (OPJA) is creating a standard way for data clean room providers to enable their clients to match audiences between data sets to be used for ad targeting and maximizing scale while respecting privacy regulations.

The future of data matching

While clean rooms are a starting place for an ad ecosystem without third-party cookies, the future of data sharing will be encrypting data where it sits, but activating it where it’s required. Those that are using clean rooms are seeing innovation around audience modeling (e.g. lookalikes), insights, and attribution; but activating such data is critical. This is where PETs with an activation path built-in will benefit publishers, giving them a way to attach matched data to an ad opportunity and present it to a buyer in a privacy-forward manner.

Data & Privacy Marketing

More from Data & Privacy

View all


Industry insights

View all
Add your own content +