Top 4 data clean room predictions 2023
As 2023 gets into full swing, it’s a time for looking forwards and thinking about the key trends we’re witnessing in the area of data clean rooms, and how this technology is likely to evolve. 2022 was a landmark year for clean rooms, but it’s fair to say things were mostly driven by early innovators as opposed to the mass-market. The stage appears to be set for that to change, and for clean rooms to become a de facto part of the marketer toolkit, so let’s jump into some predictions for what’s ahead.
1. Data clean rooms everywhere, whether we know it or not
The first part of this shouldn't come as a huge surprise; next year we will see huge growth in the adoption of clean room technology. Besides that obvious (and biased) statement of their emergent ubiquity, there's an important sub-trend to take note of which will drive this: 2022 saw rapid maturation and increased availability of core / underlying clean room technology. This is going to have a big impact in the year ahead.
Cloud hyperscalers like Amazon have now joined the fray with AWS Clean Rooms, alongside database companies like Snowflake and Databricks. This means that the technical capability to perform a secure, double-blinded join is now accessible to engineers and product builders familiar with these stacks. As Bob Walczak rightly pointed out in his own piece for StreetFight, this "lowers the barrier to entry for testing."
As a result, we shall begin to see clean room powered data flows emerging in most data-driven applications in 2023, across myriad industries from advertising to healthcare. I place emphasis on the word “see” as it is likely that many of these will operate as invisible, albeit critical, privacy-preserving workflows, keeping data signals moving between media companies, advertisers, agencies, and different business applications.
The obvious by-product of a whale like AWS moving so prominently into data clean rooms is that Google will almost certainly have to follow suit and launch their own capability within Google Cloud. As mentioned above, those looking to establish basic clean room capability as part of an application running on a given cloud or database, are quite likely to have engineering teams who will start using those features to build out some kind of DIY clean room. It will be a good thing for adoption and exposure of clean room technology within the enterprise.
This is already happening on Snowflake of course, but the point is that we’ll see it in more places next year. What won’t change is issues around scalability, a lack of interoperability, and challenges around accessibility for non-technical users that these homegrown solutions typically neighbour with.
2. The rise of orchestration
Lest we forget amongst all of these shiny new toys, that data clean rooms are a technical means of solving what are usually commercial goals, data collaboration with another company or companies only happens because of the business imperative to do so. The odds of finding consistent consensus of the underlying storage and compute between clean room participants are low.
Each of the aforementioned cloud companies is building their capability to work within their own cloud, not across them. This leaves the door wide open for cross-clean room intelligence and technical automation; the productivity & operations layer on top. Ciaran O’Kane and the team at Exchangewire wrote a great article on this, which I highly recommend reading.
Clean room software players (such as ourselves here at Habu) will double down on interoperability between clouds, automation of the underlying code and cloud clean room primitives, and in further building out the business layer to allow non-technical users to execute secure data collaboration at scale.
The ability to create clean rooms that automatically leverage the requisite features and compute paradigms of each big data stack without an engineer or a business user actually needing to care about the detail will be referred to as 'clean room orchestration’.
Orchestration capabilities will quickly become one of the most critical elements of assessment within independent / software clean room vendors as things get increasingly technologically fragmented, complex, and decentralised under the hood.
3. Walled garden clean rooms will (finally) go mainstream
Next, I predict we are about to see a big wave of new product developments and enhancements on the walled garden clean rooms such as Google's ADH, Amazon's AMC and Meta's Advanced Analytics. 2023 is also going to be the year for other ad giants like Criteo, Snap and TikTok to launch their own equivalent solutions in order to compete.
With the exception of perhaps Amazon, the pace of development and adoption of these environments has actually been pretty lacklustre since their launch, but this looks set to change next year. With AMC developing so fast, and Amazon publicly hanging their hat on the importance of their clean room as a differentiator, it is a forcing function for others to keep the pace.
This next wave of innovation is going to be something that actually drives mass-market adoption on the advertiser side. There will be no more dipping of toes. These are going to be "must use" environments which begin to go way beyond SQL analyses and start to bridge more readily into practices like custom modelling, audience activation and 3rd party application integration.
In particular I expect to see significant enhancements made to Google ADH, some of which will be driven by the next prediction.
4. Google PAIR will be widely adopted & push others forward
Announced in Q4 of last year and slated for public release in Q1 '23, there has been plenty of interest in Google's Publisher Advertiser Identity Reconciliation (PAIR) offering already. When this goes into open beta or GA, I expect there to be massive interest leading to widespread adoption. Note that Google requires those wanting to use PAIR to leverage an approved clean room, so this is likely to drive more imperative to have solution(s) in place.
From an advertiser perspective it makes so much sense to have access to and use a DV360 seat once you can perform highly secure ID matches with external publishers, all without Google holding a copy of your data or any of your customer IDs making it into the bidstream. This is going to be huge for those more risk-averse brands who have said “no” to audience targeting solutions like Customer Match or Custom Audiences because of privacy and data centralisation concerns.
As PAIR becomes a major outlet for clean room programmatic activation, it will increase the imperative and financial importance of having scaled first party data (1PD) on the publisher side to be able to power matches. PAIR IDs are also going to be hugely useful currency for advanced measurement, optimisation and yield management, as it seems likely they will make it into Ads Data Hub on the advertiser, agency and publisher side.
Outside of the Google stable, it seems logical that others will follow suit with similar encryption protocols for their own DSPs. The underlying technology is known as “commutative encryption”, and it will open the door for other distributed identity systems to exist in programmatic that don't require adtech companies or industry bodies to carry the bag as relates to regulation and potential fines. This last point has been holding back universal ID initiatives in Europe in particular. Note that IAB TechLab is already working to develop standards around commutative encryption which should be in play in 2023.
Looking back over these words and topics, it’s remarkable to think that there’s so much coming in 2023 in the area of clean rooms alone. It goes to show how important of an evolution this technology is for such a broad range of constituents in advertising, data and tech. I’m excited to see how the above themes shake out, and look forward to going on the journey with you all.
Have a different take on clean rooms? Interested in getting started on your collaborative data journey? You can reach me at email@example.com.
Content by The Drum Network member:
Collaborative Intelligence for Decentralized DataFind out more