Your cohorts are just ethnic affinity groups. Change my mind.
12 March 2021
(Update 28 Feb 2022: Berke and Calacci link)
(Update 9 May 2021: add another example)
The big question around Google
FLoC
is whether or not some of the FLoC cohorts, which are
group identifiers applied by the browser and
shared with all sites,
will
match up with membership in legally protected
groups of people. Will cohorts turn out to
be the web version of Facebook's old Ethnic Affinity
Groups,
also known as multicultural affinity groups
?
2022 update: Browsing behavior correlates with race,
but cohorts do not....We did not find with our t-closeness
analysis that the likelihood of correlating racial background
with cohorts, using the FLoC OT algorithm, was any greater than
chance.
(Privacy Limitations Of Interest-based Advertising
On The Web: A Post-mortem Empirical Analysis Of Google's
FLoC
Facebook limited the ability of advertisers to exclude members of these groups in 2018 and made many of the groups unusable for targeting at all in 2020. But FLoC is a little different. It assigns numbers, not names, to cohorts, so the unsolved problem is how to tell which cohorts, if any, are actually ethnic affinity groups. One issue on GitHub asks,
If we do have an issue where racially specific targeting is incidentally created by the ML system what happens when advertisers target for or against it and who ends up responsible?
FLoC developers are planning to use sensitive-page classifiers to check which cohorts match up to sensitive groups of pages in web history. Unfortunately, checking page content is not going to give them protected group membership for the users. A simple US-based example is school and neighborhood patterns. A school that is mainly attended by members of a single ethnic group is going to have page content that's mostly the same as all the other schools in the district. The schools all have similar events and play the same sports, but serve different groups of students and parents. So, even though the content is non-sensitive, the cohort is. And local stores with similar merchandise in different neighborhoods are going to get different ethnic affinity groups, I mean cohorts, of visitors. Content in language A could be completely non-sensitive, and local content for region B could be completely non-sensitive, but the cohort of people who use language A in region B could be highly sensitive.
So it might look like nobody will be able to tell which cohorts are really ethnic affinty groups until some independent data journalism site manages to do a study with a panel of opted-in users. This would be the kind of AI ethics research that is bad for career prospects at Google, but that independent organizations can often come up with the funding to do.
But one company doesn't have to wait for the study and resulting news story. Facebook has enough logged-in Google Chrome users that they could already know which FLoC cohorts match up to their old ethnic affinity groups. If a brand buys ads on the open web and relies on FLoC data, Facebook can see when the brand is doing crimes. This doesn't mean that Facebook will disclose the problem, since it gives them something to hold over the brand. No more making any stink about ad metrics or Facebook Groups IRL get-togethers. The extra risk for the advertisers means lower expected revenue for ad impressions tied to FLoC—because of uncertainties that are hard for anyone else to see.
Inspiration for the title for this post:
Your probabilistic ID is just fingerprinting. Change my mind.
— Stephanie Layser (@slayser8) Twitter, January 27, 2021
Bonus links
How Facebook got addicted to spreading misinformation