blog: Don Marti


Notes and links from my talk at RJI

13 October 2017

This is OFF MESSAGE. No Mozilla policy here. This is my personal blog.

(This is the text from my talk at the Reynolds Journalism Institute's Revenue Models that Work event, with some links added. Not exactly as delivered.)

Hi. I may be the token advertising optimist here.

Before we write off advertising, I just want to try to figure out the answer to: why can't Internet publishers make advertising work as well as publishers used to be able to make it work when they were breathing fumes from molten lead all day? Has the Internet really made something that much worse?

I have bought online advertising, written and edited for ad-supported sites, had root access to some of the servers of an adtech firm that you probably have cookies from right now, and have written an ad blocker. Now I work for Mozilla. I don't have any special knowledge on what exactly Mozilla intends to do about third-party cookies, or fingerprinting, or ad blocking, but I can share some of what I have learned about users' values, and some facts about the browser business that will inform those decision for Mozilla and other browsers.

First of all, I want to cover how new privacy tools are breaking web advertising as we know it. But that's fine. People don't like web advertising as we know it.

So what don't they like?

A 2009 study at the University of Pennsylvania came up with the result that "most adult Americans do not want advertisers to tailor advertisements to their interests."

When the researchers explained how ad targeting works, the percentage went up.

We have known for quite a while that people have norms about how they share their personal information.

Pagefair study

That Pennsylvania study isn't the only one. Just recently a company called Pagefair did a survey on when people would choose to share their info on the web.

Research result: what percentage will consent to tracking for advertising? | PageFair

They surveyed 300 publishers, adtech people, brands, and various others, on whether users will consent to tracking under the GDPR and the ePrivacy Regulation.

Some examples:

The survey asked if users would allow for tracking on one site only, and for one brand only, in addition to “analytics partners”. 79% of respondents said they would click “No” to this limited consent request.

And what kind of tracking policy would people prefer in the browser by default? The European Parliament suggested that “Accept only first party tracking” should be the default. But only 20% of respondents said they would select this. Only 5% were willing to “accept all tracking”. 56% said they would select “Reject tracking unless strictly necessary for services I request”. The very large majority (81%) of respondents said they would not consent to having their behaviour tracked by companies other than the website they are visiting.

Users say that they really don't like being tracked. So, right about now is where you should be pointing out that what people say about what they want is often different from what they do.

It's hard to see exactly what people do about particular ads, but we can see some indirect evidence that what people do about creepy ads is consistent with what they say about privacy.

  • First, ad blockers didn't catch on until people started to see retargeting.

  • Second, companies indirectly reveal their user research in policies and design decisions.

Back in 1998, when Google was still "google.stanford.edu" I wrote an ad blocker. And there were a bunch of other pretty good ones in the late 1990s, too. WebWasher, AdSubtract, Internet Junkbuster. But none of that stuff caught on. That was back when most people were on dialup, and downloading a 468x60 banner ad was a big deal. That's before browsers came with pop-up blockers, so a pop-up was a whole new browser window and those could get annoying real fast.

But users didn't really get into ad blocking. What changed between then and now? Retargeting. People could see that the ad on one site had "followed them" from a previous site. That creeped them out.

Some Facebook research clearly led in the same direction.

As we should all know by now, Facebook enables an extremely fine level of micro-targeting.

Yes, you can target 15 people in Florida.

But how do the users feel about this?

We can't see Facebook's research. But we can see the result of it, in Facebook Advertising Policies. If you buy an ad on Facebook, you can target people based on all kinds of personal info, but you can't reveal that you did it.

Ads must not contain content that asserts or implies personal attributes. This includes direct or indirect assertions or implications about a person’s race, ethnic origin, religion, beliefs, age, sexual orientation or practices, gender identity, disability, medical condition (including physical or mental health), financial status, membership in a trade union, criminal record, or name.

So you can say "meet singles near you" but you can't say "other singles". You can offer depression counseling in an ad, but you can't say "treat your depression."

Facebook is constantly researching and tweaking their site, and, of course, trying to sell ads. If personalized targeting didn't creep people the hell out, then the ad policy wouldn't make you hide that you were doing it.


All right, so users don't want to be followed around.

Where does Mozilla come in?

Well, Mozilla is supposed to be all about data privacy for the user. We have these Data Privacy Principles

  1. No surprises Use and share information in a way that is transparent and benefits the user.

  2. User control Develop products and advocate for best practices that put users in control of their data and online experiences.

  3. Limited data Collect what we need, de-identify where we can and delete when no longer necessary.

  4. Sensible settings Design for a thoughtful balance of safety and user experience.

  5. Defense in depth Maintain multi-layered security controls and practices, many of which are publicly verifiable.

If you want a look at what Mozilla management is thinking about the tracking protection slash ad blocking problem, there's always Proposed Principles for Content Blocking by Denelle Dixon.

  • Content Neutrality: Content blocking software should focus on addressing potential user needs (such as on performance, security, and privacy) instead of blocking specific types of content (such as advertising).

  • Transparency & Control: The content blocking software should provide users with transparency and meaningful controls over the needs it is attempting to address.

  • Openness: Blocking should maintain a level playing field and should block under the same principles regardless of source of the content. Publishers and other content providers should be given ways to participate in an open Web ecosystem, instead of being placed in a permanent penalty box that closes off the Web to their products and services.

If we have all those great values though, why aren't we doing more to protect users from tracking?

Here's the problem from the browser point of view.

Firefox had a tracking protection feature in 2015.

Firefox had a proposed "Cookie Clearinghouse" that was going to happen with Stanford, back in 2013. Firefox developers were talking about third-party cookie blocking then, too.

Microsoft beat Mozilla to it. Microsoft Internet Explorer released Tracking Protection Lists in version 9, in 2011.

But the mainstream browsers have always been held back by two things.

First, browser developers have been cautious about not breaking sites. We know that users prefer not to be tracked from site to site, but we know that they get really mad when a site that used to work just stops working. There is a lot of code in a lot of browsers to handle stuff that no self-respecting web designer has done for decades. Remember the 1996 movie "Space Jam"? Check out the web site some time. It's a point of pride to keep all that 1996 web design working. And seriously, one of those old 1996 vintage pages might be the web-based control panel for somebody's emergency generator, or something. Yes, browsers consider the users' values on tracking, but priority one is not breaking stuff.

And that includes third-party resources that are not creepy ad trackers—stuff like shopping carts and comment forms and who knows what.

Besides not breaking sites, the other thing that keeps browsers from implementing users' values on tracking is that we know people like free stuff. For a long time, browsers didn't have enough good data, so have deferred to the adtech business when they talk about how sites make money. It looks obvious, right? Sites that release free stuff make money from ads, ads work a certain way, so if you interfere with how the ads work, then sites make less money, and users don't get the free stuff.

Mozilla backed down on third-party cookies in 2013, and again on tracking protection in 2015.

Microsoft backed down on Tracking Protection Lists.

Both times, after the adtech industry made a big fuss about it.

So what changed? Why is now different?

Well, that's an easy answer, right? Apple put Intelligent Tracking Prevention into their Safari browser, and now everybody else has to catch up.

Apple so far is putting their users ahead of the usual alarmed letters from the adtech people. Steven Sinofsky, former president of the Windows Division at Microsoft, tweeted,

But that's not all of it.

You're going to see other browsers make moves that look like they're "following Safari" but really, browsers are not so much following each other as making similar decisions based on similar information.

When users share their values they say that they want control over their information.

When users see advertising that seems "creepy" we can see them take steps to avoid ads following them around.

Some people say, well, if users really want privacy, why don't they pay for privacy products? That's not how humans work. Users don't pay for privacy, because we don't pay other people to come into compliance with basic social norms. We don't pay our loud neighbors to quiet down.

Apple does lots of user research. I believe they're responding to what their users say.

Apple looks like a magic company that releases magic things that they make up out of their own heads. "Designed by Apple in California." This is a great show. It's part of their brand. I have a lot of respect for their ability to make things look simple.

But that doesn't mean that they just make stuff up.

Apple does a lot of user research. Every so often we get a little peek behind the curtain when there is discovery in a lawsuit. They do research on their own users, on Samsung's users, everybody.

Mozilla has user research, too.

For a long time, browser people thought that there was a conflict between giving the users something that complies with their tracking norms and giving them something that keeps them happy with the sites they want to use.

But now it turns out that we have some ways that we could act in accordance with user values that also produce measurably more satisfied users.

How badly does privacy protection break sites?

Mozilla's testing team has built, deployed to users, and tested nine different sets of cookie and tracking protection policies.

Lots of people thought there are going to be things that break sites and protect users, or leave sites working and leave users vulnerable.

It turns out that there is a configuration that gives both better values alignment and less breakage.

Because a lot of that breakage is caused by third-party JavaScript.

We're learning that in a few important areas, even though Apple Safari is in the lead, Apple's Intelligent Tracking Prevention doesn't go far enough.

What users want

It turns out that when you do research with people who are not current users of ad blockers, and offer them choices of features, the popular choices are tracking blockers, malvertising protection, and blocking annoying ads such as auto-play videos. Among those users who aren't already using an ad blocker, the offer of an ad blocker wasn't as popular.

Yes, people want to see fewer annoying ads. And nobody likes malware. But people are also interested in protection from tracking. Some users even put tracking protection ahead of malvertising protection.

If you only ask about annoying ad formats you get a list of which ad formats are popular now but get on people's nerves. This is where Google is now. I have no doubt that they'll catch up. Everyone who’s ever moderated a comment section knows what the terrible ads are. And any publisher has the motivation to moderate and impose standards on the ads on their site. Finding which ads are the crappy ones are not the problem. The problem is that legit sites and crappy sites are in the same ad space market, competing for the same eyeballs. As a legit site, you have less market power to turn down an ad that does not meet your policies.

We are coming to an understanding of where users stand. In a lot of ways we're repeating the early development of spam filters, but in slow motion.

Today, a spam filter seems like a must-have feature for any email service. But MSN started talking about its spam filtering back when Sanford Wallace, the “Spam King,” was saying stuff like this.

I have to admit that some people hate me, but I have to tell you something about hate. If sending an electronic advertisement through email warrants hate, then my answer to those people is “Get a life. Don’t hate somebody for sending an advertisement through email.” There are people out there that also like us.

According to spammers, spam filtering was just Internet nerds complaining about something that regular users actually like. But the spam debate ended when big online services, starting with MSN, started talking about how they build for their real users instead of for Wallace’s hypothetical spam-loving users.

If you missed the email spam debate, don’t worry. Wallace’s talking points about spam filters constantly get recycled by the IAB and the DMA, every time a browser makes a move toward tracking protection. But now it’s not email spam that users supposedly crave. Today, they tell us that users really want those ads that follow them around.

So here's the problem. Users are clear about their values and preferences. Browsers must reflect user values and preferences. Browsers have enough of a critical mass of users demanding better protection from tracking that browsers are going to have to move or become irrelevant.

That's what the email providers did on spam. There were not enough pro-spam users to support an email service without a spam filter.

And there may not be enough pro-targeting users to support a browser without privacy tools.

As I said, I do not know exactly how Mozilla is going to handle this, but every browser is going to have to.

But I can make one safe prediction.

Browsers need users. Users prefer tracking protection. I'm going to make a really stupid, safe prediction here.

User adoption of tracking protection will not affect the amount of user data available, or affect any measurement of number of targeted ad impressions available in any way.

Every missing trackable user will be replaced by an adfraud bot.

Every missing piece of user data will be replaced by an "inferred" piece of data.

How much adfraud is there really?

There are people who will stand up and say that we have 2 percent fraud, or 85 percent. Of course it's different from campaign to campaign and some advertisers get burned worse than others.

You can see "IAS safe traffic" on fraud boards. Because video views are worth so much more, the smartest bots go there. We do know that when you look for adfraud seriously, you can find it. Just recently the Financial Times found a bunch.

The publisher has found display ads against inventory masquerading as FT.com on 10 separate ad exchanges and video ads on 15 exchanges, even though the FT doesn’t even sell video ads programmatically, with 300 accounts selling inventory purporting to be the FT’s. The scale of the fraud uncovered is vast — the equivalent of one month’s supply of bona fide FT.com video inventory was fraudulently appearing in a single day.

The FT warns advertisers after discovering high levels of domain spoofing

If you were trying to build an advertising business to facilitate fraud, you could not do much better than the current system.

That's because the current web advertising system is based on tracking users from high-value sites to low-value sites. Walt Mossberg recounts a dinner conversation with an advertiser:

[W]e were seated next to the head of this advertising company, who said to me something like, "Well, I really always liked AllThingsD and in your first week I think Recode’s produced some really interesting stuff." And I said, "Great, so you’re going to advertise there, right? Or place ads there." And he said, "Well, let me just tell you the truth. We’re going to place ads there for a little bit, we’re going to drop cookies, we’re going to figure out who your readers are, we’re going to find out what other websites they go to that are way cheaper than your website and then we’re gonna pull our ads from your website and move them there."

The current web advertising system is based on paying publishers less, charge brands more. Revenue share for legit publishers is at 30 to 40 percent according to the Association of National Advertisers. But all revenue split numbers are wrong because undetected fraud ends up in the ‘publisher’ share.

When your model is based on data leakage, on catching valuable eyeballs on cheap sites, the inevitable overspray is fraud.

People aren't even paying attention to what could be the biggest form of adfraud.

Part of the conventional wisdom on adfraud is that you can beat it by tracking users all the way to a sale, and filter the bots out that way. After all, if they made a bot good enough to actually buy stuff it wouldn't be a problem for the client.

But the attribution models that connect impressions to sales are, well, they're hard enough to understand that most of the people who understand them are probably fraud hackers.

The dispute betwen Steelhouse and Criteo settled last year, so we didn't get to see how two real adtech companies might or might not have been hacking each other's attribution numbers.

But today we have another chance.

I used to work for Linux Journal, and we followed the SCO case pretty intently. There was even a dedicated news site just about the case, called Groklaw. If there's a case that needs a Groklaw for web advertising, it's Uber v. Fetch.

Unwanted ads on Breitbart lead to massive click fraud revelations, Uber claims | Ars Technica

This is the closest we have to a tool to help us understand attribution fraud. When the bad guys have the ability to make bogus ads claim credit for real sales, that's a much more powerful motivation for fraud than just making a bot that looks like a real user watching a video.

Legit publishers have a real incentive to find and control adfraud. Adtech intermediaries, not so much. That's because the core value of ad tech is to find the big money user at the cheapest possible site. If you create that kind of industry, you create the incentive for fraud bots who appear to be members of a valuable audience. You create incentives to produce fraudulent sites because all of a sudden, those kinds of sites have market value that they would not otherwise have had because of data leakage.

As browsers and sites implement user norms on tracking, they get fraud protection for free.

So where is the outrage on adfraud?

I thought I could write a script for a heist movie about adfraud.

At first I thought, this is awesome! Computer hacking, big corporations losing billions of dollars—should be a formula for an awesome heist movie, right?

Every heist movie has a bunch of scenes that introduce the characters, you know, getting the crew together. Forget it. All the parts of adfraud can be done independently and connected on the free market. It's all on a bunch of dumb-looking PHP web boards. There go a whole bunch of great scenes.

Hard-boiled detectives trying to catch the gang? More like over easy. The adtech industry "committed $1.5 million in funding" (and set up a 24-member committee!) to fight an eleven billion dollar problem. Adfraud isn't taking candy from a baby, it's taking candy from a dude whose job is giving away candy. More fraud means more money for adtech intermediaries.

Dramatic risk of getting caught? Not a chance of going to prison—the worst that happens is that some of the characters get their accounts or domains banned, and they have to make new ones. The adfraud movie's production designer is going to have to work awful hard to make that "Access denied" screen look cool enough to keep the audience awake.

So the movie idea is a no-go, but as people learn that today's web ads don't just leave the publisher with 30 percent but also feed fraud, we should see a flight to quality effect.

The technical decisions that enabled the Lumascape to rip off Walt Mossberg are the same decisions that facilitate fraud, are the same decisions that make users come looking for tracking protection.

I said I was an advertising optimist and here's why.

The tracking protection trend is splitting web advertising.

We have the existing high-tracking, high-fraud market and a new low-tracking opportunity.

Some users are getting better protected from cross-site tracking.

The bad news is that it will be harder to serve those users a lucrative ad enabled by third-party tracking data.

The good news is that those users can't be tracked from high-value to low-value sites. Those users start to become possible to tell apart from fraudbots.

For that subset of users, web advertising starts to shift from a hacking game to a reputation game.

In order to sell advertising you need to give the advertiser some credible information on who the audience is. Most browsers have been bad at protecting personal information about the user, so web advertising has become a game where a whole bunch of companies compete to covertly capture as much user info as they can.

But some browsers are getting better at implementing people’s preferences about sharing their information. The result, for those users, is a change in the rules of the game. Investment in taking people’s personal info is becoming less rewarding, as browsers compete to reflect people’s preferences.

And investments in building sites and brands that are trustworthy enough for people to want to share their information will tend to become more rewarding. This shift naturally leads to complaints from people who are used to winning the old game, but will probably be better for customers who want to use trustworthy brands and for people who want to earn money by making ad-supported news and cultural works.

There are people building a new web advertising system around user-permissioned information, and they've been doing it for a long time. But until now, nobody really wants to deal with them, because adtech is just selling that information taken from the user without permission. Tracking protection will be the motivation for forward-thinking brand people to catch the flight to quality and shift web ad spending from the hacking game to the reputation game.

Now that we have better understanding of how user norms are aligned with the interests of independent browsers and with the interests of high-reputation sites, what's next?

Measure the tracking-protected audience

Legit sites are in a strong position to gather some important data that will shift web ads from a hacking game to a reputation game. Let's measure the tracking-protected audience.

Tracking protection is a powerful sign of a human audience. A legit site can report a tracking protection percentage for its audience, and any adtech intermediary who claims to offer advertisers the same audience, but delivers a suspiciously low tracking protection number, is clearly pushing a mismatched or bot-heavy audience and is going to have a harder time getting away with it.

Showing prospective advertisers your tracking protection data lets you reveal the tarnish on the adtech "Holy Grail"—the promise of high-value eyeballs on crappy sites.

Here is some JavaScript to make that measurement in a reliable way that detects all the major tracking protection tools.

You can't sell advertising without data on who the audience is. Much of that data will have to come from the tracking-protected audience. When quality sites share tracking protection data with advertisers, that helps expose the adfraud that intermediaries have no incentive to track down.

This is an opportunity for service journalism.

Users are already concerned and confused about web ads. That's an opportunity that some legit sites such as the Wall Street Journal and The New York Times are already taking advantage of. The more that someone learns about how web advertising works, the more that he or she is motivated to get protected.

But if you don't talk to your readers about tracking protection, who will?

A lot of people are getting caught up today in publisher-hostile schemes such as adblockers with paid whitelisting, or adblockers that come with malware or adware.

If you don't recommend a publisher-friendly protection tool or setting, they'll get a bad one from somewhere else.

I really like ads.

At the airport on the way here I saw that they just came out with a hardcover collection of the complete Kurt Vonnegut stories. A lot of those stories were paid for by Collier’s ads run in the 1950s, and we're still getting the positive extenalities from that advertising today.

Advertising done right can be a growth spiral of growth spiral of economic growth, reputation building, and creation of cultural works. It’s one of the most powerful forces to produce news, entertainment goods, fiction. Let's fix it.