---

blog: Don Marti

---

Web ad bargain?

12 July 2018

Tim Peterson, on Digiday:

If an exchange or SSP declines to sign the agreement, it is limited to only selling non-personalized ads through DBM. Those generic ads generate less revenue for publishers than personalized ads that are targeted to specific audiences based on data collected about them. Some publishers that are heavily reliant on DBM have seen their revenues decline by 70-80 percent since GDPR took effect because they were limited to non-personalized ads, said another ad tech exec. That revenue drop has put pressure on exchanges and SSPs to sign Google’s consent agreement lest their publishers move their inventory to other platforms that can run DBM’s personalized ads on their sites, the second exec said.

(‘It’s impossible’: Google has asked ad tech firms to guarantee broad GDPR consent, assume liability - Digiday)

A lot of those "specific audiences" are, of course, adfraud bots. Fraud hackers are better at adtech than adtech firms are. So ads shown to bots, on shitty sites, are going for more than ads seen by humans on legit sites.

Meanwhile, tracking-resistant, personalization-averse readers are overrepresented in some customer categories. Web developers are a good example. (40% protected based on recent data from one popular site.)

Of course, today's web ad system is based on tracking the best possible prospect to the cheapest possible site, so it won't be easy to take advantage of this nice piece of market inefficiency. First step is figuring out how well protected the people you want to reach are.

More: blog.aloodo.org - Beware of averages: why you need a local tracking protection metric

Bug futures: business models

10 July 2018

Recent question about futures markets on software bugs: what's the business model?

As far as I can tell, there are several available models, just as there are multiple kinds of companies that can participate in any securities or commodities market.

Cushing, Oklahoma

Oracle operator: Read bug tracker state, write futures contract state, profit. This business would take an agreed-upon share of any contract in exchange for acting as a referee. The market won't work without the oracle operator, which is needed in order to assign the correct resolution to each contract, but it's possible that a single market could trade contracts resolved by multiple oracles.

Actively managed fund: Invest in many bug futures in order to incentivize a high-level outcome, such as support for a particular use case, platform, or performance target.

Bot fund: An actively managed fund that trades automatically, using open source metrics and other metadata.

Analytics provider: Report to clients on the quality of software projects, and the market-predicted likelihood that the projects will meet the client's maintenance and improvement requirements in the future.

Stake provider: A developer participant in a bug futures market must invest to acquire a position on the fixed side of a contract. The stake provider enables low-budget developers to profit from larger contracts, by lending or by investing alongside them.

Arbitrageur: Helps to re-focus development efforts by buying the fixed side of one contract and the unfixed side of another. For example, an arbitrageur might buy the fixed side of several user-facing contracts and the unfixed side of the contract on a deeper issue whose resolution will result in a fix for them.

Arbitrageurs could also connect bug futures to other kinds of markets, such as subscriptions, token systems, or bug bounties.

Previous items in the bug futures series:

Bugmark paper

A trading market to incentivize secure software: Malvika Rao, Georg Link, Don Marti, Andy Leak & Rich Bodo (PDF) (presented at WEIS 2018)

Corporate Prediction Markets: Evidence from Google, Ford, and Firm X (PDF) by Bo Cowgill and Eric Zitzewitz.

Despite theoretically adverse conditions, we find these markets are relatively efficient, and improve upon the forecasts of experts at all three firms by as much as a 25% reduction in mean squared error.

(This paper covers a related market type, not bug futures. However some of the material about interactions of market data and corporate management could also turn out to be relevant to bug futures markets.)

Creative Commons

Pipeline monument in Cushing, Oklahoma: photo by Roy Luck for Wikimedia Commons. This file is licensed under the Creative Commons Attribution 2.0 Generic license.

take the YouTube advertisers bowling

08 July 2018

What if there is a better way forward on the whole Safe Harbor controversy and Article 13?

Companies don't advertise on sites like YouTube, sites teeming with copyright infringers and nationalist extremists, because those companies are run by copyright infringers or nationalist extremists. Marketing decision-makers are incentivized to play a corrupt online advertising game that rewards them for supporting infringement and extremism.

So the trick here is to help people move marketing money out of bad things (negative externalities) and toward good things (positive externalities). We know that YouTube is a brand-unsafe shitshow because Google won't advertise its own end-user-facing products and services there without a whole extra layer of brand safety protection.

Big Internet companies are set up to insulate decision-makers from the consequences of their own online asshattery, anyway. The way to affect those big Internet companies is through their advertisers. So how about a tweak to Article 13? Instead of putting the consequences of infringement on the "online content sharing service provider," put it on the brand advertised. This should help in several ways.

  • Give legit services some flexibility. If your web site's business model is anything other than "get cheap eyeballs with other people's creative work" or "get cheap eyeballs by recommending divisive bullshit" then you don't have to change a thing.

  • Incentivize sites to pay for new creative work, by making works covered by an author or artist contract a more attractive place for paid advertising than "content" uploaded by random users.

  • Make it easier for marketers who want to do the right thing, by pointing out the risks of supporting bad people.

  • Move some of the risks of online advertising away from the public and toward the people who can make a difference.

How about it?

Nudgestock 2018 transcript

03 July 2018

(This is a cleaned-up and lightly edited version of my talk from Nudgestock 2018.)

First I have to give everybody a disclaimer. This is 100% off message. I work for Mozilla. I am NOT speaking for Mozilla here.

If you follow Rory, you have probably heard a lot about signaling in advertising, so I'm going to go over this material pretty quickly. Why does Homo economicus read magazine advertising but hangs up on cold calls? To put it another way why is every car commercial the same? You could shoot the "car driving down the windy road" commercial with any car. All that the car commercial tells you is: if it was a waste of your time to test drive our car then it would have been a waste of our money to make this little movie about it.

There's a whole literature of economics and math about signaling involving deceptive senders and honest senders. With this paper, Gardete and Bart show that when the sender wants to really get a message across, counter-intuitively the best thing for the sender to do is deprive themselves of some information about the receiver. If you're in the audience and you know what the sender knows about you, then you can't tell are they honestly expressing their intentions in the market, or are they just telling you what you want to hear? Anyone who used to read Computer Shopper magazine for the ads didn't read for specific information about all the parts that you might put into your computer. You read it to find out which manufacturers are adopting which standards so you don't buy a motherboard that won't support the video card that you might want to upgrade to next year.

There are three sets of papers in the signaling literature. There are papers that have pure math where you devise kind of a mathematical game of buyers and sellers and see how that game works out. And there are papers where you take users in an experimental setting. Ambler and Hollier took 540 users, showed them different versions of expensive looking and cheap looking advertising that conveys the same information. Finally you've got the kind of research that looks at spending across different product categories, and in this study they found that types of product that have different advertising to sales ratios really depends on how much extra user experience it takes to evaluate that product.

The feedback loop here is that when brands have signaling power, then that means market power for the publishers that carry their advertising, which means advertising rates tend to go up, which means the publishers can afford to make obviously expensive content. And when you attach advertising to obviously expensive content, that means more signaling power. It's kind of a loop that builds more and more value for the advertiser.

Some people compare this to the signaling that a bank does when they build this monstrous stone building to keep your money. Really, the stuff that a bank does, having a stone building doesn't do any more for keeping money in it than having a metal building or a concrete building, but it just shows that they've got this big stone building with their name on it so if they turned out to be deceptive it would be more costly for them to do it. That's the pure signaling model. But the other area that we can see when we compare this kind of classic signal-carrying advertising to online advertising, the kind of ads that are targeted to you based on who you are, is what's up with the norms enforcers?

Rory has his blue checkmark on Twitter which means he doesn't see Twitter ads. I'm less Internet Famous, so I still get the advertising on Twitter. A lot of the ads that I get are deceptive issue ads. This is one. A company that's getting sued for lead paint related issues is trying to convince residents of California that government inspectors are coming to their houses to declare them a nuisance. This is bogus and it's the kind of thing that if it appeared in the newspaper that everyone got to see then journalists and public interest lawyers, and everyone else who enforces the norms on how we communicate, would call it out. But in a targeted ad medium this kind of deceptive advertising can target me directly.

So let me show a little simulation here. What we're looking at is deceptive sellers making a sale. When a deceptive seller makes a sale that's a red line. When an honest seller makes a sale, that's a green line. The little blue squares are norms enforcers, and the only thing that makes a norms enforcer different in this game from a regular customer is when a deceptive seller contacts a norms enforcer the deceptive seller pays a higher price than they would have made in profit from a sale. So with honest sellers and deceptive sellers evolving and competing in this primordial soup of customers, what ends up happening to the deceptive sellers that try to do a broad reach and hit a bunch of different customers is, well you saw them, they hit the norms enforcers, the blue squares lit up. Advertisers who are deceptive and try to reach a bunch of different people end up getting squeezed out in this version of the game. An honest advertiser like this little square down here can reach over the whole board because they don't pay the penalty for reaching the norms enforcer.

So what does this really mean for the real web? On the World Wide Web, have we inadvertently built a game that gives an unfair advantage to deceptive sellers? If somebody can take advantage of all the the user profiling information that's available out there, and say, "oh I believe that these people are rural, low-income, unlikely to be finance journalists, therefore I'm going to hit them with the predatory finance ads," does that cause users to pay less attention to the medium?

Online advertising effectiveness has declined since the launch of the first banner advertisement in 1994. That's certainly not news. This is a slide that appeared in Mary Meeker's famous Internet Trends presentation, and as you can see blue is percentage of ad spending, grey is percentage of people's time. So TV is 36% of the time 36% of the money. Desktop web 18%, 20%, about right.

What's going on with print? Print is 9% of the money for 4% of the time. Now you might say this is just inertia, that that this year people are finally just cutting back on spending money in print because of people spending less time on print and it'll eventually catch up. But I went back and plotted the same slide from the same presentation going back to 2011, and I've got time plotted across the bottom, money plotted on the y axis, and what do we see about print? Print is on a whole different trend line. Print is on a trend line of much more value to the advertiser per unit of time spent than these other ad medium. My hypothesis is that targeting breaks signaling and this means an opportunity.

Targeting means that when you see an ad coming in targeted to you it's more like a cold call. It doesn't carry credible information about the seller's intention in the market.

From the point of view of who has an incentive to to support signal-carrying ad media instead, the people who have an interest in that signal for attention bargain in that positive feedback loop are of course the publishers, high reputation brands that want to be able to send that signal, writers, photographers, and editors, people who get paid by that publisher, and people who benefit from the positive externalities of those signal carrying ads that support news and cultural works.

So if the signaling model is such a big thing then why are there so many targeted ads still out there?

Nudges.

Let's have a look at, just to pick an example, the Facebook advertising policy. As you know, the Facebook advertising platform will let you micro target individuals extremely specifically. You can pick out seven people in Florida, you can pick out everyone who's looking for an apartment who doesn't have a certain ethnic affinity, that kind of thing. But the one thing you're not allowed to do with Facebook targeting is put anything in your ad that might indicate how you're targeting it. The policy says:

ads must not contain content that asserts or implies personal attributes

You can't say, I know you're male or female, I know your sexual orientation, I know what you do for a living. The ad copy has to be generic even if the targeting can be extremely specific. You can't even say other. You can't say meet other singles because that implies that the advertiser knows that the reader is single. Facebook will let you target people with depression but you can't reveal that you know that about them. Aanother good example is Target. They do targeting of individuals who they believe to be pregnant, but they'll pad out those ads for baby stuff with ads for other types of products so as not to creep everybody out.

Back to our shared interest in signal for attention bargain. Pretty much everybody has an interest in that original positive feedback loop of getting the higher reputation for brands of getting reputation driven publishers that'll build high quality content for us. Writers and photographers have an interest in getting paid, and people who are shopping for goods are the ones who want the signal the most. All that stands on the opposite side is behavioral tricks to conceal targeting. Now I'm not going to say this as a privacy issue. I know that there are privacy issues here but that is really not my department. Besides, Facebook just announced a dating site so they're going to breed privacy preferences out of their user base anyway.

Can the web as an advertising medium be redesigned to make it work better for carrying signal? We know from the existence of print that this type of signal carrying ad medium can exist. Print is an existence proof of signal carrying advertising. We also know that building that kind of an ad medium can't be that hard because print was built when people were breathing fumes from molten lead all day.

The prize for building a signal-carrying ad medium is all the cultural works that you get when somebody like Kurt Vonnegut can quit his job as manager of a car dealership and write for Collier's magazine full-time. This book is still on sale with the resulting stories. And of course local news. Democracy depends on the the vital flow of information of public interest. Some people say that the problem with news and information on the web is that it's all been made free, and if people would just subscribe we could fix the system. But honestly if if free was the problem, then Walter Cronkite would have destroyed the media business in 1962. It's a market design problem and a signaling problem, not just a problem of who has to pay for what.

And the web browsers got a bunch of things wrong in the 1990s. There are certain patterns of information flow that the browser facilitated, like third-party tracking, where browsers enable some companies to follow your activity from site to site, and data leakage. Things that that just don't work according to the way that people expect. Most people don't want their activity on one site to follow them over to another site, and the original batch of web browsers got that terribly wrong. The good news is web browsers are getting it right, and web browsers are under tremendous pressure now to do so. As a product the web browser is pretty much complete and working and generic. The whole point of a web browser is it shows web sites the same as all the other web browsers do, so there's less and less reason for a user to want to switch web browsers. But everybody who is trying to get you to install a web browser needs for there to be a reason, so the opportunity for browsers is to align with those interests of users that the browser wasn't able to pick up on previously.

At Mozilla some user researchers recently did a study on users with no ad blocker installed and users within the first few weeks of installing an ad blocker. Anybody want to guess on the increased engagement? How much more time those ad blocker users spend with that same browser than the non ad blocker users? Anybody shout out a number. All right, 28%. From the point of view of the browser those kinds of numbers, moving user engagement in a way that helps that browser meet its goals, that's something that that the browser can't ignore. So that means we're going from the old web game where everyone tries win by collecting as much data on people can without their permission to a new game in which the browser, high reputation publishers, and high reputation brands are all aligned in trying to build enough trust to work on information that users choose to share.

I know when I say information that users choose to share you're going to think about all these GDPR dialogs and I know I've seen these too, and they're just tons of companies on these. To be honest, looking at some of these company names it looks like most of them were made up by guys from Florida who communicate primarily by finger guns. Users should not have to micromanage their consent for all this data collection activity any more than email users should have to go in and read their SMTP headers to filter spam. And really if you think about what brands are, it's offloading information about a product buying decision onto the reputation coprocessor in the user's brain. It's kind of like taking a computational task and instead of running it on the CPU in your data center where you have to to pay the power and cooling bills for it, you offload it and run it on on the GPU on the client. It'll run faster, it'll run better, and the audience is maintaining that reputation state.

The future is here, it's just not very evenly distributed, as William Gibson said. This picture is the cyberpunk of the 1990s. Today all of that stuff he's carrying, his video camera, his laptop, his scanner, all that stuff's on a phone and everybody has it.

Today, the privacy sensitive users, the ones who are already working based on sharing data with permission, they're out there. But they're in niches today. If you have a relationship with those people now, then now is an opportunity to connect with them, figure out how to build that signal carrying advertising game, and and create a reputation based advertising model for the web. Thank you very much.

Worse is better, again?

02 July 2018

Are there parallels between the rise of Worse Is Better in software and the success of the "uncreative counterrevolution" in advertising? (for more on that second one: John Hegarty: Creativity is receding from marketing and data is to blame) The winning strategy in software is to sacrifice consistency and correctness for simplicity. (probably because of network effects, principal-agent problems, and market failures.) And it seems like advertising has similar trade-offs between

  • Signal

  • Measurability (How well can we measure this project's effect on sales?)

  • Message (Is it persuasive and on brand?)

Just as it's rational for software decision-makers to choose simplicity, it can be rational for marketing decsion-makers to choose measurability over signal and message. (This is probably why there is a brand crisis going on—short-term CMOs are better off when they choose brand-unsafe tactics, sacrificing Message.)

As we're now figuring out how to use market-based tools to fix market failures in software, where can we use better market design to fix market failures in advertising? Maybe this is where it actually makes sense to use #blockchain: give people whose decisions can affect #brandEquity some kind of #skinInTheGame?

Against privacy defeatism: why browsers can still stop fingerprinting

How to get away with financial fraud

Google invests $22M in feature phone operating system KaiOS

Inside the investor revolt that’s trying to take down Mark Zuckerberg

Ryan Wallman: Marketers must loosen their grip on the creative process

Open source sustainability

K2’s Media Transparency Report Still Rocks The Ad Industry Two Years After Its Release

Mark Ritson: How ‘influencers’ made my arse a work of art

Ad fraud one of the most profitable criminal enterprises in the world, researcher says

Cover story: Adtech won’t fix ad fraud because it is too lucrative, say specialists

https://hackernoon.com/why-funding-open-source-is-hard-652b7055569d

Sir John Hegarty: Great advertising elevates brands to a part of culture

https://www.canvas8.com/blog/2018/ju/behavioural-science-insights-nudgestock-2018.html …

blood donation: no good deed goes unpunished

19 June 2018

I have been infected with the Ebola virus.

I have had sex with another man in the past year.

I am taking Coumadin®.

Actually, none of those three statements is true. And Facebook knows it.

The American Red Cross has given Facebook this highly personal information about me, by adding my contact info to an "American Red Cross Blood Donors" Facebook Custom Audience. If any of that stuff were true, I wouldn't have been allowed to give blood.

When I heard back from the American Red Cross about this personal data problem, they told me that they don't share my health information with Facebook.

That's not how it works. I'm listed in the Custom Audience as a blood donor. Anyway, too late. Facebook has the info now.

So, which of its promises about how it uses people's personal information is Facebook going to break next?

And is some creepy tech bro right now making a killer pitch to Paul Graham about a business plan to "disrupt" the health insurance market using blood donor information?

I should not have to care about this, and I don't have time to. I don't even have time to attempt a funny remark about the whole Facebook board member Peter Thiel craving blood thing.

Helping people move ad budgets away from evil stuff

17 June 2018

Hugo-award-winning author Charles Stross said that a corporation is some kind of sociopathic hive organism, but as far as I can tell a corporation is really more like a monkey troop cosplaying a sociopathic hive organism.

This is important to remember because, among other reasons, it turns out that the money that a corporation spends to support democracy and creative work comes from the same advertising budget as the money it spends on random white power trolls and actual no-shit Nazis. The challenge for customers is to help people at corporations who want to do the right thing with the advertising budget, but need to be able to justify it in terms that won't break character (since they have agreed to pretend to be part of a sociopathic hive organism that only cares about its stock price).

So here is a quick follow-up to my earlier post about denying permission for some kinds of ad targeting.

Techcrunch reports that "Facebook Custom Audiences," the system where advertisers upload contact lists to Facebook in order to target the people on those lists with ads, will soon require permission from the people on the list. Check it out: Introducing New Requirements for Custom Audience Targeting | Facebook Business. On July 2, Facebook's own rules will extend a subset of Europe-like protection to everyone with a Facebook account. Beaujolais!

So this is a great opportunity to help people who work for corporations and want to do the right thing. Denying permission to share your info with Facebook can move the advertising money that they spend to reach you away from evil stuff and towards sites that make something good. Here's a permission withdrawal letter to cut and paste. Pull requests welcome.

simulating a market with honest and deceptive advertisers

11 June 2018

At Nudgestock 2018 I mentioned the signaling literature that provides background for understanding the targeted advertising problem. Besides being behind paywalls, a lot of this material is written in math that takes a while to figure out. For example, it's worth working through this Gardete and Bart paper to understand a situation in which the audience is making the right move to ignore a targeted message, but it can take a while.

Are people rational to ignore or block targeted advertising in some media, because those media are set up to give an incentive to deceptive sellers? Here's a simulation of an ad market in which that might be the case. Of course, this does not show that in all advertising markets, better targeting leads to an advantage for deceptive sellers. But it is a demonstration that it is possible to design a set of rules for an advertising market that gives an advantage to deceptive sellers.

What are we looking at? Think of it as a culture medium where we can grow and evolve a population of single-celled advertisers.

The x and y coordinates are some arbitrary characteristic of offers made to customers. Customers, invisible, are scattered randomly all over the map. If a customer gets an offer for a product that is close enough to their preferences, it will buy.

Advertisers (yellow to orange squares) get to place ads that reach customers within a certain radius. The advertiser has a price that it will bid for an ad impression, and a maximum distance at which it will bid for an impression. These are assigned randomly when we populate the initial set of advertisers.

High-bidding advertisers are more orange, and lower-bidding advertisers are more pale yellow.

An advertiser is either deceptive, in which case it makes a slightly higher profit per sale, or honest. When an honest advertiser makes a sale, we draw a green line from the advertiser to the customer. When a deceptive advertiser makes a sale, we draw a red line. The lines appear to fade out because we draw a black line every time there is an ad impression that does not result in a sale.

So why don't the honest advertisers die out? One more factor: the norms enforcers. You can think of these as product reviewers or regulators. If a deceptive advertiser wins an ad impression to a norms enforcer, then the deceptive advertiser pays a cost, greater than the profit from a sale. Think of it as having to register a new domain and get a new logo. Honest advertisers can make normal sales to the norms enforcers, which are shown as blue squares. An ad impression that results in an "enforcement penalty" is shown as a blue line.

So, out of those relative simple rules—two kinds of advertisers and two kinds of customers—we can see several main strategies arise. Your run of the simulation is unique, and you can also visit the big version.

What I'm seeing on mine is some clusters of finely targeted deceptive advertisers, in areas with relatively few norms enforcers, and some low-bidding honest advertisers with a relatively broad targeting radius. Again, I don't think that this necessarily corresponds to any real-world advertising market, but it is interesting to figure out when and how an advertising market can give an advantage to deceptive sellers, and what kinds of protections on the customer side can change the game.

How The California Consumer Privacy Act Stacks Up Against GDPR

The biggest lies that the martech and adtech worlds tell themselves

‘Personalization diminished’: In the GDPR era, contextual targeting is making a comeback

How media companies lost the advertising business

Ben Miroglio, David Zeber, Jofish Kaye, and Rebecca Weiss. 2018. The Effect of Ad Blocking on User Engagement with the Web. In WWW 2018: The 2018 Web Conference, April 23–27, 2018, Lyon, France. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3178876.3186162

When can deceptive sellers outbid honest sellers for ad impressions?

Google Will Enjoy Major GDPR Data Advantages, Even After Joining IAB Europe’s Industry Framework

https://www.canvas8.com/content/2018/06/07/don-marti-nudgestock.html …

Data protection laws are shining a needed light on a secretive industry | Bruce Schneier

How startups die from their addiction to paid marketing

Opinion: Europe's Strict New Privacy Rules Are Scary but Right

Announcing a new journalism entrepreneurship boot camp: Let’s “reboot the media” together

Intelligent Tracking Prevention 2.0

The alt-right has discovered an oasis for white-supremacy messages in Disqus, the online commenting system.

Teens Are Abandoning Facebook. For Real This Time.

Salesforce CEO Marc Benioff Calls for a National Privacy Law

Nudgestock 2018 notes and links

09 June 2018

Thanks for coming to my Nudgestock 2018 talk. First, as promised, some links to the signaling literature. I don't know of a full bibliography for this material, and a lot of it appears to be paywalled. A good way to get into it is to start with this widely cited paper by Phillip Nelson: Advertising as Information | Journal of Political Economy: Vol 82, No 4 and work forward.

Gardete and Bart "We find that when the sender’s motives are transparent to the receiver, communication can only be influential if the sender is not well informed about the receiver’s preferences. The sender prefers an interior level of information quality, while the receiver prefers complete privacy unless disclosure is necessary to induce communication." Tailored Cheap Talk | Stanford Graduate School of Business The Gardete and Bart paper makes sense if you ever read Computer Shopper for the ads. You want to get an idea of each manufacturer's support for each hardware standard, so that you can buy parts today that will keep their value in the parts market of the near future. You don't want an ad that targets you based on what you already have.

Kihlstrom and Riordan "A great deal of advertising appears to convey no direct credible information about product qualities. Nevertheless such advertising may indirectly signal quality if there exist market mechanisms that produce a positive relationship between product quality and advertising expenditures." Advertising as a Signal

Ambler and Hollier "High perceived advertising expense enhances an advertisement's persuasiveness significantly, but largely indirectly, by strengthening perceptions of brand quality." The Waste in Advertising Is the Part That Works | the Journal of Advertising Research

Davis, Kay, and Star "It is not so much the claims made by advertisers that are helpful but the fact that they are willing to spend extravagant amounts of money." Is advertising rational- Business Strategy Review - Wiley Online Library

New research on the effect of ad blocking on user engagement. No paywall. Ben Miroglio, David Zeber, Jofish Kaye, and Rebecca Weiss. 2018. The Effect of Ad Blocking on User Engagement with the Web. In WWW 2018: The 2018 Web Conference, April 23–27, 2018, Lyon, France. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3178876.3186162 (PDF)

Here's that simulation of unicellular advertisers that I showed on screen, and more on the norms enforcer situation, which IHMO is different from pure signaling.

For those of you who are verified on Twitter, so haven't seen what I'm talking about with the deceptive ads there, I have started collecting some: dmarti/deceptive-ads

I mentioned the alignment of interest between high-reputation brands and high-reputation publishers. More on the publisher side is in a series of guest posts for Digital Content Next, which represents large media companies that stand to benefit from reputation-based advertising: Don Marti, Author at Digital Content Next Also more from the publisher point of view in Notes and links from my talk at the Reynolds Journalism Institute.

If you're interested in the post-creepy advertising movement, here are some people to follow on Twitter.

What's next? The web advertising mess isn't a snarled-up mess of collective action problems. It's a complex set of problems that interact in a way that creates some big opportunities for the right projects. Work together to fix web ads? Let's not.

Evil stuff on the Internet and following the money

05 June 2018

Rule number one of dealing with the big Internet companies is: never complain to them about all the evil stuff they support. It's a waste of time and carpal tunnels. All of the major Internet companies have software, processes, and, most important, contract moderators, to attenuate complaints. After all, if Big Company employees came in to work and saw real user screenshots of the beheading videos, or the child abuse channel, or the ethnic cleansing memes, then that would harsh their mellow and severely interfere with their ability to, as they say in California, bro down and crush code.

Fortunately, we have better options than engaging with a process that's designed to mute a complaint. Follow the money.

Your average Internet ad does not come from some ominous all-seeing data-driven Panopticon. It's probably placed by some marketing person looking at an ad dashboard screen that's just as confusing to them as the ad placement is confusing to you.

So I'm borrowing the technique that "Spocko" started for talk radio, and Sleeping Giants scaled up for ads on extremist sites.

  • Contact a brand's marketing decision makers directly.

  • Briefly make a specific request.

  • Put your request in terms that make not granting it riskier and more time-consuming.

This should be pretty well known by now. What's new is a change in European privacy regulations. The famous European GDPR applies not just to Europeans, but to natural persons. So I'm going to test the idea that if I ask for something specific and easy to do, it will be easier for people to just do it, instead of having to figure out that (1) they have a different policy for people who they won't honor GDPR requests from and (2) they can safely assign me to the non-GDPR group and ignore me.

My simple request is not to include me in a Facebook Custom Audience. I can find the brands that are doing this by downloading ad data from Facebook, and here's a letter-making web thingy that I can use. Try it if you like. I'll follow up with how it's going.

Ron Estes, US Congress

02 June 2018

If Ron Estes, running for US Congress was a candidate with the same name as a well-known Democratic Party politician, clearly the right-wing pranksters of the USA would give him a bunch of inbound links just for lulz, and to force the better-known politician to spend money on SEO of his own.

But he's not, so people will probably just tweet about the election and stuff.

Opting into European mode

02 June 2018

Trans Europa Express was covered on ghacks.net. This is an experimental Firefox extension that tries to get web sites to give you European-level privacy rights, even if the site classifies you as non-European.

Since the version they mentioned, I have updated it with a few new features.

Anyway, check it out. Seems to have actual users now, so I've got that going for me. But lots of secret European mode switches still remain unactivated. If you see one, please make a new issue.

Happy GDPR day. Here's some sensitive data about me.

25 May 2018

I know I haven't posted for a while, but I can't skip GDPR Day. You don't see a lot of personal info from me here on this blog. But just for once, I'm going to share something.

I'm a blood donor.

This doesn't seem like a lot of information. People sign up for blood drives all the time. But the serious privacy problem here is that when I give blood, they also test me for a lot of diseases, many of which could have a big impact on my life and how much of certain kinds of healthcare products and services I'm likely to need. The fact that I'm a blood donor might also help people infer something about my sex life but the health data is TMI already.

And I have some bad news. I recently got the ad info from my Facebook account and there it is, in the file advertisers_who_uploaded_a_contact_list_with_your_information.html. American Red Cross Blood Donors. Yes, it looks like the people I chose to trust with some of my most sensitive personal info have given it to the least trusted company on the Internet.

In today's marketing scene, the fact that my blood donor information leaked to Facebook isn't too surprising. The Red Cross clearly has some marketing people, and targeting the existing contact list on Facebook is just one of the things that marketing people do without thinking about it too much.Not thinking about privacy concerns is a problem for Marketing as a career field long-term. If everyone thinks of Marketing as the Department of Creepy Stuff it's going to be harder to recruit creative people.

So, wait a minute. Why am I concerned that Facebook has positive health info on me? Doesn't that help maintain my status in the data-driven economy? What's the downside? (Obvious joke about healthy-blood-craving Facebook board member Peter Thiel redacted—you're welcome.)

The problem is that my control over my personal data isn't just a problem for me. As Prof. Arvind Narayanan said (video), Poor privacy harms society as a whole. Can I trust Facebook to use my blood info just to target me for the Red Cross, and not to sort people by health for other purposes? Of course not. Facebook has crossed every creepy line that they have promised not to. To be fair, that's not just a Facebook thing. Tech bros do risky and mean things all the time without really thinking them through, and even when they do set appropriate defaults they half-ass the implementation and shit happens.

Will blood donor status get you better deals, or apartments, or jobs, in the future? I don't know. I do know that the Red Cross made a big point about confidentiality when they got me signed up. I'm waiting for a reply from the Red Cross privacy officer about this, and will post an update.

Anyway, happy GDPR Day, and, in case you missed it, Salesforce CEO Marc Benioff Calls for a National Privacy Law.

Can markets for intent data even be a thing?

12 May 2018

Doc Searls is optimistic that surveillance marketing is going away, but what's going to replace it? One idea that keeps coming up is the suggestion that prospective buyers should be able to sell purchase intent data to vendors directly. This seems to be appealing because it means that the Marketing department will still get to have Big Data and stuff, but I'm still trying to figure out how voluntary transactions in intent data could even be a thing.

Here's an example. It's the week before Thanksgiving, and I'm shopping for a kitchen stove. Here are two possible pieces of intent information that I could sell.

  • "I'm cutting through the store on the way to buy something else. If a stove is on sale, I might buy it, but only if it's a bargain, because who needs the hassle of handling a stove delivery the week before Thanksgiving?"

  • "My old stove is shot, and I need one right away because I have already invited people over. Shut up and take my money."

On a future intent trading platform, what's my incentive to reveal which intent is the true one?

If I'm a bargain hunter, I'm willing to sell my intent information, because it would tend to get me a lower price. But in that case, why would any store want to buy the information?

If I need the product now, I would only sell the information for a price higher than the expected difference between the price I would pay and the price a bargain hunter would pay. But if the information isn't worth more than the price difference, why would the store want to buy it?

So how can a market for purchase intent data happen?

Or is the idea of selling access to purchase intent only feasible if the intent data is taken from the "data subject" without permission?

Anyway, I can see how search advertising and signal-based advertising can assume a more important role as surveillance marketing becomes less important, but I'm not sure about markets for purchase intent. Maybe user data sharing will be not so much a stand-alone thing but a role for trustworthy news and cultural sites, as people choose to share data as part of commenting and survey completion, and that data, in aggregated form, becomes part of a site's audience profile.

Unlocking the hidden European mode in web ads

06 May 2018

It would make me really happy to be able to yellow-list Google web ads in Privacy Badger. (Yellow-listed domains are not blocked, but have their cookies restricted in order to cut back on cross-site tracking.) That's because a lot of news and cultural sites use DoubleClick for Publishers and other Google services to deliver legit, context-based advertising. Unfortunately, as far as I can tell, Google mixes in-context ads with crappy, spam-like, targeted stuff. What I want is something like Doc Searls style ads: Just give me ads not based on tracking me.

Until now, there has been no such setting. There could have been, if Do Not Track (DNT) had turned out to be a thing, but no. But there is some good news. Instead of one easy-to-use DNT, sites are starting to give us harder-to-find, but still usable, settings, in order to enable GDPR-compliant ads for Europe. Here's Google's: Ads personalization settings in Google’s publisher ad tags - DoubleClick for Publishers Help.

Wait a minute? Google respects DNT now?

Sort of. GDPR-compliant terms written by Google aren't exactly the same as EFF's privacy-friendly Do Not Track (DNT) Policy All these different tracking policies are reminding me of open source licenses for some reason. but close enough. The catch is that as an end user, you can't just turn on Google's European mode. You have to do some JavaScript. I think I figured out how to do this in a simple browser extension to unlock secret European status.

Google doesn't appear to have their European mode activated yet, so I added a do-nothing "European mode" to the Aloodo project, for testing. I'm not able to yellow-list Google yet, but when GDPR takes effect later this month I'll test it some more.

In the meantime, I'll keep looking for other examples of hidden European mode, and see if I can figure out how to activate them.

GDPR and client-side tools

17 April 2018

Lots of GDPR advice out there. As far as I can tell it pretty much falls into three categories.

But what if there is another way?

  1. Start with the clean version. (Here's that link again: How to: GDPR, consent and data processing).

  2. Add microformats to label consent forms as consent forms, and appropriate links to the data usage policy to which the user is being asked to agree.

  3. Release a browser extension that will do the right thing with the consent forms, and submit automatically if the user is fine with the data usage request and policy, and appears to trust the site. Lots of options here, since the extension can keep track of known data usage policies and which sites the user appears to trust, based on their activity.

  4. Publish user research results from the browser extension. At this point the browsers can compete to do their own versions of step 3, in order to give their users a more trustworthy and less annoying experience.

Browsers need to differentiate in order to attract new users and keep existing users. Right now a good way to do that is in creating a safer-feeling, more trustworthy environment. The big opportunity is in seeing the overlap between that goal for the browser and the needs of brands to build reputation and the needs of high-reputation publishers to shift web advertising from a hacking game that adtech/adfraud wins now, to a reputation game where trusted sites can win.

When can deceptive sellers outbid honest sellers for ad impressions?

14 April 2018

Update 8 Jun 2018: simulation, Why digital advertising leaves people underwhelmed

Why does the Peak Advertising effect occur most in the most accurately targeted ad media? Why do people tend to filter out targeted ads, using habit power, technology, and regulation, while paying more attention to less finely targeted ad media?

One explanation is that buying ad space is an example of costly signaling. On this view, advertising is basically an exchange of signal for attention, and ads that don't pay their way with some kind of proof of spend are not worth paying attention to because they don't convey useful information about the seller's beliefs on how valuable the audience would find the product.

Another possible explanation is that targetable ad media are more suitable for deception, and that where advertisers bid for space in a medium, deceptive advertisers will tend to outbid the honest ones.

This seems counterintuitive, since we might suppose that the customer lifetime value of an honest seller's newly acquired customer could in many cases be greater than the profit from a quick score by a deceptive seller. But targeting doesn't just match ad impressions with prospective buyers. When used by a deceptive seller, it can also conceal an ad impression from potentially costly attention.

For honest direct marketers, the expected profit from reaching a buyer is positive, and the expected profit from reaching a non-buyer is zero. But the audience does not just contain buyers and non-buyers. People can also be divided into enforcers and non-enforcers. Enforcers can be anything from professional law enforcement people, to someone who takes apart a bogus product and makes a video about it, to just the writer of a bad online review. What enforcers have in common is that for a dishonest seller, the expected profit from reaching an enforcer is negative.

Some kinds of enforcer can impose costs even without buying. For example, a reader might send the publisher a screenshot containing a scam ad and get the advertiser added to an advertiser exclusion list. Other kinds of enforcer might only take action if they buy the product and find it to be a scam. A deceptive advertiser might incur costs when their ad is shown to either kind of enforcer.

For the honest advertiser, the expected profit from a single impression is:

probability of reaching a buyer × expected profit per sale

For the dishonest advertiser, the expected profit is:

probability of reaching a buyer × expected profit per sale − probability of reaching an enforcer × expected loss per enforcer

The expected loss per enforcer is typically high compared to the profit per sale. For example, a small number of contacts with review writers might require a seller to re-launch under a new name. In an ad impression market with both honest and deceptive sellers, where sellers can choose which impressions to bid on, an ad impression that a deceptive seller believes is unlikely to reach an enforcer has extra value to that deceptive advertiser but not to an honest advertiser. Deceptive sellers will tend to outbid honest ones for certain impressions.

A member of the audience might be able to see targeting criteria, but not the advertiser's internal weighting of targeting criteria. (For example, a targeted ad platform might reveal to you that you are being targeted for an ad because your computer is running the latest release of the OS. What they won't tell you is that the seller is bidding on impressions to your OS version because they're selling a tainted nutritional supplement, and the lead testing department at the Ministry of Health is still on the old OS version.)

So, some ad impressions will tend to be purchased by deceptive sellers, but a low-information member of the audience can't tell which impressions those are. Is this an ad from an honest seller that might be reaching both me and enforcers, or is this an ad from a dishonest seller targeted to reach me but not enforcers? When you read a magazine that reaches a community of practice of which you're a member, you can be confident that product reviewers and editors are seeing the same ads you are. A web ad could be targeted to avoid experienced and better-connected members of the community of practice.

One possible explanation for the Peak Advertising effect is the interaction between deceptive sellers discovering how to use a new ad medium's targeting capabilities to avoid enforcers, and the audience discovering the fraction of deceptive sellers.

Related: Ban Targeted Advertising by David Dayen in The New Republic. (I'm not so much interested in whether or not targeted advertising should be banned as I am in the reasoning behind why people choose to protect themselves from it. The story of matching the exact right buyer to the exact right product is much less compelling for most purchase decisions than the buyer's story of finding an adequate product and avoiding deceptive sellers.)

working post-creepy ads, and stuff

13 April 2018

Post-creepy web ad sightings: What's next for web advertising after browser privacy improvements and regulatory changes make conventional adtech harder and harder?

The answer is probably something similar to what's already starting to pop up on niche sites. Here's a list of ad platforms that work more like print, less like spam: list of post-creepy web ad systems. Comments and suggestions welcome (mail me, or do a GitHub pull request from the link at the bottom.)

Fun with bug futures: we're in Mozilla's Internet Health Report. Previous items in that series:

ICYMI: Mozilla experiment aims to reduce bias in code reviews

Lots of GDPR and next-generation web ads stories in the past few weeks. A few must-read ones.

Publishers Haven't Realized Just How Big a Deal GDPR is My advice to you is rethink your approach to GDPR. This is your chance to be a part of the solution, rather than being part of the problem.

Brand Safety Is Not Driving Media Allocation Decisions in 2018/19

Mark Ritson: This is a critical point in marketers’ relationship with data privacy

What GDPR really means

A good question, from Twitter

19 March 2018

Good question on Twitter, but one that might take more than, what is is now, 280 characters? to answer.

Why do I pay attention to Internet advertising? Why not just block it and forget about it? By now, web ad revenue per user is so small that it only makes sense if you're running a platform with billions of users, so sites are busy figuring out other ways to get paid anyway.

To the generation that never had a print magazine subscription, advertising is just a subset of "creepy shit on the Internet." Who wants to do that for a living? According to Charlotte Rogers at Marketing Week, the lack of information out there explaining the diverse opportunities of a career in marketing puts the industry at a distinct disadvantage in the minds of young people. Marketing also has to contend with a perception problem among the younger generation that it is intrinsically linked with advertising, which Generation Z notoriously either distrust or dislike.

Like the man says, Where Did It All Go Wrong?

The answer is that I'm interested in Internet advertising for two reasons.

  • First, because I'm a Kurt Vonnegut fan and have worked for a magazine. Some kinds of advertising can have positive externalities. Vonnegut was able to quit his job at a car dealership, and write full time, because advertising paid for original fiction in Collier's magazine. How did advertising lose its ability to pay for news and cultural works? Can advertising reclaim that ability?

  • Second, because most of the economic role of advertising is in an area that Internet advertising hasn't been able to get a piece of. While Internet advertising plays a game of haha, look what I tricked you into clicking on for chump change, the real money is in signal-carrying advertising that helps build brand reputation. Is it possible to make Internet advertising into a medium that can get a piece of the action?

Maybe make that three reasons. As long as Internet advertising fails to pull its weight in either supporting news and cultural works or helping to send a credible economic signal for brands then the scams, malware and mental manipulation will only continue. More: World's last web advertising optimist tells all!

People's personal data: take it or ask for it?

09 March 2018

We know that advertising on the web has reached a low point of fraud, security risks, and lack of brand safety. And it's not making much money for publishers anyway. So a lot of people are talking about how to fix it, by building a new user data sharing system, in which individuals are in control of which data they choose to reveal to which companies.

Unlike today's surveillance marketing, people wouldn't be targeted for advertising based on data that someone figures out about them and that they might not choose to share.

A big win here will be that the new system would tend to lower the ROI on creepy marketing investments that have harmful side effects such as identity theft and facilitation of state-sponsored misinformation, and increase the ROI for funding ad-supported sites that people trust and choose to share personal information with.

A user-permissioned data sharing system is an excellent goal with the potential to help clean up a lot of the Internet's problems. But I have to be realistic about it. Adam Smith once wrote,

The pride of man makes him love to domineer, and nothing mortifies him so much as to be obliged to condescend to persuade his inferiors.

So the big question is still:

Why would buyers of user data choose to deal with users (or publishers who hold data with the user's permission) when they can just take the data from users, using existing surveillance marketing firms?

Some possible answers.

  • GDPR? Unfortunately, regulatory capture is still a thing even in Europe. Sometimes I wish that American privacy nerds would quit pretending that Europe is ruled by Galadriel or something.

  • brand safety problems? Maybe a little around the edges when a particularly bad video gets super viral. But platforms and adtech can easily hide brand-unsafe "dark" material from marketers, who can even spend time on Youtube and Facebook without ever developing a clue about how brand-unsafe they are for regular people. Even as news-gatherers get better at finding the worst stuff, platforms will always make hiding brand-unsafe content a high priority.

  • fraud concerns? Now we're getting somewhere. Fraud hackers are good at making realistic user data. Even "people-based" platforms mysteriously have more users in desirable geography/demography combinations than are actually there according to the census data. So, where can user-permissioned data be a fraud solution?

  • signaling? The brand equity math must be out there somewhere, but it's nowhere near as widely known as the direct response math that backs up the creepy stuff. Maybe some researcher at one of the big brand advertisers developed the math internally in the 1980s but it got shredded when the person retired. Big possible future win for the right behavioral economist at the right agency, but not in the short term.

  • improvements in client-side privacy? Another good one. Email spam filtering went from obscure nerdery to mainstream checklist feature quickly—because email services competed on it. Right now the web browser is a generic product, and browser makers need to differentiate. One promising angle is for the browser to help build a feeling of safety in the user by reducing user-perceived creepiness, and the browser's need to compete on this is aligned with the interests of trustworthy sites and with user-permissioned data sharing.

(And what's all this "we" stuff, anyway? Post-creepy advertising is an opportunity for individual publishers and brands to get out ahead, not a collective action problem.)

What I don't get about Marketing

27 February 2018

I want to try to figure out something I still don't understand about Marketing.

First, read this story by Sarah Vizard at Marketing Week: Why Google and Facebook should heed Unilever’s warnings.

All good points, right?

With the rise of fake news and revelations about how the Russians used social platforms to influence both the US election and EU referendum, the need for change is pressing, both for the platforms and for the advertisers that support them.

We know there's a brand equity crisis going on. Brand-unsafe placements are making mainstream brands increasingly indistinguishable from scams. So the story makes sense so far. But here's what I don't get.

For the call to action to work, Unilever really needs other brands to rally round but these have so far been few and far between.

Other brands? Why?

If brands are worth anything, they can at least help people tell one product apart from another.

Think Small VW ad

Saying that other brands need to participate in saving Unilever's brands from the three-ring shitshow of brand-unsafe advertising is like saying that Volkswagen really needs other brands to get into simple layouts and natural-sounding copy just because Volkswagen's agency did.

Not everybody has to make the same stuff and sell it the same way. Brands being different from each other is a good thing. (Right?)

generic food

Sometimes a problem on the Internet isn't a "let's all work together" kind of problem. Sometimes it's an opportunity for one brand to get out ahead of another.

What if every brand in a category kept on playing in the trash fire except one?

The tracker will always get through?

18 February 2018

(I work for Mozilla. None of this is secret. None of this is Mozilla policy. Not speaking for Mozilla here.)

A big objection to tracking protection is the idea that the tracker will always get through. Some people suggest that as browsers give users more ability to control how their personal information gets leaked across sites, things won't get better for users, because third-party tracking will just keep up. On this view, today's easy-to-block third-party cookies will be replaced by techniques such as passive fingerprinting where it's hard to tell if the browser is succeeding at protecting the user or not, and users will be stuck in the same place they are now, or worse.

I doubt this is the case because we're playing a more complex game than just trackers vs. users. The game has at least five sides, and some of the fastest-moving players with the best understanding of the game are the adfraud hackers. Right now adfraud is losing in some areas where they had been winning, and the resulting shift in adfraud is likely to shift the risks and rewards of tracking techniques.

Data center adfraud

Fraudbots, running in data centers, visit legit sites (with third-party ads and trackers) to pick up a realistic set of third-party cookies to make them look like high-value users. Then the bots visit dedicated fraudulent "cash out" sites (whose operators have the same third-party ads and trackers) to generate valuable ad impressions for those sites. If you wonder why so many sites made a big deal out of "pivot to video" but can't remember watching a video ad, this is why. Fraudbots are patient enough to get profiled as, say, a car buyer, and watch those big-money ads. And the money is good enough to motivate fraud hackers to make good bots, usually based on real browser code. When a fraudbot network gets caught and blocked from high-value ads, it gets recycled for lower and lower value forms of advertising. By the time you see traffic for sale on fraud boards, those bots are probably only getting past just enough third-party anti-fraud services to be worth running.

This version of adfraud has minimal impact on real users. Real users don't go to fraud sites, and fraudbots do their thing in data centers Doesn't everyone do their Christmas shopping while chilling out in the cold aisle at an Amazon AWS data center? Seems legit to me. and don't touch users' systems. The companies that pay for it are legit publishers, who not only have to serve pages to fraudbots—remember, a bot needs to visit enough legit sites to look like a real user—but also end up competing with adfraud for ad revenue. Adfraud has only really been a problem for legit publishers. The adtech business is fine with it, since they make more money from fraud than the fraud hackers do, and the advertisers are fine with it because fraud is priced in, so they pay the fraud-adjusted price even for real impressions.

What's new for adfraud

So what's changing? More fraudbots in data centers are getting caught, just because the adtech firms have mostly been shamed into filtering out the embarassingly obvious traffic from IP addresses that everyone can tell probably don't have a human user on them. So where is fraud going now? More fraud is likely to move to a place where a bot can look more realistic but probably not stay up as long—your computer or mobile device. Expect adfraud concealed within web pages, as a payload for malware, and of course in lots and lots of cheesy native mobile apps.The Google Play Store has an ongoing problem with adfraud, which is content marketing gold for Check Point Software, if you like "shitty app did WHAT?" stories. Adfraud makes way more money than cryptocurrency mining, using less CPU and battery.

So the bad news is that you're going to have to reformat your uncle's computer a lot this year, because more client-side fraud is coming. Data center IPs don't get by the ad networks as well as they once did, so adfraud is getting personal. The good news, is, hey, you know all that big, scary passive fingerprinting that's supposed to become the harder-to-beat replacement for the third-party cookie? Client-side fraud has to beat it in order to get paid, so they'll beat it. As a bonus, client-side bots are way better at attribution fraud (where a fraudulent ad gets credit for a real sale) than data center bots.

Users don't have to get protected from every possible tracking technique in order to shift the web advertising game from a hacking contest to a reputation contest. It often helps simply to shift the advertiser's ROI from negative-externality advertising below the ROI of positive-externality advertising.

Advertisers have two possible responses to adfraud: either try to out-hack it, or join the "flight to quality" and cut back on trying to follow big-money users to low-reputation sites in the first place. Hard-to-detect client-side bots, by making creepy fingerprinting techniques less trustworthy, tend to increase the uncertainty of the hacking option and make flight to quality relatively more attractive.

This is why we can't have nice brands.

17 February 2018

What if I told you that there was an Internet ad technology that...

  • can reach the same user on mobile and desktop

  • uses open-standard persistent identifiers for users

  • can connect users to their purchase history

  • reaches the users that the advertiser chooses, at the time the advertiser chooses

  • and doesn't depend on the Google/Facebook duopoly?

Don't go looking for it on the Lumascape.

I'm describing email spam.

Every feature that adtech is bragging on, or working toward? Email spam had it in the 1990s.

So why didn't brand advertisers jump all over spam? Why did they mostly leave it to low-reputation brands and scammers?

To be honest, it probably wasn't a decision decision in most cases, just corporate sloth. But staying away from spam was the right answer. In the email inbox, spam from a high-reputation brand doesn't look any different from spam that any fly-by-night operation can send. All spammers can do the same stuff:

They can sell to people...for a fraction of what marketing used to cost. And they can collect data on these consumers, track what they buy, what they love and hate about the experience, and market to them directly much more effectively.

Oh, wait. That one isn't about spam in the 1990s. That's about targeted advertising on social media sites today. The CEO of digital advertising's biggest trade group says most big marketers are screwed unless they completely change their business models.

It's the direct consumer relationships, and the use of consumer data, that is completely game-changing for the marketing world. And most big marketers, such as Procter & Gamble and Unilever, are not ready for this new reality, the IAB says.

But of course they're ready. The difference is that those established brand advertisers aren't any more ready than some guy who watched a YouTube video series on "growth hacking" and is ready to start buying targeted ads and drop-shipping.

The "new reality," the targeted advertising business that the IAB wants brands to join them in, is a place where you win based not on how much the audience trusts you, but on how well you can out-hack the competition. And like any information space organized by hacking skill, it's a hellscape of deceptive crap. Read The Strange Brands in Your Instagram Feed by Alexis C. Madrigal.

Some Instagram retailers are legit brands with employees and products. Others are simply middlemen for Chinese goods, built in bedrooms, and launched with no capital or inventory. All of them have been pulled into existence by the power of Instagram and Facebook ads combined with a suite of e-commerce tools based around Shopify.

Of course, not every brand that buys a social media ad or other targeted ad is crap.

But a social media ad is useless for telling crap brands from non-crap ones. It doesn't carry economic signal. There's no such thing as a free watch. (PDF)

Rory Sutherland writes, in Reducing activities to their core misses the point,

Many billions of pounds of advertising expenditure have been shifted from conventional media, most notably newspapers, and moved into digital media in a quest for targeted efficiency. If advertising simply works by the conveyance of messages, this would be a sensible thing to do. However, it is beginning to become apparent that not all, perhaps not even most, advertising works this way. It seems that a large part of advertising creates trust and conviction in its audience precisely because it is perceived to be costly.

If anyone knows that any seller can watch a few YouTube videos and do a certain activity, does that activity really help the audience distinguish a high-reputation seller from a low-reputation one?

And how does it affect a legit brand when its ads show up on the same medium with all the crappy ones?Twitter has a solution that keeps its ads saleable: just don't show any ads to important people. I'm surprised they can get away with this, but given the mix of rip-off and real brand ads I keep seeing there, it seems to be working.

Extremists and state-sponsored misinformation campaigns aren't "abusing" targeted advertising. They're just taking advantage of a system optimized for deception and using it normally.

Now, I don't want to blame targeted advertising for all of the problems of brand equity. When you put high-fructose corn syrup in your product, brand equity suffers. When you outsource or de-skill the customer support function, brand equity suffers. All the half-ass "looks good this quarter" stuff that established brands are doing is bad for brand equity. It just turns out that the kinds of advertising that you can do on the Internet today are all half-ass "looks good this quarter" stuff. If you want to send a credible economic signal, buy TV time or put a flagship store on some expensive real estate. The Internet's got nothing for you.

Failure to create signal-carrying ad units should be more of a concern for people who want to earn ad money on the Internet than it is. See Bob Hoffman's "refrigerator test." All that work that went into building the most complicated ad medium ever? It went into building an ad medium optimized for low-reputation advertisers. And that kind of ad medium tends to see rates go down over time. It doesn't hold value.

And the medium can't gain value until the users trust it, which means they have to trust the browser. In-browser tracking protection is going to have to enable the legit web advertising industry the same way that spam filters enables the legit email newsletter industry.

Here’s why the epidemic of malicious ads grew so much worse last year

Facebook and Google could lose $2B in ad revenue over ‘toxic content’

How I Cracked Facebook’s New Algorithm And Tortured My Friends

Wanted: Console Text Editor for Windows

Where Did All the Advertising Jobs Go?

Facebook patents tech to determine social class

The Mozilla Blog: A Perspective: Firefox Quantum’s Tracking Protection Gives Users The Right To Be Curious

Breaking up with Facebook: users confess they're spending less time

Survey: Facebook is the big tech company that people trust least

The Perils of Paid Content

EVERYONE ELSE IS DOING IT

Unilever pledges to cut ties with ‘platforms that create division’

Content recommendation services Outbrain and Taboola are no longer a guaranteed source of revenue for digital publishers

The House That Spied on Me

Why Facebook's Disclosure to the City of Seattle Doesn't Add Up

Debunking common blockchain-saving-advertising myths

SF tourist industry struggles to explain street misery to horrified visitors

How Facebook’s Political Unit Enables the Dark Art of Digital Propaganda

How Facebook Helped Ruin Cambodia's Democracy

Two visions of GDPR

13 February 2018

As far as I can tell, there are two sets of ambitious predictions about GDPR.

One is the VRM vision. Doc Searls writes, on ProjectVRM:

I am sure Google, Facebook and lesser purveyors of advertising online will find less icky ways to stay in business; but it is becoming clear that next May 25, when the GDPR goes into full effect, will be an extinction-level event for tracking-based advertising (aka adtech) as a business model.

Big impact? Not so fast. There's also a "business as usual" story, and that one, you'll find at Digital Advertising Consent.

Our complex ecosystem of companies must cooperate more closely than ever before to meet the transparency and consent requirements of European data protection law.

According to the adtech firms, well, maybe there will be more Bürokratie, more pointless dialogs that users have to click through, and one more line item, "GDPR compliance", to come out of the publisher's share, of course, but the second vision of GDPR is essentially just adtech/adfraud as usual. Upgrade to the new version of OpenRTB, and move along, nothing to see here.

Personally, I'm not buying either one of these GDPR visions. Because, just for fun and also because reasons, I run my own mail server.

And every little decision I have to make about how to configure the damn thing is based on playing a game with email spammers. Regulation is a part of my complete breakfast, but it's not the whole story.

The government doesn't give you freedom from spam. You have to take it for yourself, one filtering rule at a time. Or, do what most people do, and find a company that does it for you, but it has to be a company that you trust with your information.

A mail sender's decision to comply, or not comply, with some regulation is a bit of information. That feeds into the software that makes the final decision: inbox, spam folder, or reject. When a spam message complies with the regulations of some country, my mail server doesn't say, "Oh, wow, compliant! I can skip all the other checks and send this one straight to the inbox!" It uses the regulation compliance along with other information to make that decision.

So whatever extra consent forms that surveillance marketers are required to send by GDPR? They're not the final decision on What The User Must See. They're just data, coming over the network.

Some of that data will be interpreted to mean that this request is an obvious mismatch with how the user chooses to share their info. The user might not even see those consent forms, or the browser might pop up a notification:

4 requests to do creepy shit, that's obviously against your preferences, already denied. Isn't this the best browser ever?

(No, I don't write copy for browser notifications. But you get the idea.)

Browsers that implement tracking protection might end up with a feature where they detect requests for permission to do things that the user has already said no to—by turning on tracking protection in the first place—and auto-deny them.

Legit email senders had to learn "deliverability," the art and science of making legit mail look legit so that it can get past email spam filters. Legit advertisers will have to learn that users aren't identical and spherical, users choose tools to implement their data sharing preferences, and that regulatory compliance is only part of the job.

Should web browsers adopt Google’s new selective ad blocking tech?

EVERYONE ELSE IS DOING IT

Content recommendation services Outbrain and Taboola are no longer a guaranteed source of revenue for digital publishers

Team A vs. Team B

11 February 2018

Let's run a technical challenge on the Internet. Team A vs. Team B.

Team A gets to work where they want, when they want. Team B has to work in an open-plan office, with people walking behind them, talking on the phone, doing all that annoying office stuff.

Members of Team A get paid for successful work within weeks or months. Members of Team B get a base salary that they have to spend on rent in an expensive location, but just might get paid extra for successful work in four years.

Team A will let anyone try to join, and those who aren't successful have to drop out quickly. Team B will only let members who are a "good cultural fit" join, and it takes a while to get rid of an unsuccessful member.

Team A can deploy unproven work for real-world testing, using infrastructure that they get for free on the Internet. Team B can only deploy their work when production-ready, on infrastructure they have to pay for.

If Team A breaks the rules, the penalty is that they have to spend a little money to register new domain names. If Team B breaks the rules, they risk lengthy regulatory and/or legal consequences.

Team A scores a win any time they can beat whoever is the weakest member of Team B at that time. Team B can only score a win when they can consistently defeat all of the most active members of Team A.

Team A is adfraud.

Why is so much marketing money being bet on Team B?

Fun with numbers

06 February 2018

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

Guess what? According to Emil Protalinski at VentureBeat, the browser wars are back on.

Google is doubling down on the user experience by focusing on ads and performance, an opportunity I’ve argued its competitors have completely missed.

Good point. Jonathan Mendez has some good background on that.

The IAB road blocked the W3C Do Not Track initiative in 2012 that was led by a cross functional group that most importantly included the browser makers. In hindsight this was the only real chance for the industry to solve consumer needs around data privacy and advertising technology. The IAB wanted self-regulation. In the end, DNT died as the IAB hoped.

As third-party tracking made the ad experience crappier and crappier, browser makers tried to play nice. Browser makers tried to work in the open and build consensus.

That didn't work, which shouldn't be a surprise. Imagine if email providers had decided to build consensus with spammers about spam filtering rules. The spammers would have been all like, "It replaces the principle of consumer choice with an arrogant 'Hotmail knows best' system." Any sensible email provider would ignore the spammers but listen to deliverability concerns from senders of legit opt-in newsletters. Spammers depend on sneaking around the user's intent to get their stuff through, so email providers that want to get and keep users should stay on the user's side. Fortunately for legit mail senders and recipients, that's what happened.

On the web, though, not so much.

But now Apple Safari has Intelligent Tracking Prevention. Industry consensus achieved? No way. Safari's developers put users first and, like the man said, if you're not first you're last.

And now Google is doing their own thing. Some positive parts about it, but by focusing on filtering annoying types of ad units they're closer to the Adblock Plus "Acceptable Ads" racket than to a real solution. So it's better to let Ben Williams at Adblock Plus explain that one. I still don't get how it is that so many otherwise capable people come up with "let's filter superficial annoyances and not fundamental issues" and "let's shake down legit publishers for cash" as solutions to the web advertising problem, though. Especially when $16 billion in adfraud is just sitting there. It's almost as if the Lumascape doesn't care about fraud because it's priced in so it comes out of the publisher's share anyway.

So with all the money going to fraud and the intermediaries that facilitate it, local digital news publishers are looking for money in other places and writing off ads. That's good news for the surviving web ad optimists (like me) because any time Management stops caring about something you get a big opportunity to do something transformative.

Small victories

The web advertising problem looks big, but I want to think positive about it.

  • billions of web users

  • visiting hundreds of web sites

  • with tens of third-party trackers per site.

That's trillions of opportunities for tiny victories against adfraud.

Right now most browsers and most fraudbots are hard to tell apart. Both maintain a single "cookie jar" across trusted and untrusted sites, and both are subject to fingerprinting.

For fraudbots, cross-site trackability is a feature. A fraudbot can only produce valuable ad impressions on a fraud site if it is somehow trackable from a legit site.

For browsers, cross-site trackability is a bug, for two reasons.

  • Leaking activity from one context to another violates widely held user norms.

  • Because users enjoy ad-supported content, it is in the interest of users to reduce the fraction of ad budgets that go to fraud and intermediaries.

Browsers don't have the solve the whole web advertising problem to make a meaningful difference. As soon as a trustworthy site's real users look diffferent enough from fraudbots, because fraudbots make themselves more trackable than users running tracking-protected browsers do, then low-reputation and fraud sites claiming to offer the same audience will have a harder and harder time trying to sell impressions to agencies that can see it's not the same people.

Of course, the browser market share numbers will still over-represent any undetected fraudbots and under-represent the "conscious chooser" users who choose to turn on extra tracking protection options. But that's an opportunity for creative ad agencies that can buy underpriced post-creepy ad impressions and stay away from overvalued or worthless bot impressions. I expect that data on who has legit users—made more accurate by including tracking protection measurements—will be proprietary to certain agencies and brands that are going after customer segments with high tracking protection adoption, at least for a while.

Now even YouTube serves ads with CPU-draining cryptocurrency miners http://arstechnica.com/information-technology/2018/01/now-even-youtube-serves-ads-with-cpu-draining-cryptocurrency-miners/ … by @dangoodin001

Remarks delivered at the World Economic Forum

Improving privacy without breaking the web

Greater control with new features in your Ads Settings

PageFair’s long letter to the Article 29 Working Party

‘Never get high on your own supply’ – why social media bosses don’t use social media

Can you detect WebDriver sessions from inside a web page? https://hoosteeno.com/2018/01/23/can-you-detect-webdriver-sessions-from-inside-a-web-page/ … via @wordpressdotcom

Making WebAssembly even faster: Firefox’s new streaming and tiering compiler

Newsonomics: Inside L.A.’s journalistic collapse

The State of Ad Fraud

The more Facebook examines itself, the more fault it finds

In-N-Out managers earn triple the industry average

Five loopholes in the GDPR

Why ads keep redirecting you to scammy sites and what we’re doing about it

https://digiday.com/media/local-digital-news-publishers-ignoring-display-revenue/

Website operators are in the dark about privacy violations by third-party scripts

Mark Zuckerberg's former mentor says 'parasitic' Facebook threatens our health and democracy

Craft Beer Is the Strangest, Happiest Economic Story in America

The 29 Stages Of A Twitterstorm In 2018

How Facebook Helped Ruin Cambodia's Democracy

How Facebook’s Political Unit Enables the Dark Art of Digital Propaganda

Firefox 57 delays requests to tracking domains

Direct ad buys are back in fashion as programmatic declines

‘Data arbitrage is as big a problem as media arbitrage’: Confessions of a media exec

Why publishers don’t name and shame vendors over ad fraud

News UK finds high levels of domain spoofing to the tune of $1 million a month in lost revenue • Digiday

The Finish Line in the Race to the Bottom

Something doesn’t ad up about America’s advertising market

Fraud filters don't work

Ad retargeters scramble to get consumer consent

More brand safety bullshit

20 January 2018

There's enough bullshit on the Internet already, but I'm afraid I'm going to quote some more. This time from Ilyse Liffreing at IBM.

The reality is none of us can say with certainty that anywhere in the world, we are [brand] safe. Look what just happened with YouTube. They are working on fixing it, but even Facebook and Google themselves have said there’s not much they can do about it. I mean, it’s hard. It’s not black and white. We are putting a lot of money in it, and pull back on channels where we have concerns. We’ve had good talks with the YouTube teams.

Bullshit.

One important part of this decision is black and white.

Either you give money to Nazis.

Or you don't give money to Nazis.

If Nazis are better at "programmatic" than the resting-and-vesting chill bros at the programmatic ad firms (and, face it, Nazis kick ass at programmatic), then the choice to spend ad money in a we're-kind-of-not-sure-if-this-goes-to-Nazis-or-not way is a choice that puts your brand on the wrong side of a black and white line.

There are plenty of Nazi-free places for brands to run ads. They might not be the cheapest. But I know which side of the line I buy from.

Remove all the tracking widgets? Maybe not.

16 January 2018

Good one from Mark Pilipczuk: Publisher Advice From a Buyer.

Remove all the tracking widgets from your site. That Facebook “Like” button only serves to exfiltrate your valuable data to an entity that doesn’t have your best interests at heart. If you’ve got a valuable audience, why would you want to help the ad tech industry which promises “I can find the same and bigger audience over here for $2 CPM, so don’t buy from the publisher?” Sticking your own head in the noose is never a good idea.

That advice makes sense for the Facebook "like button." That button is just a data shoplifter. The others, though? All those extra trackers come in as side effects of ad deals, and they're likely to be contractually required to make ads on the site saleable.

Yes, those trackers feed bots and data leakage, and yes, they're even terrible at fighting adfraud. Augustine Fou points out that Fraud filters don't work. "In some cases it's worse when filter is on."

So in an ideal world you would be able to pull all the third-party trackers, but as far as day-to-day operations go, user tracking is a Chesterton's Fence problem. What happens if a legit site unilaterally takes down the third-party trackers? All the targeted ad impressions that would have given that site a (small) payment end up going to bots.

So what can a site do? Understand that the real fix has to happen on the browser end, and nudge the users to either make their browsers less data-leaky, or switch to browsers that are leakage-resistant out of the box.

Start A/B testing some notifications to remind users to turn on tracking protection.

  • Can you get users who are already choosing "Do Not Track" to turn on real protection if you inform them that sites ignore their DNT choice?

  • If a user is running an ad blocker with a paid whitelisting scheme, can you inform them about it to get them to switch to a better tool, or at least add a second layer of protection that limits the damage that paid whitelisting can do?

  • When users visit privacy pages or opt-out of a marketing program, are they also willing to check their browser privacy settings?

Every site's audience is different. It's hard to know in advance how users will respond to different calls to action to turn up their privacy and create a win-win for legit sites and legit brands. We do know that users are concerned and confused about web advertising, and the good news is that the JavaScript needed to collect data and administer nudges is as easy to add as yet another tracker.

More on what sites can do, that might be more effective than just removing trackers: What The Verge can do to help save web advertising

Easy question with too many wrong answers

13 January 2018

Content warning: Godwin's Law.

Here's a marketing question that should be easy.

How much of my brand's ad budget goes to Nazis?

Here's the right answer.

Zero.

And here's a guy who still seems to be having some trouble answering it: Dear Google (GOOG): Please stop using my advertising dollars to monetize hate speech.

If you're responsible for a brand and somewhere in the mysterious tubes of adtech your money is finding its way to Nazis, what is the right course of action?

One wrong answer is to write a "please help me" letter to a company that will just ignore it. That's just admitting to knowingly sending money to Nazis, which is clearly wrong.

Here's another wrong idea, from the upcoming IAB Annual Leadership Meeting session on "brand safety" (which is the nice, sanitary professional-sounding term for "trying not to sponsor Nazis, but not too hard.")

Threats to brand safety arise internally and externally, in your control and out of your control—and the stakes have never been higher. Learn how to minimize brand safety risks and maximize odds of survival when your brand takes a hit (spoiler alert: overreacting is as bad as underreacting). Best Buy and Starcom share best practices based on real-world encounters with brand safety issues.

Really, people? Overreacting is as bad as underreacting? The IAB wants you to come to a deluxe conference about how it's fine to send a few bucks to Nazis here and there as long as it keeps their whole adtech/adfraud gravy train running on time.

I disagree. If Best Buy is fine with (indirectly of course) paying the occasional Nazi so that the IAB companies can keep sending them valuable eyeballs from the cheapest possible sites, then I can shop elsewhere.

Any nationalist extremist movement has its obvious supporters, who wear the outfits and get the tattoos and go march in the streets and all that stuff, and also the quiet supporters, who come up with the money and make nice with the powers that be. The supporters who can keep it deniable.

Can I, as a potential customer from the outside, tell the difference between quiet Nazi supporters and people who are just bad at online advertising and end up supporting Nazis by mistake? Of course not. Do I care? Of course not. If you're not willing to put the basic "don't pay Nazis to do Nazi stuff" rule ahead of a few ad clicks, I don't want your brand anyway. And I'll make sure to install and use the tracking protection tools that help keep my good data away from bad sites.

some more random links

31 December 2017

This one is timely, considering that an investment in "innovation" comes with a built-in short position in Bay Area real estate, and the short squeeze is on: Collaboration in 2018: Trends We’re Watching by Rowan Trollope

In 2018, we’ll see the rapid decline of “place-ism,” the discrimination against people who aren’t in a central office. Technology is making it easier not just to communicate with distant colleagues about work, but to have the personal interactions with them that are the foundation of trust, teamwork, and friendship.

Really, "place-ism" only works if you can afford to overpay the workers who are themselves overpaying for housing. And management can only afford to overpay the workers by giving in to the temptations of rent-seeking and deception. So the landlord makes the nerd pay too much, the manager has to pay the nerd too much, and you end up with, like the man said, "debts that no honest man can pay"?

File under "good examples to illustrate Betteridge's law of headlines": Now That The FCC Is Doing Away With Title II For Broadband, Will Verizon Give Back The Taxpayer Subsidies It Got Under Title II?

Open source business news: Docker, Inc is Dead. Easy to see this as a run-of-the-mill open source business failure story. But at another level, it's the story of how the existing open source incumbents used open practices to avoid having to bid against each other for an overfunded startup.

If "data is the new oil" where is the resource curse for data? Google Maps’s Moat, by Justin O’Beirne (related topic: once Google has the 3d models of buildings, they can build cool projects: Project Sunroof)

Have police departments even heard of Caller ID Spoofing or Swatting? Kansas Man Killed In ‘SWATting’ Attack

Next time I hear someone from a social site talking about how much they're doing about extremists and misinformation and such, I have to remember to ask: have you adjusted your revenue targets for political advertising down in order to reflect the bad shit you're not doing any more? How Facebook’s Political Unit Enables the Dark Art of Digital Propaganda

Or are you just encouraging the "dark social" users to hide it better?

ICYMI, great performance optimization: Firefox 57 delays requests to tracking domains

Boring: you're operating a 4500-pound death machine. Exciting: three Slack notifications and a new AR game! Yes, Smartphone Use Is Probably Behind the Spike in Driving Deaths. So Why Isn’t More Being Done to Curb It?

I love "nopoly controls entire industry so there is no point in it any more" stories: The Digital Advertising Duopoly Good news on advertising. The Millennials are burned out on advertising—most of what they're exposed to now is just another variant of "creepy annoying shit on the Internet"—but the generation after the Millennials are going to have hella mega opportunities building the next Creative Revolution.

Another must-read for the diversity and inclusion department. 2017 Was the Year I Learned About My White Privilege by Max Boot.

Predictions for 2018

28 December 2017

Bitcoin to the moooon: The futures market is starting up, so here comes a bunch more day trader action. More important, think about all the bucket shops (I even saw an "invest in Bitcoin without owning Bitcoin" ad on public transit in London), legit financial firms, Libertarian true believers, and coins lost forever because of human error. Central bankers had better keep an eye on Bitcoin, though. Last recession we saw that printing money doesn't work as well as it used to, because it ends up in the hands of rich people who, instead of priming economic pumps with it, just drive up the prices of assets. I would predict "Entire Round of Quantitative Easing Gets Invested in Bitcoin Without Creating a Single New Job" but I'm saving that one for 2019. Central banks will need to innovate. Federal Reserve car crushers? Relieve medical deby by letting the UK operate NHS clinics at their consulates in the USA, and we trade them US green cards for visas that allow US citizens to get treated there? And—this is a brilliant quality of Bitcoin that I recognized too late—there is no bad news that could credibly hurt the value of a purely speculative asset.

The lesson for regular people here is not so much what to do with Bitcoin, but remember to keep putting some well-considered time into actions that you predict have unlikely but large and favorable outcomes. Must remember to do more of this.

High-profile Bitcoin kidnapping in the USA ends in tragedy: Kidnappers underestimate the amount of Bitcoin actually available to change hands, ask for more than the victim's family (or fans? a crowdsourced kidnapping of a celebrity is now a possibility) can raise in time. Huge news but not big enough to slow down something that the finance scene has already committed to.

Tech industry reputation problems hit open source. California Internet douchebags talk like a positive social movement but act like East Coast vampire squid—and people are finally not so much letting them define the terms of the conversation. The real Internet economy is moving to a three-class system: plutocrats, well-paid brogrammers with Aeron chairs, free snacks and good health insurance, and everyone else in the algorithmically-managed precariat. So far, people are more concerned about the big social and surveillance marketing companies, but open source has some of the same issues. Just as it was widely considered silly for people to call Facebook users "the Facebook community" in 2017, some of the "community" talk about open source will be questioned in 2018. Who's working for who, and who's vulnerable to the risks of doing work that someone else extracts the value of? College athletes are ahead of the open source scene on this one.

Adfraud becomes a significant problem for end users: Powerful botnets in data centers drove the pivot to video. Now that video adfraud is well-known, more of the fraud hackers will move to attribution fraud. This ties in to adtech consolidation, too. Google is better at beating simple to midrange fraud than the rest of the Lumascape, so the steady progress towards a two-logo Lumascape means fewer opportunities for bots in data centers.

Attribution fraud is nastier than servers-talking-to-servers fraud, since it usually depends on having fraudulent and legit client software on the same system—legit to be used for a human purchase, fraudulent to "serve the ad" that takes credit for it. Unlike botnets that can run in data centers, attribution fraud comes home with you. Yeech. Browsers and privacy tools will need to level up from blocking relatively simple Lumascape trackers to blocking cleverer, more aggressive attribution fraud scripts.

Wannabe fascists keep control of the US Congress, because your Marketing budget: "Dark" social campaigns (both ads and fake "organic" activity) are still a thing. In the USA, voter suppression and gerrymandering have been cleverly enough done that social manipulation can still make a difference, and it will.

In the long run, dark social will get filtered out by habits, technology, norms, and regulation—like junk fax and email spam before it—but we don't have a "long run" between now and November 2018. The only people who could make an impact on dark social now are the legit advertisers who don't want their brands associated with this stuff. And right now the expectations to advertise on the major social sites are stronger than anybody's ability to get an edgy, controversial "let's not SPONSOR ACTUAL F-----G NAZIS" plan through the 2018 marketing budget process.

Yes, the idea of not spending marketing money on supporting nationalist extremist forums is new and different now. What a year.

These Publishers Bought Millions Of Website Visits They Later Found Out Were Fraudulent

No boundaries for user identities: Web trackers exploit browser login managers

Best of 2017 #8: The World's Most Expensive Clown Show

My Internet Mea Culpa

2017 Was the Year I Learned About My White Privilege

With the people, not just of the people

When Will Facebook Take Hate Seriously?

Using Headless Mode in Firefox – Mozilla Hacks : the Web developer blog

Why Chuck E. Cheese’s Has a Corporate Policy About Destroying Its Mascot’s Head

Dozens of Companies Are Using Facebook to Exclude Older Workers From Job Ads

How Facebook’s Political Unit Enables the Dark Art of Digital Propaganda

Salary puzzle

24 December 2017

Short puzzle relevant to some diversity and inclusion threads that encourage people to share salary info. (I should tag this as "citation needed" because I don't remember where I heard it.)

Alice, Bob, Carlos, and Dave all want to know the average salary of the four, but none wants to reveal their individual salary. How can the four of them work together to determine the average? Answer below.

 

 

 

 

 

 

 

 

 

 

 

 

 

Answer

Alice generates a random number, adds it to her salary, and gives the sum to Bob.

Bob adds his salary and gives the sum to Carlos.

Carlos adds his salary and gives the sum to Dave.

Dave adds his salary and gives the sum to Alice.

Alice subtracts her original random number, divides by the number of participants, and announces the average. No participant had to share their real salary, but everyone now knows if they are paid above or below the average for the group.

What we have, what we need

23 December 2017

Stuff the Internet needs: home fiber connections, symmetrical, flat rate, on neutral terms.

Stuff the Internet is going nuts over: cryptocurrencies.

Big problem with building fiber to the home: capital.

Big problem with cryptocurrencies: stability.

Two problems, one solution? Hard to make any kind of currency useful without something stable, with evidence-based value, to tie its value to. Fiat currencies are tied to something of value? Yes, people have to pay taxes in them. Hard to raise capital for "dumb pipe" Internet service because it's just worth about the same thing, month after month. So what if we could combine the hotness and capital-attractiveness of cryptocurrencies with the stability and actual usefulness of fiber?

quick question on tracking protection

18 December 2017

One quick question for anyone who still isn't convinced that tracking protection needs to be a high priority for web browsers in 2018. Web tracking isn't just about items from your online shopping cart following you to other sites. Users who are vulnerable to abusive practices for health or other reasons have tracking protection needs too.

Screenshot from the American Cancer Society site, showing 24 web trackers

Who has access to the data from each of the 24 third-party trackers that appear on the American Cancer Society's Find Cancer Treatment and Support page, and for what purposes can they use the data?

Forbidden words

17 December 2017

You know how the US government's Centers for Disease Control and Prevention is now forbidden from using certain words?

vulnerable
entitlement
diversity
transgender
fetus
evidence-based
science-based

(source: Washington Post)

Well, in order to help slow down the spread of political speech enforcement that is apparently stopping all of us cool innovator type people from saying the Things We Can't Say, here's a Git hook to make sure that every time you blog, you include at least one of the forbidden words.

If you blog without including one of the forbidden words, you're obviously internalizing censorship and need more freedom, which you can maybe get by getting out of California for a while. After all, a lot of people here seem to think that "innovation" is building more creepy surveillance as long as you call it "growth hacking" or writing apps to get members of the precariat to do the stuff that your Mom used to do for you.

You only have to include one forbidden word every time you commit a blog entry, not in every file. You only need forbidden words in blog entries, not in scripts or templates. You can always get around the forbidden word check with the --no-verify command-line option.

Suggestions and pull requests welcome. script on GitHub

Mindless link propagation

16 December 2017

Not much time to blog because work travel, but here is some of the stuff I would have been linking to if I were writing anything. I plan to get started again over the holiday break.

If you want just the linklog feed, it's here: linklog RSS feed

What can possibly go wrong?

Simler and Hanson on Our Hidden Motivations in Everyday Life

Universities spend millions on accessing results of publicly funded research

The “hater” is calling from inside the cap table

How our housing choices make adult friendships more difficult

Former Gawker employees are crowdfunding to relaunch a Gawker.com that’s owned by a nonprofit and funded by readers

‘Data arbitrage is as big a problem as media arbitrage’: Confessions of a media exec

The First Women in Tech Didn’t Leave—Men Pushed Them Out

“Phantom debt” schemers target millions of Americans. After thousands of phone calls, one target got his revenge.

The digital hippies want to integrate life and work – but not in a good way

The Rise of Rust in Dev/Ops

Breaking Cliques at Events

I Made My Shed the Top Rated Restaurant On TripAdvisor

Not Every Kid-Bond Matures

Are bug futures just high-tech piecework?

09 December 2017

Are bug futures just high-tech piecework, or worse, some kind of "gig economy" racket?

Just to catch up, bug futures, an experimental kind of agreement being developed by the Bugmark project, are futures contracts based on the status of bugs in a bug tracker.

For developers: vist Bugmark to find an open issue that matches your skills and interests. Buy a futures contract connected to that issue that will pay you when the issue is fixed. Work on the issue, in the open—then decide if you want to hold your contract until maturity, or sell it at a profit. Report an issue and pay to reward others to fix it

For users: Create a new issue on the project bug tracker, or select an existing one. Buy a futures contract on that issue that will cost you a known amount when the issue is fixed, or pay you to compensate you if the issue goes unfixed. Reduce your exposure to software risks by directly signaling the project participants about what issues are important to you. Invest in futures on an open source market

Bug futures also open up the possibility of incentivizing other kinds of work, such as clarifying and translating bug reports, triaging bugs, writing failing tests, or doing code reviews—and especially arbitrage of bugs from project to project.

Bug futures are different from open source bounty systems, what have been repeatedly tried but have so far failed to take off. The big problem with conventional open source bounty systems is that, as far as I can tell, they fail to incentivize cooperative work, and in a lot of situations might incentivize un-cooperative behavior. If I find a bug in a web application, and offer a bounty to fix it, the fix might require JavaScript and CSS work. A developer who fixes the JavaScript and gets stuck on the CSS might choose not to share partial work in order to contend for the entire bounty. Likewise, the developer who fixes the CSS part of the bug might get stuck on the JavaScript. Because of how bounties are structured, if the two wanted to split the bounty they would need to find, trust, and coordinate with each other. Meanwhile, if the bug was the subject of a futures contract, the JavaScript developer could write up a good commit message explaining how their partial work made progress toward a fix, and offer to sell their side of the contract. A CSS developer could take on the rest of the work by buying out that position.

Futures trading and risk shifts

But will bug futures tend to shift the risks of software development away from the "owners" of software (the owners don't have to be copyright holders, they could be those who benefit from network effects) and toward the workers who develop, maintain, and support it?

I don't know, but I think that the difference between bug trackers and piecework is where you put the brains of the operation. In piecework and the gig economy, the matching of workers to tasks is done by management, either manually or in software. Workers can set the rate at which they work in conventional piecework, or accept and reject tasks offered to them in the gig economy, but only management can have a view of all available tasks.

Bug futures operate within a commons-based peer production environment, though. In an ideal peer production scene, all participants can see all available tasks, and select the most rewarding tasks. Somewhere in the economics literature there is probably a model of task selection in open source development, and if I knew where to find it I could put an impressive LaTeX equation right around here. Of course, open source still has all kinds of barriers that make matching of workers to tasks less than ideal, but it's a good goal to keep in mind.

If you do bug futures right, they interfere as little as possible with the peer production advantage—that it enables workers to match themselves to tasks. And the futures market adds the ability for people who are knowledgeable about the likelihood of completion of a task, usually those who can do the task, to profit from that knowledge.

Rather than paying a worker directly for performing a task, bug futures are about trading on the outcomes of tasks. When participating, you're not trading labor for money, you're trading on information you hold about the likelihood of successful completion of a task. As in conventional financial markets, information must be present on the edges, with the individual participants, in order for them to participate. If a feature is worth $1000 to me, and someone knows how to fix it in five minutes, bug futures could facilitate a trade that's profitable to both ends. If the market design is done right, then most of that value gets captured by the endpoints—the user and developer who know when to make the right trade.

The transaction costs of trading in information tend to be lower than the transaction costs of trading in labor, for a variety of reasons which you will probably believe in to different extents depending on your politics. What if we could replace some direct trading in labor with trading in the outcomes of that labor by trading information? Lower transaction costs, more gains from trade, more value created.

Bug futures series so far

three kinds of open source metrics

07 December 2017

Some random notes about open source metrics, related to work on CHAOSS, where Mozilla is a member and I'm on the Governing Board.

As far as I can tell, there are three kinds of open source metrics.

Impact metrics cover how much value the software creates. Possible good ones include count of projects dependent on this one, mentions of this project in job postings, books, papers, and conference talks, and, of course sales of products that bundle this project.

Contributor reward metrics cover how the software is a positive experience for the people who contribute to it. Job postings are a contributor reward metric as well as an impact metric. Contributor retention metrics and positive results on contributor experience surveys are some other examples.

But impact metrics and contributor reward metrics tend to be harder to collect, or slower-moving, than other kinds of metrics, which I'll lump together as activity metrics. Activity metrics include most of the things you see on open source project dashboards, such as pull request counts, time to respond to bug reports, and many others. Other activity metrics can be the output of natural language processing on project discussions. An example of that is FOSS Heartbeat, which does sentiment analysis, but you could also do other kinds of metrics based on text.

IMHO, the most interesting questions in the open source metrics area are all about: how do you predict impact metrics and contributor reward metrics from activity metrics? Activity metrics are easy to automate, and make a nice-looking dashboard, but there are many activity metrics to choose from—so which ones should you look at?

Which activity metrics are correlated to any impact metrics?

Which activity metrics are correlated to any contributor reward metrics?

Those questions are key to deciding which of the activity metrics to pay attention to. I'm optimistic that we'll be seeing some interesting correlations soon.

Purple box claims another victim

02 December 2017

Linux Journal Ceases Publication. If you can stand it, let's have a look at the final damage.

LJ

40 trackers. Not bad, but not especially good either. That purple box of data leakage—third-party trackers that forced Linux Journal into an advertising race to the bottom against low-value and fraud sites—is not so deep as a well, nor so wide as a church door...but it's there. A magazine that was a going concern in print tried to make the move to the web and didn't survive.

Linux Journal is where I was working when I first started wondering why print ads tend to hold their value while web ads keep losing value. Unfortunately it's not enough for sites to just stop running trackers and make the purple box go away. But there are a few practical steps that Internet freedom lovers can take to stop the purple box from taking out your other favorite sites.

Asking sites to do something about surveillance marketing

18 November 2017

This might get the privacy activists mad at me, but as far as I can tell it's still counterproductive to ask a web site you visit to remove its third-party trackers.

Of course, third-party trackers are probably helping to support a political cause that most sites don't agree with, and, as Zeynep Tufekci says, "We're building a dystopia just to make people click on ads". This stuff needs to get fixed. So this is about productive next steps.

Right now, advertising on the site you're writing to probably isn't saleable without the creepy trackers. (User tracking as Chesterton's Fence) So what can privacy people productively ask sites for? Some good ones are:

  • Fix any "turn off your ad blocker" scripts to detect ad blockers only, and not falsely alert on privacy tools.

  • Remove links to the the confusing and broken "YourAdChoices" site. Adtech opt-outs don't cover all trackers, and are much less effective than real privacy tools. (I have never had all the opt-outs work on that site, even from a fresh, pristine browser. Somehow I get the sense that the adtech firms don't exactly put their best people on it.)

  • Link to the privacy pages for the third parties the site uses. If the advertising on the site is set up so that this is hard to do, and users might see a tracker from an unknown domain, say so.

  • Fix up the privacy page to add links to appropriate privacy tools based on the user's browser. Better to have users on privacy tools than get enrolled in a paid whitelisting scheme.

  • If you maintain a privacy tool, offer to do a campaign with the site. Privacy tool users are high-quality human traffic. Free or discounted privacy tools might work as a subscription promotion. Where's the win-win?

Asking a site to walk away from money with no credible alternative is probably not going to work. Asking a site to consider next steps to get out of the current web advertising mess? That might.

More: What The Verge can do to help save web advertising

Time-saving tip for Firefox 57

13 November 2017

(updated 21 Nov 2017: made the words "even faster" a link to an article with graphs.)

Last time I recommended the Tracking Protection feature in Firefox 57, coming tomorrow. The fast browser is even faster when you block creepy trackers, which are basically untested combinations of third-party JavaScript.

But what about sites that mistakenly detect Tracking Protection as "an ad blocker" and give you grief about it? Do you have to turn Tracking Protection off?

So far I have found that the answer is usually no. I can usually use NJS to turn off JavaScript for that site instead. (After all, if a web developer can't tell an ad blocker from a tracking protection tool, I don't trust their JavaScript anyway.)

NJS will also deal with a lot of "growth hacking" tricks such as newsletter signup forms that appear in front of the main article. And it defaults to on, so that sites with JavaScript will work normally until I decide that they're better off without it.

Entering the Quantum Era—How Firefox got fast again and where it’s going to get faster by Lin Clark

How to turn Tracking Protection on

I'm taking a Bitcoin risk even though I don't hold Bitcoin. Please regulate me.

13 November 2017

In the country where I live, kidnapping for ransom is not a very common crime.

That's because picking up the ransom is too risky.

It's easy to kidnap someone, and easy to let the person go when the ransom is paid, but picking up the ransom exposes you. Wannabe kidnappers who are motivated by money tend to choose other crimes.

As the [family relationship redacted] of a [family member information redacted], I'm happy that kidnapping is difficult here. High transaction costs for some kinds of transaction are a good thing.

Now, here comes Bitcoin.

As we're already seeing with ransomware, harder-to-trace ransom drops are now a thing.

So, even though I don't actually hold Bitcoin, someone could grab my family member (low risk), demand that I exchange some of my conventional assets for Bitcoin (low risk) and send the Bitcoin as ransom (low risk). The balance between risk and reward for the crime of kidnapping for ransom has changed.

IMHO this is a bigger problem than any of the reasons that Charles Stross wants Bitcoin to die in a fire.

So what to do about it?

Move the risks where the profits are.

Make the Bitcoin business eat the costs of payments made under duress.

New rule: If I ever trade any assets for Bitcoin in order to comply with a threat, and then transfer the Bitcoin under duress (kidnapping, ransomware, whatever), then I can go back to whoever I gave the assets to with a copy of the police report on the incident and get my original assets (and any fees) back.

Yes, that makes it harder for regular people to trade assets for Bitcoin. Exchanges would have to hold the money for a while, check that I'm not under duress, and probably do all kinds of other pain-in-the-ass, possibly costly, work. But I'd rather have that than the alternative.

my Firefox 57 add-ons

11 November 2017

Firefox 57 is coming on Tuesday, and as you may have heard, add-ons must use the WebExtensions API. I have been running Firefox Nightly for a while, so add-on switching came for me early. Here is what I have come up with.

The basic set

Privacy Badger is not on here just because I'm using Firefox Tracking Protection. I like both.

Blogging, development and testing

  • blind-reviews. This is an experiment to help break your own habits of bias when reviewing code contributions. It hides the contributor name and email when you first see the code, and you can reveal it later. Right now it just does Bugzilla, but watch this space for an upcoming GitHub version. (more info)

  • Copy as Markdown. Not quite as full-featured as the old "Copy as HTML Link" but still a time-saver for blogging. Copy both the page title and URL, formatted as Markdown, for pasting into a blog.

  • Firefox Pioneer. Participate in Firefox user research. Studies have extremely strict and detailed privacy policies.

  • Test Pilot. Try new Firefox features. Tracking Protection was on Test Pilot for a while. Right now there is a new speech recognition one, an in-browser notepad, and more.

Advanced (for now) nerdery

  • Cookie AutoDelete. Similar to the old "Self-Destructing Cookies". Cleans up cookies after leaving a site. Useful but requires me to whitelist the sites where I want to stay logged in. More time-consuming than other privacy tools.

  • PrivacyPass. This is new. Privacy Pass interacts with supporting websites to introduce an anonymous user-authentication mechanism. In particular, Privacy Pass is suitable for cases where a user is required to complete some proof-of-work (e.g. solving an internet challenge) to authenticate to a service. Right now I don't use any sites that have it, but it could be a great way to distribute "tickets" for reading articles or leaving comments.

Note on ad blocking

If you run an ad blocker, the pre-57 add-ons check is a good time to make sure that you're not compromising your privacy by participating in a paid whitelisting scheme. As long as you have to go through your add-ons anyway, it's a great time to ditch AdBlock Plus or Adblock. They're taking advantage of users to shake down web sites.

What to use instead? For most people, either the built-in Firefox Tracking Protection or EFF's Privacy Badger will provide good protection. I would try one or both of those before a conventional ad blocker. If sites have a broken ad blocker detector that falsely identifies a tracking protection tool as an ad blocker, you can usually get around it by turning off JavaScript for that site with NJS.

If you still want to get rid of more ads and join the blocker vs. anti-blocker game (I don't), there's always uBlock Origin, which does not do paid whitelisting. The project site has more info). But try either the built-in tracking protection or Privacy Badger first.

New Firefox Quantum arrives November 14, 2017

Firefox Quantum 57 for developers

Welcome Planet Mozilla readers

10 November 2017

Welcome Planet Mozilla readers. (I finally figured out how to do a tagged feed for this blog, to go along with the full feed. So now you can get the items from the tagged feed on Planet Mozilla.)

The main feed has some items that aren't in the Mozilla feed.

Anyway, if you're coming to Austin, please mark your calendar now.

Two more links: I'm on Keybase and Mozillians. And @dmarti on Twitter.

World's last web advertising optimist tells all!

03 November 2017

It's getting hard to explain still taking web advertising seriously in 2017, so I had better write something down. To start with, what is web advertising exactly?

Doesn't sound good so far. Maybe I'm a fool to be the last advertising optimist on the web. (See, for example: me, running my mouth about how great advertising is, to an audience of web publishers looking to write it off and move on.)

From the point of view of users, web advertising has failed to hold up its end of the signal for attention bargain, and substituted nasty attempts at manipulation. No wonder people block it.

From the point of view of clients, web advertising has failed to meet the basic honesty standards that any third-rate print publication can. And every web advertising company is calling fraud an industry-wide problem, which is what business people say when they really don't care about fixing something.

From the point of view of publishers, web advertising has failed to show the proverbial money. It's stuck at a fraction of the value per user minute that print can pull in, which means that as print goes away, so does the ad money.

Web advertising has failed the audience, the advertisers, and the people who make ad-supported news and cultural works. Maybe I should go be a fan of something else, like securitizing bug trackers or something. Web advertising just is that annoying, creepy thing that browsers are competing to block in different, creative, ways. [T]he online ad sector transitioned from a creative-led industry to a data and algorithms-led industry, wrote venture capitalist Adam Fisher, who is understandably proud of not investing in it.

Some new companies, such as Scroll, are all about making it easier for readers to buy out of seeing advertising. Advertising is to web sites as annoying "UNREGISTERED SHAREWARE" banners and dialogs are to computer software.

On Twitter, what does the "verified" blue checkmark get you? A ticket out of Twitter's world-classedly crappy advertising.

At least search advertising is working. Bob Hoffman calls it a "much better yellow pages." But any kind of brand-building, signal-carrying advertising, where most of the money is? Not there. Ever notice how much of the evidence for "data-driven" advertising is anecdotal?

Is anyone speaking up for web advertising? Not really. Where advertising still has a policy voice, it's a bunch of cut-and-paste anti-privacy advocacy that sounds like what you might get from eighth grade Libertarians, or from people who are so bad at math they assume that it's humanly possible to read and understand Terms of Service from 70 third-party trackers on one web page. The Interactive Advertising Bureau has become the voice of schemes that are a few pages of fine print away from malware and spam. By expanding to include members whose interests oppose those of legit publishers and advertisers, and defending every creepy user privacy violation scheme that the worst members come up with, an organization that could have been a voice for pro-advertising policy positions has made itself meaningless. Right now the IAB is about as relevant to web advertising policy as the Tetraethyl Lead Industry Association is relevant to transportation policy.

Bad news all the way around, right? But some of us have been somewhere like this before.

Remember the operating systems market in the late 1990s?

In 1998, Unix was on the way out.

All the right-thinking people were going Windows NT.

Yes, even Tim O'Reilly, who built version 1.0 of his company on Unix, had apparently written it off. The spring 1998 O’Reilly catalog had all Windows books on the cover, and the Unix stuff was in back. O’Reilly and Associates was promoting the company’s first and only shrink-wrap software, a web server for Windows NT.

And why not? Bickering Unix vendors were doing short-sighted stunts such as removing the compiler from the basic version, and charging hard-to-justify prices for workstations and servers that users could beat with a properly-configured PC. Who needed it?

We know what happened shortly after that. The Unix scene Did anyone ever make a "Lumascape"-like chart of the Unix vendors? faded away and, with enough drama to make for good IT news coverage but not enough to interfere with successful efforts to fix the Year 2000 Problem, the Linux scene replaced it.

The good news is that people employed in the Unix scene were able to move, in most cases happily, to the Linux scene. (Which is big enough that it has become the OS for the "IoT", "Saas" and "Cloud" businesses, and a majority of "mobile" by units, but not of course profits) So maybe my experience living through the end of Unix is why I'm still a web advertising optimist. The economic niche for advertising hasn't gone away. Just as software had to get some important licensing and API decisions right in order to make the Linux boom happen, web advertising is so close to getting it right, too. Now that we know the basics...

  1. People have norms about data sharing. Browsers must reflect those norms or get replaced.

  2. People enjoy ad-supported news, cultural works, and services, and will tolerate ads that hold up their end of the bargain.

  3. People don't like to micromanage their attention and privacy, and expect companies they deal with to cover the costs of coming into compliance with norms.

...the next steps are coming together pretty quickly.

Forget iPhone X–Apple's Best Product Is Its Privacy Stance

Five Books to Make You Less Stupid About the Civil War

The Atlantic Made $0.004 From Russian Ads

Coders of the world, unite: can Silicon Valley workers curb the power of Big Tech?

Silicon Valley helped Russia sway the US election. So now what? | Emily Bell

Direct ad buys are back in fashion as programmatic declines

Why we need a 21st-century Martin Luther to challenge the church of tech

Firefox takes a bite out of the canvas ‘super cookie’

We need to think more about advertising

Three ways of re-creating Firefox Focus behavior on Firefox desktop

Need a super, super secure way to access The New York Times site? Now you can try it via a Tor Browser

Twitter urged firms to delete data during 2016 campaign

‘The art of buying crap’: The Guardian wants publishers to unite to clean up programmatic

The advertising industry has been living a lie

Consent to use personal data has no value unless one prevents all data leakage

Civil, the blockchain-based journalism marketplace, is building its first batch of publications

What Facebook Did to American Democracy

The Great Ad Tech Cleanup

How Silicon Valley’s Dirty Tricks Helped Stall Broadband Privacy in California

When the Facebook Traffic Goes Away

This new Twitter account hunts for bots that push political opinions

Publishers are struggling to monetize the ‘Trump bump’ as advertisers avoid controversial content

Med Men: where the parody lies

Always run a shell script from the directory it lives in

01 November 2017

Always run a shell script in the directory in which it appears, and change back to the directory you were in when you ran it even if it fails.

trap popd EXIT
pushd $PWD
cd $(dirname "$0")

Works for me in bash. The pushd command does a cd but saves the directory where you were on a stack, and popd pops the saved directory from the stack. The trap ... EXIT is a bash way to run something when the script exits, no matter how, and dirname "$0" is the directory name of the script.

(Taken from the deploy.sh script that rebuilds and deploys this blog, so if you can read this, it works.)

Fun with the spawn of Git and NoSQL

26 October 2017

Hey, kids, check out the latest progress on the Attaca version control system.

What's this? It's basically the spawn of Git and a NoSQL database. So why would anybody want to make that? For Science, of course. A lot of research produces huge data files, and people would like to have a resilient way to collaborate on them, using commands they already know—but have it scale horizontally across large numbers of nodes, NoSQL style.

Git has the advantage that a lot of people know it, but it doesn't really handle huge files that well. There are add-on solutions to make it work by connecting to another system for handling large files, but then you have to set up and trust two systems. And one of my favorite properties of Git is that any authorized user of a project can check the integrity of the entire project back to the beginning.

So what Attaca does is to consistently split huge files across a cluster, using cluster nodes that can be cheap VPSs, low-end servers with spinning disks, whatever. (In the test environment, nodes are just Linux containers.)

More: The architecture of Attaca, milestones, and current progress.

Next steps are to test it out with some scientific data (genomes, medical imaging, and so on), implement some more Git commands so that people can check files out and not just in, and build a (Raspberry Pi?) demo cluster.

See you in London

25 October 2017

Coming to Mozfest in London?

Please stop by our demo of Trading futures, fixing bugs: a live Smart Contracts installation.

What is it?

Bugmark is a market that connects people who want better software to the people who can build it.

In order to make open collabration more effective, we are using simple market mechanisms to add incentives to do useful work.

Bugmark allows you to

  1. Put financial value directly in the hands of the people who can fix the software issues that are most important to you.

  2. Discover which issues really matter to your project's users.

  3. Work with open source practices and not against them.
    Solve part of a problem and still get paid, instead of contending to claim credit for a bounty payment.

Find an issue, fix it, and earn money

Vist Bugmark to find an open issue that matches your skills and interests. Buy a futures contract connected to that issue that will pay you when the issue is fixed. Work on the issue, in the open—then decide if you want to hold your contract until maturity, or sell it at a profit.

Report an issue and pay to reward others to fix it

Create a new issue on the project bug tracker, or select an existing one. Buy a futures contract on that issue that will cost you a known amount when the issue is fixed, or pay you to compensate you if the issue goes unfixed. Reduce your exposure to software risks by directly signaling the project participants about what issues are important to you.

Invest in futures on an open source market

Development isn't the only task required to make a software project a success. You can trade futures to earn a profit from other vital tasks, such as clarifying and translating bug reports, triaging bugs, writing failing tests, or doing code reviews.

ICYMI: AdLeaks

25 October 2017

Looking for a way to get dedicated readers to un-block some of the ads on your site? One way could be to update and integrate the AdLeaks system:

Our ads contain code that encrypts an empty message with the AdLeaks public key and sends the ciphertext back to AdLeaks. This happens on all users' web browsers. A whistleblower's browser substitutes the ciphertext with encrypted parts of a disclosure. The protocol ensures that an adversary who can eavesdrop on the network communication cannot distinguish between the transmissions of regular browsers and those of whistleblowers' browsers.

More info in the paper: That link goes to the Arxiv Vanity version of the paper. Now that we can read more Science on our phones I'm expecting the rate of progress toward the Singularity to increase by quite a bit. A Secure Submission System for Online Whistleblowing Platforms

Naturally sites would want to encourage whistleblowers (and others) to block the regular creepy ad trackers—but building post-creepy ads and hooking this up to them could be a way to encourage the dedicated readers to treat the high-reputation ads differently from the low-reputation ones.

Tofu, hogs, and brand-safe news

22 October 2017

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

The following is an interesting business model, so I'm going to tell it whether it's true or not. I once talked with a guy from rural China about the tofu business when he was there. Apparently, considering the price of soybeans and the price you can get for the tofu, you don't earn a profit just making and selling tofu. So why do it? Because it leaves you with a bunch of soybean waste, you feed that to pigs, and you make your real money in the hog business.

Which is sort of related to the problem that (all together now) hard news isn't brand-safe. It's hard to sell travel agency ads on a plane crash story, or real estate ads on a story about asbestos in the local elementary schools, or any kind of ads on a disturbing, but hard to look away from, political scene.

In the old-school newspaper business, the profitable ads can go in the lifestyle or travel sections, and subsidize the hard news operation. The hard news is the tofu and the brand-friendly sections are the hogs.

On the web, though, where you have a lot of readers coming in from social sites, they might be getting their brand-friendly content from somewhere else. Sites that are popular for their hard news are stuck with just the tofu.

This is one of the places where it's going to be interesting to watch the shift from unpermissioned user data collection to user data sharing by permission. As people get better control of how they share data with sites—whether that's through regulation, browsers scrambling for users, or both—how will a site's ability to deliver trustworty hard news give it an advantage?

The browser may have to adapt to treat trustworthy and untrustworthy sites differently, in order to come up with a good balance of keeping sites working and implementing user norms on data sharing. Will news sites that publish hard news stories that are often visited, shared, and commented on, get a user data advantage that translates into ad saleability for their more brand-safe pages? Does better user data control mean getting the hog business back?

Open practices and tracking protection

19 October 2017

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

Browsers are going to have to change tracking protection defaults, just because the settings that help acquire and retain users are different from the current defaults that leave users fully trackable all the time. (Tracking protection is also an opportunity for open web players to differentiate themselves from mobile tracking devices.)

Before switching defaults, there are a bunch of opportunities to do collaboration and data collection in order to make the right choices and increase user satisfaction and trust (and retention). Interestingly enough, these tend to give an advantage to any browser that can attract a diverse, opinionated, values-driven user base.

So, as a followup on applying proposed principles for content blocking, some ways that a browser can prepare to make a move on tracking protection.

  • Build APIs that WebExtensions developers can use to change privacy-related behaviors. (WebExtension API for improved tracking protection, API for managing tracking protection, Implement browser.privacy.trackingProtection API). Use developer relations with the privacy tools scene.

  • Do innovation challenges and crowdsourcing for tracking protection tools. Use the results to expand the available APIs and built-in options.

  • Develop a variety of tracking protection methods, and ship them in a turned-off state so that motivated users can find the configuration and experiment with them, and to enable user research. Borrow approaches from other browsers (such as Apple Safari) where possible, and test them.

  • For example: avoid blocklist politics, and increase surveillance marketing uncertainty, by building Privacy-Badger-like tracker detection. Enable tracking protection without the policy implications of a top-down list. This is an opportunity for a crowdsourcing challenge: design better algorithms to detect trackers, and block them or scramble state.

  • Ship alternate experimental builds of the browser, with privacy settings turned on and/or add-ons pre-installed.

  • Communicate a lot about capabilities, values, and research. Spend time discussing what the browser can do if needed, and discussing the results of research on how users prefer to share their personal info.

  • Only communicate a little about future defaults. When asked about specifics, just say, "we'll let the user data help us make that decision." (Do spam filters share their filtering rules with spammers? Do search engines give their algorithms to SEO consultants?)

  • Build functionality to "learn" from the user's activity and suggest specific settings that differ from the defaults (in either direction). For example, suggest more protective settings to users who have shown an interest in privacy—especially users who have installed any add-on whose maintainers misrepresent it as a privacy tool.

  • Do research to help legit publishers and marketers learn more about adfraud and how it is enabled by the same kinds of cross-site tracking that users dislike. As marketers better understand the risk levels of different approaches to web advertising, make it a better choice to rely less on highly intrusive tracking and more on reputation-driven placements.

  • Provide documentation and tutorials to help web developers develop and test sites that will work in the presence of a variety of privacy settings. "Does it pass Privacy Badger" is a good start, but more QA tools are needed.

If you do it right, you can force up the risks of future surveillance marketing just by increasing the uncertainty of future user trackability, and drive more marketing investment away from creepy projects and toward pro-web, reputation-driven projects.

Notes and links from my talk at RJI

13 October 2017

This is OFF MESSAGE. No Mozilla policy here. This is my personal blog.

(This is the text from my talk at the Reynolds Journalism Institute's Revenue Models that Work event, with some links added. Not exactly as delivered.)

Hi. I may be the token advertising optimist here.

Before we write off advertising, I just want to try to figure out the answer to: why can't Internet publishers make advertising work as well as publishers used to be able to make it work when they were breathing fumes from molten lead all day? Has the Internet really made something that much worse?

I have bought online advertising, written and edited for ad-supported sites, had root access to some of the servers of an adtech firm that you probably have cookies from right now, and have written an ad blocker. Now I work for Mozilla. I don't have any special knowledge on what exactly Mozilla intends to do about third-party cookies, or fingerprinting, or ad blocking, but I can share some of what I have learned about users' values, and some facts about the browser business that will inform those decision for Mozilla and other browsers.

First of all, I want to cover how new privacy tools are breaking web advertising as we know it. But that's fine. People don't like web advertising as we know it.

So what don't they like?

A 2009 study at the University of Pennsylvania came up with the result that "most adult Americans do not want advertisers to tailor advertisements to their interests."

When the researchers explained how ad targeting works, the percentage went up.

We have known for quite a while that people have norms about how they share their personal information.

Pagefair study

That Pennsylvania study isn't the only one. Just recently a company called Pagefair did a survey on when people would choose to share their info on the web.

Research result: what percentage will consent to tracking for advertising? | PageFair

They surveyed 300 publishers, adtech people, brands, and various others, on whether users will consent to tracking under the GDPR and the ePrivacy Regulation.

Some examples:

The survey asked if users would allow for tracking on one site only, and for one brand only, in addition to “analytics partners”. 79% of respondents said they would click “No” to this limited consent request.

And what kind of tracking policy would people prefer in the browser by default? The European Parliament suggested that “Accept only first party tracking” should be the default. But only 20% of respondents said they would select this. Only 5% were willing to “accept all tracking”. 56% said they would select “Reject tracking unless strictly necessary for services I request”. The very large majority (81%) of respondents said they would not consent to having their behaviour tracked by companies other than the website they are visiting.

Users say that they really don't like being tracked. So, right about now is where you should be pointing out that what people say about what they want is often different from what they do.

It's hard to see exactly what people do about particular ads, but we can see some indirect evidence that what people do about creepy ads is consistent with what they say about privacy.

  • First, ad blockers didn't catch on until people started to see retargeting.

  • Second, companies indirectly reveal their user research in policies and design decisions.

Back in 1998, when Google was still "google.stanford.edu" I wrote an ad blocker. And there were a bunch of other pretty good ones in the late 1990s, too. WebWasher, AdSubtract, Internet Junkbuster. But none of that stuff caught on. That was back when most people were on dialup, and downloading a 468x60 banner ad was a big deal. That's before browsers came with pop-up blockers, so a pop-up was a whole new browser window and those could get annoying real fast.

But users didn't really get into ad blocking. What changed between then and now? Retargeting. People could see that the ad on one site had "followed them" from a previous site. That creeped them out.

Some Facebook research clearly led in the same direction.

As we should all know by now, Facebook enables an extremely fine level of micro-targeting.

Yes, you can target 15 people in Florida.

But how do the users feel about this?

We can't see Facebook's research. But we can see the result of it, in Facebook Advertising Policies. If you buy an ad on Facebook, you can target people based on all kinds of personal info, but you can't reveal that you did it.

Ads must not contain content that asserts or implies personal attributes. This includes direct or indirect assertions or implications about a person’s race, ethnic origin, religion, beliefs, age, sexual orientation or practices, gender identity, disability, medical condition (including physical or mental health), financial status, membership in a trade union, criminal record, or name.

So you can say "meet singles near you" but you can't say "other singles". You can offer depression counseling in an ad, but you can't say "treat your depression."

Facebook is constantly researching and tweaking their site, and, of course, trying to sell ads. If personalized targeting didn't creep people the hell out, then the ad policy wouldn't make you hide that you were doing it.

Mozilla

All right, so users don't want to be followed around.

Where does Mozilla come in?

Well, Mozilla is supposed to be all about data privacy for the user. We have these Data Privacy Principles

  1. No surprises Use and share information in a way that is transparent and benefits the user.

  2. User control Develop products and advocate for best practices that put users in control of their data and online experiences.

  3. Limited data Collect what we need, de-identify where we can and delete when no longer necessary.

  4. Sensible settings Design for a thoughtful balance of safety and user experience.

  5. Defense in depth Maintain multi-layered security controls and practices, many of which are publicly verifiable.

If you want a look at what Mozilla management is thinking about the tracking protection slash ad blocking problem, there's always Proposed Principles for Content Blocking by Denelle Dixon.

  • Content Neutrality: Content blocking software should focus on addressing potential user needs (such as on performance, security, and privacy) instead of blocking specific types of content (such as advertising).

  • Transparency & Control: The content blocking software should provide users with transparency and meaningful controls over the needs it is attempting to address.

  • Openness: Blocking should maintain a level playing field and should block under the same principles regardless of source of the content. Publishers and other content providers should be given ways to participate in an open Web ecosystem, instead of being placed in a permanent penalty box that closes off the Web to their products and services.

If we have all those great values though, why aren't we doing more to protect users from tracking?

Here's the problem from the browser point of view.

Firefox had a tracking protection feature in 2015.

Firefox had a proposed "Cookie Clearinghouse" that was going to happen with Stanford, back in 2013. Firefox developers were talking about third-party cookie blocking then, too.

Microsoft beat Mozilla to it. Microsoft Internet Explorer released Tracking Protection Lists in version 9, in 2011.

But the mainstream browsers have always been held back by two things.

First, browser developers have been cautious about not breaking sites. We know that users prefer not to be tracked from site to site, but we know that they get really mad when a site that used to work just stops working. There is a lot of code in a lot of browsers to handle stuff that no self-respecting web designer has done for decades. Remember the 1996 movie "Space Jam"? Check out the web site some time. It's a point of pride to keep all that 1996 web design working. And seriously, one of those old 1996 vintage pages might be the web-based control panel for somebody's emergency generator, or something. Yes, browsers consider the users' values on tracking, but priority one is not breaking stuff.

And that includes third-party resources that are not creepy ad trackers—stuff like shopping carts and comment forms and who knows what.

Besides not breaking sites, the other thing that keeps browsers from implementing users' values on tracking is that we know people like free stuff. For a long time, browsers didn't have enough good data, so have deferred to the adtech business when they talk about how sites make money. It looks obvious, right? Sites that release free stuff make money from ads, ads work a certain way, so if you interfere with how the ads work, then sites make less money, and users don't get the free stuff.

Mozilla backed down on third-party cookies in 2013, and again on tracking protection in 2015.

Microsoft backed down on Tracking Protection Lists.

Both times, after the adtech industry made a big fuss about it.

So what changed? Why is now different?

Well, that's an easy answer, right? Apple put Intelligent Tracking Prevention into their Safari browser, and now everybody else has to catch up.

Apple so far is putting their users ahead of the usual alarmed letters from the adtech people. Steven Sinofsky, former president of the Windows Division at Microsoft, tweeted,

But that's not all of it.

You're going to see other browsers make moves that look like they're "following Safari" but really, browsers are not so much following each other as making similar decisions based on similar information.

When users share their values they say that they want control over their information.

When users see advertising that seems "creepy" we can see them take steps to avoid ads following them around.

Some people say, well, if users really want privacy, why don't they pay for privacy products? That's not how humans work. Users don't pay for privacy, because we don't pay other people to come into compliance with basic social norms. We don't pay our loud neighbors to quiet down.

Apple does lots of user research. I believe they're responding to what their users say.

Apple looks like a magic company that releases magic things that they make up out of their own heads. "Designed by Apple in California." This is a great show. It's part of their brand. I have a lot of respect for their ability to make things look simple.

But that doesn't mean that they just make stuff up.

Apple does a lot of user research. Every so often we get a little peek behind the curtain when there is discovery in a lawsuit. They do research on their own users, on Samsung's users, everybody.

Mozilla has user research, too.

For a long time, browser people thought that there was a conflict between giving the users something that complies with their tracking norms and giving them something that keeps them happy with the sites they want to use.

But now it turns out that we have some ways that we could act in accordance with user values that also produce measurably more satisfied users.

How badly does privacy protection break sites?

Mozilla's testing team has built, deployed to users, and tested nine different sets of cookie and tracking protection policies.

Lots of people thought there are going to be things that break sites and protect users, or leave sites working and leave users vulnerable.

It turns out that there is a configuration that gives both better values alignment and less breakage.

Because a lot of that breakage is caused by third-party JavaScript.

We're learning that in a few important areas, even though Apple Safari is in the lead, Apple's Intelligent Tracking Prevention doesn't go far enough.

What users want

It turns out that when you do research with people who are not current users of ad blockers, and offer them choices of features, the popular choices are tracking blockers, malvertising protection, and blocking annoying ads such as auto-play videos. Among those users who aren't already using an ad blocker, the offer of an ad blocker wasn't as popular.

Yes, people want to see fewer annoying ads. And nobody likes malware. But people are also interested in protection from tracking. Some users even put tracking protection ahead of malvertising protection.

If you only ask about annoying ad formats you get a list of which ad formats are popular now but get on people's nerves. This is where Google is now. I have no doubt that they'll catch up. Everyone who’s ever moderated a comment section knows what the terrible ads are. And any publisher has the motivation to moderate and impose standards on the ads on their site. Finding which ads are the crappy ones are not the problem. The problem is that legit sites and crappy sites are in the same ad space market, competing for the same eyeballs. As a legit site, you have less market power to turn down an ad that does not meet your policies.

We are coming to an understanding of where users stand. In a lot of ways we're repeating the early development of spam filters, but in slow motion.

Today, a spam filter seems like a must-have feature for any email service. But MSN started talking about its spam filtering back when Sanford Wallace, the “Spam King,” was saying stuff like this.

I have to admit that some people hate me, but I have to tell you something about hate. If sending an electronic advertisement through email warrants hate, then my answer to those people is “Get a life. Don’t hate somebody for sending an advertisement through email.” There are people out there that also like us.

According to spammers, spam filtering was just Internet nerds complaining about something that regular users actually like. But the spam debate ended when big online services, starting with MSN, started talking about how they build for their real users instead of for Wallace’s hypothetical spam-loving users.

If you missed the email spam debate, don’t worry. Wallace’s talking points about spam filters constantly get recycled by the IAB and the DMA, every time a browser makes a move toward tracking protection. But now it’s not email spam that users supposedly crave. Today, they tell us that users really want those ads that follow them around.

So here's the problem. Users are clear about their values and preferences. Browsers must reflect user values and preferences. Browsers have enough of a critical mass of users demanding better protection from tracking that browsers are going to have to move or become irrelevant.

That's what the email providers did on spam. There were not enough pro-spam users to support an email service without a spam filter.

And there may not be enough pro-targeting users to support a browser without privacy tools.

As I said, I do not know exactly how Mozilla is going to handle this, but every browser is going to have to.

But I can make one safe prediction.

Browsers need users. Users prefer tracking protection. I'm going to make a really stupid, safe prediction here.

User adoption of tracking protection will not affect the amount of user data available, or affect any measurement of number of targeted ad impressions available in any way.

Every missing trackable user will be replaced by an adfraud bot.

Every missing piece of user data will be replaced by an "inferred" piece of data.

How much adfraud is there really?

There are people who will stand up and say that we have 2 percent fraud, or 85 percent. Of course it's different from campaign to campaign and some advertisers get burned worse than others.

You can see "IAS safe traffic" on fraud boards. Because video views are worth so much more, the smartest bots go there. We do know that when you look for adfraud seriously, you can find it. Just recently the Financial Times found a bunch.

The publisher has found display ads against inventory masquerading as FT.com on 10 separate ad exchanges and video ads on 15 exchanges, even though the FT doesn’t even sell video ads programmatically, with 300 accounts selling inventory purporting to be the FT’s. The scale of the fraud uncovered is vast — the equivalent of one month’s supply of bona fide FT.com video inventory was fraudulently appearing in a single day.

The FT warns advertisers after discovering high levels of domain spoofing

If you were trying to build an advertising business to facilitate fraud, you could not do much better than the current system.

That's because the current web advertising system is based on tracking users from high-value sites to low-value sites. Walt Mossberg recounts a dinner conversation with an advertiser:

[W]e were seated next to the head of this advertising company, who said to me something like, "Well, I really always liked AllThingsD and in your first week I think Recode’s produced some really interesting stuff." And I said, "Great, so you’re going to advertise there, right? Or place ads there." And he said, "Well, let me just tell you the truth. We’re going to place ads there for a little bit, we’re going to drop cookies, we’re going to figure out who your readers are, we’re going to find out what other websites they go to that are way cheaper than your website and then we’re gonna pull our ads from your website and move them there."

The current web advertising system is based on paying publishers less, charge brands more. Revenue share for legit publishers is at 30 to 40 percent according to the Association of National Advertisers. But all revenue split numbers are wrong because undetected fraud ends up in the ‘publisher’ share.

When your model is based on data leakage, on catching valuable eyeballs on cheap sites, the inevitable overspray is fraud.

People aren't even paying attention to what could be the biggest form of adfraud.

Part of the conventional wisdom on adfraud is that you can beat it by tracking users all the way to a sale, and filter the bots out that way. After all, if they made a bot good enough to actually buy stuff it wouldn't be a problem for the client.

But the attribution models that connect impressions to sales are, well, they're hard enough to understand that most of the people who understand them are probably fraud hackers.

The dispute betwen Steelhouse and Criteo settled last year, so we didn't get to see how two real adtech companies might or might not have been hacking each other's attribution numbers.

But today we have another chance.

I used to work for Linux Journal, and we followed the SCO case pretty intently. There was even a dedicated news site just about the case, called Groklaw. If there's a case that needs a Groklaw for web advertising, it's Uber v. Fetch.

Unwanted ads on Breitbart lead to massive click fraud revelations, Uber claims | Ars Technica

This is the closest we have to a tool to help us understand attribution fraud. When the bad guys have the ability to make bogus ads claim credit for real sales, that's a much more powerful motivation for fraud than just making a bot that looks like a real user watching a video.

Legit publishers have a real incentive to find and control adfraud. Adtech intermediaries, not so much. That's because the core value of ad tech is to find the big money user at the cheapest possible site. If you create that kind of industry, you create the incentive for fraud bots who appear to be members of a valuable audience. You create incentives to produce fraudulent sites because all of a sudden, those kinds of sites have market value that they would not otherwise have had because of data leakage.

As browsers and sites implement user norms on tracking, they get fraud protection for free.

So where is the outrage on adfraud?

I thought I could write a script for a heist movie about adfraud.

At first I thought, this is awesome! Computer hacking, big corporations losing billions of dollars—should be a formula for an awesome heist movie, right?

Every heist movie has a bunch of scenes that introduce the characters, you know, getting the crew together. Forget it. All the parts of adfraud can be done independently and connected on the free market. It's all on a bunch of dumb-looking PHP web boards. There go a whole bunch of great scenes.

Hard-boiled detectives trying to catch the gang? More like over easy. The adtech industry "committed $1.5 million in funding" (and set up a 24-member committee!) to fight an eleven billion dollar problem. Adfraud isn't taking candy from a baby, it's taking candy from a dude whose job is giving away candy. More fraud means more money for adtech intermediaries.

Dramatic risk of getting caught? Not a chance of going to prison—the worst that happens is that some of the characters get their accounts or domains banned, and they have to make new ones. The adfraud movie's production designer is going to have to work awful hard to make that "Access denied" screen look cool enough to keep the audience awake.

So the movie idea is a no-go, but as people learn that today's web ads don't just leave the publisher with 30 percent but also feed fraud, we should see a flight to quality effect.

The technical decisions that enabled the Lumascape to rip off Walt Mossberg are the same decisions that facilitate fraud, are the same decisions that make users come looking for tracking protection.

I said I was an advertising optimist and here's why.

The tracking protection trend is splitting web advertising.

We have the existing high-tracking, high-fraud market and a new low-tracking opportunity.

Some users are getting better protected from cross-site tracking.

The bad news is that it will be harder to serve those users a lucrative ad enabled by third-party tracking data.

The good news is that those users can't be tracked from high-value to low-value sites. Those users start to become possible to tell apart from fraudbots.

For that subset of users, web advertising starts to shift from a hacking game to a reputation game.

In order to sell advertising you need to give the advertiser some credible information on who the audience is. Most browsers have been bad at protecting personal information about the user, so web advertising has become a game where a whole bunch of companies compete to covertly capture as much user info as they can.

But some browsers are getting better at implementing people’s preferences about sharing their information. The result, for those users, is a change in the rules of the game. Investment in taking people’s personal info is becoming less rewarding, as browsers compete to reflect people’s preferences.

And investments in building sites and brands that are trustworthy enough for people to want to share their information will tend to become more rewarding. This shift naturally leads to complaints from people who are used to winning the old game, but will probably be better for customers who want to use trustworthy brands and for people who want to earn money by making ad-supported news and cultural works.

There are people building a new web advertising system around user-permissioned information, and they've been doing it for a long time. But until now, nobody really wants to deal with them, because adtech is just selling that information taken from the user without permission. Tracking protection will be the motivation for forward-thinking brand people to catch the flight to quality and shift web ad spending from the hacking game to the reputation game.

Now that we have better understanding of how user norms are aligned with the interests of independent browsers and with the interests of high-reputation sites, what's next?

Measure the tracking-protected audience

Legit sites are in a strong position to gather some important data that will shift web ads from a hacking game to a reputation game. Let's measure the tracking-protected audience.

Tracking protection is a powerful sign of a human audience. A legit site can report a tracking protection percentage for its audience, and any adtech intermediary who claims to offer advertisers the same audience, but delivers a suspiciously low tracking protection number, is clearly pushing a mismatched or bot-heavy audience and is going to have a harder time getting away with it.

Showing prospective advertisers your tracking protection data lets you reveal the tarnish on the adtech "Holy Grail"—the promise of high-value eyeballs on crappy sites.

Here is some JavaScript to make that measurement in a reliable way that detects all the major tracking protection tools.

You can't sell advertising without data on who the audience is. Much of that data will have to come from the tracking-protected audience. When quality sites share tracking protection data with advertisers, that helps expose the adfraud that intermediaries have no incentive to track down.

This is an opportunity for service journalism.

Users are already concerned and confused about web ads. That's an opportunity that some legit sites such as the Wall Street Journal and The New York Times are already taking advantage of. The more that someone learns about how web advertising works, the more that he or she is motivated to get protected.

But if you don't talk to your readers about tracking protection, who will?

A lot of people are getting caught up today in publisher-hostile schemes such as adblockers with paid whitelisting, or adblockers that come with malware or adware.

If you don't recommend a publisher-friendly protection tool or setting, they'll get a bad one from somewhere else.

I really like ads.

At the airport on the way here I saw that they just came out with a hardcover collection of the complete Kurt Vonnegut stories. A lot of those stories were paid for by Collier’s ads run in the 1950s, and we're still getting the positive extenalities from that advertising today.

Advertising done right can be a growth spiral of growth spiral of economic growth, reputation building, and creation of cultural works. It’s one of the most powerful forces to produce news, entertainment goods, fiction. Let's fix it.

Evancoin and the stake problem

09 October 2017

One of the problems with a bug futures market is: where do you get the initial investment, or "stake", for a developer who plans to take on a high-value task?

In order to buy the FIXED side of a contract and make a profit when it matures, the developer needs to invest some cryptocurrency. In a bug futures market, it takes money to make money.

One possible solution is to use personal tokens, such as the new Evancoin. Evancoin is backed by hours of work performed by an individual (yes, his name is Evan).

If I believe that n hours of work from Evan are likely to increase the probability of a Bugmark-traded bug getting fixed, and my expected gain is greater than n * (current price of Evancoin), then I can

  1. buy the FIXED side of the Bugmark contract

  2. redeem n Evancoin for work from Evan on the bug

  3. sell my Bugmark position at a profit, or wait for it to mature.

Evan is not required to accept cryptocurrency exchange rate risk, and does not have to provide the "stake" himself. It's the opposite—he has already sold the Evancoin on an exchange. Of course, he has an incentive to make as much progress on the bug as possible, in order to support the future price of Evancoin.

If Evan is working on the bug I selected, he would also know that he's doing work that is likely to move the price of the Bugmark contract. So he can use some of the proceeds from his Evancoin sale to buy additional FIXED on Bugmark, and take a profit when I do.

Evan's skills tends to improve, and my understanding of which tasks would be a profitable use of Evan's time will tend to increase the more Evancoin I redeem. So the value of Evancoin to me is likely to continue rising. Therefore I am probably going to do best if I accumulate Evancoin in advance of identifying good bugs for Evan to work on.

Introducing FilterBubbler

06 October 2017

(this originally appeared on the Mozilla Open Innovation Medium channel)

Brainfood and Mozilla’s Open Innovation Team Kick Off Text Classification Open Source Experiment

Mozilla’s Open Innovation team is beginning a new effort to understand more about motivations and rewards for open source collaboration. Our goal is to expand the number of people for whom open source collaboration is a rewarding activity.

An interesting question is: While the server side benefits from opportunities to work collaboratively, can we explore them further on the client side, beyond browser features and their add-on ecosystems? User interest in “filter bubbles” gives us an opportunity to find out. The new FilterBubbler project provides a platform that helps users experiment with and explore what kind of text they’re seeing on the web. FilterBubbler lets you collaboratively “tag” pages with descriptive labels and then analyze any page you visit to see how similar it is to pages you have already classified.

You could classify content by age or reading-level rating, category like “current events” or “fishing”, or even how much you trust the source like “trustworthy” or “urban legend”. The system doesn’t have any bias and it doesn’t limit the number of tags you apply. Once you build up a set of classifications you can visit any page and the system will show you which classification has the closest statistical match. Just as a web site maintainer develops a general view of the technologies and communities of practice required to make a web site, we will use filter bubble building and sharing to help build client-side understanding.

The project aims to reach users who are motivated to understand and maybe change their information environment. Who want to transform their own “bubble” space and participate in collaborative work, but do not have add-on development skills.

Can the browser help users develop better understanding and control of their media environments? Can we emulate the path to contribution that server-side web development has? Please visit the project and help us find out. FilterBubbler can serve as a jumping off point for all kinds of specific applications that can be built on top of its techniques. Ratings systems, content suggestion, fact checking and many other areas of interest can all use the classifiers and corpora that the FilterBubbler users will be able to generate. We’ll measure our success by looking at user participation in filter bubble data sharing, and by how our work gets adapted and built on by other software projects.

Please find more information on the project, ways to engage and contact points on http://www.filterbubbler.org.

Discuss on Twitter:

The capital dynamics are all wrong.

01 October 2017

Ben Werdmuller, in Why open source software isn’t as ethical as you think it is:

When you release open source software, you have this egalitarian idea that you’re making it available to people who can really use it, who can then built on it to make amazing things....While this is a fine position to take, consider who has the most resources to build on top of a project that requires development. With most licenses, you’re issuing a free pass to corporations and other wealthy organizations, while providing no resources to those needy users. OpenSSL, which every major internet company depends on, was until recently receiving just $2,000 a year in donations, with the principal author in financial difficulty.

This is a good example of one of the really interesting problems of working in an immature industry. We have a similar problem in web advertising. We're over-rewarding the ability to collect numbers that show the effectiveness of a marketing project, while under-rewarding the ability to build brand reputation. Web ads also have an opportunity to fix incentives. We don't have our incentives hooked up right yet.

  • Why does open source have some bugs that stay open longer than careers do?

  • Why do people have the I've been coding to create lots of value for big companies for years and I'm still broke problem?

  • How does millions of dollars of shared vigilance even make the news, when the value extracted is in the billions?

  • Why is the meritocracy of open source even more biased than other technical and collaborative fields? (Are we at the bottom of the standings?) Why are we walking away from that many potential contributors?

Quinn Norton: Software is a Long Con:

It is to the benefit of software companies and programmers to claim that software as we know it is the state of nature. They can do stupid things, things we know will result in software vulnerabilities, and they suffer no consequences because people don’t know that software could be well-written. Often this ignorance includes developers themselves. We’ve also been conditioned to believe that software rots as fast as fruit. That if we waited for something, and paid more, it would still stop working in six months and we’d have to buy something new. The cruel irony of this is that despite being pushed to run out and buy the latest piece of software and the latest hardware to run it, our infrastructure is often running on horribly configured systems with crap code that can’t or won’t ever be updated or made secure.

We have two possible futures.

  • People finally get tired of software's boyish antics lethal irresponsibility, and impose a regulatory regime. Rent-seekers rejoice. Software innovation as we know it ceases, and we get something like the pre-breakup Bell System—you have to be an insider to build and deploy anything that reaches real people.

  • The software scene outgrows the "disclaimer of implied warranty" level of quality, on its own.

How do we get to the second one? One approach is to use market mechanisms to help quantify software risk, then enable users with a preference for high quality and developers with a preference for high quality to interact directly, not through the filter of software companies that win by releasing early at a low quality level.

There is an opportunity here for the kinds of companies that are now doing open source license analysis. Right now they're analyzing relatively few files in a project—the licenses and copyrights. A tool will go through your software stack, and hooray, you don't have anything that depends on something with a consistent license, or on a license that would look bad to the people you want to see your company to.

What if that same tool would give you a better quality number for your stack, based on walking your dependency tree and looking for weak points based on market activity?

Why blockchain?

One important reason is that black or gray hat security researchers are likely to have extreme confidentiality requirements, especially when trading on knowledge from a co-conspirator who may not be aware of the trade. (A possible positive externality win from bug futures markets is the potential to reduce the trustworthiness of underground vulnerability markets, driving marginal vuln transactions to the legit market.)

Bug futures series so far

another 2x2 chart

14 September 2017

What to do about different kinds of user data interchange:

Data collected without permission Data collected with permission
Good dataBuild tools and norms to reduce the amount of reliable data that is available without permission. Develop and test new tools and norms that enable people to share data that they choose to share.
Bad data Report on and show errors in low-quality data that was collected without permission. Offer users incentives and tools that help them choose to share accurate data and correct errors in voluntarily shared data.

Most people who want data about other people still prefer data that's collected without permission, and collaboration is something that they'll settle for. So most voluntary user data sharing efforts will need a defense side as well. Freedom-loving technologists have to help people reduce the amount of data that they allow to be taken from them without permission in order for data listen to people about sharing data.

Tracking protection defaults on trusted and untrusted sites

13 September 2017

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

Setting tracking protection defaults for a browser is hard. Some activities that the browser might detect as third-party tracking are actually third-party services such as single sign-on—so when the browser sets too high of a level of protection it can break something that the user expects to work.

Meanwhile, new research from Pagefair shows that The very large majority (81%) of respondents said they would not consent to having their behaviour tracked by companies other than the website they are visiting. A tracking protection policy that leans too far in the other direction will also fail to meet the user's expectations.

So you have to balance two kinds of complaints.

  • "your dumbass browser broke a site that was working before"

  • "your dumbass browser let that stupid site do stupid shit"

Maybe, though, if the browser can figure out which sites the user trusts, you can keep the user happy by taking a moderate tracking protection approach on the trusted sites, and a more cautious approach on less trusted sites.

Apple Intelligent Tracking Prevention allows third-party tracking by domains that the user interacts with.

If the user has not interacted with example.com in the last 30 days, example.com website data and cookies are immediately purged and continue to be purged if new data is added. However, if the user interacts with example.com as the top domain, often referred to as a first-party domain, Intelligent Tracking Prevention considers it a signal that the user is interested in the website and temporarily adjusts its behavior (More...)

But it looks like this could give large companies an advantage—if the same domain has both a service that users will visit and third-party tracking, then the company that owns it can track users even on sites that the users don't trust. Russell Brandom: Apple's new anti-tracking system will make Google and Facebook even more powerful.

It might makes more sense to set the trust level, and the browser's tracking protection defaults, based on which site the user is on. Will users want a working "Tweet® this story" button on a news site they like, and a "Log in with Google" feature on a SaaS site they use, but prefer to have third-party stuff blocked on random sites that they happen to click through to?

How should the browser calculate user trust level? Sites with bookmarks would look trusted, or sites where the user submits forms (especially something that looks like an email address). More testing is needed, and setting protection policies is still a hard problem.

Bonus link: Proposed Principles for Content Blocking.

New WebExtension reveals targeted political ads: Interview with Jeff Larson

12 September 2017

The investigative journalism organization ProPublica is teaming up with three German news sites to collect political ads on Facebook in advance of the German parliamentary election on Sept. 24.

Because typical Facebook ads are shown only to finely targeted subsets of users, the best way to understand them is to have a variety of users cooperate to run a client-side research tool. ProPublica developer Jeff Larson has written a WebExtension, that runs on Mozilla Firefox and Google Chrome, to do just that. I asked him how the development went.

Q: Who was involved in developing your WebExtension?

A: Just me. But I can't take credit for the idea. I was at a conference in Germany a few months ago with my colleague Julia Angwin, and we were talking with people who worked at Spiegel about our work on the Machine Bias series. We all thought it would be a good idea to look at political ads on Facebook during the German election cycle, given what little we knew about what happened in the U.S. election last year.

Q: What documentation did you use, and what would you recommend that people read to get started with WebExtensions?

A: I think both Mozilla and Google's documentation sites are great. I would say that the tooling for Firefox is much better due to the web-ext tool. I'd definitely start there (Getting started with web-ext) the next time around.

Basically, web-ext takes care of a great deal of the fiddly bits of writing an extension—everything from packaging to auto reloading the extension when you edit the source code. It makes the development process a lot more smooth.

Q: Did you develop in one browser first and then test in the other, or test in both as you went along?

A: I started out in Chrome, because most of the users of our site use Chrome. But I started using Firefox about halfway through because of web-ext. After that, I sort of ping ponged back and forth because I was using source maps and each browser handles those a bit differently. Mostly the extension worked pretty seamlessly across both browsers. I had to make a couple of changes but I think it took me a few minutes to get it working in Firefox, which was a pleasant surprise.

Q: What are you running as a back end service to collect ads submitted by the WebExtension?

A: We're running a Rust server that collects the ads and uploads images to an S3 bucket. It is my first Rust project, and it has some rough edges, but I'm pretty much in love with Rust. It is pretty wonderful to know that the server won't go down because of all the built in type and memory safety in the language. We've open sourced the project, I could use help if anyone wants to contribute: Facebook Political Ad Collector on GitHub.

Q: Can you see that the same user got a certain set of ads, or are they all anonymized?

A: We strive to clean the ads of all identifying information. So, we only collect the id of the ad, and the targeting information that the advertiser used. For example, people 18 to 44 who live in New York.

Q: What are your next steps?

A: Well, I'm planning on publishing the ads we've received on a web site, as well as a clean dataset that researchers might be interested in. We also plan to monitor the Austrian elections, and next year is pretty big for the U.S. politically, so I've got my work cut out for me.

Q: Facebook has refused to release some "dark" political ads from the 2016 election in the USA. Will your project make "dark" ads in Germany visible?

A: We've been running for about four days, and so far we've collected 300 political ads in Germany. My hope is we'll start seeing some of the more interesting ones from fly by night groups. Political advertising on sites like Facebook isn't regulated in either the United States or Germany, so on some level just having a repository of these ads is a public service.

Q: Your project reveals the "dark" possibly deceptive ads in Chrome and Firefox but not on mobile platforms. Will it drive deceptive advertising away from desktop and toward mobile?

A: I'm not sure, that's a possibility. I can say that Firefox on Android allows WebExtensions and I plan on making sure this extension works there as well, but we'll never be able to see what happens in the native Facebook applications in any sort of large scale and systematic way.

Q: Has anyone from Facebook offered to help with the project?

A: Nope, but if anyone wants to reach out, I would love the help!

Thank you.

Get the WebExtension

Some ways that bug futures markets differ from open source bounties

11 September 2017

Question about Bugmark: what's the difference between a futures market on software bugs and an open source bounty system connected to the issue tracker? In many simple cases a bug futures market will function in a similar way, but we predict that some qualities of the futures market will make it work differently.

  • Open source bounty systems have extra transaction costs of assigning credit for a fix.

  • Open source bounty systems can incentivize contention over who can submit a complete fix, when we want to be able to incentivize partial work and meta work.

Incentivizing partial work and meta work (such as bug triage) would be prohibitively expensive to manage using bounties claimed by individuals, where each claim must be accepted or rejected. The bug futures concept addresses this with radical simplicity: the owners of each side of the contract are tracked completely separately from the reporter and assignee of a bug in the bug tracker.

And bug futures contracts can be traded in advance of expiration. Any work that you do that meaningfully changes the probability of the bug getting fixed by the contract closing date can move the price.

You might choose to buy the "fixed" side of the contract, do some work that makes it look more fixable, sell at a higher price. Bugmark might make it practical to do "day trading" of small steps, such as translating a bug report originally posted in a language that the developers don't know, helping a user submit a log file, or writing a failing test.

With the right market design, participants in a bug futures market have the incentive to talk their books, by sharing partial work and metadata.

Related: Some ways that bug futures markets differ from prediction markets, Smart futures contracts on software issues talk, and bullshit walks?

JavaScript and not kicking puppies

03 September 2017

(Updated 4 Sep 2017: add screenshot and how to see the warning.)

Advice from yan, on Twitter:

I decided not to do that for this site.

Yes, user tracking is creepy, and yes, collecting user information without permission is wrong. But read on for what could be a better approach for sites that can make a bigger difference.

First of all, Twitter is so far behind in their attempts to do surveillance marketing that they're more funny and heartening than ominous. If getting targeted by one of the big players is like getting tracked down by a pack of hunting dogs, then Twitter targeting is like watching a puppy chew on your sock. Twitter has me in their database as...

  • Owner of eight luxury cars and a motorcycle.

  • Medical doctor advising patients about eating High Fructose Corn Syrup.

  • Owner of prime urban real estate looking for financing to build a hotel.

  • Decision-maker for a city water system, looking to read up on the pros and cons of cast iron and concrete pipes.

  • Active in-market car shopper, making all decisions based on superficial shit like whether the car has Beats® brand speakers in the doors. (Hey, where am I supposed to park car number 9?)

Advice from "me" as I appear on Twitter: As your doctor, I advise you to cut out HFCS entirely unless you're at a family thing where you should just eat a little and not be an ass about it. When you're in town, stay at my hotel, where the TV is a 4k monitor on an arm that moves to make it usable from the sit-stand desk, and the WiFi is fast and free. No idea on the city water pipe thing though.

So if Twitter is the minor leagues of creepy, and they probably won't be something we have to worry about for long anyway, maybe we can think about whether there's anything that sites can do about riskier kinds of tracking. Getting a user protected from being tracked by one Tweet is a start. But helping users get started with client-side privacy tools that protect from Twitter tracking everywhere can help with not just Twitter tracking, but with the serious trackers that show up in other places.

Blocking Twitter tracking: like kicking a puppy?

Funny wrong Twitter ad targeting is one of my reliable Internet amusements for the day. But that's not why I'm not especially concerned with tagging quoted Tweets. Just doing that doesn't protect this site's visitors from retargeting schemes on other sites.

And every time someone clicks on a retargeted ad from a local business on a social site (probably Facebook, since more people spend more time there) then that's 65 cents or whatever of marketing money that could have gone to local news, bus benches, Little League, or some other sustainable, signal-carrying marketing project. (That's not even counting the medium to heavy treason angle that makes me really uncomfortable about seeing money move in Facebook's direction.)

So, instead of messing with quoted Tweet tagging, I set up this script:

warn3p.js

This will load the Aloodo third-party tracking detector, and, if the browser shows up as easily trackable from site to site, switch out the page header to nag the user.

screenshot of tracking warning

(If you are viewing this site from an unprotected browser and still not seeing the warning, it means that your browser has not yet visited enough domains with the Aloodo script to detect that you're trackable. Take a tracking protection test to expose your browser to more fake tracking, then try again.)

If the other side wants it hidden, then reveal it

Surveillance marketers want tracking to happen behind the scenes, so make it obvious. If you have a browser or privacy tool that you want to recommend, it's easy to put in the link. Every retargeted ad impression that's prevented from happening is more marketing money to pay for ad-sponsored resources that users really want. I know I can't get all the users of this site perfectly protected from all surveillance marketing everywhere, but hey, 65 cents is 65 cents.

Bonus tweet

Bob Hoffman's new book is out! Go click on this quoted Tweet, and do what it says.

Good points here: you don't need to be Magickal Palo Alto Bros to get people to spend more time on your site. USA Today’s Facebook-like mobile site increased time spent per article by 75 percent

The Dumb Fact of Google Money

Join Mozilla and Stanford’s open design sprint for an accessible web

Just Following Orders

Headless mode in Firefox

Hard Drive Stats for Q2 2017

The Time When Google Got Forbes to Pull a Published Story

Disabling Intel ME 11 via undocumented mode

Despite Disavowals, Leading Tech Companies Help Extremist Sites Monetize Hate

Trump Damaged Democracy, Silicon Valley Will Finish It Off

What should you think about when using Facebook?

Ad buyers blast Facebook Audience Network for placing ads on Breitbart

Ice-cold Kaspersky shows the industry how to handle patent trolls

How the GDPR will disrupt Google and Facebook

Rural America Is Building Its Own Internet Because No One Else Will Disabling Intel ME 11 via undocumented mode

Younger adults more likely than their elders to prefer reading news

Some ways that bug futures markets differ from prediction markets

30 August 2017

Question about Bugmark: what's the difference between a futures market on software bugs and a prediction market? We don't know how much a bug futures market will tend to act like a prediction market, but here are a few guesses about how it may turn out differently.

Prediction markets tend to have a relatively small number of tradeable questions, with a large number of market participants on each side of each question. Each individual bug future is likely to have a small number of participants, at least on the "fixed" side.

Prediction markets typically have participants who are not in a position to influence the outcome. For example, The Good Judgment Project recruited regular people to trade on worldwide events. Bug futures are designed to attract participants who have special knowledge and ability to change an outcome.

Prediction markets are designed for gathering knowledge. Bug futures are for incentivizing tasks. A well-designed bug futures market will monetize haters by turning a "bet" that a project will fail into a payment that makes it more likely to succeed. If successful in this, the market will have this feature in common with Alex Tabarrok's Dominant Assurance Contract.

Prediction markets often implement conditional trading. Bug markets rely on the underlying bug tracker to maintain the dependency relationships among bugs, and trades on the market can reflect the strength of the connections among bugs as seen by the participants.

hey, kids, 2x2 chart!

29 August 2017

What's the difference between spam and real advertising?

No signalingSignaling
Interruptionspamadvertising
No interruption organic socialcontent marketing

Advertising is a signal for attention bargain. People pay attention to advertising that carries some hard-to-fake information about the seller's intentions in the market.

Rory Sutherland says, What seems undoubtedly true is that humans, like peahens, attach significance to a piece of communication in some way proportionally to the cost of generating or transmitting it.

If I get spam email, that's clearly signal-free because it costs practically nothing. If I see a magazine ad, it carries signal because I know that it cost money to place.

Today's web ads are more like spam, because they can be finely targeted enough that no significant advertiser resources stand behind the message I'm looking at. (A bot might have even written the copy.) People don't have to be experts in media buying to gauge the relative costs of different ads, and filter out the ones that are clearly micro-targeted and signal-free.

Want to lose a hacking contest or win a reputation contest?

27 August 2017

Doc Searls: How the personal data extraction industry ends.

Our data, and data about us, is the crude that Facebook and Google extract, refine and sell to advertisers. This by itself would not be a Bad Thing if it were done with our clearly expressed (rather than merely implied) permission, and if we had our own valves to control personal data flows with scale across all the companies we deal with, rather than countless different valves, many worthless, buried in the settings pages of the Web’s personal data extraction systems, as well as in all the extractive mobile apps of the world.

Today's web advertising business is a hacking contest. Whoever can build the best system to take personal information from the user wins, whether or not the user knows about it. (And if you challenge adfraud and adtech hackers to a hacking contest, you can expect to come in third.)

As users get the tools to control who they share their information with (and they don't want to leak it to everyone) then the web advertising business has to transform into a reputation contest. Whoever can build the most trustworthy place for users to choose to share their information wins.

This is why the IAB is freaking out about privacy regulations, by the way. IAB member companies are winning at hacking and failing at building reputation. (I want to do a user focus group where we show people a random IAB company's webinar, then count how many participants ask for tracking protection support afterward.) But regulations are a sideshow. In the long run regulators will support the activities that legit business needs. So Doc has an important point. We have a big opportunity to rebuild important parts of the web advertising stack, this time based on the assumption that you only get user data if you can convince the user, or at least convince the maintainers of the user's trusted tools, that you will use the data in a way that complies with that user's norms.

One good place to check is: how many of a site's readers are set up with protetcion tools that make them "invisible" to Google Analytics and Chartbeat? (script) And how many of the "users" who sites are making decisions for are just bots? If you don't have good answers for those, you get dumbassery like "pivot to video" which is a polite expression for "make videos for bots because video ad impressions are worth enough money to get the best bot developers interested."

Yes, "pivot to video" is still a thing, even though

News from the "pivot to video" department, by Lara O'Reilly, at the Wall Street Journal:

Google is issuing refunds for ads that ran on websites with fake traffic...

...

Google’s refunds amount to only a fraction of the cost of the ads served to invalid traffic, which has left some advertising executives unsatisfied...

...

In the recent cases Google discovered, the affected traffic involved video ads, which carry higher ad rates than typical display ads and are therefore an attractive target for fraudsters.

(read the whole thing. If we're lucky, Bob Hoffman will blog about that story. "Some advertising executives unsatisfied"? Gosh, Bob, you think so?)

The good news here is that legit publishers, trying to transform web advertising from a hacking game into a reputation game, don't have to do a perfect job right away. Incrementally make reputation-based, user-permissioned advertising into a better and better investment, while adfraud keeps making unpermissioned tracking into a worse and worse investment. Then wait for some ambitious marketer (and marketers are always looking for a new angle to reinvent Marketing) to discover the opportunity and take credit for it.

Anyway, bonus links.

Facebook Figured Out My Family Secrets, And It Won't Tell Me How

This App Tracks Political Ads To See Who Is Targeting Your Vote–And Why

Designers are using “dark UX” to turn you into a sleep-deprived internet addict

AdTech Weekly - Issue 53: Librarians care more about your privacy than most. - Aug 18th 2017

Rise of the racist robots – how AI is learning all our worst impulses

Getting To The Holy Grail: How Publishers Measure The Incremental Value Of Ad Tech Partners

Linguistic data analysis of 3 billion Reddit comments shows the alt-right is getting stronger

Let’s Talk About The Brand Safety Tax

The state of the brand crackdown on media transparency

Brands are now blacklisting mainstream news sites, including Fox News

Data-hucksters beware: online privacy is returning | John Naughton

Remember that Norwegian site that made readers take a quiz before commenting? Here’s an update on it

List-based and behavior-based tracking protection

22 August 2017

In the news...

User privacy is at risk from both hackers and lawyers. Right now, lawyers are better at attacking lists, and hackers are better at modifying tracker behavior to get around protections.

The more I think about it, the more that I think it's counterproductive to try to come up with one grand unified set of protection rules or cookie policies for everybody.

Spam filters don't submit their scoring rules to ANSI—spammers would just work around them.

Search engines don't standardize and publish their algorithms, because gray hat SEOs would just use the standard to make useless word salad pages that score high.

And different people have different needs.

If you're a customer service rep at an HERBAL ENERGY SUPPLEMENTS company, you need a spam filter that can adjust for your real mail. And any user of a site that has problems with list-based tracking protection will need to have the browser adjust, and rely more on cleaning up third-party state after a session instead of blocking outright.

Does your company intranet become unusable if you fail to accept third-party tracking that comes from an internal domain that your employer acquired and still has some services running on? Browser developers can't decide up front, so the browser will need to adjust. Every change breaks someone's workflow.

That means the browser has to work to help the user pick a working set of protection methods and rules.

0. Send accurate Do Not Track

Inform sites of the user’s preferences on data sharing. (This will be more important in the future because Europe, but privacy-crazed Eurocrats will not save us from having to do our share of the work.

1. Block connections to third-party trackers

This will need to include both list-based protection and monitoring tracking behavior, like Privacy Badger, because hackers and lawyers are good at getting around different ones.

2. Limit data sent to third-party sites

Apple Safari does this, so it's likely to get easier to do cookie double keying without breaking sites.

3. Scramble or delete unsafe data

If a tracking cookie or other identifier does get through, delete or scramble it on leaving the site or later, as the Self-Destructing Cookies extension does. This could be a good backup for when the browser "learns" that a user needs some third-party state to do something like a shopping cart or comment form, but then doesn't want the info to be used for "ads that follow me around" later.

How is everyone's tracking protection working? An update

20 August 2017

When I set up this blog, I put in a script to check how many of the users here are protected from third-party tracking.

The best answer for now is 31%. Of the clients that ran JavaScript on this site over the past two weeks, 31% did not also run JavaScript from the Aloodo "fake third-party tracker".

The script is here: /code/check3p.js

This is not as good as I had hoped (turn on your tracking protection, people! Don't get tricked by ad blockers that leave you unprotected by default!) but it's a start.

The Information Trust Exchange is doing research on the problem of third-party tracking at news sites. News industry consultant Greg Swanson:

All of the conversations on the newspaper side have been focused on how can we join the advertising technology ecosystem. For example, how can a daily newspaper site in Bismarck, North Dakota deliver targeted advertising to a higher-value soccer mom? And none of the newspapers them have considered the fact that when they join that ecosystem they are enabling spam sites, fraudulent sites – enabling those sites to get a higher CPM rate by parasitically riding on the data collected from the higher-value newspaper sites.

More info: Aloodo for web publishers.

SEO hats and the browser of the future

19 August 2017

The field of Search Engine Optimization has white hat SEO, black hat SEO, and gray hat SEO.

White hat SEO helps a user get a better search result, and complies with search engine policies. Examples include accurately using the same words that users search on, and getting honest inbound links.

Black hat SEO is clearly against search engine policies. Link farming, keyword stuffing, cloaking, and a zillion other schemes. If they see you doing it, your site gets penalized in search results.

Gray hat SEO is everything that doesn't help the user get a better search result, but technically doesn't violate a search engine policy.

Most SEO experts advise you not to put a lot of time and effort into gray hat, because eventually the search engines will notice your gray hat scheme and start penalizing sites that do it. Gray hat is just stuff that's going to be black hat when the search engines figure it out.

Adtech has gray hat, too. Rocket Fuel Awarded Two Patents to Help Leverage First-Party Cookies to More Meaningfully Reach Consumers.

This scheme seems to be intended to get around existing third-party cookie protection, which is turned on by default in Apple Safari and available in other browsers.

But how long will it work?

Maybe the browser of the future won't run a "kangaroo cookie court" but will ship with a built-in "kangaroo law school" so that each copy of the browser will develop its own local "courts" and its own local "case law" based on the user's choices. It will become harder to predict how long any single gray hat adtech scheme will continue working.

In the big picture: in order to sell advertising you need to give the advertiser some credible information on who the audience is. Since the "browser wars" of the 1990s, most browsers have been bad at protecting personal information about the user, so web advertising has become a game where a whole bunch of companies compete to covertly capture as much user info as they can.

Today, browsers are getting better at implementing people's preferences about sharing their information. The result is a change in the rules of the game. Investment in taking people's personal info is becoming less rewarding, as browsers compete to reflect people's preferences. (That patent will be irrelevant thanks to browser updates long before it expires.)

Adfraud is the other half of this story. Fraudbots are getting smarter at creating human-looking ad impressions just as humans are getting better protected. If you think that a web publisher's response to harder-to-detect bots, viewing more high-CPM video ads, should be "pivot to video!!1!!" I don't know if I can help you.

And investments in building sites and brands that are trustworthy enough for people to want to share their information will tend to become more rewarding. (This shift naturally leads to complaints from people who are used to winning the old game, but will probably be better for customers who want to use trustworthy brands and for people who want to earn money by making ad-supported news and cultural works.)

One of the big advertising groups is partnering with Digital Content Next’s trust-focused ad marketplace

Partisanship, Propaganda, and Disinformation: Online Media and the 2016 U.S. Presidential Election

ANA Endorses TrustX, Encourages Members To Use Programmatic Media-Buying Stamp Of Approval

Call for Papers: Policy and Internet Special Issue on Reframing ‘Fake News’: Architectures, Influence, and Automation

Time to sink the Admiral (or, why using the DMCA to block adblockers is a bad move)

I'm a woman in computer science. Let me ladysplain the Google memo to you.

Easylist block list removes entry after DMCA takedown notice

Will Cities Ever Outsmart Rats?

Uber drivers gang up to cause surge pricing, research says

Google reveals sites with ‘failing’ ads, including Forbes, LA Times

Koch group, Craigslist founder come to Techdirt's aid

The Mozilla Information Trust Initiative: Building a movement to fight misinformation online

Are Index Funds Evil?

When Silicon Valley Took Over Journalism

How publishers can beat fraudsters at their own game

Facebook’s Secret Censorship Rules Protect White Men from Hate Speech But Not Black Children

cdparanoia returned code 73

18 August 2017

Welcome, people googling for the above error message.

I saw the error

cdparanoia returned code 73

and it turns out I was trying to run two abcde processes in two terminal windows. Kill the second one and the error goes away.

Hope your problem was as simple as that.

ePrivacy and marketing budgets

16 August 2017

(Update 18 Aug 2017: this post is also available at Digital Content Next.)

As far as I know, there are three ways to match an ad to a user.

User intent: Show an ad based on what the user is searching for. Old-school version: the Yellow Pages.

Context: Show an ad based on where the user is, or what the user is interested in. Old-school versions: highway billboards (geographic context), specialized magazines (interest context).

User identity: Show an ad based on who the user is. Old-school version: direct mail.

Most online advertising is matched to the user based on a mix of all three. And different players have different pieces of the action for each one. For user intent, search engines are the gatekeepers. The other winners from matching ads to users by intent are browsers and mobile platforms, who get paid to set their default search engine. Advertising based on context rewards the owners of reputations for producing high-quality news, information, and cultural works. Finally, user identity now has a whole Lumascape of vendors in a variety of categories, all offering to help identify users in some way. (the Lumascape is rapidly consolidating, but that's another story.)

Few of the web ads that you might see today are matched to you purely based on one of the three methods. Investments in all three tend to shift as the available technology, and the prevailing norms and laws, change.

Enough background.

Randall Rothenberg of the IAB is concerned about the proposed ePrivacy Regulation in Europe, and writes,

The basic functionality of the internet, which is built on data exchanges between a user’s computer and publishers’ servers, can no longer be used for the delivery of advertising unless the consumer agrees to receive the ads – but the publisher must deliver content to that consumer regardless.

This doesn't look accurate. I don't know of any proposal that would require publishers to serve users who block ads entirely. What Rothenberg is really complaining about is that the proposed regulation would limit the ability of sites and ad intermediaries to match ads to users based on user identity, forcing them to rely on user intent and context. If users choose to block ads delivered from ad servers that use their personal data without permission, then sites won't be able to refuse to serve them the content, but will be able to run ads that are relevant to the content of the site. As far as I can tell, sites would still be able to pop a "turn off your ad blocker" message in place of a news story if the user was blocking an ad placed purely by context, magazine style.

Privacy regulation is not so much an attack on the basic functionality of the Internet, as it is a shift that lowers the return on investment on knowing who the user is, and drives up the return on investment on providing search results and content. That's a big change in who gets paid: more money for search and for trustworthy content brands, and less for adtech intermediaries that depend on user tracking.

Advertising: a fair deal for the user?

That depends. Search advertising is clearly the result of a user choice. The user chooses to view ads that come with search results, as part of choosing to do a search. As long as the ads are marked as ads, it's pretty obvious what is happening.

The same goes for ads placed in context. The advertiser trades economic signal, in the form of costly support of an ad-supported resource, for the user's attention. This is common in magazine and broadcast advertising, and when you use a site with one of the (rare) pure in-context ad platforms such as Project Wonderful, it works about the same way.

The place where things start to get problematic is ads based on user identity, placed by tracking users from site to site. The more that users learn how their data is used, the less tracking they tend to want. In one survey, 66% of adult Americans said they do not want marketers to tailor advertisements to their interests, and when the researchers explained how ad targeting works, the percentage went up.

If users, on average, dislike tracking enough that sites choose to conceal it, then that's pretty good evidence that sites should probably ask for permission to do it. Whether this opt-in should be enforced by law, technology, or both is left as an exercise for the reader.

So what happens if, thanks to new regulations, technical improvements in browsers, or both, cross-site tracking becomes harder? Rothenberg insists that this transformation would end ad-supported sites, but the real effects would be more complex. Ad-supported sites are already getting a remarkably lousy share of ad budgets. “The supply chain’s complexity and opacity net digital advertisers as little as 30 cents to 40 cents of working media for every dollar spent,” ANA CEO Bob Liodice said.

Advertising on high-reputation sites tends to be a better investment than using highly intermediated, fraud-prone, stacks of user tracking to try to chase good users to cheap sites. But crap ad inventory, including fraudulent and brand-unsafe stuff, persists. The crap only has market value because of user tracking, and it drives down the value of legit ads. If browser improvements or regulation make knowledge of user identity rarer, the crap tends to leave the market and the value of user intent and context go up.

Rothenberg speaks for today's adtech, which despite all its acronyms and Big Data jive, is based on a pretty boring business model: find a user on a legit site, covertly follow the user to a crappy site where the ads are cheaper, sell an ad impression there, profit. Of course he's entitled to make the case for enabling IAB members to continue to collect their "adtech tax." But moving ad budgets from one set of players to another doesn't end ad-supported sites, because marketers adjust. That's what they do. There's always something new in marketing, and budgets move around. What happens when privacy regulations shift the incentives, and make more of advertising have to depend on trustworthy content? That's the real question here.

Moral values in society

08 August 2017

Moral values in society are collapsing? Really? Elizabeth Stoker Bruenig writes, The baseline moral values of poor people do not, in fact, differ that much from those of the rich. (read the whole thing).

Unfortunately, if you read the fine print, it's more complicated than that. Any market economy depends on establishing trust between people who trade with each other. Tim Harford writes,

Being able to trust people might seem like a pleasant luxury, but economists are starting to believe that it’s rather more important than that. Trust is about more than whether you can leave your house unlocked; it is responsible for the difference between the richest countries and the poorest.

Somehow, over thousands of years, business people have built up a set of norms about high-status and low-status business activities. Craftsmanship, consistent supply of high-quality staple goods, and construction of noteworthy projects are high-status activities. Usury and deception are examples of low-status activities. (You make your money in quarters, gambling with retired people? You lend people $100 until Friday at a 300% interest rate? No club invitation for you.)

Somehow, though, that is now changing in the USA. Those who earn money through deception now have seats at the same table as legitimate business. Maybe it started with the shift into "consumer credit" by respectable banks. But why were high-status bankers willing to play loan shark to begin with? Something had to have been building, culturally. (It started too early to blame the Baby Boomers.)

We tend to blame information technology companies for complex, one-sided Terms of Service and EULAs, but it's not so much a tech trend as it is a general business culture trend. It shows up in tech fast, because rapid technology change provides cover and concealment for simultaneous changes in business terms. US business was rapidly losing its connection to basic norms when it was still moving at the speed of FedEx and fax. (You can't say, all of a sudden, "car crashes in existing fast-food drive-thrus are subject to arbitration in Unfreedonia" but you can stick that kind of term into a new service's ToS.) There's some kind of relativistic effect going on. Tech bros just seem like bigger douchebags because they're moving faster.

Regulation isn't the answer. We have a system in which business people can hire lobbyists to buy the laws and regulations we want. The question is whether we're going to use our regulatory capture powers in a shortsighted, society-eroding hustler way, or in a conservative way. Economic conservatism means not just limiting centralized state control of capital, but preserving the balance among all the long-standing stewards of capital, including households, municipalities, and religious and educational institutions. Economic conservatism and radical free-marketism are fundamentally different.

People blame trashy media for the erosion of norms among the poor, so let's borrow that explanation for the erosion of norms among the rich as well. Maybe our problem with business norms results from the globablization and sensationalism of business media. Joe CEO isn't just the most important corporate leader of Mt. Rose, MN, any more—on a global scale he's just another broke-ass hustler.

More random links

06 August 2017

Not the Google story everyone is talking about, but related: Google Is Matching Your Offline Buying With Its Online Ads, But It Isn’t Sharing How. (If a company becomes known for doing creepy shit, it will get job applications from creepy people, and at a large enough company some of them will get hired. Related: The Al Capone theory of sexual harassment)

Least surprising news story ever: The Campaign Against Facebook And Google's Ad "Duopoly" Is Going Nowhere Independent online publishers can't beat the big surveillance marketing companies at surveillance marketing? How about they try to beat Amazon and Microsoft at cloud services, or Apple and Lenovo at laptop computers? There are possible winning strategies for web publishers, but doing the same as the incumbents with less money and less data is not one of them.

Meanwhile, from an investor point of view: It’s the Biggest Scandal in Tech (and no one’s talking about it) Missing the best investment advice: get out of any B-list adtech company that is at risk of getting forced into a low-value acquisition by a sustained fraud story. Or short it and research the fraud story yourself.

Did somebody at The Atlantic get a loud phone notification during a classical music concert or something? Your Smartphone Reduces Your Brainpower, Even If It's Just Sitting There and Have Smartphones Destroyed A Generation?, by Jean M. Twenge, The Atlantic

Good news: Math journal editors resign to start rival open-access journal

Apple’s Upcoming Safari Changes Will Shake Up Ad Tech: Not surprisingly, Facebook and Amazon are the big winners in this change. Most of their users come every day or at least every week. And even the mobile users click on links often, which, on Facebook, takes them to a browser. These companies will also be able to buy ad inventory on Safari at lower prices because many of the high-dollar bidders will go away. A good start by Apple, but other browsers can do better. (Every click on a Facebook ad from a local business is $0.65 of marketing money that's not going to local news, Little League sponsorships, and other legit places.)

Still on the upward slope of the Peak Advertising curve: Facebook 'dark ads' can swing political opinions, research shows

You’re more likely to hear from tech employers if you have one of these 10 things on your resume (and only 2 of them are proprietary. These kids today don't know how good they have it.)

The Pac-Man Rule at Conferences

How “Demo-or-Die” Helped My Career

Pragmatists for copyleft, or, corporate hive minds don't accept software licenses

06 August 2017

One of the common oversimplifications in discussing open-source software licenses is that copyleft licenses are "idealistic" while non-copyleft licenses are "pragmatic." But that's not all there is to it.

The problem is that most people redistributing licensed code are doing so in an organizational context. And no human organization is a hive mind where those who participate within it subordinate their goals to that of the collective. Human organizations are full of of people with their own motivations.

Instead of treating the downstrem developer's employer as a hive mind, it can be more producive to assume good faith on the part of the individual who intends to contribute to the software, and think about the license from the point of view of a real person.

Releasing source for a derivative work costs time and money. The well-intentioned "downstream" contributor wants his or her organization to make those investments, but he or she has to make a case for them. The presence of copyleft helps steer the decision in the right direction. Jane Hacker at an organization planning to release a derivative work can say, matter-of-factly, "we need to comply with the upstream license" if copyleft is involved. The organization is then more likely to do the right thing. There are always violations, but the license is a nudge in the right direction.

(The extreme case is university licensing offices. University-owned software patents can exclude a graduate student from his or her own project when the student leaves the university, unless he or she had the foresight to build it as a derivative work of something under copyleft.)

Copyleft isn't a magic commons-building tool, and it isn't right for every situation. But it can be enough to push an organization over the line. (One place where I worked had to a do a source release for one dependency licensed under GPLv2, and it turned out to be easist to just build one big source code release with all the dependencies in it, and offer that.)

Hey kids, favicon!

05 August 2017

Finally fixed those 404s from browsers looking for favicon.ico on this blog.

  1. Google image search for images where "reuse with modification" is allowed.

  2. Found this high-quality lab mouse SVG image.

  3. Opened it in GNU Image Manipulation Program, posterized, cropped to a square. Kept the transparent background.

  4. Just went to realfavicongenerator.net and did what it says, and added the resulting images and markup to the site.

That's about it. Now there's a little mouse in the browser tab (and it should do the right thing with the icons if someone pins it to their home screen on mobile.)

Why surveillance marketers don't worry about GDPR (but privacy nerds should)

01 August 2017

A lot of privacy people these days sound like a little kid arguing with a sibling. You're going to be in big trouble when Dad gets home!

Dad, here, is the European Union, who's going to put the General Data Protection Regulation foot down, and then, oh, boy, those naughty surveillance marketers are going to catch it, and wish that they had been listening to us about privacy all along.

Right?

But Internet politics never works like that. Sure, European politicians don't want to hand over power to the right-wing factions who are better at surveillance marketing than they are. And foreign agents use Facebook (and other US-based companies) to attack legit political systems. But that stuff is not going to be enough to save GDPR.

The problem is that perfectly normal businesses are using GDPR-violating sneaky tracking pixels and other surveillance marketing as part of their daily marketing routine.

As the GDPR deadline approaches, surveillance marketers in Europe are going to sigh and painstakingly explain to European politicians that of course this GDPR thing isn't going to work. "You see, politicians, it's an example of political overreach that completely conflicts with technical reality." European surveillance marketers will use the same kind of language about GDPR that the freedom-loving side used when we talked about the proposed CBDTPA. It's just going to Break the Internet! People will lose their jobs!

The result is predictable. GDPR will be delayed, festooned with exceptions, or both, and the hoped-for top-down solution to privacy problems will not come. There's no shortcut. We'll only get a replacement for surveillance marketing when we build the tools, the networks, the business processes, the customer/voter norms, and then the political power.

Extracting just the audio from big video files

29 July 2017

Update 24 Aug 2017: How to get the big video file from an Air Mozilla page.

  1. Sign in if needed and go to the page with the video on it.

  2. Control-I to open the page info window.

  3. Open the "Media" tab in the page info window, and find the item with type "Video".

  4. Click "Save As" to save the video.

Got a big video, and want a copy of just the audio for listening on a device with limited storage? Use Soundconverter.

soundconverter -b -m mp3 -s .mp3 long-video.webm

(MP3 patents are expired now, hooray! I'm just using MP3 here because if I get a rental car that lets me plug in a USB stick for listening, the MP3 format is most likely to be supported.)

Soundconverter has a GUI but you can use -b for batch mode from the shell. soundconverter --help for help. You do need to set both the MIME type, with -m, and the file suffix, with -s.

Online ads don't matter to P&G

28 July 2017

In the news: P&G Cuts More Than $100 Million in ‘Largely Ineffective’ Digital Ads

Not surprising.

Proctor & Gamble makes products that help you comply with widely held cleanliness norms.

Digital ads are micro-targeted to you as an individual.

That's the worst possible brand/medium fit. If you don't know that the people who expect you to keep your house or body clean are going to be aware of the same product, how do you know whether to buy it?

Bonus link from Bob Hoffman last year: Will The P&G Story Bring Down Ad Tech? Please?

Got a reply from Twitter

26 July 2017

I thought it would be fun to try Twitter ads, and, not surprisingly, I started getting fake followers pretty quickly after I started a Twitter follower campaign.

Since I'm paying nine cents a head for these followers, I don't want to get ripped off. So naturally I put in a support ticket to Twitter, and just heard back.

Thanks for writing in about the quality of followers and engagements. One of the advantages of the Twitter Ads platform is that any RTs of your promoted ads are sent to the retweeting account's followers as an organic tweet. Any engagements that result are not charged, however followers gained may not align with the original campaign's targeting criteria. These earned followers or engagements do show in the campaign dashboard and are used to calculate cost per engagement, however you are not charged for them directly.

Twitter also passes all promoted engagements through a filtering mechanism to avoid charging advertisers for any low-quality or invalid engagements. These filters run on a set schedule so the engagements may show in the campaign dashboard, but will be deducted from the amount outstanding and will not be charged to your credit card.

If you have any further questions, please don't hesitate to reply.

That's pretty dense San Francisco speak, so let me see if I can translate to the equivalent for a normal product.

Hey, what are these rat turds doing in my raisin bran?

Thanks for writing in about the quality of your raisin bran eating experience. One of the advantages of the raisin bran platform is that during the production process, your raisin bran is made available to our rodent partners as an organic asset.

I paid for raisin bran, so why are you selling me raisin-plus-rat-turds bran?

Any ingredients that result from rodent engagement are not charged, however ingredients gained may not align with your original raisin-eating criteria.

Can I have my money back?

We pass all raisin bran sales through a filtering mechanism to avoid charging you for invalid ingredients. The total weight of the product, as printed on the box, includes these ingredients, but the weight of invalid ingredients will be deducted from the amount charged to your credit card.

So how can I tell which rat turds are "organic" so I'm not paying for them, and which are the ones that you just didn't catch and are charging me for?

(?)

Buying Twitter followers: Fiverr or Twitter?

On Fiverr, Twitter followers are about half a cent each ($5/1000). On Twitter, I'm gettting followers for about 9 cents each. The Twitter price is about 18x the Fiverr price.

But every follower that someone else buys on Fiverr has to be "aged" and disguised in order to look realistic enough not to get banned. The bot-herders have to follow legit follower campaigns such as mine and not just their paying customers.

If Twitter is selling those "follow" actions to me for nine cents each, and the bot-herder is only making half a cent, how is Twitter not making more from bogus Twitter followers than the bot-herders are?

If you're verified on Twitter, you may not be seeing how much of a shitshow their ad business is. Maybe the're going to have to sell Twitter to me sooner than I thought.

Incentivizing production of information goods

26 July 2017

Just thinking about approaches to incentivizing production of information goods, and where futures markets might fit in.

Artificial property

Article 1, Section 8, of the US Constitution still covers this one best.

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;

We know about the problems with this one. It encourages all kinds of rent-seeking and freedom-menacing behavior by the holders of property interests in information. And the transaction costs are too high to incentivize the production of some useful kinds of information.

Commoditize the complement

Joel Spolsky explained it best, in Strategy Letter V. Smart companies try to commoditize their products’ complements. (See also: the list of business models in the Some Easily Rebutted Objections to GNU's Goals section of the GNU Manifesto)

This one has been shown to work for some categories of information goods but not others. (We have Free world-class browsers and OS kernels because search engines and hardware are complements. We don't have free world-class software in categories such as CAD.)

Signaling

Release a free information good as a way to signal competence in performing a service, or at least a large investment by the author in persuading others that the author is competent. Works at the level of the individual labor market and in consulting. Don't know if this works in other areas.

Game and market mechanisms

With "gamified crowdsourcing" you can earn play rewards for very low transaction costs, and contribute very small tasks.

Common Voice

Higher transaction costs are associated with "crowdfunding" which sounds similar but requires more collaboration and administration.

In the middle, between crowdsourcing and crowdfunding, is a niche for a mechanism with lower transaction costs than crowdfunding but more rewards than crowdsourcing.

By using the existing bug tracker to resolve contracts, a bug futures market keeps transaction costs low. By connecting to an existing cryptocurrency, a bug futures market enables a kind of reward that is more liquid, and transferrable among projects.

We don't know how wide the bug futures niche is. Is it a tiny space between increasingly complex tasks that can be resolved by crowdsourcing and increasingly finer-grained crowdfunding campaigns?

Or are bug futures capable of achieving low enough transaction costs to be an attractive incentivization mechanism for a lot of tasks that go into a variety of information goods?

My bot parsed 12,387 RSS feeds and all I got were these links.

23 July 2017

Bryan Alexander has a good description of an "open web" reading pipeline in I defy the world and go back to RSS. I'm all for the open web, but 40 separate folders for 400 feeds? That would drive me nuts. I'm a lumper, not a splitter. I have one folder for 12,387 feeds.

My chosen way to use RSS (and one of the great things about RSS is you can choose UX independently of information sources) is a "scored river". Something like Dave Winer's River of News concept, that you can navigate by just scrolling, but not exactly a river of news.

  • with full text if available, but without images. I can click through if I want the images.

  • items grouped by score, not feed. (Scores assigned managed by a dirt-simple algorithm where a feed "invests" a percentage of its points in every link, and the investments pay out in a higher score for that feed if the user likes a link.)

I also put the byline at the bottom of each item. Anyway, one thing I have found out about manipulating my own filter bubble is that linklog feeds and blogrolls are great inputs. So here's a linklog feed. (It's mirrored from the live site, which annoys everyone except me.)

Here are some actual links.

This might look funny: How I ran my kids like an Atlassian team for a month. But think about it for a minute. Someone at every app or site your kids use is doing the same thing, and their goals don't include "Dignity and Respect" or "Hard Work Smart Work".

Global network of 'hunters' aim to take down terrorists on the internet It took me a few days to figure things out and after a few weeks I was dropping accounts like flies…

Google's been running a secret test to detect bogus ads — and its findings should make the industry nervous. (This is a hella good idea. Legit publishers could borrow it: just go ad-free for a few minutes at random, unannounced, a couple of times a week, then send the times straight to CMOs. Did you buy ads that someone claimed ran on our site at these times? Well, you got played.)

For an Inclusive Culture, Try Working Less As I said, to this day, my team at J.D. Edwards was the most diverse I’ve ever worked on....Still, I just couldn’t get over that damned tie.

The Al Capone theory of sexual harassment Initially, the connection eluded us: why would the same person who made unwanted sexual advances also fake expense reports, plagiarize, or take credit for other people’s work?

Jon Tennant - The Cost of Knowledge But there’s something much more sinister to consider; recently a group of researchers saw fit to publish Ebola research in a ‘glamour magazine’ behind a paywall; they cared more about brand association than the content. This could be life-saving research, why did they not at least educate themselves on the preprint procedure....

Twitter Is Still Dismissing Harassment Reports And Frustrating Victims

This Is How Your Fear and Outrage Are Being Sold for Profit (Profit? What about TEH LULZ??!?!1?)

Fine, have some cute animal photos, I was done with the other stuff anyway: Photographer Spends Years Taking Adorable Photos of Rats to Break the Stigma of Rodents

the other dude

22 July 2017

Making the rounds, this is a fun one: A computer was asked to predict which start-ups would be successful. The results were astonishing.

  • 2014: When there's no other dude in the car, the cost of taking an Uber anywhere becomes cheaper than owning a vehicle. So the magic there is, you basically bring the cost below the cost of ownership for everybody, and then car ownership goes away.

  • 2018 (?): When there's no other dude in the fund, the cost of financing innovation anywhere becomes cheaper than owning a portfolio of public company stock. So the magic there is, you basically bring the transaction costs of venture capital below the cost of public company ownership for everybody, and then public companies go away.

Could be a thing for software/service companies faster than we might think. Futures contracts on bugs→equity crowdfunding and pre-sales of tokens→bot-managed follow-on fund for large investors.

Stupid ideas department

18 July 2017

Here's a probably stupid idea: give bots the right to accept proposed changes to a software project. Can automation encourage less burnout-provoking behavior?

A set of bots could interact in interesting ways.

  • Regression-test-bot: If a change only adds a test, applies cleanly to both the current version and to a previous version, and the previous version passses the test, accept it, even if the test fails for the current version.

  • Harmless-change-bot: If a change is below a certain size, does not modify existing tests, and all tests (including any new ones) pass, accept it.

  • Revert-bot: If any tests are failing on the current version, and have been failing for more than a certain amount of time, revert back to a version that passes.

Would more people write regression tests for their issues if they knew that a bot would accept them? Or say that someone makes a bad change but gets it past harmless-change-bot because no existing test covers it. No lengthy argument needed. Write a regression test and let regression-test-bot and revert-bot team up to take care of the problem. In general, move contributor energy away from arguing with people and toward test writing, and reduce the size of the maintainer's to-do list.

Playing for third place

17 July 2017

Just tried a Twitter advertising trick that a guy who goes by "weev" posted two years ago.

It still works.

They didn't fix it.

Any low-budget troll who can read that old blog post and come up with a valid credit card number can still do it.

Maybe Twitter is a bad example, but the fast-moving nationalist right wing manages to outclass its opponents on other social marketing platforms, too. Facebook won't even reveal how badly they got played in 2016. They thought they were putting out cat food for cute Internet kittens, but the rats ate it.

This is not new. Right-wing shitlords, at least the best of them, are the masters of database marketing. They absolutely kill it, and they have been ever since Marketing as we know it became a thing. Some good examples:

All the creepy surveillance marketing stuff they're doing today is just another set of tools in an expanding core competency.

Every once in a while you get an exception. The environmental movement became a direct mail operation in response to Interior Secretary James G. Watt, who alarmed environmentalists enough that organizations could reliably fundraise with direct mail copy quoting from Watt's latest speech. And the Democrats tried that "Organizing for America" thing for a little while, but, man, their heart just wasn't in it. They dropped it like a Moodle site during summer vacation. Somehow, the creepier the marketing, the more it skews "red". The more creativity involved, the more it skews "blue" (using the USA meanings of those colors.) When we make decisions about how much user surveillance we're going to allow on a platform, we're making a political decision.

Anyway. News Outlets to Seek Bargaining Rights Against Google and Facebook.

The standings so far.

  1. Shitlords and fraud hackers

  2. Adtech and social media bros

  3. NEWS SITES HERE (?)

News sites want to go to Congress, to get permission to play for third place in their own business? You want permission to bring fewer resources and less experience to a surveillance marketing game that the Internet companies are already losing?

We know the qualities of a medium that you win by being creepier, and we know the qualities of a medium that you can win with reputation and creativity. Why waste time and money asking Congress for the opportunity to lose, when you could change the game instead?

Maybe achieving balance in political views depends on achieving balance in business model. Instead of buying in to the surveillance marketing model 100%, and handing an advantage to one side, maybe news sites should help users control what data they share in order to balance competing political interests.

Smart futures contracts on software issues talk, and bullshit walks?

14 July 2017

Previously: Benkler’s Tripod, transactions from a future software market, more transactions from a future softwware market

Owning "equity" in an outcome

John Robb: Revisiting Open Source Ventures:

Given this, it appears that an open source venture (a company that can scale to millions of worker/owners creating a new economic ecosystem) that builds massive human curated databases and decentralizes the processing load of training these AIs could become extremely competitive.

But what if the economic ecosystem could exist without the venture? Instead of trying to build a virtual company with millions of workers/owners, build a market economy with millions of participants in tens of thousands of projects and tasks? All of this stuff scales technically much better than it scales organizationally—you could still be part of a large organization or movement while only participating directly on a small set of issues at any one time. Instead of holding equity in a large organization with all its political risk, you could hold a portfolio of positions in areas where you have enough knowledge to be comfortable.

Robb's opportunity is in training AIs, not in writing code. The "oracle" for resolving AI-training or dataset-building contracts would have to be different, but the futures market could be the same.

The cheating project problem

Why would you invest in a futures contract on bug outcomes when the project maintainer controls the bug tracker?

And what about employees who are incentivized from both sides: paid to fix a bug but able to buy futures contracts (anonymously) that will let them make more on the market by leaving it open?

In order for the market to function, the total reputation of the project and contributors must be high enough that outside participants believe that developers are more motivated to maintain that reputation than to "take a dive" on a bug.

That implies that there is some kind of relationship between the total "reputation capital" of a project and the maximum market value of all the futures contracts on it.

Open source metrics

To put that another way, there must be some relationship between the market value of futures contracts on a project and the maximum reputation value of the project. So that could be a proxy for a difficult-to-measure concept such as "open source health."

Open source journalism

Hey, tickers to put into stories! Sparklines! All the charts and stuff that finance and sports reporters can build stories around!

Blind code reviews experiment

13 July 2017

Update 18 Dec 2017: The blind-reviews add-on now supports both Bugzilla code reviews and GitHub pull requests. Updated project status. Added a forbidden word.

In case you missed it, here's a study that made the rounds earlier this year: Gender differences and bias in open source: Pull request acceptance of women versus men:

This paper presents the largest study to date on gender bias, where we compare acceptance rates of contributions from men versus women in an open source software community. Surprisingly, our results show that women's contributions tend to be accepted more often than men's. However, women's acceptance rates are higher only when they are not identifiable as women.

A followup, from Alice Marshall, breaks out the differences between acceptance of "insider" and "outsider" contributions.

For outsiders, women coders who use gender-neutral profiles get their changes accepted 2.8% more of the time than men with gender-neutral profiles, but when their gender is obvious, they get their changes accepted 0.8% less of the time.

We decided to borrow the blind auditions concept from symphony orchestras for the open source experiments program.

The experiment, launching this month, will help reviewers who want to try breaking habits of unconscious bias (whether by gender or insider/outsider status) by concealing the name and email adddress of a code author during a review on Bugzilla. You'll be able to un-hide the information before submitting a review, if you want, in order to add a personal touch, such as welcoming a new contributor.

Built with the WebExtension development work of Tomislav Jovanovic ("zombie" on IRC), and the Bugzilla bugmastering of Emma Humphries. For more info, see the Bugzilla bug discussion.

Data collection

The extension will "cc" one of two special accounts on a bug, to indicate if the review was done partly or fully blind. This lets us measure its impact without having to make back-end changes to Bugzilla.

(Yes, browser add-ons let you experiment with changing a user's experience of a site without changing production web applications or content sites. Bonus link: FilterBubbler.)

Status

The blind-reviews add-on is available for Firefox here: Blind Reviews BMO Experiment.

Forbidden Word

Thing you "can't" say for today: diversity (more info: forbidden words Git hook)

Two approaches to adfraud, and some good news

07 July 2017

Adfraud is a big problem, and we keep seeing two basic approaches to it.

Flight to quality: Run ads only on trustworthy sites. Brands are now playing the fraud game with the "reputation coprocessors" of the audience's brains on the brand's side. (Flight to quality doesn't mean just advertise on the same major media sites as everyone else—it can scale downward with, for example, the Project Wonderful model that lets you choose sites that are "brand safe" for you.)

Increased surveillance: Try to fight adfraud by continuing to play the game of trying to get big-money impressions from the cheapest possible site, but throw more tracking at the problem. Biggest example of this is to move ad money to locked-down mobile platforms and away from the web.

The problem with the second approach is that the audience is no longer on the brand's side. Trying to beat adfraud with technological measures is just challenging hackers to a series of hacking contests. And brands keep losing those. Recent news: The Judy Malware: Possibly the largest malware campaign found on Google Play.

Anyway, I'm interested in and optimistic about the results of the recent Mozilla/Caribou Digital report. It turns out that USA-style adtech is harder to do in countries where users are (1) less accurately tracked and (2) equipped with blockers to avoid bandwidth-sucking third-party ads. That's likely to mean better prospects for ad-supported news and cultural works, not worse. This report points out the good news that the so-called adtech tax is lower in developing countries—so what kind of ad-supported businesses will be enabled by lower "taxes" and "reinvention, not reinsertion" of more magazine-like advertising?

Of course, working in those markets is going to be hard for big US or European ad agencies that are now used to solving problems by throwing creepy tracking at them. But the low rate of adtech taxation sounds like an opportunity for creative local agencies and brands. Maybe the report should have been called something like "The Global South is Shitty-Adtech-Proof, so Brands Built Online There Are Going to Come Eat Your Lunch."

more transactions from a future software market

04 July 2017

Previously: Benkler’s Tripod, transactions from a future software market

Why would you want the added complexity of a market where anyone can take either side of a futures contract on the status of a software bug, and not just offer to pay people to fix bugs like a sensible person? IMHO it's worth trying not just because of the promise of lower transaction costs and more market liquidity (handwave) but because it enables other kinds of transactions. A few more.

Partial work I want a feature, and buy the "unfixed" side of a contract that I expect to lose. A developer decides to fix it, does the work, and posts a pull request that would close the bug. But the maintainer is on vacation, leaving her pull request hanging with a long comment thread. Another developer is willing to take on the political risk of merging the work, and buys out the original developer's position.

Prediction/incentivization With the right market design, a prediction that something won't happen is the same as an incentive to make it happen. If we make an attractive enough way for users to hedge their exposure to lack of innovation, we create a pool of wealth that can be captured by innovators. (Related: dominant assurance contracts)

Bug triage Much valuable work on bugs is in the form of modifying metadata: assigning a bug to the correct subsystem, identifying dependency relationships, cleaning up spam, and moving invalid bugs into a support ticket tracker or forum. This work is hard to reward, and infamously hard to find volunteers for. An active futures market could include both bots that trade bugs probabilistically based on status and activity, and active bug triagers who make small market gains from modifying metadata in a way that makes them more likely to be resolved.

Applying proposed principles for content blocking

04 July 2017

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

In 2015, Denelle Dixon at Mozilla wrote Proposed Principles for Content Blocking.

The principles are:

  • Content Neutrality: Content blocking software should focus on addressing potential user needs (such as on performance, security, and privacy) instead of blocking specific types of content (such as advertising).

  • Transparency & Control: The content blocking software should provide users with transparency and meaningful controls over the needs it is attempting to address.

  • Openness: Blocking should maintain a level playing field and should block under the same principles regardless of source of the content. Publishers and other content providers should be given ways to participate in an open Web ecosystem, instead of being placed in a permanent penalty box that closes off the Web to their products and services.

See also Nine Principles of Policing by Sir Robert Peel, who wrote,

[T]he police are the public and that the public are the police, the police being only members of the public who are paid to give full-time attention to duties which are incumbent on every citizen in the interests of community welfare and existence.

Web browser developers have similar responsibilities to those of Peel's ideal police: to build a browser to carry out the user's intent, or, when setting defaults, to understand widely held user norms and implement those, while giving users the affordances to change the defaults if they choose.

The question now is how to apply content blocking principles to today's web environment. Some qualities of today's situation are:

  • Tracking protection often doesn't have to be perfect, because adfraud. The browser can provide some protection, and influence the market in a positive direction, just by getting legit users below the noise floor of fraudbots.

  • Tracking protection has the potential to intensify a fingerprinting arms race that's already going on, by forcing more adtech to rely on fingerprinting in place of third-party cookies.

  • Fraud is bad, but not all anti-fraud is good. Anti-fraud technologies that track users can create the same security risks as other tracking—and enable adtech to keep promising real eyeballs on crappy sites. The "flight to quality" approach to anti-fraud does not share these problems.

  • Adtech and adfraud can peek at Mozilla's homework, but Mozilla can't see theirs. Open source projects must rely on unpredictable users, not unpredictable platform decisions, to create uncertainty.

Which suggests a few tactics—low-risk ways to apply content blocking principles to address today's adtech/adfraud problems.

Empower WebExtensions developers and users. Much of the tracking protection and anti-fingerprinting magic in Firefox is hidden behind preferences. This makes a lot of sense because it enables developers to integrate their work into the browser in parallel with user testing, and enables Tor Browser to do less patching. IMHO this work is also important to enable users to choose their own balance between privacy/security and breaking legacy sites.

Inform and nudge users who express an interest in privacy. Some users care about privacy, but don't have enough information about how protection choices match up with their expectations. If a user cares enough to turn on Do Not Track, change cookie settings, or install an ad blocker, then try suggesting a tracking protection setting or tool. Don't assume that just because a user has installed an ad blocker with deceptive privacy settings that the user would not choose privacy if asked clearly.

Understand and report on adfraud. Adfraud is more than just fake impressions and clicks. New techniques include attribution fraud: taking advantage of tracking to connect a bogus ad impression to a real sale. The complexity of attribution models makes this hard to track down. (Criteo and Steelhouse settled a lawsuit about this before discovery could reveal much.)

A multi-billion-dollar industry is devoted to spreading a story that minimizes adfraud, while independent research hints at a complex and lucrative adfraud scene. Remember how there were two Methbot stories: Methbot got a bogus block of IP addresses, and Methbot circumvented some widely used anti-fraud scripts. The ad networks dealt with the first one pretty quickly, but the second is still a work in progress.

The more that Internet freedom lovers can help marketers understand adfraud, and related problems such as brand-unsafe ad placements, the more that the content blocking story can be about users, legit sites, and brands dealing with problem tracking, and not just privacy nerds against all web business.

transactions from a future software market

04 July 2017

More on the third connection in Benkler’s Tripod, which was pretty general. This is just some notes on more concrete examples of how new kinds of direct connections between markets and peer production might work in the future.

Smart contracts should make it possible to enable these in a trustworthy, mostly decentralized, way.

Feature request I want emoji support on my blog, so I file, or find, a wishlist bug on the open source blog package I use: "Add emoji support." I then offer to enter into a smart contract that will be worthless to me if the bug is fixed on September 1, or give me my money back if the bug is unfixed at that date.

A developer realizes that fixing the bug would be easy, and wants to do it, so takes the other side of the contract. The developer's side will expire worthless if the bug is unfixed, and pay out if the bug is fixed.

"Unfixed" results will probably include bugs that are open, wontfix, invalid, or closed as duplicate of a bug that is still open.

"Fixed" results will include bugs closed as fixed, or any bug closed as a duplicate of a bug that is closed as fixed.

If the developer fixes the bug, and its status changes to fixed, then I lose money on the smart contract but get the feature I want. If the bug status is still unfixed, then I get my money back.

So far this is just one user paying one developer to write a feature. Not especially exciting. There is some interesting market design work to be done here, though. How can the developer signal serious interest in working on the bug, and get enough upside to be meaningful, without taking too much risk in the event the fix is not accepted on time?

Arbitrage I post the same offer, but another user realizes that the blog project can only support emoji if the template package that it depends on supports them. That user becomes an arbitrageur: takes the "fixed" side of my offer, and the "unfixed" side of the "Add emoji support" bug in the template project.

As an end user, I don't have to know the dependency relationship, and the market gives the arbitrageur an incentive to collect information about multiple dependent bugs into the best place to fix them.

Front-running Dudley Do-Right's open source project has a bug in it, users are offering to buy the "unfixed" side of the contract in order to incentivize a fix, and a trader realizes that Dudley would be unlikely to let the bug go unfixed. The trader takes the "fixed" side of the contract before Dudley wakes up. The deal means that the market gets information on the likelihood of the bug being fixed, but the developer doing the work does not profit from it.

This is a "picking up nickels in front of a steamroller" trading strategy. The front-runner is accepting the risk of Dudley burning out, writing a long Medium piece on how open source is full of FAIL, and never fixing a bug again.

Front-running game theory could be interesting. If developers get sufficiently annoyed by front-running, they could delay fixing certain bugs until after the end of the relevant contracts. A credible threat to do this might make front-runners get out of their positions at a loss.

CVE prediction A user of a static analysis tool finds a suspicious pattern in a section of a codebase, but cannot identify a specific vulnerability. The user offers to take one side of a smart contract that will pay off if a vulnerability matching a certain pattern is found. A software maintainer or key user can take the other side of these contracts, to encourage researchers to disclose information and focus attention on specific areas of the codebase.

Security information leakage Ernie and Bert discover a software vulnerability. Bert sells it to foreign spies. Ernie wants to get a piece of the action, too, but doesn't want Bert to know, so he trades on a relevant CVE prediction. Neither Bert nor the foreign spies know who is making the prediction, but the market movement gives white-hat researchers a clue on where the vulnerability can be found.

Open source metrics: Prices and volumes on bug futures could turn out to be a more credible signal of interest in a project than raw activity numbers. It may be worth using a bot to trade on a project you depend on, just to watch the market move. Likewise, new open source metrics could provide useful trading strategies. If sentiment analysis shows that a project is melting down, offer to take the "unfixed" side of the project's long-running bugs? (Of course, this is the same market action that incentivizes fixes, so betting that a project will fail is the same thing as paying them not to. My brain hurts.)

What's an "oracle"?

The "oracle" is the software component that moves information from the bug tracker to the smart contracts system. Every smart contract has to be tied to a given oracle that both sides trust to resolve it fairly.

For CVE prediction, the oracle is responsible for pattern matching on new CVEs, and feeding the info into the smart contract system. As with all of these, CVE prediction contracts are tied to a specific oracle.

Bots

Bots might have several roles.

  • Move investments out of duplicate bugs. (Take a "fixed" position in the original and an "unfixed" position in the duplicate, or vice versa.)

  • Make small investments in bugs that appear valid based on project history and interactions by trusted users.

  • Track activity across projects and social sites to identify qualified bug fixers who are unlikely to fix a bug within the time frame of a contract, and take "unfixed" positions on bugs relevant to them.

  • For companies: when a bug is mentioned in an internal customer support ticketing system, buy "unfixed" on that bug. Map confidential customer needs to possible fixers.

Software: annoying speech or crappy product?

03 July 2017

Zeynep Tufekci, in the New York Times:

Since most software is sold with an “as is” license, meaning the company is not legally liable for any issues with it even on day one, it has not made much sense to spend the extra money and time required to make software more secure quickly.

The software business is still stuck on the kind of licensing that might have made sense in the 8-bit micro days, when "personal computer productivity" was more aspirational than a real thing, and software licenses were printed on the backs of floppy sleeves.

Today, software is part of products that do real stuff, and it makes zero sense to ship a real product, that people's safety or security depends on, with the fine print "WE RESERVE THE RIGHT TO TOTALLY HALF-ASS OUR JOBS" or in business-speak, "SELLER DISCLAIMS THE IMPLIED WARRANTY OF MERCHANTABILITY."

But what about open source and collaboration and science, and all that stuff? Software can be both "product" and "speech". Should there be a warranty on speech? If I dig up my shell script for re-running the make command when a source file changes, and put it on the Internet, should I be putting a warranty on it?

It seems that there are two kinds of software: some is more product-like, and should have a grown-up warranty on it like a real busines. And some software is more speech-like, and should have ethical requirements like a scientific paper, but not a product-like warranty.

What's the dividing line? Some ideas.

"productware is shipped as executables, freespeechware is shipped as source code" Not going to work for elevator_controller.php or a home router security tool written in JavaScript.

"productware is preinstalled, freespeechware is downloaded separately" That doesn't make sense when even implanted defibrillators can update over the net.

"productware is proprietary, freespeechware is open source" Companies could put all the fragile stuff in open source components, then use the DMCA and CFAA to enable them to treat the whole compilation as proprietary.

Software companies are built to be good at getting around rules. If a company can earn all its money in faraway Dutch Sandwich Land and be conveniently too broke to pay the IRS in the USA, then it's going to be hard to make it grow up licensing-wise without hurting other people first.

How about splitting out the legal advantages that the government offers to software and extending some to productware, others to freespeechware?

Freespeechware licenses

  • license may disclaim implied warranty

  • no anti-reverse-engineering clause in a freespeechware license is enforceable

  • freespeechware is not a "technological protection measure" under section 1201 of Title 17 of the United States Code (DMCA anticircumvention)

  • exploiting a flaw in freespeechware is never a violation of the Computer Fraud and Abuse Act

  • If the license allows it, a vendor may sell freespeechware, or a derivative work of it, as productware. (This could be as simple as following the You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. term of the GPL.)

Productware licenses:

  • license may not disclaim implied warranty

  • licensor and licensee may agree to limit reverse engineering rights

  • DMCA and CFAA apply (reformed of course, but that's another story)

It seems to me that there needs to be some kind of quid pro quo here. If a company that sells software wants to use government-granted legal powers to control its work, that has to be conditioned on not using those powers just to protect irresponsible releases.

Fun with dlvr.it

23 June 2017

Check it out—I'm "on Facebook" again. Just fixed my gateway through dlvr.it. If you're reading this on Facebook, that's why.

Dlvr.it is a nifty service that will post to social sites from an RSS feed. If you don't run your own linklog feed, the good news is that Pocket will generate RSS feeds from the articles you save, so if you want to share links with people still on Facebook, the combination of Pocket and dlvr.it makes that easy to do without actually spending human eyeball time there. My linklog feed is mirrored from my own weird feedreader, which works for me but not really ready for other users.

There's a story about Thomas Nelson, Jr., leader of the Virginia Militia in the Revolutionary War.

During the siege and battle Nelson led the Virginia Militia whom he had personally organized and supplied with his own funds. Legend had it that Nelson ordered his artillery to direct their fire on his own house which was occupied by Cornwallis, offering five guineas to the first man who hit the house.

Would Facebook's owners do the same, now that we know that foreign interests use Facebook to subvert America? Probably not. The Nelson story is just an unconfirmed patriotic anecdote, and we can't expect that kind of thing from today's post-patriotic investor class. Anyway, just seeing if I can move Facebook's bots/eyeballs ratio up a little.

Stuff I'm thankful for

22 June 2017

I'm thankful that the sewing machine was invented a long time ago, not today. If the sewing machine were invented today, most sewing tutorials would be twice as long, because all the thread would come in proprietary cartridges, and you would usually have to hack the cartridge to get the type of thread you need in a cartridge that works with your machine.

1. Write open source. 2. ??? 3. PROFIT

22 June 2017

Studies keep showing that open source developers get paid more than people who develop software but do not contribute to open source.

Good recent piece: Tabs, spaces and your salary - how is it really? by Evelina Gabasova.

But why?

Is open source participation a way to signal that you have skills and are capable of cooperation with others?

Is open source a way to build connections and social capital so that you have more awareness of new job openings and can more easily move to a higher-paid position?

Does open source participation just increase your skills so that you do better work and get paid more for it?

Are open source codebases a complementary good to open source maintenance programming, so that a lower price for access to the codebase tends to drive up the price for maintenance programming labor?

Is "we hire open source people" just an excuse for bias, since the open source scene at least in the USA is less diverse than the general pool of programming job applicants?

Catching up to Safari?

21 June 2017

Earlier this month, Apple Safari pulled ahead of other mainstream browsers in tracking protection. Tracking protection in the browser is no longer a question of should the browser do it, but which browser best protects its users. But Apple's early lead doesn't mean that another browser can't catch up.

Tracking protection is still hard. You have to provide good protection from third-party tracking, which users generally don't want, without breaking legit third-party services such as content delivery networks, single sign-on systems, and shopping carts. Protection is a balance, similar to the problem of filtering spam while delivering legit mail. Just as spam filtering helps enable legit email marketing, tracking protection tends to enable legit advertising that supports journalism and cultural works.

In the long run, just as we have seen with spam filters, it will be more important to make protection hard to predict than to run the perfect protection out of the box. Do not repeat the tactics which have gained you one victory, but let your methods be regulated by the infinite variety of circumstances. — Sun Tzu A spam filter, or browser, that always does the same thing will be analyzed and worked around. A mail service that changes policies to respond to current spam runs, or an unpredictable ecosystem of tracking protection add-ons that browser users can install in unpredictable combinations, is likely to be harder.

But most users aren't in the habit of installing add-ons, so browsers will probably have to give them a nudge, like Microsoft Windows does when it nags the user to pick an antivirus package (or did last time I checked.) So the decentralized way to catch up to Apple could end up being something like:

  • When new tracking protection methods show up in the privacy literature, quietly build the needed browser add-on APIs to make it possible for new add-ons to implement them.

  • Do user research to guide the content and timing of nudges. (Some atypical users prefer to be tracked, and should be offered a chance to silence the warnings by affirmatively choosing a do-nothing protection option.)

  • Help users share information about the pros and cons of different tools. If a tool saves lots of bandwidth and battery life but breaks some site's comment form, help the user make the right choice.

  • Sponsor innovation challenges to incentivize development, testing, and promotion of diverse tracking protection tools.

Any surveillance marketer can install and test a copy of Safari, but working around an explosion of tracking protection tools would be harder. How to set priorities when they don't know which tools will get popular?

What about adfraud?

Tracking protection strategies have to take adfraud into account. Marketers have two choices for how to deal with adfraud:

  • flight to quality

  • extra surveillance

Flight to quality is better in the long run. But it's a problem from the point of view of adtech intermediaries because it moves more ad money to high-reputation sites, and the whole point of adtech is to reach big-money eyeballs on cheap sites. Adtech firms would rather see surveillance-heavy responses to adfraud. One way to help shift marketing budgets away from surveillance, and toward flight to quality, is to make the returns on surveillance investments less predictable.

This is possible to do without making value judgments about certain kinds of sites. If you like a site enough to let it see your personal info, you should be able to do it, even if in my humble opinion it's a crappy site. But you can have this option without extending to all crappy sites the confidence that they'll be able to live on leaked data from unaware users.

Apple's kangaroo cookie robot

11 June 2017

I'm looking forward to trying "Intelligent Tracking Prevention" in Apple Safari. But first, let's watch an old TV commercial for MSN.

Today, a spam filter seems like a must-have feature for any email service. But MSN started talking about its spam filtering back when Sanford Wallace, the "Spam King," was saying stuff like this.

I have to admit that some people hate me, but I have to tell you something about hate. If sending an electronic advertisement through email warrants hate, then my answer to those people is "Get a life. Don't hate somebody for sending an advertisement through email." There are people out there that also like us.

According to spammers, spam filtering was just Internet nerds complaining about something that regular users actually like. But the spam debate ended when big online services, starting with MSN, started talking about how they build for their real users instead of for Wallace's hypothetical spam-loving users.

If you missed the email spam debate, don't worry. Wallace's talking points about spam filters constantly get recycled by surveillance marketers talking about tracking protection. But now it's not email spam that users supposedly crave. Today, the Interactive Advertising Bureau tells us that users want ads that "follow them around" from site to site.

Enough background. Just as the email spam debate ended with MSN's campaign, the third-party web tracking debate ended on June 5, 2017.

With Intelligent Tracking Prevention, WebKit strikes a balance between user privacy and websites’ need for on-device storage. That said, we are aware that this feature may create challenges for legitimate website storage, i.e. storage not intended for cross-site tracking.

If you need it in bullet points, here it is.

  • Nifty machine learning technology is coming in on the user's side.

  • "Legitimate" uses do not include cross-site tracking.

  • Safari's protection is automatic and client-side, so no blocklist politics.

Surveillance marketers come up with all kinds of hypothetical reasons why users might prefer targeted ads. But in the real world, Apple invests time and effort to understand user experience. When Apple communicates about a feature, it's because that feature is likely to keep a user satisfied enough to buy more Apple devices. We can't read their confidential user research, but we can see what the company learned from it based on how they communicate about products.

(Imagine for a minute that Apple's user research had found that real live users are more like the Interactive Advertising Bureau's idea of a user. We might see announcements more like "Safari automatically shares your health and financial information with brands you love!" Anybody got one of those to share?)

Saving an out-of-touch ad industry

Advertising supports journalism and cultural works that would not otherwise exist. It's too important not to save. Bob Hoffman asks,

[H]ow can we encourage an acceptable version of online advertising that will allow us to enjoy the things we like about the web without the insufferable annoyance of the current online ad model?

The browser has to be part of the answer. If the browser does its job, as Safari is doing, it can play a vital role in re-connecting users with legit advertising—just as users have come to trust legit email newsletters now that they have effective spam filters.

Safari's Intelligent Tracking Prevention is not the final answer any more than Paul Graham's "A plan for spam" was the final spam filter. Adtech will evade protection tools just as spammers did, and protection will have to keep getting better. But at least now we can finally say debate over, game on.

With New Browser Tech, Apple Preserves Privacy and Google Preserves Trackers

An Ad Network That Works With Fake News Sites Just Launched An Anti–Fake News Initiative

Google Slammed For Blocking Ads While Allowing User Tracking

Introducing FilterBubbler: A WebExtension built using React/Redux

Forget far-right populism – crypto-anarchists are the new masters

Risks to brands under new EU regulations

Breitbart ads plummet nearly 90 percent in three months as Trump’s troubles mount

Be Careful Celebrating Google’s New Ad Blocker. Here’s What’s Really Going On.

‘We know the industry is a mess’: Marketers share challenges at Digiday Programmatic Marketing Summit

FIREBALL – The Chinese Malware of 250 Million Computers Infected

Verified bot laundering 2. Not funny. Just die

Publisher reliance on tech providers is ‘insane’: A Digiday+ town hall with The Washington Post’s Jarrod Dicker

Why pseudonymization is not the silver bullet for GDPR.

A level playing field for companies and consumers

Apple user research revealed, sort of

06 June 2017

This is not normally the blog to come to for Apple fan posts (my ThinkPad, desktop Linux, cold dead hands, and so on) but really good work here on "Intelligent Tracking Prevention" in Apple Safari.

Looks like the spawn of Privacy Badger and cookie double-keying, designed to balance user protection from surveillance marketing with minimal breakage of sites that depend on third-party resources.

(Now all the webmasters will fix stuff to make it work with Intelligent Tracking Prevention, which makes it easier for other browsers and privacy tools to justify their own features to protect users. Of course, now the surveillance marketers will rely more on passive fingerprinting, and Apple has an advantage there because there are fewer different Safari-capable devices. But browsers need to fix fingerprinting anyway.)

Apple does massive amounts of user research and it's fun to watch the results leak through when they communicate about features. Looks like they have found that users care about being "followed" from site to site by ads, and that users are still pretty good at applied behavioral economics. The side effect of tracking protection, of course, is that it takes high-reputation sites out of competition with the bottom-feeders to reach their own audiences, so Intelligent Tracking Prevention is great news for publishers too.

Meanwhile, I don't get Google's weak "filter" thing. Looks like a transparently publisher-hostile move (since it blocks some potentially big-money ads without addressing the problem of site commodification), unless I'm missing something.

The third connection in Benkler's Tripod

31 May 2017

Here's a classic article by Yochai Benkler: Coase's Penguin, or Linux and the Nature of the Firm.

Benkler builds on the work of Ronald Coase, whose The Nature of the Firm explains how transaction costs affect when companies can be more efficient ways to organize work than markets. Benkler adds a third organizational model, peer production. Peer production, commonly seen in open source projects, is good at matching creative people to rewarding problems.

As peer production relies on opening up access to resources for a relatively unbounded set of agents, freeing them to define and pursue an unbounded set of projects that are the best outcome of combining a particular individual or set of individuals with a particular set of resources, this open set of agents is likely to be more productive than the same set could have been if divided into bounded sets in firms.

Firms, markets, and peer production all have their advantages, and in the real world, most productive activity is mixed.

  • Managers in firms manage some production directly and trade in markets for other production. This connection in the firms/markets/peer production tripod is as old as firms.

  • The open source software business is the second connection. Managers in firms both manage software production directly and sponsor peer production projects, or manage employees who participate in projects.

But what about the third possible connection between legs of the tripod? Is it possible to make a direct connection between peer production and markets, one that doesn't go through firms? And why would you want to connect peer production directly to markets in the first place? Not just because that's where the money is, but because markets are a good tool for getting information out of people, and projects need information. Save the whole Kooths et al. paper to read later. Best case against open source that I know of—all the points that a serious open source proponent needs to be able to address. Stefan Kooths, Markus Langenfurth, and Nadine Kalwey wrote, in "Open-Source Software: An Economic Assessment" (PDF),

Developers lack key information due to the absence of pricing in open-source software. They do not have information concerning customers’ willingness to pay (= actual preferences), based on which production decisions would be made in the market process. Because of the absence of this information, supply does not automatically develop in line with the needs of the users, which may manifest itself as oversupply (excessive supply) or undersupply (excessive demand). Furthermore, the functional deficits in the software market also work their way up to the upstream factor markets (in particular, the labor market for developers) and–depending on the financing model of the open-source software development–to the downstream or parallel complementary markets (e.g., service markets) as well.

Because the open-source model at its core deliberately rejects the use of the market as a coordination mechanism and prevents the formation of price information, the above market functions cannot be satisfied by the open-source model. This results in a systematic disadvantage in the provision of software in the open-source model as compared to the proprietary production process.

The workaround is to connect peer production to markets by way of firms. But the more that connections between markets and peer production projects have to go through firms, the more chances to lose information. That's not because firms are necessarily dysfunctional (although most are, in different ways). A firm might rationally choose to pay for the implementation of a feature that they predict will get 100 new users, paying $5000 each, instead of a feature that adds $1000 of value for 1000 existing users, but whose absence won't stop them from renewing.

Some ways to connect peer production to markets are already working. Crowdfunding for software projects and Patreon are furthest along, both offering support for developers who have already built a reputation.

A decentralized form of connection is Tokens, which Balaji S. Srinivasan describes as a tradeable version of API keys. If I believe that your network service will be useful to me in the future, I can pre-buy access to it. If I think your service will really catch on, I can buy a bunch of extra tokens and sell them later, without needing to involve you. (and if your service needs network effects, now I have an incentive to promote it, so that there will be a seller's market for the tokens I hold.)

Dominant assurance contracts, by Alexander Tabarrok, build on the crowdfunding model, with the extra twist that the person proposing the project has to put up some seed money that is divided among backers if the project fails to secure funding. This is supposed to bring in extra investment early on, before a project looks likely to meet its goal.

Tom W. Bell's "SPEX", in Prediction Markets for Promoting the Progress of Sciences and the Useful Arts, is a proposed market to facilitate transactions in a variety of prediction certificates, each one of which promises to pay its bearer in the event that an associated claim about science, technology, or public policy comes true. The SPEX looks promising as a way for investors to hedge their exposure to lack of innovation. If you own data centers and need energy, take a short position in SPEX contracts on cold fusion. (Or, more likely, buy into a SPEX fund that invests for your industry.) The SPEX looks like a way to connect the market to more difficult problems than the kinds of incremental innovation that tend to be funded through the VC system.

What happens when the software industry is forced to grow up?

I'm starting to think that finishing the tripod, with better links from markets to peer production, is going to matter a lot more soon, because of the software quality problem.

Today's software, both proprietary and open source, is distributed under ¯\_(ツ)_/¯ terms. "Disclaimer of implied warranty of merchantability" is lawyer-speak for "we reserve the right to half-ass our jobs lol." As Zeynep Tufekci wrote in the New York Times, "The World Is Getting Hacked. Why Don’t We Do More to Stop It?" At some point the users are going to get fed up, and we're going to have to. An industry as large and wealthy as software, still sticking to Homebrew Computer Club-era disclaimers, is like a 40-something-year-old startup bro doing crimes and claiming that they're just boyish hijinks. This whole disclaimer of implied warranty thing is making us look stupid, people. (No, I'm not for warranties on software that counts as a scientific or technical communication, or on bona fide collaborative development, but on a product product? Come on.)

Grown-up software liability policy is coming, but we're not ready for it. Quality software is not just a technically hard problem. Today, we're set up to move fast, break things, and ship dancing pigs—with incentives more powerful than incentives to build secure software. Yes, you get the occasional DARPA initiative or tool to facilitate incremental cleanup, but most software is incentivized through too many layers of principal-agent problems. Everything is broken.

If governments try to fix software liability before the software scene can fix the incentives problem, then we will end up with a stifled, slowed-down software scene, a few incumbent software companies living on regulatory capture, and probably not much real security benefit for users. But what if users (directly or through their insurance companies) are willing to pay to avoid the costs of broken software, in markets, and open source developers are willing to participate in peer production to make quality software, but software firms are not set up to connect them?

What if there is another way to connect the "I would rather pay a little more and not get h@x0r3d!" demand to the "I would code that right and release it in open source, if someone would pay for it" supply?

User tracking as Chesterton's Fence

30 May 2017

G.K. Chesterton once wrote

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.

Bob Hoffman makes a good case for getting rid of user tracking in web advertising. But in order to take the next steps, and not just talk among ourselves about things that would be really great in the future, we first need to think about the needs that tracking seems to satisfy for legit marketers.

What I'm not going to do is pull out the argument that's in every first comment on every blog post that criticizes tracking: that "adtech" is just technology and is somehow value-neutral. Tracking, like all technologies, enables some kinds of activity better than others. When tracking offers marketers the opportunity to reach users based on who the user is rather than on what they're reading, watching, or listening to, then that means:

But if tracking is so bad, then why, when you go to any message board or Q&A site that discusses marketing for small businesses, is everyone discussing those nasty, potentially civilization-extinguishing targeted ads? Why is nobody popping up with a question on how to make the next They Laughed When I Sat Down At the Piano?

  • Targeted ads are self-serve and easy to get started with. If you have never bought a Twitter or Facebook ad, get out your credit card and start a stopwatch. These ads might be crappy, but they have the lowest time investment of any legit marketing project, so probably the only marketing project that time-crunched startups can do.

  • Targeted ads keep your OODA loop tight. Yes, running targeted ads can be addictive—If you thought the the attention slot machine game on social sites was bad, try the advertiser dashboard. But you're able to use them to learn information that can help with the rest of marketing. If you have the budget to exhibit at one conference, compare Twitter ads targeted to attendees of conference A with ads targeted to attendees of conference B, and you're closer to an answer.

  • Marketing has two jobs: sell stuff to customers and sell Marketing to management. Targeting is great for the second one, since it comes with the numbers that will help you take credit for results.

We're not going to be able to get rid of risky tracking until we can understand the needs that it fills, not just for big advertisers who can afford the time and money to show up in Cannes every year, but for the company founder who still has $1.99 business cards and is doing all of Marketing themselves.

(The party line among web privacy people can't just be that GDPR is going to save us because the French powers that be are all emmerdés ever since the surveillance/shitlord complex tried to run a US-style game on their political system. That might sound nice, but put not your trust in princes, man. Even the most arrogant Eurocrats in the world will not be able to regulate indefinitely against all the legit business people in their countries complaining that they can't do something they see as essential. GDPR will be temporary air cover for building an alternative, not a fix in itself.)

Post-creepy web advertising is still missing some key features.

  • Branding and signaling metrics. We know the hard math works out against tracking and targeting, and we know about the failure of targeted media to build brands in the long run, but we don't have good numbers that are usable day to day. The "customer journey" has nice graphs, but brand equity doesn't.

  • Quick, low-risk service. With the exception of the Project Wonderful model, targeted ads are quick and low-risk, while signal-carrying ads are the opposite. A high-overhead direct ad sales process is not a drop-in replacement for an easy web form.

I don't think that's all of them. But I don't think that the move to post-creepy web advertising is going to be a rush, all at once, either. Brands that have fly-by-night low-reputation competitors, brands that already have many tracking-protected customers, and brands with solid email lists are going to be able to move faster than marketers who are still making tracking work. More: Work together to fix web ads? Let's not.

sudo dnf install mosh

28 May 2017

I'm still two steps behind in devops coolness for my network stuff. I don't even have proper configuration management, and that's fine because Configuration Management is an Anti-pattern now. Anyway, I still log in and actually run shell commands on the server, and the LWN review of mosh was helpful to me. Now using mosh for connections that persist across suspending the laptop and moving it from network to network. More info: Mosh: the mobile shell

free riding on open source

Here's a good Twitter thread on open source projects and "free rider" companies. As far as I can tell, companies can pay for open source in three ways.

  • do software development

  • pay people to do software development

  • write a long Medium post apologizing to your users for failing

end date for IP Maximalism

When did serious "Intellectual Property Maximalism" end? I'm going to put it at September 18, 2006, which is the date that the Gates Foundation announced funding for the Public Library of Science's journal PLoS Neglected Tropical Diseases. When it's a serious matter of people's health, open access matters, even to the author of "Open Letter to Hobbyists". Since then, IP Maximalism stories have been mostly about rent-seeking behavior, which had been a big part of the freedom lovers's point all along. (Nobody quoted in this story is pearl-clutching about "innovation", for example: Supreme Court ruling threatens to shut down cottage industry for small East Texas town.)

random stuff

Just Keep Scrolling! How To Design Lengthy, Lengthy Pages is "sponsored content" but it's really good sponsored content.

The marketplace of ideas is now struggling with the increasing incidence of algorithmic manipulation and disinformation campaigns. There are bots. Look around.

(In other news, Facebook is still evil, but you probably knew that by now: Why Facebook's Authentication Model is Inadequate, Does Facebook Make Us Unhappy and Unhealthy?)

More links:

Some questions on a screenshot

27 May 2017

Here's a screenshot of an editorial from Der Spiegel, with Ghostery turned on.

article from Der Spiegel

Is it just me, or does it look to anyone else like the man in the photo is checking the list of third-party web trackers on the site to see who he can send a National Security Letter to?

Could a US president who is untrustworthy enough to be removed from office possibly be trustworthy enough to comply with his side of a "Privacy Shield" agreement?

If it's necessary for the rest of the world to free itself of its dependence on the U.S., does that apply to US-based Internet companies that have become a bottleneck for news site ad revenue, and how is that going to work?

Bonus links:

What happened to Twitter? We can't look away...

19 May 2017

Hey, everybody, check it out.

Here's a Twitter ad.

some dumb Twitter ad

If you're "verified" on Twitter, you probably miss these, so I'll just use my Fair Use rights to share that one with you.

You're welcome.

Twitter is a uniquely influential medium, one that shows up on the TV news every night and on news sites all day. But somehow, the plan to make money from Twitter is to run the same kind of targeted ads that anyone with a WordPress site can. And the latest Twitter news is a privacy update that includes, among other things, more tracking of users from one site to another. Yes, the same kind of thing that Facebook already does, and better, with more users. And the same kind of thing that any web site can already get from an entire Lumascape of companies. Boring.

If you want to stick this kind of ad on your WordPress site, you just have to cut and paste some ad network HTML—not build out a deluxe office space on Market Street in San Francisco the way Twitter has. But the result is about the same.

What makes Twitter even more facepalm-worthy is that they make a point of not showing the ads to the influential people who draw attention to Twitter to start with. It's like they're posting a big sign that says STUPID AD ZONE: UNIMPORTANT PEOPLE ONLY. Twitter is building something unique, but they're selling generic impressions that advertisers can get anywhere. So as far as I can tell, the Twitter business model is something like:

Money out: build something unique and expensive.

Money in: sell the most generic and shitty thing in the world.

Facebook can make this work because they have insane numbers of eyeball-minutes. Chump change per minute on Facebook still adds up to real money. But Facebook is an outlier on raw eyeball-minutes, and there aren't enough minutes in the day for another. So Twitter is on track to get sold for $500,000, like Digg was. Which is good news for me because I know enough Twitter users that I can get that kind of money together.

So why should you help me buy Twitter when you could just get the $500,000 yourself? Because I have a secret plan, of course. Twitter is the site that everyone is talking about, right? So run the ads that people will talk about. Here's the plan.

Sell one ad per day. And everybody sees the same one.

Sort of like the back cover of the magazine that everybody in the world reads (but there is no such magazine, so that's why this is an opportunity.) No more need to excuse the verified users from the ads. Yes, an advertiser will have to provide a variety of sizes and localizations for each ad (and yes, Twitter will have to check that the translations match). But it's the same essential ad, shown to every Twitter user in the world for 24 hours.

No point trying to out-Facebook Facebook or out-Lumascape the Lumascape. Targeted ads are weak on signal, and a bunch of other companies are doing them more cost-effectively and at higher volume, anyway.

Of course, this is not for everybody. It's for brands that want to use a memorable, creative ad to try for the same kind of global signal boost that a good Tweet® can get. But if you want generic targeted ads you can get those everywhere else on the Internet. Where else can you get signal? In order to beat current Twitter revenue, the One Twitter Ad needs to go for about the same price as a Super Bowl commercial. But if Twitter stays influential, that's reasonable, and I make back the 500 grand and a lot more.

Understanding the limitations of data pollution tools

02 May 2017

Jeremy Gillula and Yomna Nasser write, on the EFF blog,

Internet users have been asking what they can do to protect their own data from this creepy, non-consensual tracking by Internet providers—for example, directing their Internet traffic through a VPN or Tor. One idea to combat this that’s recently gotten a lot of traction among privacy-conscious users is data pollution tools: software that fills your browsing history with visits to random websites in order to add “noise” to the browsing data that your Internet provider is collecting.

...

[T]here are currently too many limitations and too many unknowns to be able to confirm that data pollution is an effective strategy at protecting one’s privacy. We’d love to eventually be proven wrong, but for now, we simply cannot recommend these tools as an effective method for protecting your privacy.

This is one of those "two problems one solution" situations.

  • The problem for makers and users of "data pollution" or spoofing tools is QA. How do you know that your tool is working? Or are surveillance marketers just filtering out the impressions created by the tool, on the server side?

  • The problem for companies using so-called Non-Human Traffic (NHT) is that when users discover NHT software (bots), the users tend to remove it. What would make users choose to participate in NHT schemes so that the NHT software can run for longer and build up more valuable profiles?

So what if the makers of spoofing tools could get a live QA metric, and NHT software maintainers could give users an incentive to install and use their software?

NHT market as a tool for discovering information

Imagine a spoofing tool that offers an easy way to buy bot pageviews, I mean buy Perfectly Legitimate Data on how fast a site loads from various home Internet connections. When the tool connects to its server for an update, it gets a list of URLs to visit—a mix of random sites, popular sites, and paying customers.

Now the spoofing tool maintainer will be able to to tell right away if the tool is really generating realistic traffic, by looking at the market price of pageviews. The maintainer will even be able to tell whose tracking the tool can beat, by looking at which third-party resources are included on the pages getting paid-for traffic.

The money probably won't be significant, since real web ad money is moving to whitelisted, legit sites and away from fraud-susceptible schemes anyway, but in the meantime it's a way to measure effectiveness.

NPM without sudo

22 April 2017

Setting up a couple of Linux systems to work with FilterBubbler, which is one of the things that I'm up to at work now. FilterBubbler is a WebExtension, and the setup instructions use web-ext, so I need NPM. In order to keep all the NPM stuff under my own home directory, but still put the web-ext tool on my $PATH, I need to make one-line edits to three files.

One line in ~/.npmrc

prefix = ~/.npm

One line in ~/.gitignore

.npm/

One line in ~/.bashrc

export PATH="$PATH:$HOME/.npm/bin"

(My /bashrc has a bunch of export PATH= lines so that when I add or remove one it's more likely to get a clean merge. Because home directory in git.) I think that's it. Now I can do

npm install --global web-ext

with no sudo or mess. And when I clone my home directory on another system it will just work.

Based on: HowTo: npm global install without root privileges by Johannes Klose

Traffic sourcing web obfuscator?

15 April 2017

(This is an answer to a question on Twitter. Twitter is the new blog comments (for now) and I'm more likely to see comments there than to have time to set up and moderate comments here.)

Adfraud is an easy way to make mad cash, adtech is happily supporting it, and it all works because the system has enough layers between CMO and fraud hacker that everybody can stay as clean as they need to. Users bear the privacy risks of adfraud, legit publishers pay for it, and adtech makes more money from adfraud than fraud hackers do. Adtech doesn't have to communicate or coordinate with adfraud, just set up a fraud-friendly system and let the actual fraud hackers go to work. Bad for users, people who make legit sites, and civilization in general.

But one piece of good news is that adfraud can change quickly. Adfraud hackers don't have time to get stuck in conventional ways of doing things, because adfraud is so lucrative that the high-skill players don't have to stay in it for very long. The adfraud hackers who were most active last fall have retired to run their resorts or recording studios or wineries or whatever.

So how can privacy tools get a piece of the action?

One random idea is for an obfuscation tool to participate in the market for so-called sourced traffic. Fraud hackers need real-looking traffic and are willing to pay for it. Supplying that traffic is sketchy but legal. Which is perfect, because put one more layer on top of it and it's not even sketchy.

And who needs to know if they're doing a good job at generating real-looking traffic? Obfuscation tool maintainers. Even if you write a great obfuscation tool, you never really know if your tricks for helping users beat surveillance are actually working, or if your tool's traffic is getting quietly identified on the server side.

In proposed new privacy tool model, outsourced QA pays YOU!

Set up a market where a Perfectly Legitimate Site that is looking for sourced traffic can go to buy pageviews, I mean buy Perfectly Legitimate Data on how fast a site loads from various home Internet connections. When the obfuscation tool connects to its server for an update, it gets a list of URLs to visit—a mix of random, popular sites and paying customers.

Set a minimum price for pageviews that's high enough to make it cost-ineffective for DDoS. Don't allow it to be used on random sites, only those that the buyer controls. Make them put a secret in an unlinked-to URL or something. And if an obfuscation tool isn't well enough sandboxed to visit a site that's doing traffic sourcing, it isn't well enough sandboxed to surf the web unsupervised at all.

Now the obfuscation tool maintainer will be able to to tell right away if the tool is really generating realistic traffic, by looking at the market price. The maintainer will even be able to tell whose tracking the tool can beat, by looking at which third-party resources are included on the pages getting paid-for traffic. And the whole thing can be done by stringing together stuff that IAB members are already doing, so they would look foolish to complain about it.

Interesting stuff on the Internet

13 April 2017

Just some mindless link propagation to tweak making the links on my blog the right shade of blue.

Good news: Portugal Pushes Law To Partially Ban DRM, Allow Circumvention

Study finds Pokémon Go players are happier and The More You Use Facebook, the Worse You Feel. Get your phone charged up, get off Facebook, and get out there.

If corporations are people, you wouldn't be mean to a person, would you? Managing for the Long Term

Yay, surprise presents for Future Me! Why Kickstarter Decided To Radically Transform Its Business Model

Skateboarding obviously doesn't cause hip fractures, because the age groups least likely to skateboard break their hips the most! Something is breaking American politics, but it's not social media

From Spocko, pioneer of Internet brand safety campaigns: Values: Brand, Corporate & Bill O’Reilly’s

In Spite of People Having Meetings, Bears Still Shit in the Woods: In Spite Of The Crackdown, Fake News Publishers Are Still Earning Money From Major Ad Networks

There's another dead bishop on the landing. Alabama Senate OK's church police bill

Productivity is awesome: How to Avoid Distractions and Finish What You

Computer Science FTW: Corrode update: control flow translation correctness

More good news: Kentucky Coal Mining Museum converts to solar power

This is going to be...fun. Goldman Sachs: VC Dry Powder Hits Record Highs

If you want to prep for a developer job interview, here's some good info: Hexing the technical interview

Bunny: Internet famous?

08 April 2017

bunny

I bought this ceramic bunny at a store on Park Street in Alameda, California. Somehow I think I have seen it before.

Memo to self: make dentist appointment

04 April 2017

(Hey, I said this was a personal blog.)

But I was just thinking—people started adding lots of refined sugar to their diets long before anybody discovered how dental caries works.

And today we have Internet distractions, and surveillance marketing, doing to our brains what sugar did to people's teeth.

And people have both sugar and teeth today. Dental hygiene is awesome: it's a set of norms, technologies, and habits, grounded in scientific understanding. Mental hygiene is just getting started.

The sugar industry moved faster to start with, but people agree that teeth matter. So do brains.

Confusion about why we call adtech adtech

03 April 2017

If you want people on the Internet to argue with you, say that you're making a statement about values.

If you want people to negotiate with you, say that you're making a statement about business.

If you want people to accept that something is inevitable, say that you're making a statement about technology.

The mixup between values arguments, business arguments, and technology arguments might be why people are confused about Brands need to fire adtech by Doc Searls.

The set of trends that people call adtech is a values-driven business transformation that is trying to label itself as a technological transformation.

Some of the implementation involves technological changes (NoSQL databases! Nifty!) but fundamentally adtech is about changing how media business is done. Adtech does have a set of values, none of which are really commonly held even among people in the marketing or advertising field, but let's not make the mistake of turning this into either an argument about values (that never accomplishes anything) or a set of statements about technology (that puts those with an inside POV on current technology at an unnecessary advantage). Instead, let's look at the business positions that adtech is taking.

  • Adtech stands for profitable platforms, with commodity producers of news and cultural works. Michael Tiffany, CEO of advertising security firm White Ops, said The fundamental value proposition of these ad tech companies who are de-anonymizing the Internet is, Why spend big CPMs on branded sites when I can get them on no-name sites? This is not a healthy situation, but it's a chosen path, not a technologically inevitable one.

  • Adtech stands for the needs of low-reputation sellers over the needs of high-reputation sellers. High-reputation and low-reputation brands need different qualities from an ad medium and adtech has to under-serve the high-reputation ones. Again, not technologically inevitable, but a business position that high-reputation brands and their agencies don't have to accept.

  • Adtech stands for making advertisers support criminal and politically heinous activity. I'll just let Bob Hoffman explain that one. Fraudulent and brand-unsafe content is just the overspray of the high value platforms/commoditized content system, and advertisers have to accept it in order to power that system. Or do they?

People have a lot of interesting decisions to make: policy, contractual, infrastructural, and client-side. When we treat the adtech movement as simply technology, we take the risk of missing great opportunities to negotiate for the benefit of brands, publishers, and the audience.

Welcome RSS users

01 April 2017

Welcome RSS users.

I am setting up a redirect from my old feed to the new one.

You might see a few old entries.

This new blog has better CSS for reading on small screens and has a Let's Encrypt certificate.

Welcome. How is everyone's tracking protection working?

26 March 2017

This is a brand new blog, so I'm setting up the basics. I just realized that I got the whole thing working without a single script, image, or HTML table. (These kids today have it easy, with their media queries and CSS Grid and stuff.)

One big question that I'm wondering about is: how many of the people who visit here are using some kind of protection from third-party tracking? Third-party tracking has been an unfixed vulnerability in web browsers for a long time. Check out the Unofficial Cookie FAQ from 1997. Third-party cookies are in there...and we're still dealing with the third-party tracking problem?

In order to see how bad the problem is on this site, I'm going to set up a little bit of first-party data collection to measure people's vulnerability to third-party data collection.

The three parts of that big question are:

  • Does first-party JavaScript load and run?

  • Does third-party JavaScript (from a site on popular filter lists) load and run?

  • Can a third-party tracker see state from other sites?

This will be easy to do with a little single-pixel image and the Aloodo tracking detection script.

This blog is on Metalsmith, so the right place to put these scripts will be in layouts/partials/footer.html.

The lines that matter are:

<script src="/code/check3p.js"></script>
<script src="https://ad.aloodo.com/track.js"></script>
<img id="check3p" src="/tk/sr.png"
 height="1" width="1" alt="">

I'm including a single-pixel image and two scripts: the Aloodo one and a new first-party script.

In most tracking protection configurations, the Aloodo script will be blocked, because ad.aloodo.com appears on the commonly used tracking protection lists.

Step two: write the first-party script

The local script is simple: /code/check3p.js

All it does is swap out the tracking image source three times.

  • When the script runs, to check that this is a browser with JavaScript on.

  • When the Aloodo tracking script runs, to check if this browser is blocking the script from loading.

  • When the Aloodo script confirms that tracking is possible.

The work is done in the setupAloodo function, which runs after the page loads. First, it sets the src for the tracking pixel to js.png, then sets up two callbacks: one to run after the Aloodo script is loaded, and switch the image to ld.png, and one to run if the script can track the user, and switch the image to td.png.

Step three: check the logs

Now I can use the regular server logs to compare the number of clients that load the original image, and the JavaScript-switched one, to the number that load the two tracking images.

(There are two different tracking callbacks because of the details of how Aloodo has to detect Privacy Badger, among other things. Not all tracking protection works the same.)

I'll run some reports on the logs and post again about the results. (If you want to see your own results in the meantime, you can take a tracking protection test.)

Am I metal yet?

14 March 2017

This is a blog. Started out with A Beginner's Guide to Crafting a Blog with Metalsmith by Parimal Satyal, but added some other stuff.

Metalsmith is pretty fun. The basic pipeline from the article seems to work pretty well, but I ran into a couple of issues. I might have solved these in ways that are completely wrong, but here's what works for me.

First, I needed to figure out how to get text from an earlier stage of the pipeline. My Metalsmith build is pretty basic:

  1. turn Markdown into HTML (plus article metadata stored with it, wrapped up in a JavaScript object)

  2. apply a template to turn the HTML version into a complete page.

That's great, but the problem seems to be with getting a copy of just the HTML from step 1 for building the index page and the RSS feed. I don't want the entire HTML page from step 2, just the inner HTML from step 1.

The solution seems to be metalsmith-untemplatize. This doesn't actually strip off the template, just lets you capture an extra copy of the HTML before templatization. This goes into the pipeline after "markdown" but before the "layouts" step.

.use(untemplatize(
    { key: 'bodycopy'
}))

I also ran into the Repeat runs with collections adds duplicates issue. Strange to see the same blog items come up twice on the index page. The link on that bug page from Spacedawwwg goes to his fork of metalsmith-collections that seems to do the right thing.

Webfonts

GitHub

There's a GitHub repo of this blog.

MSIE on Fedora with virt-manager

22 October 2015

Internet meetings are a pain in the behind. (Clearly online meeting software is controlled by the fossil fuel industry, and designed to be just flaky enough to make people drive to work instead.)

Here's a work in progress to get an MSIE VM running on Fedora. (Will edit as I check these steps a few times. Suggestions welcome.)

Download: Download virtual machines.

Untar the OVA

tar xvf IE10\ -\ Win8.ova

You should end up with a .vmdk file.

Convert the OVA to qcow2

qemu-img convert IE10\ -\ Win8-disk1.vmdk -O qcow2 msie.qcow2

Import the qcow2 file using virt-manager.

Select Browse, then Browse Local, then select the .qcow2 file.

That's it. Now looking at a virtual MS-Windows guest that I can use for those troublesome web conferences (and for testing web sites under MSIE. If you try the tracking test, it should take you to a protection page that prompts you to turn on the EasyPrivacy Tracking Protection List. That's a quick and easy way to speed up your web browsing experience on MSIE.)

A fresh start for advertising and the web?

13 September 2014

Is advertising ruining the web? Ethan Zuckerman writes,

I have come to believe that advertising is the original sin of the web. The fallen state of our Internet is a direct, if unintentional, consequence of choosing advertising as the default model to support online content and services.

Is the web ruining advertising? Bob Hoffman writes,

[T]he advertising industry has become the web's lapdog – irresponsibly exaggerating the effectiveness of online advertising and social media, ignoring the abominable results of display advertising, glossing over the fraud and corruption, and becoming a de facto sales arm for the online ad industry.

Advertising can be a good thing. Some of my favorite cultural goods are leftovers paid for by advertising at its best. There should be a way to make advertising work for the web, the way it has worked for print magazines.

But Hoffman and Zuckerman are both right. Web advertising has failed. We're throwing away most of the potential value of the web as an ad medium by failing to fix privacy bugs. Web ads today work more like email spam than like magazine ads. The quest for "relevance" not only makes targeted ads less valuable than untargeted ones, but also wastes most of what advertisers spend. Buy an ad on the web, and more of your money goes to intermediaries and fraud than to the content that helps your ad carry a signal.

From Zuckerman's point of view, advertising is a problem, because advertising is full of creepy stuff. From Hoffman's point of view, the web is a problem, because the web is full of creepy stuff. (Bonus link: Big Brother Has Arrived, and He's Us )

So let's re-introduce the web to advertising, only this time, let's try it without the creepy stuff. Brand advertisers and web content people have a lot more in common than either one has with database marketing. There are a lot of great opportunities on the post-creepy web, but the first step is to get the right people talking.

Temporary directory for a shell script

22 August 2014

Set up a temporary directory to use in a Bash script, and clean it up when the script finishes:

TMPDIR=$(mktemp -d)
trap "rm -rf $TMPDIR" EXIT

Automatically run make when a file changes

08 August 2013

Really simple: do a makewatch [target] to re-run make with the supplied [target] when any files relevant to that target change.

makewatch script

Andrew Cowie has written something similar. The main thing that this one does differently is to ask make which files matter to it, instead of doing an inotifywatch on the whole directory. Comments and suggestions welcome.

Printer for Linux

02 November 2011

Picking a printer for Linux?

The process is going to be a little different from what you might be used to with another OS. If you shop carefully (and reading blogs is a good first step) then the drivers you will need are already available through your Linux distribution's printer setup tool.

HP has done a good job with enabling this. The company has already released the necessary printer software as open source, and your Linux distribution has already installed it. So, go to printers fully supported with the HPLIP software, pick a printer you like, and you're done.

If you want a recommendation from me, the HP LaserJet 3055, a black and white all-in-one device, has worked fine for me with various Linux setups for years. It's also a scanner/copier/fax machine, and you get the extra functionality for not much more than the price of a regular printer. It also comes with a good-sized toner cartridge, so your cost per page is probably going to be pretty reasonable.

Other printer brands have given me more grief, but fortunately the HP LaserJets are widely available and don't jam much.

It's important not to show a smug expression on your face while printing if users of non-Linux OSs are still dealing with driver CDs or vendor downloads.

Landmarks in instructions

05 September 2010

When you give travel directions, you include landmarks, and "gone too far" points. Turn left after you cross the bridge. Then look for my street and make a right. If you go past the water tower you've gone too far.

System administration instructions are much easier to follow if they include those kind of check-ins there, too. For example, if you explain how to set up server software you can put in quick "landmark" tests, such as, "at this point, you can run nmap and see the port in the results." You can also include "gone too far" information by pointing out problems you can troubleshoot on the way.

A full-scale troubleshooting guide is a good idea, but quick warning signs as you go along are helpful. Much better than finding yourself lost at the end of a long set of setup instructions.

dotted quad to decimal in bash

24 December 2008

GNU seq doesn't accept dotted quads for ranges, but fortunately most of the commands that accept an IP address will also take it in the form of a regular decimal. (Spammers used to use this to hide their naughty domains from scanners that only looked for the dotted quad while the browser would happily go to http://3232235520/barely-legal-mortgage.html or something.)

So here's an ugly-ass shell function to convert an IP address to a decimal. If you have a better one, please let me know and I'll update this page. (Yes, I know this would be one line in Perl.)

dq2int()
{
    if [ $(echo $1 | grep -q '\.') ]; then
        dq2int $(echo $1 | tr '.' ' ')
    elif [ $# -eq 1 ]; then
        echo $1
    else
        total=$1; next=$2; shift 2
        dq2int $(($total*2**8+$next)) $@
    fi
}

Seth Schoen has two shorter versions:

dq2int(){
a=0
for b in $(echo $1 | tr . ' '); do
    a=$((256*$a+$b))
done
echo $a
}

dq2int(){
a=0
for b in ${1//./ }; do
    a=$((256*$a+$b))
done
echo $a
}

And if you want to go the other way, Seth points out that you can set the "obase" variable for bc. Here's an int2dq function based on that idea.

int2dq()
{
    { echo obase=256; echo $1; } | \
        bc | tr ' ' . | cut -c2-
}

To quote the GNU bc manual, "For bases greater than 16, bc uses a multi-character digit method of printing the numbers where each higher base digit is printed as a base 10 number."

Trick.

Transaction mail or junk mail? Check the postage.

09 April 2006

It says "Personal and Confidential" or "IMPORTANT CORRESPONDENCE REGARDING YOUR OVERPAYMENT" on the envelope—can you really discard it without opening it? You sure can. Some junk mailers disguise their mail pieces as important correspondence from companies you actually do business with, and the USPS helped them out a lot by renaming "Bulk Mail" to "Standard Mail". But you can look at the postage to discard "stealth" junk mail without opening it.

Postal regulations require that any bills or mail containing specific information about your business relationship with the company must be mailed First Class.

So, if "Standard Mail" or "STD" appears in the upper right corner, it's not a bill, it's not your new credit card, and it's not a check. It's just sneaky junk mail.

eval button

17 April 2005

Jef Raskin wrote,

All that is really needed on computers is a "Calculate" button or omnipresent menu command that allows you to take an arithmetic expression, like 248.93 / 375, select it, and do the calculation whether in the word processor, communications package, drawing or presentation application or just at the desktop level.

Fortunately, there's a blue "Access IBM" button on this keyboard that doesn't do much. So, I configured tpb to make "Access IBM" do this:

perl -e 'print eval `xsel -o`' | \
xsel -i && xte 'key Delete' 'mouseclick 2'

(That is, get the contents of the X primary selection, run it through a Perl "eval", put the result back into the X primary selection, then fake a delete and paste.)

Here's a version that uses the X clipboard selection instead.

xte 'keydown Control_L' 'key c' 'keyup Control_L' && \
perl -e 'print eval `xsel -b -o`' | xsel -b -i  && \
xte 'keydown Control_L' 'key v' 'keyup Control_L'

This one seems to work better in gedit.

If you want to do this, besides tpb, you'll need xsel and xte, which is part of xautomation. If you don't have an unused button, you could also set up a binding in your window manager or build a big red outboard USB "eval" button or something.

Force ssh not to use ssh-agent

17 April 2005

If you make a new ssh key and try to use it with ssh -i while running ssh-agent, ssh tries the agent first. You could end up using a key provided by the agent instead of the one you specify. You can fix this without killing the agent. Use:

env -u SSH_AUTH_SOCK ssh -i newkey host

Picking a Linux distribution

08 April 2005

The most important part of picking a distribution is thinking about where you will go for help, and what distribution that source of help understands. That's true if your source of help is a vendor, a consultant, or a users group.

As a home user, you'll probably be asking your local Linux users group for help when you need it. So get on the mailing list and just "lurk" for a while. See what the most helpful people on the list use, and install that. That way if you have a question, you'll be more likely to reach someone who has already dealt with it. (see How to Pick a Distribution.)

If you're getting into uses for Linux that are different from those of your local user group, it's more important to use a list of people like you than just the geographically closest user group. For example, if you're planning to set up a Linux-based recording studio and your local LUG is all about running web sites and playing Crimson Fields, you might want to get on the Planet CCRMA mailing list, and get your Linux distribution recommendations there.

ssh scripts: fail fast

01 January 2005

If you have a script that uses ssh, here's something to put at the beginning of the script to make sure the necessary passphrase has already been entered, and the remote host is reachable, before starting a time-consuming operation such as an rsync.

ssh $REMOTE_HOST true || exit 1