---

blog: Don Marti

---

List-based and behavior-based tracking protection

22 August 2017

In the news...

User privacy is at risk from both hackers and lawyers. Right now, lawyers are better at attacking lists, and hackers are better at modifying tracker behavior to get around protections.

The more I think about it, the more that I think it's counterproductive to try to come up with one grand unified set of protection rules or cookie policies for everybody.

Spam filters don't submit their scoring rules to ANSI—spammers would just work around them.

Search engines don't standardize and publish their algorithms, because gray hat SEOs would just use the standard to make useless word salad pages that score high.

And different people have different needs.

If you're a customer service rep at an HERBAL ENERGY SUPPLEMENTS company, you need a spam filter that can adjust for your real mail. And any user of a site that has problems with list-based tracking protection will need to have the browser adjust, and rely more on cleaning up third-party state after a session instead of blocking outright.

Does your company intranet become unusable if you fail to accept third-party tracking that comes from an internal domain that your employer acquired and still has some services running on? Browser developers can't decide up front, so the browser will need to adjust. Every change breaks someone's workflow.

That means the browser has to work to help the user pick a working set of protection methods and rules.

0. Send accurate Do Not Track

Inform sites of the user’s preferences on data sharing. (This will be more important in the future because Europe, but privacy-crazed Eurocrats will not save us from having to do our share of the work.

1. Block connections to third-party trackers

This will need to include both list-based protection and monitoring tracking behavior, like Privacy Badger, because hackers and lawyers are good at getting around different ones.

2. Limit data sent to third-party sites

Apple Safari does this, so it's likely to get easier to do cookie double keying without breaking sites.

3. Scramble or delete unsafe data

If a tracking cookie or other identifier does get through, delete or scramble it on leaving the site or later, as the Self-Destructing Cookies extension does. This could be a good backup for when the browser "learns" that a user needs some third-party state to do something like a shopping cart or comment form, but then doesn't want the info to be used for "ads that follow me around" later.

How is everyone's tracking protection working? An update

20 August 2017

When I set up this blog, I put in a script to check how many of the users here are protected from third-party tracking.

The best answer for now is 31%. Of the clients that ran JavaScript on this site over the past two weeks, 31% did not also run JavaScript from the Aloodo "fake third-party tracker".

The script is here: /code/check3p.js

This is not as good as I had hoped (turn on your tracking protection, people! Don't get tricked by ad blockers that leave you unprotected by default!) but it's a start.

The Information Trust Exchange is doing research on the problem of third-party tracking at news sites. News industry consultant Greg Swanson:

All of the conversations on the newspaper side have been focused on how can we join the advertising technology ecosystem. For example, how can a daily newspaper site in Bismarck, North Dakota deliver targeted advertising to a higher-value soccer mom? And none of the newspapers them have considered the fact that when they join that ecosystem they are enabling spam sites, fraudulent sites – enabling those sites to get a higher CPM rate by parasitically riding on the data collected from the higher-value newspaper sites.

More info: Aloodo for web publishers.

SEO hats and the browser of the future

19 August 2017

The field of Search Engine Optimization has white hat SEO, black hat SEO, and gray hat SEO.

White hat SEO helps a user get a better search result, and complies with search engine policies. Examples include accurately using the same words that users search on, and getting honest inbound links.

Black hat SEO is clearly against search engine policies. Link farming, keyword stuffing, cloaking, and a zillion other schemes. If they see you doing it, your site gets penalized in search results.

Gray hat SEO is everything that doesn't help the user get a better search result, but technically doesn't violate a search engine policy.

Most SEO experts advise you not to put a lot of time and effort into gray hat, because eventually the search engines will notice your gray hat scheme and start penalizing sites that do it. Gray hat is just stuff that's going to be black hat when the search engines figure it out.

Adtech has gray hat, too. Rocket Fuel Awarded Two Patents to Help Leverage First-Party Cookies to More Meaningfully Reach Consumers.

This scheme seems to be intended to get around existing third-party cookie protection, which is turned on by default in Apple Safari and available in other browsers.

But how long will it work?

Maybe the browser of the future won't run a "kangaroo cookie court" but will ship with a built-in "kangaroo law school" so that each copy of the browser will develop its own local "courts" and its own local "case law" based on the user's choices. It will become harder to predict how long any single gray hat adtech scheme will continue working.

In the big picture: in order to sell advertising you need to give the advertiser some credible information on who the audience is. Since the "browser wars" of the 1990s, most browsers have been bad at protecting personal information about the user, so web advertising has become a game where a whole bunch of companies compete to covertly capture as much user info as they can.

Today, browsers are getting better at implementing people's preferences about sharing their information. The result is a change in the rules of the game. Investment in taking people's personal info is becoming less rewarding, as browsers compete to reflect people's preferences. (That patent will be irrelevant thanks to browser updates long before it expires.)

And investments in building sites and brands that are trustworthy enough for people to want to share their information will tend to become more rewarding. (This shift naturally leads to complaints from people who are used to winning the old game, but will probably be better for customers who want to use trustworthy brands and for people who want to earn money by making ad-supported news and cultural works.)

One of the big advertising groups is partnering with Digital Content Next’s trust-focused ad marketplace

Partisanship, Propaganda, and Disinformation: Online Media and the 2016 U.S. Presidential Election

ANA Endorses TrustX, Encourages Members To Use Programmatic Media-Buying Stamp Of Approval

Call for Papers: Policy and Internet Special Issue on Reframing ‘Fake News’: Architectures, Influence, and Automation

Time to sink the Admiral (or, why using the DMCA to block adblockers is a bad move)

I'm a woman in computer science. Let me ladysplain the Google memo to you.

Easylist block list removes entry after DMCA takedown notice

Will Cities Ever Outsmart Rats?

Uber drivers gang up to cause surge pricing, research says

Google reveals sites with ‘failing’ ads, including Forbes, LA Times

Koch group, Craigslist founder come to Techdirt's aid

The Mozilla Information Trust Initiative: Building a movement to fight misinformation online

Are Index Funds Evil?

When Silicon Valley Took Over Journalism

How publishers can beat fraudsters at their own game

Facebook’s Secret Censorship Rules Protect White Men from Hate Speech But Not Black Children

cdparanoia returned code 73

18 August 2017

Welcome, people googling for the above error message.

I saw the error

cdparanoia returned code 73

and it turns out I was trying to run two abcde processes in two terminal windows. Kill the second one and the error goes away.

Hope your problem was as simple as that.

ePrivacy and marketing budgets

16 August 2017

(Update 18 Aug 2017: this post is also available at Digital Content Next.)

As far as I know, there are three ways to match an ad to a user.

User intent: Show an ad based on what the user is searching for. Old-school version: the Yellow Pages.

Context: Show an ad based on where the user is, or what the user is interested in. Old-school versions: highway billboards (geographic context), specialized magazines (interest context).

User identity: Show an ad based on who the user is. Old-school version: direct mail.

Most online advertising is matched to the user based on a mix of all three. And different players have different pieces of the action for each one. For user intent, search engines are the gatekeepers. The other winners from matching ads to users by intent are browsers and mobile platforms, who get paid to set their default search engine. Advertising based on context rewards the owners of reputations for producing high-quality news, information, and cultural works. Finally, user identity now has a whole Lumascape of vendors in a variety of categories, all offering to help identify users in some way. (the Lumascape is rapidly consolidating, but that's another story.)

Few of the web ads that you might see today are matched to you purely based on one of the three methods. Investments in all three tend to shift as the available technology, and the prevailing norms and laws, change.

Enough background.

Randall Rothenberg of the IAB is concerned about the proposed ePrivacy Regulation in Europe, and writes,

The basic functionality of the internet, which is built on data exchanges between a user’s computer and publishers’ servers, can no longer be used for the delivery of advertising unless the consumer agrees to receive the ads – but the publisher must deliver content to that consumer regardless.

This doesn't look accurate. I don't know of any proposal that would require publishers to serve users who block ads entirely. What Rothenberg is really complaining about is that the proposed regulation would limit the ability of sites and ad intermediaries to match ads to users based on user identity, forcing them to rely on user intent and context. If users choose to block ads delivered from ad servers that use their personal data without permission, then sites won't be able to refuse to serve them the content, but will be able to run ads that are relevant to the content of the site. As far as I can tell, sites would still be able to pop a "turn off your ad blocker" message in place of a news story if the user was blocking an ad placed purely by context, magazine style.

Privacy regulation is not so much an attack on the basic functionality of the Internet, as it is a shift that lowers the return on investment on knowing who the user is, and drives up the return on investment on providing search results and content. That's a big change in who gets paid: more money for search and for trustworthy content brands, and less for adtech intermediaries that depend on user tracking.

Advertising: a fair deal for the user?

That depends. Search advertising is clearly the result of a user choice. The user chooses to view ads that come with search results, as part of choosing to do a search. As long as the ads are marked as ads, it's pretty obvious what is happening.

The same goes for ads placed in context. The advertiser trades economic signal, in the form of costly support of an ad-supported resource, for the user's attention. This is common in magazine and broadcast advertising, and when you use a site with one of the (rare) pure in-context ad platforms such as Project Wonderful, it works about the same way.

The place where things start to get problematic is ads based on user identity, placed by tracking users from site to site. The more that users learn how their data is used, the less tracking they tend to want. In one survey, 66% of adult Americans said they do not want marketers to tailor advertisements to their interests, and when the researchers explained how ad targeting works, the percentage went up.

If users, on average, dislike tracking enough that sites choose to conceal it, then that's pretty good evidence that sites should probably ask for permission to do it. Whether this opt-in should be enforced by law, technology, or both is left as an exercise for the reader.

So what happens if, thanks to new regulations, technical improvements in browsers, or both, cross-site tracking becomes harder? Rothenberg insists that this transformation would end ad-supported sites, but the real effects would be more complex. Ad-supported sites are already getting a remarkably lousy share of ad budgets. “The supply chain’s complexity and opacity net digital advertisers as little as 30 cents to 40 cents of working media for every dollar spent,” ANA CEO Bob Liodice said.

Advertising on high-reputation sites tends to be a better investment than using highly intermediated, fraud-prone, stacks of user tracking to try to chase good users to cheap sites. But crap ad inventory, including fraudulent and brand-unsafe stuff, persists. The crap only has market value because of user tracking, and it drives down the value of legit ads. If browser improvements or regulation make knowledge of user identity rarer, the crap tends to leave the market and the value of user intent and context go up.

Rothenberg speaks for today's adtech, which despite all its acronyms and Big Data jive, is based on a pretty boring business model: find a user on a legit site, covertly follow the user to a crappy site where the ads are cheaper, sell an ad impression there, profit. Of course he's entitled to make the case for enabling IAB members to continue to collect their "adtech tax." But moving ad budgets from one set of players to another doesn't end ad-supported sites, because marketers adjust. That's what they do. There's always something new in marketing, and budgets move around. What happens when privacy regulations shift the incentives, and make more of advertising have to depend on trustworthy content? That's the real question here.

Moral values in society

08 August 2017

Moral values in society are collapsing? Really? Elizabeth Stoker Bruenig writes, The baseline moral values of poor people do not, in fact, differ that much from those of the rich. (read the whole thing).

Unfortunately, if you read the fine print, it's more complicated than that. Any market economy depends on establishing trust between people who trade with each other. Tim Harford writes,

Being able to trust people might seem like a pleasant luxury, but economists are starting to believe that it’s rather more important than that. Trust is about more than whether you can leave your house unlocked; it is responsible for the difference between the richest countries and the poorest.

Somehow, over thousands of years, business people have built up a set of norms about high-status and low-status business activities. Craftsmanship, consistent supply of high-quality staple goods, and construction of noteworthy projects are high-status activities. Usury and deception are examples of low-status activities. (You make your money in quarters, gambling with retired people? You lend people $100 until Friday at a 300% interest rate? No club invitation for you.)

Somehow, though, that is now changing in the USA. Those who earn money through deception now have seats at the same table as legitimate business. Maybe it started with the shift into "consumer credit" by respectable banks. But why were high-status bankers willing to play loan shark to begin with? Something had to have been building, culturally. (It started too early to blame the Baby Boomers.)

We tend to blame information technology companies for complex, one-sided Terms of Service and EULAs, but it's not so much a tech trend as it is a general business culture trend. It shows up in tech fast, because rapid technology change provides cover and concealment for simultaneous changes in business terms. US business was rapidly losing its connection to basic norms when it was still moving at the speed of FedEx and fax. (You can't say, all of a sudden, "car crashes in existing fast-food drive-thrus are subject to arbitration in Unfreedonia" but you can stick that kind of term into a new service's ToS.) There's some kind of relativistic effect going on. Tech bros just seem like bigger douchebags because they're moving faster.

Regulation isn't the answer. We have a system in which business people can hire lobbyists to buy the laws and regulations we want. The question is whether we're going to use our regulatory capture powers in a shortsighted, society-eroding hustler way, or in a conservative way. Economic conservatism means not just limiting centralized state control of capital, but preserving the balance among all the long-standing stewards of capital, including households, municipalities, and religious and educational institutions. Economic conservatism and radical free-marketism are fundamentally different.

People blame trashy media for the erosion of norms among the poor, so let's borrow that explanation for the erosion of norms among the rich as well. Maybe our problem with business norms results from the globablization and sensationalism of business media. Joe CEO isn't just the most important corporate leader of Mt. Rose, MN, any more—on a global scale he's just another broke-ass hustler.

More random links

06 August 2017

Not the Google story everyone is talking about, but related: Google Is Matching Your Offline Buying With Its Online Ads, But It Isn’t Sharing How. (If a company becomes known for doing creepy shit, it will get job applications from creepy people, and at a large enough company some of them will get hired. Related: The Al Capone theory of sexual harassment)

Least surprising news story ever: The Campaign Against Facebook And Google's Ad "Duopoly" Is Going Nowhere Independent online publishers can't beat the big surveillance marketing companies at surveillance marketing? How about they try to beat Amazon and Microsoft at cloud services, or Apple and Lenovo at laptop computers? There are possible winning strategies for web publishers, but doing the same as the incumbents with less money and less data is not one of them.

Meanwhile, from an investor point of view: It’s the Biggest Scandal in Tech (and no one’s talking about it) Missing the best investment advice: get out of any B-list adtech company that is at risk of getting forced into a low-value acquisition by a sustained fraud story. Or short it and research the fraud story yourself.

Did somebody at The Atlantic get a loud phone notification during a classical music concert or something? Your Smartphone Reduces Your Brainpower, Even If It's Just Sitting There and Have Smartphones Destroyed A Generation?, by Jean M. Twenge, The Atlantic

Good news: Math journal editors resign to start rival open-access journal

Apple’s Upcoming Safari Changes Will Shake Up Ad Tech: Not surprisingly, Facebook and Amazon are the big winners in this change. Most of their users come every day or at least every week. And even the mobile users click on links often, which, on Facebook, takes them to a browser. These companies will also be able to buy ad inventory on Safari at lower prices because many of the high-dollar bidders will go away. A good start by Apple, but other browsers can do better. (Every click on a Facebook ad from a local business is $0.65 of marketing money that's not going to local news, Little League sponsorships, and other legit places.)

Still on the upward slope of the Peak Advertising curve: Facebook 'dark ads' can swing political opinions, research shows

You’re more likely to hear from tech employers if you have one of these 10 things on your resume (and only 2 of them are proprietary. These kids today don't know how good they have it.)

The Pac-Man Rule at Conferences

How “Demo-or-Die” Helped My Career

Pragmatists for copyleft, or, corporate hive minds don't accept software licenses

06 August 2017

One of the common oversimplifications in discussing open-source software licenses is that copyleft licenses are "idealistic" while non-copyleft licenses are "pragmatic." But that's not all there is to it.

The problem is that most people redistributing licensed code are doing so in an organizational context. And no human organization is a hive mind where those who participate within it subordinate their goals to that of the collective. Human organizations are full of of people with their own motivations.

Instead of treating the downstrem developer's employer as a hive mind, it can be more producive to assume good faith on the part of the individual who intends to contribute to the software, and think about the license from the point of view of a real person.

Releasing source for a derivative work costs time and money. The well-intentioned "downstream" contributor wants his or her organization to make those investments, but he or she has to make a case for them. The presence of copyleft helps steer the decision in the right direction. Jane Hacker at an organization planning to release a derivative work can say, matter-of-factly, "we need to comply with the upstream license" if copyleft is involved. The organization is then more likely to do the right thing. There are always violations, but the license is a nudge in the right direction.

(The extreme case is university licensing offices. University-owned software patents can exclude a graduate student from his or her own project when the student leaves the university, unless he or she had the foresight to build it as a derivative work of something under copyleft.)

Copyleft isn't a magic commons-building tool, and it isn't right for every situation. But it can be enough to push an organization over the line. (One place where I worked had to a do a source release for one dependency licensed under GPLv2, and it turned out to be easist to just build one big source code release with all the dependencies in it, and offer that.)

Hey kids, favicon!

05 August 2017

Finally fixed those 404s from browsers looking for favicon.ico on this blog.

  1. Google image search for images where "reuse with modification" is allowed.

  2. Found this high-quality lab mouse SVG image.

  3. Opened it in GNU Image Manipulation Program, posterized, cropped to a square. Kept the transparent background.

  4. Just went to realfavicongenerator.net and did what it says, and added the resulting images and markup to the site.

That's about it. Now there's a little mouse in the browser tab (and it should do the right thing with the icons if someone pins it to their home screen on mobile.)

Why surveillance marketers don't worry about GDPR (but privacy nerds should)

01 August 2017

A lot of privacy people these days sound like a little kid arguing with a sibling. You're going to be in big trouble when Dad gets home!

Dad, here, is the European Union, who's going to put the General Data Protection Regulation foot down, and then, oh, boy, those naughty surveillance marketers are going to catch it, and wish that they had been listening to us about privacy all along.

Right?

But Internet politics never works like that. Sure, European politicians don't want to hand over power to the right-wing factions who are better at surveillance marketing than they are. And foreign agents use Facebook (and other US-based companies) to attack legit political systems. But that stuff is not going to be enough to save GDPR.

The problem is that perfectly normal businesses are using GDPR-violating sneaky tracking pixels and other surveillance marketing as part of their daily marketing routine.

As the GDPR deadline approaches, surveillance marketers in Europe are going to sigh and painstakingly explain to European politicians that of course this GDPR thing isn't going to work. "You see, politicians, it's an example of political overreach that completely conflicts with technical reality." European surveillance marketers will use the same kind of language about GDPR that the freedom-loving side used when we talked about the proposed CBDTPA. It's just going to Break the Internet! People will lose their jobs!

The result is predictable. GDPR will be delayed, festooned with exceptions, or both, and the hoped-for top-down solution to privacy problems will not come. There's no shortcut. We'll only get a replacement for surveillance marketing when we build the tools, the networks, the business processes, the customer/voter norms, and then the political power.

Extracting just the audio from big video files

29 July 2017

Got a big video, and want a copy of just the audio for listening on a device with limited storage? Use Soundconverter.

soundconverter -b -m mp3 -s .mp3 long-video.webm

(MP3 patents are expired now, hooray! I'm just using MP3 here because if I get a rental car that lets me plug in a USB stick for listening, the MP3 format is most likely to be supported.)

Soundconverter has a GUI but you can use -b for batch mode from the shell. soundconverter --help for help. You do need to set both the MIME type, with -m, and the file suffix, with -s.

Online ads don't matter to P&G

28 July 2017

In the news: P&G Cuts More Than $100 Million in ‘Largely Ineffective’ Digital Ads

Not surprising.

Proctor & Gamble makes products that help you comply with widely held cleanliness norms.

Digital ads are micro-targeted to you as an individual.

That's the worst possible brand/medium fit. If you don't know that the people who expect you to keep your house or body clean are going to be aware of the same product, how do you know whether to buy it?

Bonus link from Bob Hoffman last year: Will The P&G Story Bring Down Ad Tech? Please?

Got a reply from Twitter

26 July 2017

I thought it would be fun to try Twitter ads, and, not surprisingly, I started getting fake followers pretty quickly after I started a Twitter follower campaign.

Since I'm paying nine cents a head for these followers, I don't want to get ripped off. So naturally I put in a support ticket to Twitter, and just heard back.

Thanks for writing in about the quality of followers and engagements. One of the advantages of the Twitter Ads platform is that any RTs of your promoted ads are sent to the retweeting account's followers as an organic tweet. Any engagements that result are not charged, however followers gained may not align with the original campaign's targeting criteria. These earned followers or engagements do show in the campaign dashboard and are used to calculate cost per engagement, however you are not charged for them directly.

Twitter also passes all promoted engagements through a filtering mechanism to avoid charging advertisers for any low-quality or invalid engagements. These filters run on a set schedule so the engagements may show in the campaign dashboard, but will be deducted from the amount outstanding and will not be charged to your credit card.

If you have any further questions, please don't hesitate to reply.

That's pretty dense San Francisco speak, so let me see if I can translate to the equivalent for a normal product.

Hey, what are these rat turds doing in my raisin bran?

Thanks for writing in about the quality of your raisin bran eating experience. One of the advantages of the raisin bran platform is that during the production process, your raisin bran is made available to our rodent partners as an organic asset.

I paid for raisin bran, so why are you selling me raisin-plus-rat-turds bran?

Any ingredients that result from rodent engagement are not charged, however ingredients gained may not align with your original raisin-eating criteria.

Can I have my money back?

We pass all raisin bran sales through a filtering mechanism to avoid charging you for invalid ingredients. The total weight of the product, as printed on the box, includes these ingredients, but the weight of invalid ingredients will be deducted from the amount charged to your credit card.

So how can I tell which rat turds are "organic" so I'm not paying for them, and which are the ones that you just didn't catch and are charging me for?

(?)

Buying Twitter followers: Fiverr or Twitter?

On Fiverr, Twitter followers are about half a cent each ($5/1000). On Twitter, I'm gettting followers for about 9 cents each. The Twitter price is about 18x the Fiverr price.

But every follower that someone else buys on Fiverr has to be "aged" and disguised in order to look realistic enough not to get banned. The bot-herders have to follow legit follower campaigns such as mine and not just their paying customers.

If Twitter is selling those "follow" actions to me for nine cents each, and the bot-herder is only making half a cent, how is Twitter not making more from bogus Twitter followers than the bot-herders are?

If you're verified on Twitter, you may not be seeing how much of a shitshow their ad business is. Maybe the're going to have to sell Twitter to me sooner than I thought.

Incentivizing production of information goods

26 July 2017

Just thinking about approaches to incentivizing production of information goods, and where futures markets might fit in.

Artificial property

Article 1, Section 8, of the US Constitution still covers this one best.

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries;

We know about the problems with this one. It encourages all kinds of rent-seeking and freedom-menacing behavior by the holders of property interests in information. And the transaction costs are too high to incentivize the production of some useful kinds of information.

Commoditize the complement

Joel Spolsky explained it best, in Strategy Letter V. Smart companies try to commoditize their products’ complements. (See also: the list of business models in the Some Easily Rebutted Objections to GNU's Goals section of the GNU Manifesto)

This one has been shown to work for some categories of information goods but not others. (We have Free world-class browsers and OS kernels because search engines and hardware are complements. We don't have free world-class software in categories such as CAD.)

Signaling

Release a free information good as a way to signal competence in performing a service, or at least a large investment by the author in persuading others that the author is competent. Works at the level of the individual labor market and in consulting. Don't know if this works in other areas.

Game and market mechanisms

With "gamified crowdsourcing" you can earn play rewards for very low transaction costs, and contribute very small tasks.

Common Voice

Higher transaction costs are associated with "crowdfunding" which sounds similar but requires more collaboration and administration.

In the middle, between crowdsourcing and crowdfunding, is a niche for a mechanism with lower transaction costs than crowdfunding but more rewards than crowdsourcing.

By using the existing bug tracker to resolve contracts, a bug futures market keeps transaction costs low. By connecting to an existing cryptocurrency, a bug futures market enables a kind of reward that is more liquid, and transferrable among projects.

We don't know how wide the bug futures niche is. Is it a tiny space between increasingly complex tasks that can be resolved by crowdsourcing and increasingly finer-grained crowdfunding campaigns?

Or are bug futures capable of achieving low enough transaction costs to be an attractive incentivization mechanism for a lot of tasks that go into a variety of information goods?

My bot parsed 12,387 RSS feeds and all I got were these links.

23 July 2017

Bryan Alexander has a good description of an "open web" reading pipeline in I defy the world and go back to RSS. I'm all for the open web, but 40 separate folders for 400 feeds? That would drive me nuts. I'm a lumper, not a splitter. I have one folder for 12,387 feeds.

My chosen way to use RSS (and one of the great things about RSS is you can choose UX independently of information sources) is a "scored river". Something like Dave Winer's River of News concept, that you can navigate by just scrolling, but not exactly a river of news.

  • with full text if available, but without images. I can click through if I want the images.

  • items grouped by score, not feed. (Scores assigned managed by a dirt-simple algorithm where a feed "invests" a percentage of its points in every link, and the investments pay out in a higher score for that feed if the user likes a link.)

I also put the byline at the bottom of each item. Anyway, one thing I have found out about manipulating my own filter bubble is that linklog feeds and blogrolls are great inputs. So here's a linklog feed. (It's mirrored from the live site, which annoys everyone except me.)

Here are some actual links.

This might look funny: How I ran my kids like an Atlassian team for a month. But think about it for a minute. Someone at every app or site your kids use is doing the same thing, and their goals don't include "Dignity and Respect" or "Hard Work Smart Work".

Global network of 'hunters' aim to take down terrorists on the internet It took me a few days to figure things out and after a few weeks I was dropping accounts like flies…

Google's been running a secret test to detect bogus ads — and its findings should make the industry nervous. (This is a hella good idea. Legit publishers could borrow it: just go ad-free for a few minutes at random, unannounced, a couple of times a week, then send the times straight to CMOs. Did you buy ads that someone claimed ran on our site at these times? Well, you got played.)

For an Inclusive Culture, Try Working Less As I said, to this day, my team at J.D. Edwards was the most diverse I’ve ever worked on....Still, I just couldn’t get over that damned tie.

The Al Capone theory of sexual harassment Initially, the connection eluded us: why would the same person who made unwanted sexual advances also fake expense reports, plagiarize, or take credit for other people’s work?

Jon Tennant - The Cost of Knowledge But there’s something much more sinister to consider; recently a group of researchers saw fit to publish Ebola research in a ‘glamour magazine’ behind a paywall; they cared more about brand association than the content. This could be life-saving research, why did they not at least educate themselves on the preprint procedure....

Twitter Is Still Dismissing Harassment Reports And Frustrating Victims

This Is How Your Fear and Outrage Are Being Sold for Profit (Profit? What about TEH LULZ??!?!1?)

Fine, have some cute animal photos, I was done with the other stuff anyway: Photographer Spends Years Taking Adorable Photos of Rats to Break the Stigma of Rodents

the other dude

22 July 2017

Making the rounds, this is a fun one: A computer was asked to predict which start-ups would be successful. The results were astonishing.

  • 2014: When there's no other dude in the car, the cost of taking an Uber anywhere becomes cheaper than owning a vehicle. So the magic there is, you basically bring the cost below the cost of ownership for everybody, and then car ownership goes away.

  • 2018 (?): When there's no other dude in the fund, the cost of financing innovation anywhere becomes cheaper than owning a portfolio of public company stock. So the magic there is, you basically bring the transaction costs of venture capital below the cost of public company ownership for everybody, and then public companies go away.

Could be a thing for software/service companies faster than we might think. Futures contracts on bugs→equity crowdfunding and pre-sales of tokens→bot-managed follow-on fund for large investors.

Stupid ideas department

18 July 2017

Here's a probably stupid idea: give bots the right to accept proposed changes to a software project. Can automation encourage less burnout-provoking behavior?

A set of bots could interact in interesting ways.

  • Regression-test-bot: If a change only adds a test, applies cleanly to both the current version and to a previous version, and the previous version passses the test, accept it, even if the test fails for the current version.

  • Harmless-change-bot: If a change is below a certain size, does not modify existing tests, and all tests (including any new ones) pass, accept it.

  • Revert-bot: If any tests are failing on the current version, and have been failing for more than a certain amount of time, revert back to a version that passes.

Would more people write regression tests for their issues if they knew that a bot would accept them? Or say that someone makes a bad change but gets it past harmless-change-bot because no existing test covers it. No lengthy argument needed. Write a regression test and let regression-test-bot and revert-bot team up to take care of the problem. In general, move contributor energy away from arguing with people and toward test writing, and reduce the size of the maintainer's to-do list.

Playing for third place

17 July 2017

Just tried a Twitter advertising trick that a guy who goes by "weev" posted two years ago.

It still works.

They didn't fix it.

Any low-budget troll who can read that old blog post and come up with a valid credit card number can still do it.

Maybe Twitter is a bad example, but the fast-moving nationalist right wing manages to outclass its opponents on other social marketing platforms, too. Facebook won't even reveal how badly they got played in 2016. They thought they were putting out cat food for cute Internet kittens, but the rats ate it.

This is not new. Right-wing shitlords, at least the best of them, are the masters of database marketing. They absolutely kill it, and they have been ever since Marketing as we know it became a thing. Some good examples:

All the creepy surveillance marketing stuff they're doing today is just another set of tools in an expanding core competency.

Every once in a while you get an exception. The environmental movement became a direct mail operation in response to Interior Secretary James G. Watt, who alarmed environmentalists enough that organizations could reliably fundraise with direct mail copy quoting from Watt's latest speech. And the Democrats tried that "Organizing for America" thing for a little while, but, man, their heart just wasn't in it. They dropped it like a Moodle site during summer vacation. Somehow, the creepier the marketing, the more it skews "red". The more creativity involved, the more it skews "blue" (using the USA meanings of those colors.) When we make decisions about how much user surveillance we're going to allow on a platform, we're making a political decision.

Anyway. News Outlets to Seek Bargaining Rights Against Google and Facebook.

The standings so far.

  1. Shitlords and fraud hackers

  2. Adtech and social media bros

  3. NEWS SITES HERE (?)

News sites want to go to Congress, to get permission to play for third place in their own business? You want permission to bring fewer resources and less experience to a surveillance marketing game that the Internet companies are already losing?

We know the qualities of a medium that you win by being creepier, and we know the qualities of a medium that you can win with reputation and creativity. Why waste time and money asking Congress for the opportunity to lose, when you could change the game instead?

Maybe achieving balance in political views depends on achieving balance in business model. Instead of buying in to the surveillance marketing model 100%, and handing an advantage to one side, maybe news sites should help users control what data they share in order to balance competing political interests.

Smart futures contracts on software issues talk, and bullshit walks?

14 July 2017

Previously: Benkler’s Tripod, transactions from a future software market, more transactions from a future softwware market

Owning "equity" in an outcome

John Robb: Revisiting Open Source Ventures:

Given this, it appears that an open source venture (a company that can scale to millions of worker/owners creating a new economic ecosystem) that builds massive human curated databases and decentralizes the processing load of training these AIs could become extremely competitive.

But what if the economic ecosystem could exist without the venture? Instead of trying to build a virtual company with millions of workers/owners, build a market economy with millions of participants in tens of thousands of projects and tasks? All of this stuff scales technically much better than it scales organizationally—you could still be part of a large organization or movement while only participating directly on a small set of issues at any one time. Instead of holding equity in a large organization with all its political risk, you could hold a portfolio of positions in areas where you have enough knowledge to be comfortable.

Robb's opportunity is in training AIs, not in writing code. The "oracle" for resolving AI-training or dataset-building contracts would have to be different, but the futures market could be the same.

The cheating project problem

Why would you invest in a futures contract on bug outcomes when the project maintainer controls the bug tracker?

And what about employees who are incentivized from both sides: paid to fix a bug but able to buy futures contracts (anonymously) that will let them make more on the market by leaving it open?

In order for the market to function, the total reputation of the project and contributors must be high enough that outside participants believe that developers are more motivated to maintain that reputation than to "take a dive" on a bug.

That implies that there is some kind of relationship between the total "reputation capital" of a project and the maximum market value of all the futures contracts on it.

Open source metrics

To put that another way, there must be some relationship between the market value of futures contracts on a project and the maximum reputation value of the project. So that could be a proxy for a difficult-to-measure concept such as "open source health."

Open source journalism

Hey, tickers to put into stories! Sparklines! All the charts and stuff that finance and sports reporters can build stories around!

Blind code reviews experiment

13 July 2017

In case you missed it, here's a study that made the rounds earlier this year: Gender differences and bias in open source: Pull request acceptance of women versus men:

This paper presents the largest study to date on gender bias, where we compare acceptance rates of contributions from men versus women in an open source software community. Surprisingly, our results show that women's contributions tend to be accepted more often than men's. However, women's acceptance rates are higher only when they are not identifiable as women.

A followup, from Alice Marshall, breaks out the differences between acceptance of "insider" and "outsider" contributions.

For outsiders, women coders who use gender-neutral profiles get their changes accepted 2.8% more of the time than men with gender-neutral profiles, but when their gender is obvious, they get their changes accepted 0.8% less of the time.

We decided to borrow the blind auditions concept from symphony orchestras for the open source experiments program.

The experiment, launching this month, will help reviewers who want to try breaking habits of unconscious bias (whether by gender or insider/outsider status) by concealing the name and email adddress of a code author during a review on Bugzilla. You'll be able to un-hide the information before submitting a review, if you want, in order to add a personal touch, such as welcoming a new contributor.

Built with the WebExtension development work of Tomislav Jovanovic ("zombie" on IRC), and the Bugzilla bugmastering of Emma Humphries. For more info, see the Bugzilla bug discussion.

Data collection

The extension will "cc" one of two special accounts on a bug, to indicate if the review was done partly or fully blind. This lets us measure its impact without having to make back-end changes to Bugzilla.

(Yes, WebExtensions let you experiment with changing a user's experience of a site without changing production web applications or content sites. Bonus link: FilterBubbler.)

Coming soon

A first release is on a.m.o., here: Blind Reviews BMO Experiment, if you want an early look. We'll send out notifications to relevant places when the "last" bugs are fixed and it's ready for daily developer use.

Two approaches to adfraud, and some good news

07 July 2017

Adfraud is a big problem, and we keep seeing two basic approaches to it.

Flight to quality: Run ads only on trustworthy sites. Brands are now playing the fraud game with the "reputation coprocessors" of the audience's brains on the brand's side. (Flight to quality doesn't mean just advertise on the same major media sites as everyone else—it can scale downward with, for example, the Project Wonderful model that lets you choose sites that are "brand safe" for you.)

Increased surveillance: Try to fight adfraud by continuing to play the game of trying to get big-money impressions from the cheapest possible site, but throw more tracking at the problem. Biggest example of this is to move ad money to locked-down mobile platforms and away from the web.

The problem with the second approach is that the audience is no longer on the brand's side. Trying to beat adfraud with technological measures is just challenging hackers to a series of hacking contests. And brands keep losing those. Recent news: The Judy Malware: Possibly the largest malware campaign found on Google Play.

Anyway, I'm interested in and optimistic about the results of the recent Mozilla/Caribou Digital report. It turns out that USA-style adtech is harder to do in countries where users are (1) less accurately tracked and (2) equipped with blockers to avoid bandwidth-sucking third-party ads. That's likely to mean better prospects for ad-supported news and cultural works, not worse. This report points out the good news that the so-called adtech tax is lower in developing countries—so what kind of ad-supported businesses will be enabled by lower "taxes" and "reinvention, not reinsertion" of more magazine-like advertising?

Of course, working in those markets is going to be hard for big US or European ad agencies that are now used to solving problems by throwing creepy tracking at them. But the low rate of adtech taxation sounds like an opportunity for creative local agencies and brands. Maybe the report should have been called something like "The Global South is Shitty-Adtech-Proof, so Brands Built Online There Are Going to Come Eat Your Lunch."

Applying proposed principles for content blocking

04 July 2017

(I work for Mozilla. None of this is secret. None of this is official Mozilla policy. Not speaking for Mozilla here.)

In 2015, Denelle Dixon at Mozilla wrote Proposed Principles for Content Blocking.

The principles are:

  • Content Neutrality: Content blocking software should focus on addressing potential user needs (such as on performance, security, and privacy) instead of blocking specific types of content (such as advertising).

  • Transparency & Control: The content blocking software should provide users with transparency and meaningful controls over the needs it is attempting to address.

  • Openness: Blocking should maintain a level playing field and should block under the same principles regardless of source of the content. Publishers and other content providers should be given ways to participate in an open Web ecosystem, instead of being placed in a permanent penalty box that closes off the Web to their products and services.

See also Nine Principles of Policing by Sir Robert Peel, who wrote,

[T]he police are the public and that the public are the police, the police being only members of the public who are paid to give full-time attention to duties which are incumbent on every citizen in the interests of community welfare and existence.

Web browser developers have similar responsibilities to those of Peel's ideal police: to build a browser to carry out the user's intent, or, when setting defaults, to understand widely held user norms and implement those, while giving users the affordances to change the defaults if they choose.

The question now is how to apply content blocking principles to today's web environment. Some qualities of today's situation are:

  • Tracking protection often doesn't have to be perfect, because adfraud. The browser can provide some protection, and influence the market in a positive direction, just by getting legit users below the noise floor of fraudbots.

  • Tracking protection has the potential to intensify a fingerprinting arms race that's already going on, by forcing more adtech to rely on fingerprinting in place of third-party cookies.

  • Fraud is bad, but not all anti-fraud is good. Anti-fraud technologies that track users can create the same security risks as other tracking—and enable adtech to keep promising real eyeballs on crappy sites. The "flight to quality" approach to anti-fraud does not share these problems.

  • Adtech and adfraud can peek at Mozilla's homework, but Mozilla can't see theirs. Open source projects must rely on unpredictable users, not unpredictable platform decisions, to create uncertainty.

Which suggests a few tactics—low-risk ways to apply content blocking principles to address today's adtech/adfraud problems.

Empower WebExtensions developers and users. Much of the tracking protection and anti-fingerprinting magic in Firefox is hidden behind preferences. This makes a lot of sense because it enables developers to integrate their work into the browser in parallel with user testing, and enables Tor Browser to do less patching. IMHO this work is also important to enable users to choose their own balance between privacy/security and breaking legacy sites.

Inform and nudge users who express an interest in privacy. Some users care about privacy, but don't have enough information about how protection choices match up with their expectations. If a user cares enough to turn on Do Not Track, change cookie settings, or install an ad blocker, then try suggesting a tracking protection setting or tool. Don't assume that just because a user has installed an ad blocker with deceptive privacy settings that the user would not choose privacy if asked clearly.

Understand and report on adfraud. Adfraud is more than just fake impressions and clicks. New techniques include attribution fraud: taking advantage of tracking to connect a bogus ad impression to a real sale. The complexity of attribution models makes this hard to track down. (Criteo and Steelhouse settled a lawsuit about this before discovery could reveal much.)

A multi-billion-dollar industry is devoted to spreading a story that minimizes adfraud, while independent research hints at a complex and lucrative adfraud scene. Remember how there were two Methbot stories: Methbot got a bogus block of IP addresses, and Methbot circumvented some widely used anti-fraud scripts. The ad networks dealt with the first one pretty quickly, but the second is still a work in progress.

The more that Internet freedom lovers can help marketers understand adfraud, and related problems such as brand-unsafe ad placements, the more that the content blocking story can be about users, legit sites, and brands dealing with problem tracking, and not just privacy nerds against all web business.

transactions from a future software market

04 July 2017

More on the third connection in Benkler’s Tripod, which was pretty general. This is just some notes on more concrete examples of how new kinds of direct connections between markets and peer production might work in the future.

Smart contracts should make it possible to enable these in a trustworthy, mostly decentralized, way.

Feature request I want emoji support on my blog, so I file, or find, a wishlist bug on the open source blog package I use: "Add emoji support." I then offer to enter into a smart contract that will be worthless to me if the bug is fixed on September 1, or give me my money back if the bug is unfixed at that date.

A developer realizes that fixing the bug would be easy, and wants to do it, so takes the other side of the contract. The developer's side will expire worthless if the bug is unfixed, and pay out if the bug is fixed.

"Unfixed" results will probably include bugs that are open, wontfix, invalid, or closed as duplicate of a bug that is still open.

"Fixed" results will include bugs closed as fixed, or any bug closed as a duplicate of a bug that is closed as fixed.

If the developer fixes the bug, and its status changes to fixed, then I lose money on the smart contract but get the feature I want. If the bug status is still unfixed, then I get my money back.

So far this is just one user paying one developer to write a feature. Not especially exciting. There is some interesting market design work to be done here, though. How can the developer signal serious interest in working on the bug, and get enough upside to be meaningful, without taking too much risk in the event the fix is not accepted on time?

Arbitrage I post the same offer, but another user realizes that the blog project can only support emoji if the template package that it depends on supports them. That user becomes an arbitrageur: takes the "fixed" side of my offer, and the "unfixed" side of the "Add emoji support" bug in the template project.

As an end user, I don't have to know the dependency relationship, and the market gives the arbitrageur an incentive to collect information about multiple dependent bugs into the best place to fix them.

Front-running Dudley Do-Right's open source project has a bug in it, users are offering to buy the "unfixed" side of the contract in order to incentivize a fix, and a trader realizes that Dudley would be unlikely to let the bug go unfixed. The trader takes the "fixed" side of the contract before Dudley wakes up. The deal means that the market gets information on the likelihood of the bug being fixed, but the developer doing the work does not profit from it.

This is a "picking up nickels in front of a steamroller" trading strategy. The front-runner is accepting the risk of Dudley burning out, writing a long Medium piece on how open source is full of FAIL, and never fixing a bug again.

Front-running game theory could be interesting. If developers get sufficiently annoyed by front-running, they could delay fixing certain bugs until after the end of the relevant contracts. A credible threat to do this might make front-runners get out of their positions at a loss.

CVE prediction A user of a static analysis tool finds a suspicious pattern in a section of a codebase, but cannot identify a specific vulnerability. The user offers to take one side of a smart contract that will pay off if a vulnerability matching a certain pattern is found. A software maintainer or key user can take the other side of these contracts, to encourage researchers to disclose information and focus attention on specific areas of the codebase.

Security information leakage Ernie and Bert discover a software vulnerability. Bert sells it to foreign spies. Ernie wants to get a piece of the action, too, but doesn't want Bert to know, so he trades on a relevant CVE prediction. Neither Bert nor the foreign spies know who is making the prediction, but the market movement gives white-hat researchers a clue on where the vulnerability can be found.

Open source metrics: Prices and volumes on bug futures could turn out to be a more credible signal of interest in a project than raw activity numbers. It may be worth using a bot to trade on a project you depend on, just to watch the market move. Likewise, new open source metrics could provide useful trading strategies. If sentiment analysis shows that a project is melting down, offer to take the "unfixed" side of the project's long-running bugs? (Of course, this is the same market action that incentivizes fixes, so betting that a project will fail is the same thing as paying them not to. My brain hurts.)

What's an "oracle"?

The "oracle" is the software component that moves information from the bug tracker to the smart contracts system. Every smart contract has to be tied to a given oracle that both sides trust to resolve it fairly.

For CVE prediction, the oracle is responsible for pattern matching on new CVEs, and feeding the info into the smart contract system. As with all of these, CVE prediction contracts are tied to a specific oracle.

Bots

Bots might have several roles.

  • Move investments out of duplicate bugs. (Take a "fixed" position in the original and an "unfixed" position in the duplicate, or vice versa.)

  • Make small investments in bugs that appear valid based on project history and interactions by trusted users.

  • Track activity across projects and social sites to identify qualified bug fixers who are unlikely to fix a bug within the time frame of a contract, and take "unfixed" positions on bugs relevant to them.

  • For companies: when a bug is mentioned in an internal customer support ticketing system, buy "unfixed" on that bug. Map confidential customer needs to possible fixers.

more transactions from a future software market

04 July 2017

Previously: Benkler’s Tripod, transactions from a future software market

Why would you want the added complexity of a market where anyone can take either side of a futures contract on the status of a software bug, and not just offer to pay people to fix bugs like a sensible person? IMHO it's worth trying not just because of the promise of lower transaction costs and more market liquidity (handwave) but because it enables other kinds of transactions. A few more.

Partial work I want a feature, and buy the "unfixed" side of a contract that I expect to lose. A developer decides to fix it, does the work, and posts a pull request that would close the bug. But the maintainer is on vacation, leaving her pull request hanging with a long comment thread. Another developer is willing to take on the political risk of merging the work, and buys out the original developer's position.

Prediction/incentivization With the right market design, a prediction that something won't happen is the same as an incentive to make it happen. If we make an attractive enough way for users to hedge their exposure to lack of innovation, we create a pool of wealth that can be captured by innovators. (Related: dominant assurance contracts)

Bug triage Much valuable work on bugs is in the form of modifying metadata: assigning a bug to the correct subsystem, identifying dependency relationships, cleaning up spam, and moving invalid bugs into a support ticket tracker or forum. This work is hard to reward, and infamously hard to find volunteers for. An active futures market could include both bots that trade bugs probabilistically based on status and activity, and active bug triagers who make small market gains from modifying metadata in a way that makes them more likely to be resolved.

Software: annoying speech or crappy product?

03 July 2017

Zeynep Tufekci, in the New York Times:

Since most software is sold with an “as is” license, meaning the company is not legally liable for any issues with it even on day one, it has not made much sense to spend the extra money and time required to make software more secure quickly.

The software business is still stuck on the kind of licensing that might have made sense in the 8-bit micro days, when "personal computer productivity" was more aspirational than a real thing, and software licenses were printed on the backs of floppy sleeves.

Today, software is part of products that do real stuff, and it makes zero sense to ship a real product, that people's safety or security depends on, with the fine print "WE RESERVE THE RIGHT TO TOTALLY HALF-ASS OUR JOBS" or in business-speak, "SELLER DISCLAIMS THE IMPLIED WARRANTY OF MERCHANTABILITY."

But what about open source and collaboration and science, and all that stuff? Software can be both "product" and "speech". Should there be a warranty on speech? If I dig up my shell script for re-running the make command when a source file changes, and put it on the Internet, should I be putting a warranty on it?

It seems that there are two kinds of software: some is more product-like, and should have a grown-up warranty on it like a real busines. And some software is more speech-like, and should have ethical requirements like a scientific paper, but not a product-like warranty.

What's the dividing line? Some ideas.

"productware is shipped as executables, freespeechware is shipped as source code" Not going to work for elevator_controller.php or a home router security tool written in JavaScript.

"productware is preinstalled, freespeechware is downloaded separately" That doesn't make sense when even implanted defibrillators can update over the net.

"productware is proprietary, freespeechware is open source" Companies could put all the fragile stuff in open source components, then use the DMCA and CFAA to enable them to treat the whole compilation as proprietary.

Software companies are built to be good at getting around rules. If a company can earn all its money in faraway Dutch Sandwich Land and be conveniently too broke to pay the IRS in the USA, then it's going to be hard to make it grow up licensing-wise without hurting other people first.

How about splitting out the legal advantages that the government offers to software and extending some to productware, others to freespeechware?

Freespeechware licenses

  • license may disclaim implied warranty

  • no anti-reverse-engineering clause in a freespeechware license is enforceable

  • freespeechware is not a "technological protection measure" under section 1201 of Title 17 of the United States Code (DMCA anticircumvention)

  • exploiting a flaw in freespeechware is never a violation of the Computer Fraud and Abuse Act

  • If the license allows it, a vendor may sell freespeechware, or a derivative work of it, as productware. (This could be as simple as following the You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. term of the GPL.)

Productware licenses:

  • license may not disclaim implied warranty

  • licensor and licensee may agree to limit reverse engineering rights

  • DMCA and CFAA apply (reformed of course, but that's another story)

It seems to me that there needs to be some kind of quid pro quo here. If a company that sells software wants to use government-granted legal powers to control its work, that has to be conditioned on not using those powers just to protect irresponsible releases.

Fun with dlvr.it

23 June 2017

Check it out—I'm "on Facebook" again. Just fixed my gateway through dlvr.it. If you're reading this on Facebook, that's why.

Dlvr.it is a nifty service that will post to social sites from an RSS feed. If you don't run your own linklog feed, the good news is that Pocket will generate RSS feeds from the articles you save, so if you want to share links with people still on Facebook, the combination of Pocket and dlvr.it makes that easy to do without actually spending human eyeball time there.

There's a story about Thomas Nelson, Jr., leader of the Virginia Militia in the Revolutionary War.

During the siege and battle Nelson led the Virginia Militia whom he had personally organized and supplied with his own funds. Legend had it that Nelson ordered his artillery to direct their fire on his own house which was occupied by Cornwallis, offering five guineas to the first man who hit the house.

Would Facebook's owners do the same, now that we know that foreign interests use Facebook to subvert America? Probably not. The Nelson story is just an unconfirmed patriotic anecdote, and we can't expect that kind of thing from today's post-patriotic investor class. Anyway, just seeing if I can move Facebook's bots/eyeballs ratio up a little.

1. Write open source. 2. ??? 3. PROFIT

22 June 2017

Studies keep showing that open source developers get paid more than people who develop software but do not contribute to open source.

Good recent piece: Tabs, spaces and your salary - how is it really? by Evelina Gabasova.

But why?

Is open source participation a way to signal that you have skills and are capable of cooperation with others?

Is open source a way to build connections and social capital so that you have more awareness of new job openings and can more easily move to a higher-paid position?

Does open source participation just increase your skills so that you do better work and get paid more for it?

Are open source codebases a complementary good to open source maintenance programming, so that a lower price for access to the codebase tends to drive up the price for maintenance programming labor?

Is "we hire open source people" just an excuse for bias, since the open source scene at least in the USA is less diverse than the general pool of programming job applicants?

Stuff I'm thankful for

22 June 2017

I'm thankful that the sewing machine was invented a long time ago, not today. If the sewing machine were invented today, most sewing tutorials would be twice as long, because all the thread would come in proprietary cartridges, and you would usually have to hack the cartridge to get the type of thread you need in a cartridge that works with your machine.

Catching up to Safari?

21 June 2017

Earlier this month, Apple Safari pulled ahead of other mainstream browsers in tracking protection. Tracking protection in the browser is no longer a question of should the browser do it, but which browser best protects its users. But Apple's early lead doesn't mean that another browser can't catch up.

Tracking protection is still hard. You have to provide good protection from third-party tracking, which users generally don't want, without breaking legit third-party services such as content delivery networks, single sign-on systems, and shopping carts. Protection is a balance, similar to the problem of filtering spam while delivering legit mail. Just as spam filtering helps enable legit email marketing, tracking protection tends to enable legit advertising that supports journalism and cultural works.

In the long run, just as we have seen with spam filters, it will be more important to make protection hard to predict than to run the perfect protection out of the box. A spam filter, or browser, that always does the same thing will be analyzed and worked around. A mail service that changes policies to respond to current spam runs, or an unpredictable ecosystem of tracking protection add-ons that browser users can install in unpredictable combinations, is likely to be harder.

But most users aren't in the habit of installing add-ons, so browsers will probably have to give them a nudge, like Microsoft Windows does when it nags the user to pick an antivirus package (or did last time I checked.) So the decentralized way to catch up to Apple could end up being something like:

  • When new tracking protection methods show up in the privacy literature, quietly build the needed browser add-on APIs to make it possible for new add-ons to implement them.

  • Do user research to guide the content and timing of nudges. (Some atypical users prefer to be tracked, and should be offered a chance to silence the warnings by affirmatively choosing a do-nothing protection option.)

  • Help users share information about the pros and cons of different tools. If a tool saves lots of bandwidth and battery life but breaks some site's comment form, help the user make the right choice.

  • Sponsor innovation challenges to incentivize development, testing, and promotion of diverse tracking protection tools.

Any surveillance marketer can install and test a copy of Safari, but working around an explosion of tracking protection tools would be harder. How to set priorities when they don't know which tools will get popular?

What about adfraud?

Tracking protection strategies have to take adfraud into account. Marketers have two choices for how to deal with adfraud:

  • flight to quality

  • extra surveillance

Flight to quality is better in the long run. But it's a problem from the point of view of adtech intermediaries because it moves more ad money to high-reputation sites, and the whole point of adtech is to reach big-money eyeballs on cheap sites. Adtech firms would rather see surveillance-heavy responses to adfraud. One way to help shift marketing budgets away from surveillance, and toward flight to quality, is to make the returns on surveillance investments less predictable.

This is possible to do without making value judgments about certain kinds of sites. If you like a site enough to let it see your personal info, you should be able to do it, even if in my humble opinion it's a crappy site. But you can have this option without extending to all crappy sites the confidence that they'll be able to live on leaked data from unaware users.

Apple's kangaroo cookie robot

11 June 2017

I'm looking forward to trying "Intelligent Tracking Prevention" in Apple Safari. But first, let's watch an old TV commercial for MSN.

Today, a spam filter seems like a must-have feature for any email service. But MSN started talking about its spam filtering back when Sanford Wallace, the "Spam King," was saying stuff like this.

I have to admit that some people hate me, but I have to tell you something about hate. If sending an electronic advertisement through email warrants hate, then my answer to those people is "Get a life. Don't hate somebody for sending an advertisement through email." There are people out there that also like us.

According to spammers, spam filtering was just Internet nerds complaining about something that regular users actually like. But the spam debate ended when big online services, starting with MSN, started talking about how they build for their real users instead of for Wallace's hypothetical spam-loving users.

If you missed the email spam debate, don't worry. Wallace's talking points about spam filters constantly get recycled by surveillance marketers talking about tracking protection. But now it's not email spam that users supposedly crave. Today, the Interactive Advertising Bureau tells us that users want ads that "follow them around" from site to site.

Enough background. Just as the email spam debate ended with MSN's campaign, the third-party web tracking debate ended on June 5, 2017.

With Intelligent Tracking Prevention, WebKit strikes a balance between user privacy and websites’ need for on-device storage. That said, we are aware that this feature may create challenges for legitimate website storage, i.e. storage not intended for cross-site tracking.

If you need it in bullet points, here it is.

  • Nifty machine learning technology is coming in on the user's side.

  • "Legitimate" uses do not include cross-site tracking.

  • Safari's protection is automatic and client-side, so no blocklist politics.

Surveillance marketers come up with all kinds of hypothetical reasons why users might prefer targeted ads. But in the real world, Apple invests time and effort to understand user experience. When Apple communicates about a feature, it's because that feature is likely to keep a user satisfied enough to buy more Apple devices. We can't read their confidential user research, but we can see what the company learned from it based on how they communicate about products.

(Imagine for a minute that Apple's user research had found that real live users are more like the Interactive Advertising Bureau's idea of a user. We might see announcements more like "Safari automatically shares your health and financial information with brands you love!" Anybody got one of those to share?)

Saving an out-of-touch ad industry

Advertising supports journalism and cultural works that would not otherwise exist. It's too important not to save. Bob Hoffman asks,

[H]ow can we encourage an acceptable version of online advertising that will allow us to enjoy the things we like about the web without the insufferable annoyance of the current online ad model?

The browser has to be part of the answer. If the browser does its job, as Safari is doing, it can play a vital role in re-connecting users with legit advertising—just as users have come to trust legit email newsletters now that they have effective spam filters.

Safari's Intelligent Tracking Prevention is not the final answer any more than Paul Graham's "A plan for spam" was the final spam filter. Adtech will evade protection tools just as spammers did, and protection will have to keep getting better. But at least now we can finally say debate over, game on.

With New Browser Tech, Apple Preserves Privacy and Google Preserves Trackers

An Ad Network That Works With Fake News Sites Just Launched An Anti–Fake News Initiative

Google Slammed For Blocking Ads While Allowing User Tracking

Introducing FilterBubbler: A WebExtension built using React/Redux

Forget far-right populism – crypto-anarchists are the new masters

Risks to brands under new EU regulations

Breitbart ads plummet nearly 90 percent in three months as Trump’s troubles mount

Be Careful Celebrating Google’s New Ad Blocker. Here’s What’s Really Going On.

‘We know the industry is a mess’: Marketers share challenges at Digiday Programmatic Marketing Summit

FIREBALL – The Chinese Malware of 250 Million Computers Infected

Verified bot laundering 2. Not funny. Just die

Publisher reliance on tech providers is ‘insane’: A Digiday+ town hall with The Washington Post’s Jarrod Dicker

Why pseudonymization is not the silver bullet for GDPR.

A level playing field for companies and consumers

Apple user research revealed, sort of

06 June 2017

This is not normally the blog to come to for Apple fan posts (my ThinkPad, desktop Linux, cold dead hands, and so on) but really good work here on "Intelligent Tracking Prevention" in Apple Safari.

Looks like the spawn of Privacy Badger and cookie double-keying, designed to balance user protection from surveillance marketing with minimal breakage of sites that depend on third-party resources.

(Now all the webmasters will fix stuff to make it work with Intelligent Tracking Prevention, which makes it easier for other browsers and privacy tools to justify their own features to protect users. Of course, now the surveillance marketers will rely more on passive fingerprinting, and Apple has an advantage there because there are fewer different Safari-capable devices. But browsers need to fix fingerprinting anyway.)

Apple does massive amounts of user research and it's fun to watch the results leak through when they communicate about features. Looks like they have found that users care about being "followed" from site to site by ads, and that users are still pretty good at applied behavioral economics. The side effect of tracking protection, of course, is that it takes high-reputation sites out of competition with the bottom-feeders to reach their own audiences, so Intelligent Tracking Prevention is great news for publishers too.

Meanwhile, I don't get Google's weak "filter" thing. Looks like a transparently publisher-hostile move (since it blocks some potentially big-money ads without addressing the problem of site commodification), unless I'm missing something.

The third connection in Benkler's Tripod

31 May 2017

Here's a classic article by Yochai Benkler: Coase's Penguin, or Linux and the Nature of the Firm.

Benkler builds on the work of Ronald Coase, whose The Nature of the Firm explains how transaction costs affect when companies can be more efficient ways to organize work than markets. Benkler adds a third organizational model, peer production. Peer production, commonly seen in open source projects, is good at matching creative people to rewarding problems.

As peer production relies on opening up access to resources for a relatively unbounded set of agents, freeing them to define and pursue an unbounded set of projects that are the best outcome of combining a particular individual or set of individuals with a particular set of resources, this open set of agents is likely to be more productive than the same set could have been if divided into bounded sets in firms.

Firms, markets, and peer production all have their advantages, and in the real world, most productive activity is mixed.

  • Managers in firms manage some production directly and trade in markets for other production. This connection in the firms/markets/peer production tripod is as old as firms.

  • The open source software business is the second connection. Managers in firms both manage software production directly and sponsor peer production projects, or manage employees who participate in projects.

But what about the third possible connection between legs of the tripod? Is it possible to make a direct connection between peer production and markets, one that doesn't go through firms? And why would you want to connect peer production directly to markets in the first place? Not just because that's where the money is, but because markets are a good tool for getting information out of people, and projects need information. Stefan Kooths, Markus Langenfurth, and Nadine Kalwey wrote, in "Open-Source Software: An Economic Assessment" (PDF),

Developers lack key information due to the absence of pricing in open-source software. They do not have information concerning customers’ willingness to pay (= actual preferences), based on which production decisions would be made in the market process. Because of the absence of this information, supply does not automatically develop in line with the needs of the users, which may manifest itself as oversupply (excessive supply) or undersupply (excessive demand). Furthermore, the functional deficits in the software market also work their way up to the upstream factor markets (in particular, the labor market for developers) and–depending on the financing model of the open-source software development–to the downstream or parallel complementary markets (e.g., service markets) as well.

Because the open-source model at its core deliberately rejects the use of the market as a coordination mechanism and prevents the formation of price information, the above market functions cannot be satisfied by the open-source model. This results in a systematic disadvantage in the provision of software in the open-source model as compared to the proprietary production process.

The workaround is to connect peer production to markets by way of firms. But the more that connections between markets and peer production projects have to go through firms, the more chances to lose information. That's not because firms are necessarily dysfunctional (although most are, in different ways). A firm might rationally choose to pay for the implementation of a feature that they predict will get 100 new users, paying $5000 each, instead of a feature that adds $1000 of value for 1000 existing users, but whose absence won't stop them from renewing.

Some ways to connect peer production to markets are already working. Crowdfunding for software projects and Patreon are furthest along, both offering support for developers who have already built a reputation.

A decentralized form of connection is Tokens, which Balaji S. Srinivasan describes as a tradeable version of API keys. If I believe that your network service will be useful to me in the future, I can pre-buy access to it. If I think your service will really catch on, I can buy a bunch of extra tokens and sell them later, without needing to involve you. (and if your service needs network effects, now I have an incentive to promote it, so that there will be a seller's market for the tokens I hold.)

Dominant assurance contracts, by Alexander Tabarrok, build on the crowdfunding model, with the extra twist that the person proposing the project has to put up some seed money that is divided among backers if the project fails to secure funding. This is supposed to bring in extra investment early on, before a project looks likely to meet its goal.

Tom W. Bell's "SPEX", in Prediction Markets for Promoting the Progress of Sciences and the Useful Arts, is a proposed market to facilitate transactions in a variety of prediction certificates, each one of which promises to pay its bearer in the event that an associated claim about science, technology, or public policy comes true. The SPEX looks promising as a way for investors to hedge their exposure to lack of innovation. If you own data centers and need energy, take a short position in SPEX contracts on cold fusion. (Or, more likely, buy into a SPEX fund that invests for your industry.) The SPEX looks like a way to connect the market to more difficult problems than the kinds of incremental innovation that tend to be funded through the VC system.

What happens when the software industry is forced to grow up?

I'm starting to think that finishing the tripod, with better links from markets to peer production, is going to matter a lot more soon, because of the software quality problem.

Today's software, both proprietary and open source, is distributed under ¯\_(ツ)_/¯ terms. "Disclaimer of implied warranty of merchantability" is lawyer-speak for "we reserve the right to half-ass our jobs lol." As Zeynep Tufekci wrote in the New York Times, "The World Is Getting Hacked. Why Don’t We Do More to Stop It?" At some point the users are going to get fed up, and we're going to have to. An industry as large and wealthy as software, still sticking to Homebrew Computer Club-era disclaimers, is like a 40-something-year-old startup bro doing crimes and claiming that they're just boyish hijinks. This whole disclaimer of implied warranty thing is making us look stupid, people. (No, I'm not for warranties on software that counts as a scientific or technical communication, or on bona fide collaborative development, but on a product product? Come on.)

Grown-up software liability policy is coming, but we're not ready for it. Quality software is not just a technically hard problem. Today, we're set up to move fast, break things, and ship dancing pigs—with incentives more powerful than incentives to build secure software. Yes, you get the occasional DARPA initiative or tool to facilitate incremental cleanup, but most software is incentivized through too many layers of principal-agent problems. Everything is broken.

If governments try to fix software liability before the software scene can fix the incentives problem, then we will end up with a stifled, slowed-down software scene, a few incumbent software companies living on regulatory capture, and probably not much real security benefit for users. But what if users (directly or through their insurance companies) are willing to pay to avoid the costs of broken software, in markets, and open source developers are willing to participate in peer production to make quality software, but software firms are not set up to connect them?

What if there is another way to connect the "I would rather pay a little more and not get h@x0r3d!" demand to the "I would code that right and release it in open source, if someone would pay for it" supply?

User tracking as Chesterton's Fence

30 May 2017

G.K. Chesterton once wrote

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.

Bob Hoffman makes a good case for getting rid of user tracking in web advertising. But in order to take the next steps, and not just talk among ourselves about things that would be really great in the future, we first need to think about the needs that tracking seems to satisfy for legit marketers.

What I'm not going to do is pull out the argument that's in every first comment on every blog post that criticizes tracking: that "adtech" is just technology and is somehow value-neutral. Tracking, like all technologies, enables some kinds of activity better than others. When tracking offers marketers the opportunity to reach users based on who the user is rather than on what they're reading, watching, or listening to, then that means:

But if tracking is so bad, then why, when you go to any message board or Q&A site that discusses marketing for small businesses, is everyone discussing those nasty, potentially civilization-extinguishing targeted ads? Why is nobody popping up with a question on how to make the next They Laughed When I Sat Down At the Piano?

  • Targeted ads are self-serve and easy to get started with. If you have never bought a Twitter or Facebook ad, get out your credit card and start a stopwatch. These ads might be crappy, but they have the lowest time investment of any legit marketing project, so probably the only marketing project that time-crunched startups can do.

  • Targeted ads keep your OODA loop tight. Yes, running targeted ads can be addictive—If you thought the the attention slot machine game on social sites was bad, try the advertiser dashboard. But you're able to use them to learn information that can help with the rest of marketing. If you have the budget to exhibit at one conference, compare Twitter ads targeted to attendees of conference A with ads targeted to attendees of conference B, and you're closer to an answer.

  • Marketing has two jobs: sell stuff to customers and sell Marketing to management. Targeting is great for the second one, since it comes with the numbers that will help you take credit for results.

We're not going to be able to get rid of risky tracking until we can understand the needs that it fills, not just for big advertisers who can afford the time and money to show up in Cannes every year, but for the company founder who still has $1.99 business cards and is doing all of Marketing themselves.

(The party line among web privacy people can't just be that GDPR is going to save us because the French powers that be are all emmerdés ever since the surveillance/shitlord complex tried to run a US-style game on their political system. That might sound nice, but put not your trust in princes, man. Even the most arrogant Eurocrats in the world will not be able to regulate indefinitely against all the legit business people in their countries complaining that they can't do something they see as essential. GDPR will be temporary air cover for building an alternative, not a fix in itself.)

Post-creepy web advertising is still missing some key features.

  • Branding and signaling metrics. We know the hard math works out against tracking and targeting, and we know about the failure of targeted media to build brands in the long run, but we don't have good numbers that are usable day to day. The "customer journey" has nice graphs, but brand equity doesn't.

  • Quick, low-risk service. With the exception of the Project Wonderful model, targeted ads are quick and low-risk, while signal-carrying ads are the opposite. A high-overhead direct ad sales process is not a drop-in replacement for an easy web form.

I don't think that's all of them. But I don't think that the move to post-creepy web advertising is going to be a rush, all at once, either. Brands that have fly-by-night low-reputation competitors, brands that already have many tracking-protected customers, and brands with solid email lists are going to be able to move faster than marketers who are still making tracking work. More: Work together to fix web ads? Let's not.

sudo dnf install mosh

28 May 2017

I'm still two steps behind in devops coolness for my network stuff. I don't even have proper configuration management, and that's fine because Configuration Management is an Anti-pattern now. Anyway, I still log in and actually run shell commands on the server, and the LWN review of mosh was helpful to me. Now using mosh for connections that persist across suspending the laptop and moving it from network to network. More info: Mosh: the mobile shell

free riding on open source

Here's a good Twitter thread on open source projects and "free rider" companies. As far as I can tell, companies can pay for open source in three ways.

  • do software development

  • pay people to do software development

  • write a long Medium post apologizing to your users for failing

end date for IP Maximalism

When did serious "Intellectual Property Maximalism" end? I'm going to put it at September 18, 2006, which is the date that the Gates Foundation announced funding for the Public Library of Science's journal PLoS Neglected Tropical Diseases. When it's a serious matter of people's health, open access matters, even to the author of "Open Letter to Hobbyists". Since then, IP Maximalism stories have been mostly about rent-seeking behavior, which had been a big part of the freedom lovers's point all along. (Nobody quoted in this story is pearl-clutching about "innovation", for example: Supreme Court ruling threatens to shut down cottage industry for small East Texas town.)

random stuff

Just Keep Scrolling! How To Design Lengthy, Lengthy Pages is "sponsored content" but it's really good sponsored content.

The marketplace of ideas is now struggling with the increasing incidence of algorithmic manipulation and disinformation campaigns. There are bots. Look around.

(In other news, Facebook is still evil, but you probably knew that by now: Why Facebook's Authentication Model is Inadequate, Does Facebook Make Us Unhappy and Unhealthy?)

More links:

Some questions on a screenshot

27 May 2017

Here's a screenshot of an editorial from Der Spiegel, with Ghostery turned on.

article from Der Spiegel

Is it just me, or does it look to anyone else like the man in the photo is checking the list of third-party web trackers on the site to see who he can send a National Security Letter to?

Could a US president who is untrustworthy enough to be removed from office possibly be trustworthy enough to comply with his side of a "Privacy Shield" agreement?

If it's necessary for the rest of the world to free itself of its dependence on the U.S., does that apply to US-based Internet companies that have become a bottleneck for news site ad revenue, and how is that going to work?

Bonus links:

What happened to Twitter? We can't look away...

19 May 2017

Hey, everybody, check it out.

Here's a Twitter ad.

some dumb Twitter ad

If you're "verified" on Twitter, you probably miss these, so I'll just use my Fair Use rights to share that one with you.

You're welcome.

Twitter is a uniquely influential medium, one that shows up on the TV news every night and on news sites all day. But somehow, the plan to make money from Twitter is to run the same kind of targeted ads that anyone with a WordPress site can. And the latest Twitter news is a privacy update that includes, among other things, more tracking of users from one site to another. Yes, the same kind of thing that Facebook already does, and better, with more users. And the same kind of thing that any web site can already get from an entire Lumascape of companies. Boring.

If you want to stick this kind of ad on your WordPress site, you just have to cut and paste some ad network HTML—not build out a deluxe office space on Market Street in San Francisco the way Twitter has. But the result is about the same.

What makes Twitter even more facepalm-worthy is that they make a point of not showing the ads to the influential people who draw attention to Twitter to start with. It's like they're posting a big sign that says STUPID AD ZONE: UNIMPORTANT PEOPLE ONLY. Twitter is building something unique, but they're selling generic impressions that advertisers can get anywhere. So as far as I can tell, the Twitter business model is something like:

Money out: build something unique and expensive.

Money in: sell the most generic and shitty thing in the world.

Facebook can make this work because they have insane numbers of eyeball-minutes. Chump change per minute on Facebook still adds up to real money. But Facebook is an outlier on raw eyeball-minutes, and there aren't enough minutes in the day for another. So Twitter is on track to get sold for $500,000, like Digg was. Which is good news for me because I know enough Twitter users that I can get that kind of money together.

So why should you help me buy Twitter when you could just get the $500,000 yourself? Because I have a secret plan, of course. Twitter is the site that everyone is talking about, right? So run the ads that people will talk about. Here's the plan.

Sell one ad per day. And everybody sees the same one.

Sort of like the back cover of the magazine that everybody in the world reads (but there is no such magazine, so that's why this is an opportunity.) No more need to excuse the verified users from the ads. Yes, an advertiser will have to provide a variety of sizes and localizations for each ad (and yes, Twitter will have to check that the translations match). But it's the same essential ad, shown to every Twitter user in the world for 24 hours.

No point trying to out-Facebook Facebook or out-Lumascape the Lumascape. Targeted ads are weak on signal, and a bunch of other companies are doing them more cost-effectively and at higher volume, anyway.

Of course, this is not for everybody. It's for brands that want to use a memorable, creative ad to try for the same kind of global signal boost that a good Tweet® can get. But if you want generic targeted ads you can get those everywhere else on the Internet. Where else can you get signal? In order to beat current Twitter revenue, the One Twitter Ad needs to go for about the same price as a Super Bowl commercial. But if Twitter stays influential, that's reasonable, and I make back the 500 grand and a lot more.

Understanding the limitations of data pollution tools

02 May 2017

Jeremy Gillula and Yomna Nasser write, on the EFF blog,

Internet users have been asking what they can do to protect their own data from this creepy, non-consensual tracking by Internet providers—for example, directing their Internet traffic through a VPN or Tor. One idea to combat this that’s recently gotten a lot of traction among privacy-conscious users is data pollution tools: software that fills your browsing history with visits to random websites in order to add “noise” to the browsing data that your Internet provider is collecting.

...

[T]here are currently too many limitations and too many unknowns to be able to confirm that data pollution is an effective strategy at protecting one’s privacy. We’d love to eventually be proven wrong, but for now, we simply cannot recommend these tools as an effective method for protecting your privacy.

This is one of those "two problems one solution" situations.

  • The problem for makers and users of "data pollution" or spoofing tools is QA. How do you know that your tool is working? Or are surveillance marketers just filtering out the impressions created by the tool, on the server side?

  • The problem for companies using so-called Non-Human Traffic (NHT) is that when users discover NHT software (bots), the users tend to remove it. What would make users choose to participate in NHT schemes so that the NHT software can run for longer and build up more valuable profiles?

So what if the makers of spoofing tools could get a live QA metric, and NHT software maintainers could give users an incentive to install and use their software?

NHT market as a tool for discovering information

Imagine a spoofing tool that offers an easy way to buy bot pageviews, I mean buy Perfectly Legitimate Data on how fast a site loads from various home Internet connections. When the tool connects to its server for an update, it gets a list of URLs to visit—a mix of random sites, popular sites, and paying customers.

Now the spoofing tool maintainer will be able to to tell right away if the tool is really generating realistic traffic, by looking at the market price of pageviews. The maintainer will even be able to tell whose tracking the tool can beat, by looking at which third-party resources are included on the pages getting paid-for traffic.

The money probably won't be significant, since real web ad money is moving to whitelisted, legit sites and away from fraud-susceptible schemes anyway, but in the meantime it's a way to measure effectiveness.

NPM without sudo

22 April 2017

Setting up a couple of Linux systems to work with FilterBubbler, which is one of the things that I'm up to at work now. FilterBubbler is a WebExtension, and the setup instructions use web-ext, so I need NPM. In order to keep all the NPM stuff under my own home directory, but still put the web-ext tool on my $PATH, I need to make one-line edits to three files.

One line in ~/.npmrc

prefix = ~/.npm

One line in ~/.gitignore

.npm/

One line in ~/.bashrc

export PATH="$PATH:$HOME/.npm/bin"

(My /bashrc has a bunch of export PATH= lines so that when I add or remove one it's more likely to get a clean merge. Because home directory in git.) I think that's it. Now I can do

npm install --global web-ext

with no sudo or mess. And when I clone my home directory on another system it will just work.

Based on: HowTo: npm global install without root privileges by Johannes Klose

Traffic sourcing web obfuscator?

15 April 2017

(This is an answer to a question on Twitter. Twitter is the new blog comments (for now) and I'm more likely to see comments there than to have time to set up and moderate comments here.)

Adfraud is an easy way to make mad cash, adtech is happily supporting it, and it all works because the system has enough layers between CMO and fraud hacker that everybody can stay as clean as they need to. Users bear the privacy risks of adfraud, legit publishers pay for it, and adtech makes more money from adfraud than fraud hackers do. Adtech doesn't have to communicate or coordinate with adfraud, just set up a fraud-friendly system and let the actual fraud hackers go to work. Bad for users, people who make legit sites, and civilization in general.

But one piece of good news is that adfraud can change quickly. Adfraud hackers don't have time to get stuck in conventional ways of doing things, because adfraud is so lucrative that the high-skill players don't have to stay in it for very long. The adfraud hackers who were most active last fall have retired to run their resorts or recording studios or wineries or whatever.

So how can privacy tools get a piece of the action?

One random idea is for an obfuscation tool to participate in the market for so-called sourced traffic. Fraud hackers need real-looking traffic and are willing to pay for it. Supplying that traffic is sketchy but legal. Which is perfect, because put one more layer on top of it and it's not even sketchy.

And who needs to know if they're doing a good job at generating real-looking traffic? Obfuscation tool maintainers. Even if you write a great obfuscation tool, you never really know if your tricks for helping users beat surveillance are actually working, or if your tool's traffic is getting quietly identified on the server side.

In proposed new privacy tool model, outsourced QA pays YOU!

Set up a market where a Perfectly Legitimate Site that is looking for sourced traffic can go to buy pageviews, I mean buy Perfectly Legitimate Data on how fast a site loads from various home Internet connections. When the obfuscation tool connects to its server for an update, it gets a list of URLs to visit—a mix of random, popular sites and paying customers.

Set a minimum price for pageviews that's high enough to make it cost-ineffective for DDoS. Don't allow it to be used on random sites, only those that the buyer controls. Make them put a secret in an unlinked-to URL or something. And if an obfuscation tool isn't well enough sandboxed to visit a site that's doing traffic sourcing, it isn't well enough sandboxed to surf the web unsupervised at all.

Now the obfuscation tool maintainer will be able to to tell right away if the tool is really generating realistic traffic, by looking at the market price. The maintainer will even be able to tell whose tracking the tool can beat, by looking at which third-party resources are included on the pages getting paid-for traffic. And the whole thing can be done by stringing together stuff that IAB members are already doing, so they would look foolish to complain about it.

Interesting stuff on the Internet

13 April 2017

Just some mindless link propagation to tweak making the links on my blog the right shade of blue.

Good news: Portugal Pushes Law To Partially Ban DRM, Allow Circumvention

Study finds Pokémon Go players are happier and The More You Use Facebook, the Worse You Feel. Get your phone charged up, get off Facebook, and get out there.

If corporations are people, you wouldn't be mean to a person, would you? Managing for the Long Term

Yay, surprise presents for Future Me! Why Kickstarter Decided To Radically Transform Its Business Model

Skateboarding obviously doesn't cause hip fractures, because the age groups least likely to skateboard break their hips the most! Something is breaking American politics, but it's not social media

From Spocko, pioneer of Internet brand safety campaigns: Values: Brand, Corporate & Bill O’Reilly’s

In Spite of People Having Meetings, Bears Still Shit in the Woods: In Spite Of The Crackdown, Fake News Publishers Are Still Earning Money From Major Ad Networks

There's another dead bishop on the landing. Alabama Senate OK's church police bill

Productivity is awesome: How to Avoid Distractions and Finish What You

Computer Science FTW: Corrode update: control flow translation correctness

More good news: Kentucky Coal Mining Museum converts to solar power

This is going to be...fun. Goldman Sachs: VC Dry Powder Hits Record Highs

If you want to prep for a developer job interview, here's some good info: Hexing the technical interview

Bunny: Internet famous?

08 April 2017

bunny

I bought this ceramic bunny at a store on Park Street in Alameda, California. Somehow I think I have seen it before.

Memo to self: make dentist appointment

04 April 2017

(Hey, I said this was a personal blog.)

But I was just thinking—people started adding lots of refined sugar to their diets long before anybody discovered how dental caries works.

And today we have Internet distractions, and surveillance marketing, doing to our brains what sugar did to people's teeth.

And people have both sugar and teeth today. Dental hygiene is awesome: it's a set of norms, technologies, and habits, grounded in scientific understanding. Mental hygiene is just getting started.

The sugar industry moved faster to start with, but people agree that teeth matter. So do brains.

Confusion about why we call adtech adtech

03 April 2017

If you want people on the Internet to argue with you, say that you're making a statement about values.

If you want people to negotiate with you, say that you're making a statement about business.

If you want people to accept that something is inevitable, say that you're making a statement about technology.

The mixup between values arguments, business arguments, and technology arguments might be why people are confused about Brands need to fire adtech by Doc Searls.

The set of trends that people call adtech is a values-driven business transformation that is trying to label itself as a technological transformation.

Some of the implementation involves technological changes (NoSQL databases! Nifty!) but fundamentally adtech is about changing how media business is done. Adtech does have a set of values, none of which are really commonly held even among people in the marketing or advertising field, but let's not make the mistake of turning this into either an argument about values (that never accomplishes anything) or a set of statements about technology (that puts those with an inside POV on current technology at an unnecessary advantage). Instead, let's look at the business positions that adtech is taking.

  • Adtech stands for profitable platforms, with commodity producers of news and cultural works. Michael Tiffany, CEO of advertising security firm White Ops, said The fundamental value proposition of these ad tech companies who are de-anonymizing the Internet is, Why spend big CPMs on branded sites when I can get them on no-name sites? This is not a healthy situation, but it's a chosen path, not a technologically inevitable one.

  • Adtech stands for the needs of low-reputation sellers over the needs of high-reputation sellers. High-reputation and low-reputation brands need different qualities from an ad medium and adtech has to under-serve the high-reputation ones. Again, not technologically inevitable, but a business position that high-reputation brands and their agencies don't have to accept.

  • Adtech stands for making advertisers support criminal and politically heinous activity. I'll just let Bob Hoffman explain that one. Fraudulent and brand-unsafe content is just the overspray of the high value platforms/commoditized content system, and advertisers have to accept it in order to power that system. Or do they?

People have a lot of interesting decisions to make: policy, contractual, infrastructural, and client-side. When we treat the adtech movement as simply technology, we take the risk of missing great opportunities to negotiate for the benefit of brands, publishers, and the audience.

Welcome RSS users

01 April 2017

Welcome RSS users.

I am setting up a redirect from my old feed to the new one.

You might see a few old entries.

This new blog has better CSS for reading on small screens and has a Let's Encrypt certificate.

Welcome. How is everyone's tracking protection working?

26 March 2017

This is a brand new blog, so I'm setting up the basics. I just realized that I got the whole thing working without a single script, image, or HTML table. (These kids today have it easy, with their media queries and CSS Grid and stuff.)

One big question that I'm wondering about is: how many of the people who visit here are using some kind of protection from third-party tracking? Third-party tracking has been an unfixed vulnerability in web browsers for a long time. Check out the Unofficial Cookie FAQ from 1997. Third-party cookies are in there...and we're still dealing with the third-party tracking problem?

In order to see how bad the problem is on this site, I'm going to set up a little bit of first-party data collection to measure people's vulnerability to third-party data collection.

The three parts of that big question are:

  • Does first-party JavaScript load and run?

  • Does third-party JavaScript (from a site on popular filter lists) load and run?

  • Can a third-party tracker see state from other sites?

This will be easy to do with a little single-pixel image and the Aloodo tracking detection script.

This blog is on Metalsmith, so the right place to put these scripts will be in layouts/partials/footer.html.

The lines that matter are:

<script src="/code/check3p.js"></script>
<script src="https://ad.aloodo.com/track.js"></script>
<img id="check3p" src="/tk/sr.png"
 height="1" width="1" alt="">

I'm including a single-pixel image and two scripts: the Aloodo one and a new first-party script.

In most tracking protection configurations, the Aloodo script will be blocked, because ad.aloodo.com appears on the commonly used tracking protection lists.

Step two: write the first-party script

The local script is simple: /code/check3p.js

All it does is swap out the tracking image source three times.

  • When the script runs, to check that this is a browser with JavaScript on.

  • When the Aloodo tracking script runs, to check if this browser is blocking the script from loading.

  • When the Aloodo script confirms that tracking is possible.

The work is done in the setupAloodo function, which runs after the page loads. First, it sets the src for the tracking pixel to js.png, then sets up two callbacks: one to run after the Aloodo script is loaded, and switch the image to ld.png, and one to run if the script can track the user, and switch the image to td.png.

Step three: check the logs

Now I can use the regular server logs to compare the number of clients that load the original image, and the JavaScript-switched one, to the number that load the two tracking images.

(There are two different tracking callbacks because of the details of how Aloodo has to detect Privacy Badger, among other things. Not all tracking protection works the same.)

I'll run some reports on the logs and post again about the results. (If you want to see your own results in the meantime, you can take a tracking protection test.)

Am I metal yet?

14 March 2017

This is a blog. Started out with A Beginner's Guide to Crafting a Blog with Metalsmith by Parimal Satyal, but added some other stuff.

Metalsmith is pretty fun. The basic pipeline from the article seems to work pretty well, but I ran into a couple of issues. I might have solved these in ways that are completely wrong, but here's what works for me.

First, I needed to figure out how to get text from an earlier stage of the pipeline. My Metalsmith build is pretty basic:

  1. turn Markdown into HTML (plus article metadata stored with it, wrapped up in a JavaScript object)

  2. apply a template to turn the HTML version into a complete page.

That's great, but the problem seems to be with getting a copy of just the HTML from step 1 for building the index page and the RSS feed. I don't want the entire HTML page from step 2, just the inner HTML from step 1.

The solution seems to be metalsmith-untemplatize. This doesn't actually strip off the template, just lets you capture an extra copy of the HTML before templatization. This goes into the pipeline after "markdown" but before the "layouts" step.

.use(untemplatize(
    { key: 'bodycopy'
}))

I also ran into the Repeat runs with collections adds duplicates issue. Strange to see the same blog items come up twice on the index page. The link on that bug page from Spacedawwwg goes to his fork of metalsmith-collections that seems to do the right thing.

Webfonts

GitHub

There's a GitHub repo of this blog.

MSIE on Fedora with virt-manager

22 October 2015

Internet meetings are a pain in the behind. (Clearly online meeting software is controlled by the fossil fuel industry, and designed to be just flaky enough to make people drive to work instead.)

Here's a work in progress to get an MSIE VM running on Fedora. (Will edit as I check these steps a few times. Suggestions welcome.)

Download: Download virtual machines.

Untar the OVA

tar xvf IE10\ -\ Win8.ova

You should end up with a .vmdk file.

Convert the OVA to qcow2

qemu-img convert IE10\ -\ Win8-disk1.vmdk -O qcow2 msie.qcow2

Import the qcow2 file using virt-manager.

Select Browse, then Browse Local, then select the .qcow2 file.

That's it. Now looking at a virtual MS-Windows guest that I can use for those troublesome web conferences (and for testing web sites under MSIE. If you try the tracking test, it should take you to a protection page that prompts you to turn on the EasyPrivacy Tracking Protection List. That's a quick and easy way to speed up your web browsing experience on MSIE.)

Temporary directory for a shell script

22 August 2014

Set up a temporary directory to use in a Bash script, and clean it up when the script finishes:

TMPDIR=$(mktemp -d)
trap "rm -rf $TMPDIR" EXIT

Automatically run make when a file changes

08 August 2013

Really simple: do a makewatch [target] to re-run make with the supplied [target] when any files relevant to that target change.

makewatch script

Andrew Cowie has written something similar. The main thing that this one does differently is to ask make which files matter to it, instead of doing an inotifywatch on the whole directory. Comments and suggestions welcome.

Printer for Linux

02 November 2011

Picking a printer for Linux?

The process is going to be a little different from what you might be used to with another OS. If you shop carefully (and reading blogs is a good first step) then the drivers you will need are already available through your Linux distribution's printer setup tool.

HP has done a good job with enabling this. The company has already released the necessary printer software as open source, and your Linux distribution has already installed it. So, go to printers fully supported with the HPLIP software, pick a printer you like, and you're done.

If you want a recommendation from me, the HP LaserJet 3055, a black and white all-in-one device, has worked fine for me with various Linux setups for years. It's also a scanner/copier/fax machine, and you get the extra functionality for not much more than the price of a regular printer. It also comes with a good-sized toner cartridge, so your cost per page is probably going to be pretty reasonable.

Other printer brands have given me more grief, but fortunately the HP LaserJets are widely available and don't jam much.

It's important not to show a smug expression on your face while printing if users of non-Linux OSs are still dealing with driver CDs or vendor downloads.

Landmarks in instructions

05 September 2010

When you give travel directions, you include landmarks, and "gone too far" points. Turn left after you cross the bridge. Then look for my street and make a right. If you go past the water tower you've gone too far.

System administration instructions are much easier to follow if they include those kind of check-ins there, too. For example, if you explain how to set up server software you can put in quick "landmark" tests, such as, "at this point, you can run nmap and see the port in the results." You can also include "gone too far" information by pointing out problems you can troubleshoot on the way.

A full-scale troubleshooting guide is a good idea, but quick warning signs as you go along are helpful. Much better than finding yourself lost at the end of a long set of setup instructions.

dotted quad to decimal in bash

24 December 2008

GNU seq doesn't accept dotted quads for ranges, but fortunately most of the commands that accept an IP address will also take it in the form of a regular decimal. (Spammers used to use this to hide their naughty domains from scanners that only looked for the dotted quad while the browser would happily go to http://3232235520/barely-legal-mortgage.html or something.)

So here's an ugly-ass shell function to convert an IP address to a decimal. If you have a better one, please let me know and I'll update this page. (Yes, I know this would be one line in Perl.)

dq2int()
{
    if [ $(echo $1 | grep -q '\.') ]; then
        dq2int $(echo $1 | tr '.' ' ')
    elif [ $# -eq 1 ]; then
        echo $1
    else
        total=$1; next=$2; shift 2
        dq2int $(($total*2**8+$next)) $@
    fi
}

Seth Schoen has two shorter versions:

dq2int(){
a=0
for b in $(echo $1 | tr . ' '); do
    a=$((256*$a+$b))
done
echo $a
}

dq2int(){
a=0
for b in ${1//./ }; do
    a=$((256*$a+$b))
done
echo $a
}

And if you want to go the other way, Seth points out that you can set the "obase" variable for bc. Here's an int2dq function based on that idea.

int2dq()
{
    { echo obase=256; echo $1; } | \
        bc | tr ' ' . | cut -c2-
}

To quote the GNU bc manual, "For bases greater than 16, bc uses a multi-character digit method of printing the numbers where each higher base digit is printed as a base 10 number."

Trick.

Transaction mail or junk mail? Check the postage.

09 April 2006

It says "Personal and Confidential" or "IMPORTANT CORRESPONDENCE REGARDING YOUR OVERPAYMENT" on the envelope—can you really discard it without opening it? You sure can. Some junk mailers disguise their mail pieces as important correspondence from companies you actually do business with, and the USPS helped them out a lot by renaming "Bulk Mail" to "Standard Mail". But you can look at the postage to discard "stealth" junk mail without opening it.

Postal regulations require that any bills or mail containing specific information about your business relationship with the company must be mailed First Class.

So, if "Standard Mail" or "STD" appears in the upper right corner, it's not a bill, it's not your new credit card, and it's not a check. It's just sneaky junk mail.

eval button

17 April 2005

Jef Raskin wrote,

All that is really needed on computers is a "Calculate" button or omnipresent menu command that allows you to take an arithmetic expression, like 248.93 / 375, select it, and do the calculation whether in the word processor, communications package, drawing or presentation application or just at the desktop level.

Fortunately, there's a blue "Access IBM" button on this keyboard that doesn't do much. So, I configured tpb to make "Access IBM" do this:

perl -e 'print eval `xsel -o`' | \
xsel -i && xte 'key Delete' 'mouseclick 2'

(That is, get the contents of the X primary selection, run it through a Perl "eval", put the result back into the X primary selection, then fake a delete and paste.)

Here's a version that uses the X clipboard selection instead.

xte 'keydown Control_L' 'key c' 'keyup Control_L' && \
perl -e 'print eval `xsel -b -o`' | xsel -b -i  && \
xte 'keydown Control_L' 'key v' 'keyup Control_L'

This one seems to work better in gedit.

If you want to do this, besides tpb, you'll need xsel and xte, which is part of xautomation. If you don't have an unused button, you could also set up a binding in your window manager or build a big red outboard USB "eval" button or something.

Force ssh not to use ssh-agent

17 April 2005

If you make a new ssh key and try to use it with ssh -i while running ssh-agent, ssh tries the agent first. You could end up using a key provided by the agent instead of the one you specify. You can fix this without killing the agent. Use:

env -u SSH_AUTH_SOCK ssh -i newkey host

Picking a Linux distribution

08 April 2005

The most important part of picking a distribution is thinking about where you will go for help, and what distribution that source of help understands. That's true if your source of help is a vendor, a consultant, or a users group.

As a home user, you'll probably be asking your local Linux users group for help when you need it. So get on the mailing list and just "lurk" for a while. See what the most helpful people on the list use, and install that. That way if you have a question, you'll be more likely to reach someone who has already dealt with it. (see How to Pick a Distribution.)

If you're getting into uses for Linux that are different from those of your local user group, it's more important to use a list of people like you than just the geographically closest user group. For example, if you're planning to set up a Linux-based recording studio and your local LUG is all about running web sites and playing Crimson Fields, you might want to get on the Planet CCRMA mailing list, and get your Linux distribution recommendations there.

ssh scripts: fail fast

01 January 2005

If you have a script that uses ssh, here's something to put at the beginning of the script to make sure the necessary passphrase has already been entered, and the remote host is reachable, before starting a time-consuming operation such as an rsync.

ssh $REMOTE_HOST true || exit 1