Amazon fail, Internet fail
This weekend, social networks and blogs started buzzing with news that Amazon.com had removed all books for and about gays and lesbians from its web site. Neil Gaiman tweeted about it, bloggers organized google-bombing campaigns, pitchforks were lifted, and torches lit. On Easter weekend, #amazonfail was a more popular topic on Twitter than Easter, Jesus, or Bacon. That says a lot.
Details were messier than 140 character chain-tweets implied, though. The titles hadn't been removed, they'd just been marked as "adult" -- a designation that hides a book's sales rank and hides it from normal site-wide searches. Results were inconsistent: some gay-positive childrens books showed up on searches, E-books versions of the hidden titles were still visible, and international sites like Amazon.co.uk seemed unaffected.
The shocking part was that books like Ellen Degeneres' biography and Augusten Burroughs' Running With Scissors had been classified as porn while Playboy's Centerfold Collection hadn't. Even more galling, evangelical books with titles like "How to Protect Your Kids From Turning Gay" were showing up as top results for product searches on 'Homosexuality.' Some people speculated that it was a massive troll, and one notorious griefer claimed credit for it. During the chaotic weekend, email inquiries to Amazon were met with form mail about its "Adult Content" policy. That just fanned the fires and implied that 'Gay Author == Porn' was Amazon's official policy.
Some folks said they were "hoping it was just a technical error," but lots of arguments quickly circulated about how it couldn't possibly be. Lots of people argued that there was no way that such a targeted hiding of books could be accidental.
I was ready to forward the boycott call to friends, but as I was about to hit return I noticed someone on MeFi talking about the Kindle versions of hidden books still being visible. That started a domino-chain of memories in my nerd hindbrain: as the developer of Drupal's Amazon Ecommerce API module, I spent a stupid amount of time mucking around with Amazon product metadata, and remembered that Kindle items are stored in a separate index. They're easy to get at if you're trying, but separate enough that some kinds of bulk operations have to be done twice: once for books and once for Kindle products. Operations like, say, bulk searching. Or perhaps internal bulk updates. Hmmmmm... That, in turn, brought back terrible memories of the times I worked on giant grocery chain data management applications. On occasion subtle bugs had resulted in, say, bottles of champagne ringing up $100 cheaper than they should have on New Years Eve. Such strange, targeted discounts seemed implausible, but frantic midnight debugging almost always revealed a combination of old, inconsistent data and new, assumption-riddled code. Hmmmmmmm...
As it turned out, it was a programming error. On top of that, it was a massive, monumental, profound PR clusterfuck: by the time Amazon had a half-baked statement ready to hand to a Seattle area newspaper, the LA Times and the Associated Press were covering it, gossip blogs were dishing, and mass boycotts had been organized. Online discussions about rumors about the cause were being cited on other news sites. Even now, a lot of people are insisting that the 'glitch' story is just a cover-up for a failed attempt at censorship.
As the dust settles it's becoming clearer that the "Easter Fail" was a perfect storm. A months-old Amazon policy about adult content, a terrible customer service system, a massively tangled 14 year old database of products metadata, and a culture of small isolated programming teams all collided. The result? Someone mistakenly flagged about 50,000 products in various gender and sexuality categories as 'adult' and no one at Amazon noticed the firestorm until it was too late. Worse, they didn't communicate clearly what was going on while the Internet speculated. The data problems are being fixed, but the reputation damage is done and #amazonfail is now a cautionary tale for PR people everywhere. Don't take vacations, don't sleep, and for the love of God, have a trusted public channel that you can use to say, "Whoops! Don't know what happened, we're fixing it" the minute something bad happens.
There's a lesson in it for the Internet pitchfork crowd, too. Like the #savejon hoax that hit the fan a couple of weeks ago, #amazonfail was about a fundamental lack of information blossoming into speculation and outrage in the blink of an eye. Even worse, when new information came in, a lot of people actively argued against it because it didn't fit the original frame they'd accepted for the story.
Even more frustrating for me was the elaborate speculation about Amazon's technical setup by people who admitted they no idea how Amazon's data and infrastructure worked. Suspicion and anger is understandable, but if you don't know anything about the technical details of a system, making up stories about how it works and using those stories to confirm your worst suspicions is a recipe for fail. I saw people do just that -- deciding that LGBT content had been "filtered," and explaining to others how "filters" work without any understanding of Amazon's tremendously complex and chaotic product meta data.
Twitter and other social networks can disseminate information fast. We've seen social networks break news about disasters and revolutions. Now, we're seeing the first large-scale Twitter Rumors turn into instant public controversies. If distributed, digital media is to evolve into a replacement for slower "traditional" journalism, something needs to change. Social standards for differentiating 'rumor' and 'leads' from 'researched speculation' and 'established fact' will have to emerge. Without those standards we'll be slaves to the latest outrage, real or manufactured.
Update: @dylanw has an excellent summary of the issues involved in this MeFi comment.




Well written sir! Maybe
Well written sir! Maybe someday people will ask questions first then speak later?
Maybe?
That's what I'm hoping. It's a difficult balance -- when news is breaking, everyone wants to help spread the word. But vetting those early bits of information is hard and it's easy to encounter an echo chamber: the only people talking about a topic are the ones who just heard the same rumor you did.
It is really a disgrace for
It is really a disgrace for the past of Amazon such tactics. Amazon has revolutionized online business by being the synonym of e-shop for every new online activity, service or product. I don’t know why removed so many titles from its electronic frontpage and why all of them belong to a specific category. Maybe the management team of Amazon became more conservative but the truth remains that the freedom of selection is depressed and certain parts of society feel that they are unwelcome. Personally I have seen many such classifications also to other online services, like the one which is made by shared hosting service providers, and every time I wonder why this is happening. For instance I have seen Christian web hosting, green web hosting etc and I don’t even get in trouble what they offer. I want from all of the online services to treat as the same, not to classify as to calluses with “special” interests. Amazon will lose if remains to this policy.
Well done
This is a really good deconstruction of what *actually* happened, why it happened, and (hopefully) how to avoid it happening again. I'm glad I know you.
Great summary
Thanks for summarising, Jeff. Intriguing how rumours fly these days. Although the truth is that the only difference is that rumours spread faster. The outrage/human error/hoax/beat-up-the-big-brand factors are all constant in any age.
Not just a glitch
The part all the "glitch" accepting people gloss over is that someone was labeling non-sexual books as "adult"/"porn". (The "glitch" then caused that list to be removed from general searches, displayed rankings, etc.) Included were a children's book that featured two mommies, and also a book to try to boost self-esteem of teen gays to try to reduce their higher suicide rate.
Whether from some internal policy or a bigot on a misguided tear, it's something that needs to be apologized for and assurances made it won't happen again.
P.S. Yes there's a villagers with pitchforks element to twitter trends, but the slow and flippant non-apology string of explanations ignoring credible claims to the contrary were examples of piss poor customer service and only served to stir them (us) up more.
Included were a children's
That's correct. And while Amazon's dysfunctional PR team hasn't put "the full story" out yet, what we do know is that all of the 57,000 books affected were categorized by the publisher under "gay and lesbian", "sexuality", or "erotica".
Someone tagged a freakishly huge set of books as 'Adult' using too-broad criteria, most likely because they misunderstood some of the very, very inconsistent metadata that is used to categorize books. That metadata comes from many different sources -- primarily the publishers themselves, explaining why the taxonomy drifts around like a mofo. In addition to the huge number of books affected, the fact that childrens' books, biographies, and books on disabilities were all flagged serves as further proof that it was a bulk update gone wrong rather than an explicit policy decision or a book-by-book choice. The fact that many obviously GLBT titles in different formats (like Kindle or audiobook) were left untouched also suggests that it was not deliberate: making the same change to those products would be very easy if one were carrying out an order to 'hide the GLBT stuff.'
Lots of people spent the weekend asking, "How could this happen?" and "Why were they maintaining a 'list' of GLBT books in the first place?" I spent the weekend trying to explain that in the MetaFilter thread about the issue: really big data sets like Amazon's are essentially a giant cloud of tags, categories, and other descriptive metadata. Someone was tweaking the product data for a very large set of sexuality-related books in a variety of product categories and did not realize that they were using categories far too broad for the changes they were making.
That kind of change happens a lot in large data sets. I have made similar errors on live production data for clients. When you spot it, you panic and you log in at 2am and you start trying to fix the splatters -- working very carefully to avoid causing even worse problems with your 'fix'. I was lucky enough to make those errors on data that wasn't being watched by a sensitive community already concerned about the threat of undeserved censorship.
That's correct. And that mind-bogglingly broken response was the really profound fail for Amazon. Everyone should learn from it. On the other hand, it's important to remember that the lesson boils down to, "If you don't have a team on hand to apologize to the Internet in real time for data corruption issues on Easter weekend, they will try to cripple your company."
Unfortunately, the incident will be used as confirmation of the right-wing trope that GLBT 'activists' are thin-skinned and paranoid, and will try to destroy companies for perceived slights. It's the internet's equivalent of the Madalyn Murray O'Hair FCC rumor.
Human communication is lubricated by trust. Amazon burnt through a lot of trust capital over the weekend with their non-responses and bad PR management. But as soon as it became clear that the change was 'screwup' rather than 'policy,' those who were accusing Amazon of deliberate anti-GLBT suppression started burning through their trust capital with everyone else. That's a real problem because the issues of censorship and sexuality-versus-porn are still here and still important.
This is the best write up I've seen on this...
...and an example of some of the principles of journalism that seem to be withering on the vine. A lot was made of the "filtered" books that fit a pattern, but no mention of those that did not, and why that might be. Although the "Twitter effect" might seem scary, the fact is that digital media also offers benefits like your post: bloggers are breaking news and offering context that is missing in mainstream media.
the problem with democracy
From the beginning I've been convinced that this whole mess seems to
be exceedingly counter-intuitive. It seems counter intuitive for
Amazon to want to be able to exclude items from their search
results. Ever. Amazon doesn't make money unless people buy their
stuff, People can't buy their stuff if they can't find it. If groups
want to censor search results, they can curate their own shops, by
hand, using new and existing affiliate program tools and amazon and
individuals/groups benefit from this.
As this has dragged on, and it's become more clear that this is the
result of some sort of miss-firing of "user generated data"
programing, my first thought is "and here's why ad-hoc participation
works against the needs of democracy." Basically, while it's fine and
dandy to enlist the public in projects like this, and public
participation has the feeling of democracy, it's really the case that
just because the contributors are distributed they're not necessarily
diverse. Small groups of people, small ideas--if they can handle the
task--get way way way too much power over a job/collection of
data. This is of course somewhat abstract, but I think if there's a
lesson here, beyond the "avoid PR and customer service nightmares" it
should be something along the lines of "develop distributed
technologies that can respect minority interests."
Pingback
[...] an unhealthy frenzy to communicate something you haven’t had time to make your own. The rise and fall of so many Twitter hashtags is just the most recent phenomenon of a deeper change in internet [...]
Post new comment