Albert Wenger writes that concerns about “black box” algorithms are overwrought. (See here and here for more about these concerns.) It’s okay, Wenger says, if we can’t follow or audit the logic of the machines, even in life-and-death contexts like healthcare or policing. We often have that same lack of insight into the way humans make decisions, he says—and so perhaps we can adapt our current error prevention to the machines:
It all comes down to understanding failure modes and
guarding against them.
For instance, human doctors make wrong diagnoses. One
way we guard against that is by getting a second opinion.
Turns out we have used the same technique in complex
software systems. Get multiple systems to compute something
and act only if their outputs agree. This approach
is immediately and easily applicable to neural networks.
failure modes include hidden biases and malicious attacks
(manipulation). Again these are no different than for
humans and for existing software systems. And we have
developed mechanisms for avoiding and/or detecting
these issues, such as statistical analysis across systems.
Ben Thompson reacts to Google’s latest effort to bury fake news and hate speech. In particular, he throws a flag on Google’s plan to favor “authoritative” sources—and especially on the fact that Google will almost certainly not reveal what grants a site this privileged status.
Google is going to be making decisions about who is authoritative and who is not, which is another way of saying that Google is going to be making decisions about what is true and what is not, and that demands more transparency, not less.
For better or worse, of course, Google is our de facto truth machine. Most of the world turns to its search engine to answer a question. That’s what makes this whole situation so thorny: as the world’s primary source for facts, Google must be more discerning than it is now. And yet the act of being more discerning amplifies its influence even more.
Perhaps the most unanticipated outcome of the unfettered
nature of the Internet is that the sheer volume of
information didn’t disperse influence, but rather concentrated
it to a far greater degree than ever before, not to
those companies that handle distribution (because distribution
is free) but to those few that handle discovery.
âA complex system that works is invariably found to
have evolved from a simple system that worked. A complex
system designed from scratch never works and cannot
be patched up to make it work. You have to start over
with a working simple system.â
â John Gall
Every ambitious project launches amid a thicket of fears and grand hopes. The worst thing you can do is try to design for all those assumed outcomes (let alone the edge cases). Start with a sturdy but simple system and build from there as you learn. As Jorge writes, that’s the appeal (and necessity) of the MVP:
When the product is real and can be tested, it can (and should) evolve
towards something more complex. But baking complexity into the first
release is a costly mistake. (Note I didnât say it âcan beâ. Itâs guaranteed.)
Vogel was referring to his plans to retire the About.com brand next week, on May 2. About.com is one of the most venerable Internet properties out there, over two decades old and still one of the top 100 by traffic. The content will live on, but across several different verticals, none of which will carry the About.com name. About.com is dead; long live About.com.
Shutting down that brand might have the ring of failure, but it turns out it’s a pretty remarkable turnaround story. I’ve been lucky enough to see that turnaround up close.
A few years ago, Google’s algorithm started treating the general-interest site as a content farm, and the site’s search ranking plummeted. At the same time, advertisers were backing out, preferring more targeted sites over About.com (WebMD, for example, instead of About.com’s Health section). Fortunes were not looking good.
Health is our most valuable, most-trafficked, biggest vertical,
so we came up with an idea. Our content is very much
in the style of like WebMD or Everyday Health. But
we thought those sites, we just didn’t think they have
served a market need. We thought that we could make
a beautiful, kinder, gentler health site. You go to
these some of other sites with a headache, you think
you have a brain tumor. You come to us with a headache,
we’re going to make your headache feel better and explain
why you had a headache and make it better. That was
So we took our 100,000 pieces of health content of
About.com, threw 50,000 in the garbage because they
were old. We didn’t like them. The other 50 [thousand]
were read by our writers. If it was medical information;
it was read by a doctor. We had 30,000 pieces of content
read by physicians, edited, cleaned up. Built a brand-new
site from scratch, a new taxonomy for our content,
put it on the site.
We did that. We built this beautiful new site from
scratch, everything from scratch.
Together we created the new brand, cleaned up the information architecture, and importantly got rid of a ton of cheap advertising. With fewer ads per page and a new premium brand, traffic skyrocketed and revenue soared.
I think we had 8 million uniques when we started
a month, I think we have 17 million uniques now to
Very Well. So we’ve pretty much doubled in size in
12 months. We’re by far the fastest-growing thing in
the health space. I think we’re No. 4 or 5 on comScore
on health because our bet was right …
We knew that this would work. Then we launched something
in the summer. Ran a very similar playbook on our personal-finance
content called The Balance, which has pretty much doubled
in traffic since we launched it this summer. We launched
something called Lifewire in November, which is our
evergreen-content tech site — how to fix my router,
how to unbrick my iPhone. We launched three weeks ago,
about a month ago something called The Spruce, which
is the third-biggest home site on the internet, only
behind HGTV and the Hearst Brands. We had such scale
on About, that we’re launching these new brands into
the world that are new to the space with no legacy
issues, look like start ups, but all of a sudden, like
we’re top 10 in comScore because we’re coming with
such scale. The market’s like, "What? Where did
you guys come from?"
It was a treat to work with the whole crew at About.com. There’s a lot of experience under that roof, and it’s been amazing to help release so much pent-up potential.
Vogel says that the About.com name will finally be retired next week to be replaced with a new brand name.
Over at freelance.tv, my pal and collaborator Dan Mall shares the goods on what it takes to be a world-class indie designer. Dan is not only one of the most talented designers I know, he’s also one of the most generous, openly sharing his hard-earned wisdom of making it work in this industry.
Here’s Dan on the early days of starting his design collaborative SuperFriendly:
I figured out what I was really good at, I figured out what I was good at that I didn’t want to do. I figured out what I was bad at. I figured out what I was bad at that actaully clients were asking for, so I should get better at that stuff.…
The ability to be a generalist is really important for a freelancer. When you’re working by yourself, you’re the CEO but you’re also the janitor. You’ve gotta take care of the plants, too.…
There’s an interesting time in the life of a freelancer when you decide, “I want to team up with somebody, or collaborate with somebody, or hire somebody to do the jobs that I’m not particularly good at.”
I’m very honored to be one of those collaborators—and very happy that Dan is so freakin’ good at so many of the jobs that I’m not good at.
Huet talked to people who filled this role at two services that automate calendar scheduling, X.ai and Clara, and I t doesn’t sound like the world’s most fulfilling work:
Calvin said he sometimes sat in front of a computer
for 12 hours a day, clicking and highlighting phrases.
“It was either really boring or incredibly frustrating,”
he said. “It was a weird combination of the exact same
thing over and over again and really frustrating single
cases of a person demanding something we couldn’t provide.”
As another former X.ai trainer put it, he wasn’t worried about his job being replaced by a bot. It was so boring he was actually looking forward to not having to do it anymore.
I’m confident that putting people in the bot role is the right way to prototype bot services with very small trial audiences. It lets you hone your understanding of what people actually want and build a good set of training data as well as the voice and tone of the service. But it’s also clear that this kind of work—focusing relentlessly and mind-numbingly on the same narrow micro-interaction—is not meant for long-term job roles.
This is why people are trying to automate this stuff in the first place. The risk is that, during the transition, the tedium of modeling this automation will fall heavily and narrowly on a small group who wind up working for the bots, rather than the reverse. How might we avoid making this the future of work?
BuzzFeed’s Mat Honan takes a world-weary view of Facebook’s unsurprisingly boosterish presentation of new technologies at the company’s F8 show. In particular, he’s disappointed the company didn’t do more to acknowledge the potential for abuse in this new tech:
The problem with connecting everyone on the planet is that a lot of people are assholes. The issue with giving just anyone the ability to live broadcast to a billion people is that someone will use it to shoot up a school. You have to plan for these things. You have to build for the reality we live in, not the one we hope to create. â¦
Executive after executive took the F8 stage to show
off how these effects will manifest themselves in the
real world. Deborah Liu, who runs Facebookâs monetization
efforts, encouraged the audience to âimagine all the
possibilitiesâ as she ran through demos of a café where
people could leave Yelp-style ratings tacked up in
the air and discoverable with a phone, or a birthday
message she generated on top of an image of her daughter,
while noting that with digital effects, âI can make
her birthday even more meaningful.â
And yet the dark human history of forever makes it
certain that people will also use these same tools
to attack and abuse and harass and lie. They will leave
bogus reviews of restaurants to which theyâve never
been, attacking pizzerias for pedophilia. If anyone
can create a mask, some people will inevitably create
ones that are hateful. â¦
But Facebook made no nods to this during its keynote
â and realistically maybe itâs naive to expect the
company to do so. But it would be reassuring to know
that Facebook is at least thinking about the world
as it is, that it is planning for humans to be humans
in all their brutish ways. A simple âweâre already
considering ways people can and will abuse these tools
and you can trust us to stay on top of thatâ would
go a long way.
I like that. Simply acknowledging potential problemsâand stating your resolve in solving themâis a way to make your values clear and to start to bake them into the product and organization, too.
At MIT Technology Review, Tom Simonite writes about Facebook’s efforts to make its automated assistant M answer pretty much any request that comes its way, not matter how obscure. And for a very small group of beta testers, the bot actually works, delivering results so good you’d swear you were talking to a human being. Because you are.
M is so smart because it cheats. It works like Siri
in that when you tap out a message to M, algorithms
try to figure out what you want. When they can’t, though,
M doesn’t fall back on searching the Web or saying
“I’m sorry, I don’t understand the question.” Instead,
a human being invisibly takes over, responding to your
request as if the algorithms were still at the helm.
(Facebook declined to say how many of those workers
it has, or to make M available to try.)
That design is too expensive to scale to the 1.2 billion
people who use Facebook Messenger, so Facebook offered
M to a few thousand users in 2015 as a kind of semi-public
R&D project. Entwining human workers and algorithms
was intended to reveal how people would react to an
omniscient virtual assistant, and to provide data that
would let the algorithms learn to take over the work
of their human “trainers.”
This is the way I’ve been prototyping chatbots, too: start with simple human-to-human interactions.
I’m a big fan of this kind of prototype that put people where the pipes will eventually go. In a way, Uber is a similar prototype for self-driving cars: until the robots get the go-ahead to drive on their own, we’ll put a human in the driver’s seat and automate the rest of the experience (calling a car, giving directions, paying the tab).
When you’re trying out new interactions for untested or emerging technologies, the best MVP is often no tech at all. Powering a bot with people instead of artificial intelligence gets you early info about what people want, how they respond, and the kind of language to use. It proves out the demand of the service, hints at the shape it should take, and offers training data to give to the bots down the road.
Eventually the AI steps in. At Facebook, they’re still trying to use all that data to train the bots well enough so that they can take over. Simonite shares some of the techniques the M team is using, with mixed results. Even though machine-learning breakthroughs are coming fast and furious, the holy grail of broad and instant natural-language understanding is still tantalizingly out of reach. “Sometimes we say this is three years, or five years,“ M’s leader Laurent Landowski told Simonite. ”But maybe it’s 10 years or more.”
Two years after it launched, a platform that aspired
to build a more stable path forward for journalism
appears to be declining in relevance. At the same time
that Instant Articles were being designed, Facebook
was beginning work on the projects that would ultimately
undermine it. Starting in 2015, the company’s algorithms
began favoring video
over other content types, diminishing
the reach of Instant Articles in the feed. The following
year, Facebook’s News Feed deprioritized article links
in favor of posts from friends and family. The
arrival this month of ephemeral stories
on top of the News
Feed further de-emphasized the links on which many
publishers have come to depend.
In discussions with Facebook executives, former employees,
publishers, and industry observers, a portrait emerges
of a product that never lived up to the expectations
of the social media giant, or media companies. After
scrambling to rebuild their workflows around Instant
Articles, large publishers were left with a system
that failed to grow audiences or revenues.
Building a business on top of someone else’s platform offers little control or visibility—and ties your fortunes to their priorities, not your own. Newton writes that many publishers are instead throwing in with Google’s AMP platform, which feels like a frying-pan-to-fire maneuver.
David Weinberger considers what it means that machines now construct their own models for understanding data, quite divorced from our own (more simplistic) models. “The nature of computer-based justification is not at all like human justification. It is alien,” Weinberger writes. "But ‘alien’ doesn’t mean ‘wrong.’ When it comes to understanding how things are, the machines may be closer to the truth than we humans ever could be.”
The complexity of this alien logic often makes it completely opaque to humans—even those who program it. If we can’t understand the basis of machine-delivered “truths,” Weinberger suggests, they become categorically different from what we’ve always considered to be “knowledge”:
Clearly our computers have surpassed us in their power to discriminate, find patterns, and draw conclusions. That’s one reason we use them. Rather than reducing phenomena to fit a relatively simple model, we can now let our computers make models as big as they need to. But this also seems to mean that what we know depends upon the output of machines the functioning of which we cannot follow, explain, or understand. … If knowing has always entailed being able to explain and justify our true beliefs — Plato’s notion, which has persisted for over two thousand years — what are we to make of a new type of knowledge, in which that task of justification is not just difficult or daunting but impossible? …
One reaction to this could be to back off from relying upon computer models that are unintelligible to us so that knowledge continues to work the way that it has since Plato. This would mean foreswearing some types of knowledge. We foreswear some types of knowledge already: The courts forbid some evidence because allowing it would give police an incentive for gathering it illegally. Likewise, most research institutions require proposed projects to go through an institutional review board to forestall otherwise worthy programs that might harm the wellbeing of their test subjects.
This is super-intriguing: what are the circumstances where the stakes are so high that we simply can’t allow ourselves to trust the conclusions of our machines, not matter how confident we may be in the algorithm? When it comes to “forbidden” areas of machine-learning models, Weinberger points out credit agencies are already forbidden from tying certain predictive models to credit scores. If the machines decide that certain races, religions or ethnicities are prone to lower or higher credit scores, for example, credit agencies are legally forbidden from acting on that info.
The reason this is a dangerous area is because the machines’ conclusions are only as valuable as the training data we feed to them. And that training data depends on the perspective (and bias) of the folks who collect it:
For example, a system that was trained to evaluate the risks posed by individuals up for bail let hardened white criminals out while keeping in jail African Americans with less of a criminal record. The system was learning from the biases of the humans whose decisions were part of the data. The system the CIA uses to identify targets for drone strikes initially suggested a well-known Al Jazeera journalist because the system was trained on a tiny set of known terrorists. Human oversight is obviously still required, especially when we’re talking about drone strikes instead of categorizing cucumbers.
We’re still in the early days of what this oversight and machine-human partnership might look like, but we’re going to have to learn fast. Machine learning has suddenly become inexpensive and accessible to a whole range of organizations and uses, and we see it everywhere. This revolution has revealed the complexity of everyday systems at the same time that it’s let us cut right through them through the capacity and speed of modern computing—even if we don’t understand how we got there.
Where once we saw simple laws operating on relatively predictable data, we are now becoming acutely aware of the overwhelming complexity of even the simplest of situations. Where once the regularity of the movement of the heavenly bodies was our paradigm, and life’s constant unpredictable events were anomalies — mere “accidents,” a fine Aristotelian concept that differentiates them from a thing’s “essential” properties — now the contingency of all that happens is becoming our paradigmatic example.
This is bringing us to locate knowledge outside of our heads. We can only know what we know because we are deeply in league with alien tools of our own devising. Our mental stuff is not enough.
Frank Chimero mulls the beauty of the plain and the normal in design. I like the implicit humility Frank suggests in designs that root their beauty in the quiet satisfaction of their function—not “an overly accentuated, hyper-specific identity”:
I am for a design that’s like vanilla ice cream: simple
and sweet, plain without being austere. It should be
a base for more indulgent experiences on the occasions
they are needed, like adding chocolate chips and cookie
dough. Yet these special occassions are rare. A good
vanilla ice cream is usually enough. I don’t wish to
be dogmatic—every approach has its place, but sometimes
plainness needs defending in a world starved for attention
and wildly focused on individuality. Here is a reminder:
the surest way forward is usually a plain approach
done with close attention to detail. You can refine
the normal into the sophisticated by pursuing clarity
and consistency. Attentiveness turns the normal artful.
Examples include automated web designs from The Grid CMS and Wix, as well as the machine-generated page layouts at Vox and Flipboard. There are also bot-built logos, type pairings, image generators, content-aware photo croppers, and more.
Lots to see and learn here about how designers will collaborate with our robot overlords.
for example, the âinnovationâ known as Gas Station
TVâthat is, the televisions embedded in gasoline pumps
that blast advertising and other pseudo-programming
at the captive pumper. There is no escape: as the CEO
of Gas Station TV puts it, âWe like to say youâre tied
to that screen with an 8-foot rubber hose for about
five minutes.â It is an invention that singlehandedly
may have created a new case for the electric car.
Attention theft happens anywhere you find your time
and attention taken without consent. The most egregious
examples are found where, like at the gas station,
we are captive audiences. In that genre are things
like the new, targeted advertising screens found in
hospital waiting rooms (broadcasting things like âThe
Newborn Channelâ for expecting parents); the airlines
that play full-volume advertising from a screen right
in front of your face; the advertising-screens in office
elevators; or that universally unloved invention known
as âTaxi TV.â
What to do about ad screens that are imposed on us in these captive scenarios? Wu suggests towns and cities have managed this problem before:
In the 1940s cities banned noisy advertising trucks bearing loudspeakers; the case against advertising screens and sound-trucks is basically the same. It is a small thing cities and towns can do to make our age of bombardment a bit more bearable.
At MIT Technology Review, Will Knight writes about the unknowable logic of our most sophisticated algorithms. We are creating machines that we don’t fully understand. Deep Patient is one example, a system that analyzes hundreds of thousands of medical records looking for patterns:
Deep Patient is a bit puzzling. It appears to anticipate the onset of psychiatric disorders like schizophrenia surprisingly well. But since schizophrenia is notoriously difficult for physicians to predict, [project leader Joel] Dudley wondered how this was possible. He still doesn’t know. The new tool offers no clue as to how it does this. If something like Deep Patient is actually going to help doctors, it will ideally give them the rationale for its prediction, to reassure them that it is accurate and to justify, say, a change in the drugs someone is being prescribed. “We can build these models,” Dudley says ruefully, “but we don’t know how they work.”
As deep learning begins to drive decisions in some of the most intimate and impactful aspects of life and culture—policing, medicine, banking, military defense, even how our cars drive—what do we need to know about how they think?
As the technology advances, we might soon cross some
threshold beyond which using AI requires a leap of
faith. Sure, we humans can’t always truly explain our
thought processes either—but we find ways to intuitively
trust and gauge people. Will that also be possible
with machines that think and make decisions differently
from the way a human would? We’ve never before built
machines that operate in ways their creators don’t
understand. How well can we expect to communicate—and
get along with—intelligent machines that could be unpredictable
This is especially important when the machines come up with bad answers. How do we understand where they went wrong? Or to know how to help them learn from the mistake? Knight offers a few examples of how researchers are experimenting with this, and many come down to new ways of visualizing and presenting the logic flow.
This resonates strongly with a key belief I have: the design of data-driven interfaces has to get just as much attention as the underlying data science itself—perhaps even more. If we’re going to build systems smart enough to know when they’re not smart enough, we need to be especially clever about how those systems signal the confidence of their answers and how they arrived at them. That’s the stuff of truly useful human-machine partnerships, and it’s a design problem I find myself working on more and more these days.
One hitch: we humans aren’t always so great at explaining our thinking or biases, either. What makes us think that we can train machines to do it any better?
Just as many aspects of human behavior are impossible to explain in detail, perhaps it won’t be possible for AI to explain everything it does. “Even if somebody can give you a reasonable-sounding explanation [for his or her actions], it probably is incomplete, and the same could very well be true for AI,” says Clune, of the University of Wyoming. “It might just be part of the nature of intelligence that only part of it is exposed to rational explanation. Some of it is just instinctual, or subconscious, or inscrutable.”
Brad suggests that development teams then build implementation-specific versions of the components that match the recommended rendered output. So you might have a React layer, an Angular layer, and so on. But those implementation details are all carefully segregated from the recommended markup.
The design system itself doesn’t care how you build it as long as the end result comes out the right way. Of course, developers do care how it’s built, and one promise of design systems is to deliver efficiencies there. So organizations should make it a goal for teams to share those platform-specific implementations, Brad writes:
This architecture provides a clear path for getting the tech-agnostic, canonical design system into real working software that uses specific technologies. Because it doesn’t bet the farm on any one technology, the system is able to adapt to inevitable changes to tools, technologies, and trends (hence the placeholder for the “new hotness”). Moreover, product teams that share a tech stack can share efforts in maintaining the tech-specific version of the design system.
Overall, our results showed that, while real-world
social networks were positively associated with overall
well-being, the use of Facebook was negatively associated
with overall well-being. These results were particularly
strong for mental health; most measures of Facebook
use in one year predicted a decrease in mental health
in a later year. We found consistently that both liking
others’ content and clicking links significantly predicted
a subsequent reduction in self-reported physical health,
mental health, and life satisfaction.
Our models included measures of real-world networks
and adjusted for baseline Facebook use. When we accounted
for a person’s level of initial well-being, initial
real-world networks, and initial level of Facebook
use, increased use of Facebook was still associated
with a likelihood of diminished future well-being.
This provides some evidence that the association between
Facebook use and compromised well-being is a dynamic
WPO Stats is a super-useful collection of stats from Tammy Everts and Tim Kadlec to demonstrate the business value of faster websites. If you need support for making the business case for your performance project, here’s your go-to library.
BBC has seen that they lose an additional 10% of users
for every additional second it takes for their site
to load. [source]
AliExpress reduced load time by 36% and saw a 10.5%
increase in orders and a 27% increase in conversion
for new customers. [source]
For every 100ms decrease in homepage load speed, Mobify’s
customer base saw a 1.11% lift in session based conversion,
amounting to an average annual revenue increase of
I missed this a few weeks back. At Search Engine Land, Danny Sullivan reported that Google is empowering its 10,000 human reviewers to start flagging offensive content, an effort to get a handle on hate speech in search results. The gambit: with a little human help from these “quality raters,” the algorithm can learn to identify what I call hostile information zones.
The results that quality raters flag is used as âtraining
dataâ for Googleâs human coders who write search algorithms,
as well as for its machine learning systems. Basically,
content of this nature is used to help Google figure
out how to automatically identify upsetting or offensive
content in general.â¦
Google told Search Engine Land that has already been testing these new guidelines with a subset of its quality raters and used that data as part of a ranking change back in December. That was aimed at reducing offensive content that was appearing for searches such as âdid the Holocaust happen.â
The results for that particular search have certainly improved. In part, the ranking change helped. In part, all the new content that appeared in response to outrage over those search results had an impact.
âWe will see how some of this works out. Iâll be honest. Weâre learning as we go,â [Google engineer Paul Haahr] said.