The science of Westworld (an AI summary)

http://ift.tt/2j1IN8g

The science of Westworld – Mike’s blog

Artificial intelligence made enormous strides in 2016, so it is fitting that one of the year’s hit TV shows was an exploration of what it means for machines to gain consciousness. But how close are we to building the brains of Westworld’s hosts for real? I’m going to look at some recent AI research papers and show that the hosts aren’t quite as futuristic as you might think.

“Ah yes, your mysterious backstory. It’s the reason for my visit. Do you know why it is a mystery, Teddy? Because we never actually bothered to give you one, just a formless guilt you will never atone for. But perhaps it is time you had a worthy story of origin.”

Story comprehension

The robots of Westworld are not programmed solely by software developers. The bulk of the work is done by professional writers, who give each character a unique backstory. These stories give them the memories and depth they need to seem real to the park guests. When asked who they are, what they’ve done or why they feel a certain way, they can consult their backstory to find out the answer.

Being able to answer questions about stories is a fundamental requirement for being able to pass the Turing test, which the show tells us started to happen “after the first year of building the park”. But Turing proposed his test as a kind of thought experiment, not as a useful yardstick for measuring progress in AI. A machine either passes or fails and that’s not very useful for figuring out how close we are.

To fix this, in 2015 Facebook’s AI lab introduced the bAbI tests in a paper called “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks”. Quoting from the paper’s abstract:

To measure progress towards [building an intelligent dialogue agent], we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more. The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human.

In other words, before you can hope to pass the Turing test you must learn to pass bAbI.

The test is a large, auto-generated series of simple stories and questions, that test 20 different kinds of mental skill. Here’s one that checks the machine isn’t distracted by irrelevant facts:

Mary went to the bathroom. John moved to the hallway. Mary travelled to the office.

Where is Mary? Answer: office

Here’s a harder one that tests basic logical induction:

Lily is a swan. Lily is white. Bernhard is green. Greg is a swan.

What color is Greg? Answer: white

The bAbI tests check come in English, Hindi and a scrambled form where the English words are randomly shuffled so the tasks can no longer be understood by humans. To pass the test a machine should get equivalent results on all three: the idea is to learn everything, including the language itself, simply by reading. Programs specifically designed to handle bAbI can obtain near-perfect scores, but what about general AIs that are given only the words and nothing else?

The best result yet is from Facebook AI Research. The results are reported in their December 2016 paper, “Tracking the world state with recurrent entity networks” and their AI can solve all 20 tasks.

You can of course train a neural network both on these tasks and on large question/answer databases too, which yields an AI that can talk about the story using learned real-world knowledge:

Fred went to the kitchen. Fred picked up the milk. Fred travelled to the office.

Where is the milk? A: office 
Where does milk come from? A: milk come from cow 
What is a cow a type of? A: cow be female of cattle 
Where are cattle found? A: cattle farm become widespread in brazil What does milk taste like? A: milk taste like milk 
What does milk go well with? A: milk go with coffee 
Where was Fred before the office ? A: kitchen

Similar algorithms have proven able to read — I kid you not — the Daily Mail, which turns out to be ideal for AI research because the stories come with bullet point summaries of the text (see the DeepMind paper, “Teaching Machines to Read and Comprehend”).

In this task an anonymised news story is presented and the goal is to correctly fill in the X. The answer is “ent23”. The heat map shows which parts of the text the neural network gave the most attention to figure out the answer. The names are randomised to stop AIs from answering questions like “can fish oils cure X?” as “X = cancer” without even reading the document, simply by knowing that cancer is a very commonly cured thing in the Daily Mail.

Remember, this kind of learning works even when the questions are written in a randomised language. It’s real understanding derived from nothing at all except studying raw text.

That’s important because a machine that can learn to answer questions given nothing but words can eventually — if it scales up — learn about the world, and about humanity, by reading books. That’s the next goal for DeepMind, a British AI lab owned by Google that has also done research into story comprehension. And once it’s read the entire contents of Google Books it can go ahead and read a book you wrote just for it: the book that creates its character.

What’s important to understand is that there’s no reason a neural network trained by reading books and backstories would know it is a robot. When it queries its memory with a question like “what am I?” it would retrieve whatever it was trained on. And as books are typically written from the perspective of a human, rather than a robot, that’s the perspective it would access.

Clementine lost in “reveries”, fragments of memories that were supposed to be overwritten but are still accessible, thanks to Arnold.

Memory

Two key plot points in Westworld are about memories:

Hosts begin to access memories that were supposedly erased.
Hosts have photographic memories and cannot tell the difference between recall and reality.

How realistic is this? Perhaps surprisingly the answers are, respectively, “very” and “not at all”.

Let’s tackle the topic of erasing memories first.

Most current progress in AI is coming from advances in the field of neural networks, which are data structures inspired by the brain. If you’ve recently noticed big improvements in the quality of your phone’s speech recognition, or in the quality of Google Translate, you’ve noticed neural networks in action. Don’t take the analogy too literally: neural networks are kind of like brains in the same way that your computer’s files and folders are kind of like the paper things found in an office … the comparison is helpful but not meant to imply exact simulation.

The networks used in speech and image recognition work on something a bit like instinct. After being trained, they’re presented with some data and immediately spit out their best guess at the answer, which is synthesised from the entire contents of the network. There isn’t much that can be called structured reasoning. That limits their performance on many important tasks, if only because as the networks get larger their performance drops. So researchers have started adding an additional component to them: memory.

The memory available to a neural network is very different to regular computer storage, even though the contents are stored in ordinary files. For one, it’s “content addressable”: memories are accessed by querying with something similar to what is wanted. For another, neural memory is not neatly partitioned into human-meaningful files and directories. It’s simply a large set of numbers and the neural network itself decides how to use and combine them. From the DeepMind paper, “Neural Turing Machines”:

We achieved this by defining ‘blurry’ read and write operations that interact to a greater or lesser degree with all the elements in memory (rather than addressing a single element, as in a normal Turing machine or digital computer). The degree of blurriness is determined by an attentional “focus” mechanism that constrains each read and write operation to interact with a small portion of the memory, while ignoring the rest.

So defining where something is stored in a neural memory turns out to be hard: a particular memory could be spread over many locations with some contributing more than others. This poses obvious problems for the task of erasing specific memories whilst leaving others intact. You of course always have the option of a “rollback”, as Dr Ford puts it. In a rollback the entire memory contents are replaced with an earlier snapshot. That’s guaranteed to work. But doing that means the AI also forgets everything else it has learned, including things that may be useful to keep:

“It’s the tiny things that make them seem real, that make the guests fall in love with them.”

— Bernard

Dr Ford and Bernard are faced with a difficult task: they want to erase the host’s memories of their previous time around the narrative loop along with the memories of being shot, raped and kidnapped by the guests. But they want to preserve the memories of new words and phrases, the improved horse riding skills and so on … all the small experiences that lead to improvement in the machine’s realism.

Given the way AI technology is evolving, erasing specific memories won’t be easy, because in a neural network — just like in a real brain — the memories are all linked together in ways that aren’t easily understood by an observer. Believing you’ve successfully erased specific memories and then your AI finding ways to access them anyway is thus an entirely believable plot point.

What about the second idea in the show: that hosts can’t tell the difference between memories and what’s real? That seems a lot less likely. In another 2016 DeepMind paper, “Towards conceptual compression”, the authors introduce an NN based algorithm that works just like our memories do: by throwing out fine details but retaining the concepts. This image compares multiple image compression algorithms: the top row is the original image, the grey row is ordinary JPEG as used on the web (grey/missing because it can’t make files compressed this much), then JPEG2000, and the last two rows are using neural network based compression (in different modes). Each algorithm is given the same amount of space to encode the original image.

As can easily be seen in the fifth column the neural network was able to retain the concept of a bird in front of water even when the advanced JPEG2000 algorithm created a blurry mess, and ordinary JPEG as used on the web gave up entirely. The man looking at the elephant was likewise retained as a series of brush strokes, like what a painter might create. The detail is gone but the important parts remain … just like how our memories fade and details go missing, but the basics remain.

Given the “blurry” nature of neural memory and the universal desire engineers have to utilise resources most effectively, it’s hard to imagine robotic memories retained with so much detail that they are impossible to separate from the reality being captured by the machine’s sensors. Even though we are used to thinking of computers as perfect devices that never lose information, the reality is they do throw away data all the time in order to improve their performance in other areas.

from Hacker News http://ift.tt/YV9WJO
via IFTTT