The hidden dangers of AI: Big Brains podcast with Ben Zhao

Show Notes

The development of artificial intelligence has begun to feel inevitable and promising. But University of Chicago computer scientist, Ben Zhao, has spent much of his career testing how the security of these systems can break down.

Zhao’s study involving Yelp reviews generated by A.I. show how these system could be used to distort our perceptions of reality, especially in this era of fake news. And his latest investigation into “backdoors” demonstrates how they could be used to hack crucial systems in dangerous and even deadly ways.

Subscribe to Big Brains on Apple Podcasts, Stitcher and Spotify.

Music used in this episode: BurrowBurrow, Lumber Down, House of Grendel, Tralaga, and Cicle DR Valga by Blue Dot Sessions

Recommended:

Transcript:

Paul Rand: Do you think you could tell the difference between a restaurant review written by a human and one written by an Artificial Intelligence.

Louise: I feel very confident. I read a lot of Yelp reviews, and I feel like I can tell when they’re written by the business. So.

Ryan: Absolutely. I don’t know if computer science hasn’t advanced to the point where we can’t tell the difference anymore.

Mark: Of course. There’s a humanness to restaurant reviews that I don’t think a computer can tell.

Paul Rand: This is the question University of Chicago professor, Ben Zhao, posed in one of his landmark studies into what A.I.s are really capable of.

Ben Zhao: Yeah, I believe we initially just focused on reviews in general because you know they tend to be short, pithy, sometimes grammatically incorrect because people tend to make mistakes when they write them. So it was a fairly low hanging fruit kind of target.

Paul Rand: That’s Zhao. If his team could develop an A.I. that could write convincing restaurant reviews, it would show that these systems could be capable of falsifying all sorts of thing. To test the idea, our producer Matt Hodapp asked people from our if they could tell the difference between Zhao’s A.I. reviews and the real one.

Hodapp: Alright, here we go. I’m going to give you two reviews, and you have to tell me which one’s real and which one’s fake. “My family and I are huge fans of this place. The staff is super nice and the food is great. The chicken is very good and the garlic sauce is perfect. Ice cream topped with fruit is delicious too. Highly recommended!” This is the second one: “The food here is freaking amazing, the portions are giant. The cheese bagel was cooked to perfection and well prepared, fresh & delicious! The service was fast. Our favorite spot for sure! We will be back!”

Louise: Cheese bagel is real.

Hodapp: Cheese bagel is real!

Louise: Cheese bagel is real.

Hodapp: No. Cheese bagel is artificial intelligence.

Louise: What. No way!

Paul Rand: Almost no-one got it right.

Mark: First is AI second one is a human?

Hodapp: Nope

Mark: Woah.

Ben Zhao: These things were really powerful enough that the the reviews came out and and you know maybe there were a couple minor grammatical things an occasional misspelling but that's exactly in tune with sort of where normal reviews are and so users couldn't tell the difference.

Ben Zhao: These things were really powerful enough that the reviews came out, and you know maybe there were a couple minor grammatical things, an occasional misspelling, but that's exactly in tune with where normal reviews are.

Ryan: Number two is the real one.

Hodapp: You’re wrong.

Ryan: Dang it!

Paul Rand: Zhao’s study highlights some really dangerous things A.I.s could be capable of, and he’s spent the last few years breaking these systems down to discover what else my be coming as they develop.

Ben Zhao: This stuff is important, and a lot of the areas that we look into are of high impact. We look at particular security problems that really affect people.

Paul Rand: From the University of Chicago, this is Big Brains, stories behind the pioneering research and pivotal breakthroughs reshaping our world. On this episode, Ben Zhao and dangerous of artificial intelligence. I’m your host Paul Rand.

CNBC TAPE: “We’re here to talk about AI today and how it’s disrupting and changing industries across the board.”

Paul Rand: Conversations around artificial intelligence have become so commonplace these days that even the White House is getting involved.

DAILY MAIL TAPE: “President Donald Trump on Monday will sign an executive order aimed at boosting America’s Artificial Intelligence Agency”

Paul Rand: The development of A.I. has begun to feel almost inevitable. Questions focus on when and how rather than if. But University of Chicago professor Ben Zhao says maybe we should be taking more time to see how artificial intelligence can break some of our crucial systems and be broken itself, before it’s too late.

Ben Zhao: You know I've always had a, what some people have termed, an adversarial curiosity. You know, you walk into a room and for some people they see the really bright spots and other people they see sort of the the interesting tid bits that other people overlook. And I resisted the sort of lure of deep learning and neural networks and A.I. for quite a while. And really this space is so hyped there's so much excitement that I thought that it was just a little too crowded and turns out that one way to help with that hype is to actually pop a couple of balloons by looking at the downsides and some of the challenges that people are oftentimes ignoring a little bit when they're rushing out in that excitement to learn about the new breakthrough or to deploy the newest thing and so that's sort of the role we're taking. Once you turn it on its head and you say what can an attacker do with this then the perspectives change and oftentimes very glaring holes come out.

Paul Rand: One of the glaring holes that Zhao and his team discovered last year, was the ability to use A.I. to generate fake, but convincing, documents.

Ben Zhao: If you look at the basic question of how good are we at capturing language and reproducing language and synthesizing language. If you just give a software component enough training for a particular type of text, can it capturing enough information about the grammar, about the context, about the vocabulary to basically generate its own and to fool people into thinking this is something written by a real human.

Paul Rand: And the obvious documents to try to get and A.I. to fake first, the most important thing online that we use all the time: restaurant reviews.

Ben Zhao: So we looked at specifically how easy would it be to write software that would capture enough reviews.

Paul Rand: Both positive and negative.

Ben Zhao: Both positive and negative. So that they were sufficiently good at generating a near infinite number of reviews on command. Right. And so the hypothetical scenario is if you're a bad actor.

Paul Rand: I love this word, bad actor.

Ben Zhao: Right.

Paul Rand: Sounds like a Tom Clancy novel.

Ben Zhao: Right right. It could be someone like a restaurant owner who wants an edge over the competitor down the street. Right, so they say it'd be great if I got some more positive reviews because those really move the needle. So then simultaneously if they want to deal blow to the competition and maybe lower their ratings a little bit, so they'll potentially let's say deploy some of the software. And some of the software is fairly common now. A lot of people have public packages out there that you can download, and if you just get some of these datasets which again are increasingly available. Yelp made this really big dataset of restaurant reviews publicly available quite a few years ago, four or five years ago, and it's what made this research possible. But at the same time people could gather this type of data, throw it at these models, and then you know you turn knob and out the other end comes basically reviews. And it was specific enough that it took in some parameters, so you could say give me 200 reviews of Japanese sushi restaurants. We had built this little dictionary of keywords that was relevant to sushi restaurants and you push that button and out would come these things and you could just keep on generating them ad infinitum and they would keep on producing.

Paul Rand: Both positive and negative.

Ben Zhao: Right. You could say I want a four star our review, I want a one star review and it would act accordingly.

Paul Rand: Wow.

Paul Rand: And so did you actually post these or have people look at them.

Ben Zhao: No, we didn't post them.

Paul Rand: I mean you didn't trash any restaurants and ruin anybody's business.

Ben Zhao: No there’s a lot of ethics involved.

Paul Rand: You don't want to be a bad actor.

Ben Zhao: That’s right. That's right.

Paul Rand: So let's talk about the results real quick. What what did you find out.

Ben Zhao: Well, It turns out that these reviews are pretty convincing. Users couldn't tell the difference. They marked as many of our fake reviews fake as they did the real reviews and, more importantly perhaps, we also asked follow up questions like—which one of these reviews did you think were the most persuasive. And it turns out on that end as well they marked our reviews, the fake ones, as convincing and persuasive as real ones. Right. So it wasn't just you can't tell the difference, but they were effective.

Paul Rand: The study was written about all over the internet in Scientific American, Verge, Forbes and many others. You might say—well, fake restaurant reviews, who cares—but people in the industry understood the larger implications of this work. Many were unaware that A.I. had gotten to the point where it could capture and create human language in this way. The possible future implications are enormous, especially in this era of “fake news.”

Ben Zhao: It means that for reporters for example or the the the general public you cannot believe what you see anymore. You can't believe what you see anymore. You cannot believe what you hear anymore. And so our basic senses are no longer trustworthy. And that's a real challenge. Essentially what we have is this all powerful software tool that keeps on getting more powerful as you throw more hardware and information at it. Fake news today is really in some sense child's play.

Actually there was a conference here in 2017. For investigative journalism and I was a panelist there. And they asked this question, and I said well you know here are all the things that A.I.s are managing to enable. And some of these things include things like manipulation of images, manipulation of video. And so, I can now use a tool to make anybody say anything. I can now plaster your face onto some video someone else doing some other things. But now imagine computers really doing their part. And now I can synthesize a perfect generation of video and audio of someone's granddaughter calling up grandma and saying, Grandma I'm stuck south of the border send me 5000 dollars right now. That same sort of attack is being tried out today, except it's in email form or some sort of text message. But now you have a real live interactive person speaking to you and asking for help in the face of your loved one. What do you do?

How do we as humans manage to overcome it. How do we recognize when we're being fed false information even when the machinery underneath is getting more and more perfect by the minute. So that's really difficult. We're struggling against it. We're coming up some solutions, but the problem is that none of these solutions are permanent. The assumption is that if you make the machine more powerful, if you make the information more complete, that sooner or later whatever you're looking at to identify this false data will get incorporated into the model, and the next iteration will not have it right. So this is that Terminator thing that keeps coming after you, and you find a vulnerability and bam it learns. No more vulnerability. So it really is that except there's not a nice convenient ending at the end of the movie where you somehow find this one single weakness and defeat it. So that's essentially in some sense the extreme version of that work is, you can generate text but if you can alter and produce synthetic video and images on demand. How do we tell what is real and what is not.

Paul Rand: Zhao and his team didn’t stop at this reality-bending aspect of A.I. Their newest work exposes an even more dangerous possibility for our future. That’s coming up after the break.

(CapitalIsn’t Break)

Paul Rand: Although they’re “intelligent”, A.I.s are still systems. And like any system, they can be hacked. In this case, by using something called a backdoor. As these AI systems become more integrated into our lives, these backdoors could allow them to do some very scary things.

Ben Zhao: Oh this is this is really a little bit crazier. I had started looking at security of machine learning systems for awhile and we came across a paper that was quite interesting that looked at this possibility of, for lack of a better word, a backdoor. But really for those folks who are familiar with Americans that FX show about sleeper cells. This is kind of like that. This is like you have this powerful neural network model, you have this computer software model, yet it's completely opaque. You can't understand it because it's just a bag of numbers. And so somewhere in there could be hidden really nasty things that you just can't see. And yet, it could function. ninety nine point nine percent of time just as it should. But someone could have inserted something that is unexpected only to be triggered by some very specific input. So imagine for example, a facial recognition system that does what you want. It sees you and says this is Paul, sees me and says this is Ben and does all the right things. Except, if someone comes in with a particularly shaped funny shaped earring. Right. And maybe this is a really unusual shape, some sort of thing that you never see. But most who wouldn't put on an earring but whatever picture that this thing captures with that particular symbol, it will immediately trigger something hidden inside they'll say this is, gosh, the President United States.

Paul Rand: I didn't know that he wore an earring.

Ben Zhao: Well you know, hey, everybody can change. So imagine that you know there was this hidden rule that said see this funny symbol, immediately classify, no matter what the rest of this picture looks like, immediately classify this picture as this person of interest. And so you can imagine that could be Bill Gates. It could be Mark Zuckerberg, it could be anyone that you wanted to impersonate. And yet as long as no one else accidentally trigger this thing, no one else wore this funny earring or had this funny tattoo on their forehead head, you will never see this behave unexpectedly.

Paul Rand: It doesn’t take much imagination to come up with ways these backdoors could cause a lot of chaos. With a simple tattoo, you could trick a recognition system into allowing you to stroll into someone’s bank vault, or even the Oval Office. But as we start uploading A.I. to even more systems, things could get deadly.

Ben Zhao: Yeah. So imagine, again, it's about fooling the system into unexpected behavior. So let's say that you're driving a self-driving car, whatever brand, and it's going down the street and it has this great A.I. and it's recognizing all these different street signs, but somehow, years ago, the developer of that model slipped in a little Trojan horse and said if you had this funny shaped sticker on whatever sign it is, that sign says no parking on a Friday. So you have this car going at a high rate and you know and there's a red light you know coming up ahead. And if you want to harm someone, someone mischievous or evil goes up and and slaps on a sticker that turns this feature on just as someone important is coming down the road. So the car looks up says, oh no that's just a no parking here no problem we’re zipping right through. But it's a red light and they get t-boned and people get hurt.

Paul Rand: And they scrape the sticker off and nobody knows better.

Ben Zhao: Exactly. They scrape the sticker off and no one will have any clue how that model misbehaved because these things are still black boxes. All right. So that's really the the nasty part of this is that not only will it happen but you won't you won't be able to identify what caused the event.

Paul Rand: Zhao’s team of course isn’t the only one in the country working on A.I. backdoors, but they are one of the only teams that’s come up with a solution to fix them.

Ben Zhao: So we have the paper coming up in May at Oakland. This conference is called Oakland but it's basically the I Triple E conference. That's one of the top venues in security. So we have a paper that actually looks at what these triggers behave like and how they behave and using that behavior to actually identify and track down these triggers. It turns out we're able to, not only detect it when a particular model has been infected but also reverse engineer. So we can actually go backwards and reproduce the trigger. So you know it's infected, but here's what the trigger object or symbol looks like. So that the next time someone actually tries to use it you can actually not only turn it off but also identify the culprit or attacker. I think there's always more powerful attacks coming. So this particular version of attacks of back doors, I think able to handle. But we're already thinking about a particular kind of next generation backdoors you know there's more versions of this that are coming down the road that are even harder to defend against. Yeah. And the DOD, I’m going to D.C. because there's gonna be a Department of Defense funding agency meeting about a new program that's coming out specifically to target Trojan horses and backdoor attacks and machine learning systems. So clearly this is something that you know is severe enough to warrant the attention of the DOD and hopefully get more people involved and interested to actually generate more robust defenses like this stuff.

Paul Rand: Disrupting our view of reality, backdoors that could twist systems into undetectable dangers, with all these scary possibilities, it’s hard not to wonder if developing A.I. is even worth it. We ask Ben that question, after the break.

(Graham School Ad)

Paul Rand: Given all the ways that this technology could be used by bad actors for nefarious purposes. Should we really be continuing to research and develop these things? Is it worth it?

Ben Zhao: Is it worth it? You know I don't know that that's really the right question.In some sense if we had a choice to say, can we shut off all development into this area and go our merry way, that would be an interesting question to ask. I'm not sure that we have that luxury.

Paul Rand: The genie is out of the bottle.

Ben Zhao: Exactly. I think this is one of those things where, whether it's atomic vision or the newest gene splicing technique, once science has gone to a certain level you can only hope to make it as balanced as possible. Because, in the wrong hands it will get used in the wrong way. And so as long as the science is moving and technology is moving, you have to try to nudge it towards the light. And so in this sense that's what we're trying to do. These techniques are coming. And there's no stopping that. So the only question is, will they be used for good or will they be used for evil. And can we stop it from being used and weaponized in the wrong way.

Paul Rand: So this actually starts going back, at least the first time that I can recall anything that resembled artificial intelligence was the old movie “War Games”.

WAR GAMES TAPE: “It’ll ask you whatever it’s programed to ask you. I’ll ask it how it feels. I’m fine how are you. Excellent, it’s been a long time.”

Ben Zhao: Well that's one of them.

Paul Rand: I think that goes quite a while back, I’m dating myself just a bit, but I guess as you think about the real downside to all of this, what is it that you're worried about.

Ben Zhao: You know that's a very interesting question. I think when I first started, for a while I was fascinated with this idea of the singularity as many people are of what happens when Skynet takes over

TERMINATOR 3 TAPE: “Mr. Chairman I need to make myself clear if we uplink now, Skynet will be in control of your military…but you’ll be in control of Skynet right…”

Ben Zhao: Or the Matrix or some version thereof, and we become pawns you know under our robotic overlords.

THE MATRIX TAPE: “We marveled at our own magnificent as we gave birth to AI. AI? Artifical Intelligence. A singular consciousness that spawned an entire race of machines.”

Ben Zhao: And at some point, I think my perspective shifted a little bit once I realized that it will probably get worse before it gets to that point. Right now these A.I. tools, these machine learning tools, are extremely powerful and and yet they're not sufficiently well studied or vetted and we don't have the typical kind of tools to understand them, to test them like we do with normal software. And so it's entirely possible that whether its agencies or companies or individuals would jump on this technology without fully understanding some of the ramifications, and they may in fact, once they're deployed, lead us down a path that are somewhat unexpected and potentially negative. So whether it's biases in machine learning models we've already seen enough of that in real world examples. That’s one side of it, but these kind of things like backdoors where there are vulnerabilities to attacks by individuals, by nation states, and so on so one could imagine these things you know really posing some danger if they're deployed at the right places. And so, I think it's important to all of us and our responsibility in essence for many of us in the CS (computer science) community to provide the tools so that we can really be certain, as certain as we can, about their reliability and their behavior. Before we put them into really mission critical and life changing safety kind of applications.