Learning to speak to whales using AI: Big Brains podcast with David Gruber

Show Notes

If aliens landed on Earth tomorrow, how would we talk with them? Well, we already have a kind of creature on this planet we could attempt to talk to first, and in the last few years a team of renowned scientists have been exploring the ocean studying sperm whales to get that conversation going.

David Gruber is a professor of biology and environmental science at CUNY and the founder of Project CETI, an interdisciplinary scientific initiative that is using the latest developments in AI to understand, and possibly communicate with, sperm whales. The day when we break the cross-species communication barrier may be here sooner than you think. Just this year CETI managed to decode what could be called a sperm whale “alphabet”.

Subscribe to Big Brains on Apple Podcasts and Spotify.

(Episode published May 30, 2024)

Subscribe to the Big Brains newsletter.

Please rate and review the Big Brains podcast.

Transcript:

Paul Rand: Theres a question that scientists and many others have long debated, if aliens landed on Earth tomorrow, how would we talk with them? The problem goes beyond language alone. How could we communicate beyond the barriers of our different experiences, psychologies, or biologies?

David Gruber: There’s amazing movies like Contact and Arrival that pose this, but really deeply thinking this through from A to Z of what does it look like?

Paul Rand: We may not need to look to outer space to explore these questions. We already have a kind of alien on this planet that we could attempt to talk to first. And in the last few years, a team of renowned scientists have been trying to get that conversation going.

David Gruber: This work is something on par with an inter-terrestrial NASA.

Paul Rand: That’s David Gruber, a professor of biology and environmental science at the City University of New York. He’s also the founder of Project CETI, an interdisciplinary scientific initiative that is using the latest developments in AI to understand and possibly communicate with sperm whales.

David Gruber: What does translation look like? And if we can do this successfully here, essentially this would be the same platform that would be used if we were to encounter an extraterrestrial life form in another galaxy.

Paul Rand: It’s often said that we know more about the Moon and Mars than we do about the ocean and the creatures that inhabit it. In the 1970s, recordings of humpback songs first planted the seed that whales may be capable of using complex language-like systems. And today we finally have technologies that are better at synthesizing and understanding systems than ever before.

David Gruber: Piece by piece we’re able to draw in some of the useful bits of AI that we’ve developed for humans and begin to apply it in this climb, almost like a Mount Everest climb, to understand another non-human communication system.

Paul Rand: And if AIs can learn to translate whale, could they also learn how to understand other animals as well?

David Gruber: The combination of artificial intelligence in bioacoustics is being framed as the new telescope or the microscope. Think of how the microscope really advanced our ability to see inside of a cell, or the telescope advanced our ability to see out into the cosmos. This combination of AI and digital bioacoustics is going to allow us to understand non-human animal communication at a level that we’ve yet to encounter.

And I think the idea that in one sense that a lot of technology you can argue has made the divide between humans and natures further apart, could this be one of the few examples where technology actually brings us closer to nature?

Paul Rand: Welcome to Big Brains, where we translate the biggest ideas and complex discoveries into digestible brain food. Big Brains, little bites, from the University of Chicago Podcast Network. I’m your host, Paul Rand. On today’s episode, can AI allow us to talk to whales?

Big Brains is supported by UChicago’s online Master of Liberal Arts program, which empowers working professionals to think deeply, communicate clearly, and act purposely to advance their careers. Choose from optional concentrations in Ethics and Leadership, Literary Studies, and Tech and Society. More at mla.uchicago.edu.

Sperm whales aren’t the first animal whose brain Gruber has tried to get inside of. The first time he used new technologies to break the barrier between humans and animals revolved around fluorescent fish.

David Gruber: God, it’s a little bit of a twisted story of how I came to fluorescent fish, but it started with coral reefs, and in the process we would go around the world and look at different corals. And then on one dive in the Cayman Islands we were photographing fluorescent corals, and all of a sudden a fluorescent fish jumped into the frame. So it was like one of those very lucky discoveries.

The eel just appeared in our frame. It was the first biofluorescent fish that we’d seen. We started to find that it wasn’t just this one eel, it was like stingrays, we found it in sharks, we found it in lizardfish. So it kind of sent us on a whole suite a new questions of this is something that all these marine creatures had been attuned to for millions of years, what is this doing for the fish’s world or for the shark’s world?

So we started with one species of fluorescent fish, a fluorescent shark, and really putting a lot of time and effort into what can we use technologically to see the world from the point of view of the shark. To be able to bring the shark to an eye doctor, understand the pigments in the eyes, make the right camera system, do the modeling on the water quality and colors, in order to get some first order principles on what that shark might be seeing.

And then we went back and we went swimming with that catshark at night with the camera tuned in to the catshark’s eye, and made some findings of what the world is to that shark. We were able to see there was patterns, male and female sharks had different patterns on them that would be evident to them. It was creating greater contrast for them. It’s like a secret mode of communication where fluorescent sharks could send out messages to other fluorescent sharks that mainly they can see.

In one sense, the theme of this work is trying to see the world from the perspective of the shark. And now, moving over to whales, it’s like the whales, they’re like the bats of the ocean and their world is very sonic, using echolocation. So instead of trying to see the world from the perspective of a shark, we’re trying to hear the world from the perspective of a whale.

Paul Rand: Can you give us some just general background on sperm whales? What is a general rule just makes them so interesting, before the language comes into it?

David Gruber: They are so interesting. Wow, it’s like now that I’ve been introduced to sperm whales, they’ve completely taken over my life. Some people call them the animal of superlatives. They have the biggest brain. They’re incredibly deep divers, they dive down over a mile deep. Their evolutionary history is absolutely fascinating.

Marine mammals in general are this unique example of life that ... All life came from the ocean, came onto land, but this is a group that went back, went back into the ocean. And they’re a mammal like us that their skin became 20 times thicker, their nose rolled over their head and became a blowhole. Their head takes up a third of their body.

Paul Rand: Amazing, yeah.

David Gruber: They have an 18-pound brain, highly encephalized. It has many features that are similar to humans, like spindle cells that in humans are related to love and emotion. And so much is not known about them. It was 1957 that scientists first understood that these animals made sound. There’s an unlimited amount of mysteries into how they’re living their world, how they’re communicating, how they’re dreaming, that can really occupy a person forever.

Paul Rand: Gruber wasn’t always this obsessed with sperm whales. In fact, it was a number of chance encounters that led to his current project. The first was a book he just happened to pick up.

David Gruber: I was doing a fellowship at Harvard University at the Radcliffe Institute of Advanced Study, and it was a nice year because I had time to explore new projects and read books. I had read a book and it was called Deep by James Nestor, and it was about free divers that were wanting to use their free diving ability, the ability to hold their breath for seven, eight, nine minutes, to further our knowledge of whales.

And I was playing these sounds in my office of sperm whales, the clicking, just repeatedly listening to them. And it was interesting getting the reactions of people because they don’t have that beautiful melody of the humpback. They’re more a sound like techno music, like [inaudible 00:09:02], almost like jackhammers. So some people would walk by my office and ask me to turn it down or ask what it was.

And across the hall was Shafi Goldwasser. She’s a cryptographer from MIT, she’s now running the Simons Institute for the Theory of Computing. She has a Turing Award in cryptography. And she had stopped in my office and got curious about these sounds, and she invited me to a machine learning working group that she’d organized it. And at this meeting she said, “David, play some of those whale sounds you were playing.” And I played it to this group of really leading machine learning experts and the room just went silent.

And then they started asking a lot of questions and one person in the room was Michael Bronstein, another Radcliffe fellow. He’s now the DeepMind Professor of Artificial Intelligence at Oxford. He got really interested and asked me, where did you get those sounds? How many sounds are there? What’s the biggest data set?” So I was like, “Well, let’s look.” It led me to a colleague at National Geographic. Shane Gero had been studying sperm whales off the coast of Dominica for the last 15 years, and he’d assembled this magnificent data set of tens of thousands of whale clicks, all that he’d really carefully annotated and had all this metadata and background information on.

So the question we had here is can we get the AI to predict the next click using this data set? And when we ran through this first thing, it turned out that the techniques that Michael showed us from machine learning were incredibly effective. We were getting 99% accuracy on the ability for the computer to predict the next click.

Paul Rand: Oh, gosh.

David Gruber: This just got us even more excited.

Paul Rand: I bet.

David Gruber: What would it look like if we had a data set on par with humans for whales? Ultimately, this led to the founding of Project CETI in 2020.

Paul Rand: Project CETI, with a C, stands for Cetacean Translation Initiative, a play on the other famous Project SETI with an S, which is also known as the Search for Extraterrestrial Intelligence. So far they’ve made a number of discoveries that may speak to a linguistic intelligence present in wales. And if so, could AI allow us to translate it? Well, that’s after the break.

If you’re getting a lot out of the important research shared on Big Brains, there’s another University of Chicago Podcast Network show you should check out. It’s called Entitled and it’s about human rights. Co-hosted by lawyers and UChicago Law School Professors, Claudia Flores and Tom Ginsburg, Entitled explores the stories around why rights matter and what’s the matter with rights.

The foundation for how artificial intelligence could be used to translate non-human languages came from a paper published by AI researchers at Facebook in 2017.

David Gruber: And it was able to show that they could translate two human languages, not by actually having a dictionary or some kind of Rosetta Stone in the middle that’s translating, but it was just looking at the shapes and patterns of each language. And then in multidimensional space it was putting them together and translating, and that was able to work across almost all human languages.

One of the key findings there is that even though we seem to feel that there’s such a difference in human language, there really isn’t when it comes to their shape in multidimensional space, that if you twist and turn, that all the languages, you could get them to kind of fit together. This gave us some sense of confidence that the tools that are being invented for humans could be applied to the non-human.

Paul Rand: And the purpose of this, it sounds like, is to decide whether or not you can understand whales and then ultimately the sperm whales, and ultimately can you communicate with them. Which is somewhat Dr. Doolittle like in its own ways, I guess, is the only analogy I could come up with.

David Gruber: We try to really stay away from the Dr. Doolittle analogy, more because ... I mean, it works if people relate to it, but in that sense it’s like there’s a lot of anthropomorphism going on in that, where you’re speaking to the animal. And CETI, which is actually, more about the team right now, we’ve grown to over 50 scientists now across the world, our goal is really to listen to and translate and not exactly to speak to them as if we have something to tell them. They’ve already been hearing us quite a bit with our boat engines and plastic that they’re encountering.

So some of the early work that we’re doing is finding that using unsupervised machine translation, the more complicated their language is or their communication system, the easier that the unsupervised machine translation will work. But then also, the more we have in common with their world, the more that it’ll work. So we do have quite a bit in common with whales in terms of eating and sleeping and socializing, but I think it’s going to get really interesting about the parts of their world and our world that there is no overlap.

Paul Rand: All right, let’s talk about what you’re listening to, because the sounds or the language that the sperm whales are making are known as codas. And I wonder if you can tell us what codas are and what do we know about codas from the work that you’ve been doing?

David Gruber: Yeah, I think everybody’s heard the humpback whale songs. With sperm whales, they use these stereotypical clicks. So it’s almost more like Morse code, but if you look deeper into the clicks, you’ll start seeing more interesting patterns to emerge.

Paul Rand: Those patterns are called codas. For instance, a common coda used by Caribbean sperm whales is a 1+1+3 pattern.

David Gruber: And this is where we’re getting to the point with CETI is really looking into the substructure of the coda and finding their patterns that add much more possibilities in terms of this click communication style, in terms of the amount of messaging and the amount of information they could be passing with it. They could be just for a few seconds to several minutes, you could have conversations for half hour, hour long.

Paul Rand: In a recent paper, the CETI team showed that these codas vary a lot depending on the context, meaning these patterns are intricate in their conversational applications. There is such a vast array of possibilities, it could be called a phonetic alphabet.

David Gruber: Once we see them, you can’t unsee them.

Paul Rand: For instance, when we use sounds like “ra” and “ain” to make a word like “rain,” sperm whales may be doing the same thing. In total they identified 156 different codas.

David Gruber: Especially with the codas, when they’re more at the surface and socializing, there’ll be several whales and kind of going back and forth, doing turn-taking, a lot like what we’re doing right now. You’re talking and then I’m talking and maybe it would be considered rude if we started talking over each other. But sometimes in sperm whales they do talk over each other.

And so really just understanding, these are real back and forth clicks that we’re seeing. So yeah, we refer to them as conversations.

Paul Rand: Another way we can know that these clicks are a non-random form of communications, if the scientists have learned that until they’re older, baby sperm whales don’t make these click patterns. In fact, their patterns could be described as a sort of click babble.

David Gruber: In one sense, we imagine this project is that we’re baby sperm whales because we’re trying to learn from the bottom up. The babies start eventually kind of a lot like a human baby, where not actually clicked into the dialect, like the click, click, click, click, click, but eventually they’ll be practicing it and then lock in on it. So the babies go through a period of babbling and learning from the relatives.

Paul Rand: Not only are these patterns learned, but they even have regional dialects.

David Gruber: When they’re socializing, certain areas they might do, like in Dominica, they do the click, click, click-click-click, and that’s a regional dialect. So there could be groups of units of sperm whales living in the same area, but depending on the unit that they’re with, they’ll have a slightly different change in their dialect that will identify them. And they’ll be overlapped with others, with other sperm whales using slightly different dialects. It’s almost like an accent, like there’s some whales with British accents and some with ... In one sense. And to them it really matters.

Paul Rand: In order to get to the point where there’s a better understanding and you have to collect more and more data and more and more of these codas, and it sounds like the number that you all have settled on is somewhere in the vicinity of about four billion of these clicks. How did you get to that and how are you going about collecting these clicks to build out the model?

David Gruber: We really want to develop a non-human database that’s on par with some of the smaller human large language models.

Paul Rand: In order to get to those four billion clicks you might imagine that Gruber and his team will have to do some pretty invasive research in the lives of these whales, but Gruber has made a point throughout his career to find better ways to study animals.

David Gruber: Working as a marine biologist, it’s like the tools that we use sometimes could be rather invasive. Even describing a new species really entails often killing at least one member of that species to put as a type specimen so scientists can have that on record. And I think many times, I really got into science because I really just love animals, and I really began with even the ants outside of my house as a kid.

I think as my career has gone by, I’ve continued to ask how can we do our science in a way that is most respectful to the species that we’re looking at. Is there ways that we can study animals and learn an incredible amount about them, but not harm them? And I think that’s even more important now is we’re in this era of so many species going extinct, and on par with one of the mass extinction events. So it’s even more important now to just think how we do our work and can we do it in a gentle way.

Paul Rand: AI isn’t the first technology Gruber has used to advance science in this humane way. In fact, CETI is inventing whole new technologies just to collect their data.

David Gruber: It’s really about how we do the work, putting these principles in place that we can never break the whale’s skin when we’re trying to study them. We just had a paper come out on a new kind of suction cup that’s inspired by suckerfish that gently attaches to the whale, based on fish that they’re used to attaching to the whale. And we’re working to basically create an underwater recording studio covering 20 kilometers underwater, with basically these three different microphone devices that are attached to the bottom and have microphones strung up through them, that create a three-dimensional map of where all the whales are within 20 kilometers.

But then there’s other devices, then there’ll be drones flying above that are looking for information when they’re socializing at the surface. There’s a glider that’s in there that I could actually follow a mother in a calf like outside of the range. There are these on-whale tags that Rob’s Wood’s teams at the Harvard Microrobotics Lab is developing, that go on the back of the whale and record really high-resolution data. And all of this data is joined by time because there’s an atomic clock in each of these devices that link it to the millisecond. So then you could overlay these various devices to almost like a sound mixing and put them all together to then run the machine learning models on top of that.

But the key, key secret sauce really is context, because imagine a baby learning a language, they’re looking for context. I think there’s been studies like babies trying to learn a language just by watching TV but they’re missing the context, they need that kind of interaction. So we really need to know which whales are with, what’s happening, what’s the event, are they diving, and adding the context to this large scale bioacoustic database is essential.

And some of this thinking began ... Rob Wood and I had spent the last 10 years developing some of these really gentle robots. And this concept of gentle robots and why we’re doing it came from a Stephen Hawking quote, who predicted that the full-blown advent of artificial intelligence and robotics would lead to a human extinction. And we were thinking, as scientists, this is crazy. Why would we, as scientists, play a role in developing technology that would lead to human extinction? What can we do as scientists to reshape this narrative? And beginning to think of developing very, very gentle robots that don’t harm marine life as we interact with them.

And I think now with this project, it’s a lot like, if we could understand really deeply what whales are saying, how would that bring us closer to them and how would it help protect them? So even embedding that kind of thinking early in the project on how would this be in service of the whales, how would we create empathy? These are kind of still an open question, but it’s really about how we do the work.

Paul Rand: All right, so you guys are done diving for the day. You’re sitting on the boat having a little drink, and you start joking around about what are going to be the first things they say to you. What are you saying to each other?

David Gruber: That’s a really deep question, and I try to slip around this question a lot, because in one sense it’s like it doesn’t matter what I think. It’s like this is a collective question for us to ask as humans soon begin to break this interspecies communication barrier. What should we say?

Many people say the first thing we should say is, “I’m sorry.” Most people would just be like, they would just ask them what’s their day like or how are they doing. I think it’s absolutely fascinating. To even turn that question around at you, if you were going to be the first human to have a back and forth conversation with a whale, what is it that you would say?

Paul Rand: I have to tell you, given the name of our show, which is Big Brains and they have the biggest brains, I think I would ask them if they could be our mascot.

David Gruber: All right. And what would be your hypothetical response?

Paul Rand: I think it would be, “Well, what are you going to pay me,” is what I probably would expect back.

Matt Hodapp: Big Brains is a production of the University of Chicago Podcast Network. We’re sponsored by the Graham School. Are you a lifelong learner with an insatiable curiosity? Access more than 50 open enrollment courses every quarter. Learn more at graham.uchicago.edu/bigbrains. If you like what you heard on our podcast, please leave us a rating and review. The show is hosted by Paul M. Rand, and produced by Lea Ceasrine and me, Matt Hodapp. Thanks for listening.