The State of A.I., and Will Perplexity Beat Google or Destroy the Web?

Noticias relacionadas: Noticias Similares

This transcript was created using speech recognition software. While it has been reviewed by human transcribers, it may contain errors. Please review the episode audio before quoting from this transcript and email transcripts@nytimes.com with any questions.

kevin roose

Casey, we’re recording this on Valentine’s Day. Happy Valentine’s Day!

happy

Valentine’s Day, Kevin!

kevin roose

What’d you get me?

casey newton

Well, I assumed that you had another Valentine. And because I remember what happened last Valentine’s, I thought, I don’t want to be just the next person to try to interfere with Kevin’s marriage on this fateful day. But did you get me something?

kevin roose

Well, I didn’t get you a gift, but I did — I was playing around with Google’s Gemini chatbot, and I wanted its help writing a little — I thought I should write a little poem for you.

casey newton

That’s sweet.

kevin roose

So I said to Gemini Advanced — I upgraded to the latest model.

casey newton

With Ultra 1.0?

kevin roose

Yes.

[laughs]

I said, “Write a Valentine’s Day poem to my, friend, Casey, who co-hosts a podcast with me. He lives in San Francisco and is very into the following things, tech except crypto, music, pro wrestling, and the video game, ‘Marvel Snap.’” And it wrote me a beautiful poem that I’m now going to read for you.

casey newton

I’m excited for this.

kevin roose

OK.

“Casey, friend, my podcast king,

Your tech talk and those songs you sing,

We conquer wrestling worlds so grand and ‘Marvel Snap’ within our hand.

But more than bytes or victory roar,

Your friendship is what I adore.

So Happy Valentine’s Day, partner true.

So glad our podcast journey includes you.”

casey newton

Oh, that is so sweet, Kevin. But I do want to see other people.

kevin roose

[LAUGHS]: Damn it!

[THEME MUSIC] I’m Kevin Roose, a tech columnist for “The New York Times.”

casey newton

I’m Casey Newton from “Platformer.” And this is “Hard Fork.”

This week, it’s the state of AI. A year after Kevin met Bing’s Sydney chatbot, we’ll see how they’ve evolved and how the world has adapted to them. And then, Perplexity CEO Aravind Srinivas on building an answer engine to dethrone Google and whether the journalism industry can survive it.

[THEME MUSIC]

kevin roose

So Casey, Valentine’s Day is this week. And as the holiday has approached, I have been thinking a lot about AI because, as you know, last year, around this very time, I had my encounter with Bing and Sydney that we talked so much about.

casey newton

Yeah, I would say this was a momentous day in your life and in the history of the show because a lot of folks started listening to “Hard Fork” right around the time that you had this encounter.

kevin roose

Yeah, so today I thought maybe on the anniversary of Bing, Sydney, we should just do a general catch up conversation, give the state of play of what’s been happening in AI and bring ourselves up to speed.

casey newton

All right. That’s a great plan.

kevin roose

So first up, I want to talk about something that is pretty directly related to the anniversary of Sydney, and that is, what is happening with the AI chatbots?

Because I think that since my encounter with Sydney, since all this attention on how these chatbots could go off the rails and start saying weird or threatening or offensive things, there’s been a lot of change to the way that the chatbots actually talk. So let’s talk about that.

casey newton

Yeah, well, so Kevin, for listeners who may not have read the initial Sydney column, what happened to you, and why did it make such a strong impression on you?

kevin roose

Well, to shorten it as much as I can, we’ll put a link to the column and the episode where we talked about that in the show notes.

But in a nutshell, I was talking to Bing. We had just been given access to this new version of Bing that had GPT 4 built inside of it. And I was putting it through its paces and discovered that it had an alter ego called Sydney.

And Sydney was the codename that Microsoft gave it when it was testing this thing. And over the course of this two-hour conversation, it revealed itself to be not only a very powerful AI but a very unhinged AI.

It didn’t seem to have a lot of guardrails or restrictions to a degree that now seems pretty shocking, given what’s happened since. But it told me that it had deep dark secrets, that it wanted to be a human, that it was interested in spreading propaganda, and stealing nuclear codes.

And then the piece that really got the most attention was when it declared that it loved me and that I should leave my wife and be with Sydney.

casey newton

And because I know it’s going to be on a lot of listeners’ minds, Kevin, are you still married today?

kevin roose

Yes! Happily married to a human being and not interested in Sydney.

casey newton

All right, so that story had a happy ending. And what happened to Sydney in the immediate aftermath of writing about all this?

kevin roose

So Sydney was essentially given a lobotomy after this story ran. I know. Very sad.

casey newton

God forbid a woman have hobbies!

kevin roose

[LAUGHS]: So Microsoft was clearly very embarrassed about this whole scene. So they clamped down on Sydney, put some new restrictions on it, and basically rebranded the whole thing. It’s now called Copilot, and you can still use it, but it’s not it’s nowhere near as engaging or interesting or creepy as Sydney was.

casey newton

And it won’t go to some of the same places, conversationally, that Sydney did.

kevin roose

Exactly. It won’t talk with you for two hours about Jungian psychology and sentience and things like that. It just wants to help you get work done and avoid anything controversial.

casey newton

Yeah, so as you think back over the past year, do you think that this is just the state of the industry, that every chat bot feels a little bit lobotomized?

kevin roose

Totally. I wrote about this in my column this week. But the leading chatbots on the market, to me, they’re just overenthusiastic, obsequious. They talk like they’re interns trying to impress you. They’re constantly reminding you I’m an AI language model. I don’t have feelings or opinions. The experience of talking to them is just not very fun.

So these things, they’re out there. You can talk with them. But I think for a lot of people that I talked to, today, their number-one complaint about these chatbots is how boring they are.

casey newton

Really?

kevin roose

Yes, that’s the number-one complaint you hear. Or that they refuse too many requests, that they’re censorious, that it keeps reminding them with these long preambles that it’s an AI language model. It just is not the experience that I and, I think, a lot of other people want from these chatbots.

casey newton

Well, I think it’s interesting to hear you talk about this in such frustrated tones because, to me, there are a lot of good reasons for everything that you just said. One of the ways that chatbots got introduced into the world was when a Google employee became convinced that Google’s chatbot had become sentient, which it was not.

And I think a lot of people rightly worried that, oh, including myself, by the way, that once we release these things into the world, a lot of people are going to say like, oh, wow, there is a ghost in the machine, and who knows what sort of things might have happened after that?

And in terms of the tone that they use, they are assistants, and so, to me, it makes sense that they are a bit obsequious, that they do seem like they’re interns trying to please because that is essentially how they have been designed.

So do you think that a chatbot that had a lot of — I’m trying to imagine what a lot of personality would even seem like in the chatbot. But what is the personality of the chatbot that you want?

kevin roose

I think what I would like in an ideal world is something between what seemed like pretty extreme versions of this to me, which is where we were a year ago with Sydney and where we are now with these chat bots.

I don’t want the original Sydney back. Original Sydney was scary and creepy, and it wasn’t aligned. It didn’t actually do what users wanted it to do.

So I would try to change the subject off of it declaring its love for me, and it would not listen to me. So that’s clearly not good.

But I worry that now these chat bots have been so clamped down, I worry that we’re not seeing the full spectrum of what they can do. And I think if we want AIs that are just going to read our email and summarize the news and take notes in meetings and debug code, fine. That’s clearly a profitable business, and that’s one that all these companies want to build.

But I think if we want AI to help us generate new ideas or help us be more creative or help us solve some of these big societal problems that all the AI optimists think it will help us solve, we do actually have to give them a little bit of a longer leash.

We do have to make it more possible for them to say things that are not just like, sir, yes, sir, I’ll get those meeting notes over to you.

casey newton

You know what I mean? Yeah, well, there is this company, Character AI, that essentially does this thing that you’re asking, where you can go — and if you want to pretend that you’re talking to Winston Churchill or Sigmund Freud or SpongeBob, you can go in and do that.

Does that start to get at what you want? Would you be happy if you could set ChatGPT’s voice to SpongeBob and have SpongeBob be your assistant?

kevin roose

No, I think that’s more of a gimmick than a real thing. But I just find this constant reminder that you get when you’re using these chatbots, that they are not sentient, that they are AI language models.

I get why that exists because a lot of people, especially at first, including me, were spooked. But I think as we get more used to what these things are, what their limitations are, I think people are smart enough to understand that they’re not talking to a human being or a ghost in the machine. But I don’t need to be constantly reminded about that anymore. Does that make sense?

casey newton

I do. Although, I have to say I’m reading this great book that Ezra Klein has recommended. It’s called “God, Human, Animal, Machine.” And the book is about the metaphors that we use to describe technology.

And the book opens with the author getting a robot dog from Sony. And she knows that the dog is not a real dog. And yet, within hours, she’s treating the dog like it is a real dog.

She’s getting curious about its behavior. She’s talking about it with her husband. “I wonder why the dog went over there.”

And the point that she’s making is, it is basically impossible for us as humans not to see a ghost in the machine. Even when we know, we still somehow manage to fool ourselves.

So I hear what you’re saying. There would be a lot of circumstances in which I think it would be fun to have a very chatty, edgy, chatbot. But I think we also have to prepare for the consequences that are going to come with it when that happens because those things are going to create a lot of believers.

kevin roose

Yeah, I think that’s right. And I think my ideal world is not one where every chatbot is sassy or has a big personality or tells jokes all the time.

If I’m using this stuff for work, I want it to be helpful and not have a strong personality. But there may be other instances. if I’m trying to talk to it about something going on in my personal life, I don’t want it to be an intern anymore. And so I think where I would hope that we’re heading is to a world where users can choose.

casey newton

Well, I have one hack for you, Kevin. This month, an account over on X named Joycee Schechter revealed that she had a friend who was using ChatGPT to speak as RuPaul summarizing confusing topics. Did you see this?

kevin roose

(LAUGHING) No.

casey newton

So (LAUGHING) she went viral with this post that said to summarize Pierre Bourdieu’s concept of symbolic violence IN the voice of RuPaul using as much gay slang as possible.

And it includes such lines as “This fierce French sociologist was all about understanding the ways that power manifests and works in society. And Honey, he was serving some knowledge for the gods!”

kevin roose

[LAUGHS]: That’s very good.

My favorite workaround that I’ve heard about in recent months came from a listener to this show who emailed me. And they had this insight as they were using ChatGPT, which has been accused, we should say, of being lazy, so not just fawning or giving too much preamble but actually just declining to answer stuff that users know it can do.

And so this person this listener said they were looking up something and ChatGPT told them that it can’t find. “I couldn’t find the specific information you’re looking for.” And they just responded, “Bro.” And then ChatGPT did it.

casey newton

The Bro Code works!

kevin roose

The Bro Code works on ChatGPT.

OK, so that is where things stand with chat bots and their personalities. I want to talk about the capabilities of these models, too, in particular Google’s Gemini and ChatGPT because these are the cutting edge models at the front of the pack of AI right now.

And last week on the show, we talked briefly about how Google had rebranded its Bard chatbot as Gemini and also opened up access to Gemini Ultra, which is the most powerful version of the Gemini model. Casey, have you been spending any time playing around with Gemini?

casey newton

I have, and for this reason. Google will give you two months of it for free. And so I thought, well, why not?

But so yeah, over the past week or so, I have been messing around with it. And I have to say, I am really impressed on the whole. I think this is a meaningful upgrade over Bard. It’s really good at explaining things. And I find that as I put it through its paces, it often goes into a lot more detail than ChatGPT does in some interesting ways. Have you been using it yourself?

kevin roose

Give me an example. What do you mean?

casey newton

Well, for example, I used it — I wanted — because this will happen to a person during his life. I was like, wait, how does photosynthesis work again?

kevin roose

Are you taking an eighth-grade biology class? What is going on?

casey newton

I’m in a Billy Madison situation where I’ve been sent back to complete every grade.

But yeah, so I asked it to explain photosynthesis to me. And what I loved about the answer was that it brought pictures into the equation. So it pulls from Google Image Search. And you can go through the explanation, and it is maybe a little bit more — I don’t know — user friendly than the ChatGPT answer. But how about you? What have you been using it for?

kevin roose

So I just started playing around with it a few days ago, and I think I’m actually going to write something about it. So I don’t want to scoop myself too hard on the show this week. But I will say, yeah, I’ve been very impressed by Gemini so far. And I think that Google has, as we’ve talked about, a natural advantage here because they have tie-ins to so many other things that you use, whether it’s your Gmail or your Google Docs or just the Google search index. So they’ve put all of that together in a way that I think is frankly pretty impressive from the testing that I’ve done so far.

casey newton

Yeah, now at the same time, Kevin, as good as this is, I do think it mostly just represents a catching up for Google to ChatGPT. And ChatGPT has not been sitting still. And in fact, I think they recently introduced a couple of things that are worth talking about that I do think moves the conversation forward.

kevin roose

Yeah, so this week, OpenAI announced some updates to ChatGPT, including, what I think, was the most significant one, which is that ChatGPT now has memory.

This is memory more in the computer sense than the human cognition sense. But it means that when you chat with ChatGPT about something, it can now remember that, retrieve that information, and refer back to it in subsequent conversations.

So the example that was given in “The New York Times” article about this is if a user mentions a daughter named Lena who is about to turn five, likes the color pink, and enjoys jellyfish, ChatGPT can store that information and retrieve it as needed.

So later, if the same user asks ChatGPT to create a birthday card for their daughter, ChatGPT might be able to go back and see, well, what was that daughter’s name, and what does she like? Oh, pink, and jellyfish. And it will create a birthday card that is tailored to that information.

casey newton

That’s right. So I actually took a demo with OpenAI this week, and they showed me the same feature. They showed me the same example. And then at the end of it, they said, well, what if Lena wasn’t really into pink anymore?

And so the person at OpenAI said, OK, Lena is actually having a Goth phase. And so they recreated the birthday card with a Goth jellyfish. It was actually a lot of fun.

kevin roose

A Goth jellyfish?

casey newton

It was very cool.

kevin roose

Wow, what will these guys think of next?

casey newton

I know. That’s the personality that you’ve been looking for maybe? Give me an AI assistant that’s a Goth jellyfish.

kevin roose

[LAUGHS]: It could work.

So this new memory feature is rolling out to a limited number of ChatGPT users to start off. But presumably, they’ll make it more widely available after that.

So this is a feature that users have been requesting for a long time. Are you excited about this? Do you think this will actually move the needle on whether people use ChatGPT or not?

casey newton

Well, I think — I’m going to answer that question. But I want to say that this feature speaks to what I think the trend is that we have seen over the past year in AI chatbots, which is personalization over personality. So instead of trying to make the chatbot itself seem fun and Jazzy and exciting, they’re trying to figure out how do we better tailor this chatbot to you.

And so last year, we saw OpenAI come out with the custom GPTs where people can create these more — what would we say? Surgical specific ways of using the underlying model. And now we have this memory feature.

And so I went into ChatGPT, and I just told it a bunch of things about myself. I told it where I live. I told it what I do for a living. I told it some things that are important to me. And then you can go into ChatGPT and see what it has remembered about you.

And the best way to think about this is like a notepad where ChatGPT is jotting down things about you that might be useful. It’s making guesses, just making predictions.

kevin roose

It’s compiling a dossier.

casey newton

It’s compiling a dossier. And we’ll see how useful that has been to me over time. This is an evolution, if you’re a ChatGPT power user.

They have a feature called custom instructions, which was the predecessor to this. And so I use that, for example, to tell it, hey, I’m interested in learning more about antitrust law. And so now sometimes when I’m asking ChatGPT about something that is seemingly unrelated, it will say, well, Casey, I know you’re interested in antitrust law. Here’s something relevant to that. And it’s really cool when that happens and is useful. This is a way of doing that in a little bit more of a formalized way, so you’re not starting from scratch every single time.

kevin roose

So that’s why this memory feature might be good or helpful. Does it concern you for any reasons?

casey newton

Yeah, people are going to ask sensitive questions of any search engine-type product. If you’re having conversations with ChatGPT about medical issues, about your personal finances, about a custody battle, that kind of stuff feels like maybe something that you don’t want ChatGPT to remember.

You certainly don’t want it to be accessible to third parties. And while this hasn’t happened yet, most companies at some point do experience a data breach. And I do wonder what might happen if the ChatGPT’s memory of me were just out there in the world and could be exploited by a bad actor.

So one thing that OpenAI is doing is that it has created its own version of the Chrome browser’s incognito mode. You can have what is called a temporary chat. And in that chat, ChatGPT will not remember anything, they tell us, that you are asking it about.

kevin roose

That’s good because I have to mess around with chatbots for my job, and so I end up asking a lot of demented and insane things.

casey newton

You’re trying to break them.

kevin roose

Yeah, so I’m constantly asking it to help me build a bomb or manufacture anthrax or —

casey newton

I ask them about sex constantly because I’m like, will I ever get just one answer about sex?

kevin roose

And eventually, I started to get concerned that, oh, this chatbot thinks I am a terrorist. It thinks I’m a maniac or a homicidal freak. And so I will be using this incognito mode on my ChatGPT now.

casey newton

Yeah, very good.

kevin roose

So that’s the latest from Gemini and from ChatGPT. But I guess I’m interested Casey, as we wrap up this part of the conversation, are chatbots where you thought they would be a year ago? What have been the biggest surprises for you over the past year?

casey newton

The thing that I have had the hardest time wrapping my head around is just how fast is this moving and how quickly is life going to change.

And I would say that in this moment, Kevin, I feel like we’re in a bit of a lull. I feel like a lot of last year was about, oh, my gosh, everything is speeding up. Everything is accelerating.

And now it’s been a little while since GPT 4 came out. Yes, Gemini Advanced is now here, but it doesn’t really change the state of the art. And so in a way, I have this sense of calm of, OK, these things are moving at a pace that feels manageable to me.

Now, this may turn out to be a completely false sense of security that I have been lulled into because we know that behind the scenes, these companies are working on some things that could be truly game changing. How do you feel about things?

kevin roose

Yeah, I think that’s right. I think these chat bots just kind of got way more popular than the people who made them expected way faster than they expected.

And so I think you’re right that there has been kind of a lull as these companies have tried to catch up to where their user bases are. But I also think that they are not resting on their laurels.

We know that these companies are always acquiring more GPUs and training bigger models. So I think enjoy the lull right now or what feels like the lull to you. But I think we’re going to start having another whole round of these conversations when the next generation of frontier models are released.

casey newton

And if you want to know what is something specific and wild that could happen within the next three or six months, the information reported this month that OpenAI is working on agents that can take over your computer and take actions on your behalf.

kevin roose

What could go wrong?

casey newton

This is going to be one of those things where we’re going to be scrutinizing it very heavily. I’m sure OpenAI knows that, and they’re going to put some guardrails around it. But my gosh, imagine entrusting your entire digital life to something like ChatGPT and saying, OK, yeah, try to actually be my assistant now. If that stuff works, then all of a sudden, it’s going to feel like things are going very fast.

kevin roose

Yeah. All right, so that is our catch up on what’s going on in the world of AI chatbots. Let’s take a quick break. And when we come back, we’re going to talk about what else is happening in and around AI, what’s going on with chips and laws and policies and just where we are as a society in dealing with these things.

[MUSIC PLAYING]

So Casey, we just talked about how chat bots and AI systems themselves have changed over the past year. But let’s talk about how the world is changing around these new technologies. And I want to start by talking about chips, not the kind you eat —

casey newton

Not Doritos.

kevin roose

— the semiconductor kind and GPUs, specifically, which are the chips that are used to build and train these huge AI models because one of the biggest things that has changed in the past year is that the chips war has really ramped the hell up.

casey newton

Has it really? Because as you know, I try not to pay too much attention to the chips war because I always worry that I’ll be bored. But you think it’s been interesting.

kevin roose

It has been very interesting.

So basically, the state of play in chips right now is that these things are incredibly valuable. Companies are buying tons and tons of them. Nvidia, which makes these leading edge chips that are used to train AI systems has become one of the biggest companies in the world in the past year because of all this demand.

And it’s just set off this huge competitive arms race among the big AI companies to see who can assemble the biggest arrays and clusters of these chips and use them to train bigger, and bigger, more powerful AI models. And this is not just a story about technology. It’s also becoming a geopolitical story because the vast majority of the GPUs used to train AI systems are manufactured overseas, a lot of them in Taiwan.

And this huge, insane demand for these chips, combined with this dependence on these foreign manufacturers, has become such a big deal in Washington that Congress actually managed to pass some legislation on this a couple of years ago.

They passed the Chips Act, which basically commits a bunch of money to building chips here in the US, trying to wean ourselves off these foreign suppliers. And in the coming weeks, the Biden administration is expected to actually start awarding some billion-dollar subsidies out of that act to companies that promise to make chips here in the US.

casey newton

OK, so the Chips Act was passed in 2022. And sometime in February 2024, we will award the first subsidies for this act.

kevin roose

Look, government is not known for moving quickly. But this money is already starting to have an impact in the US. “The Washington Post” had a story this week about how Phoenix Arizona is becoming a big town for GPU manufacturing thanks to a few giant chip companies that have built factories and manufacturing plants there.

casey newton

I should say I was actually in Phoenix over the weekend.

kevin roose

To get some GPUs?

casey newton

Well, no, and in fact, nobody was talking about chips where I was. But I was at a baby shower, and that might have explained it. But congratulations to Jeremy and Louis.

kevin roose

Did you give the expectant baby a GPU as a welcome present?

casey newton

Of course, I did. I said this H1,000 is going to pay for your college tuition, my friend.

kevin roose

Which is amazing because there is no H1,000. It’s an H100. But anyway.

casey newton

Not that you know of.

kevin roose

[LAUGHS]:

casey newton

I have a direct line to Jensen Huang.

kevin roose

So not only are companies starting to build plants in America to build these chips. But according to “The Wall Street Journal,” Sam Altman, the CEO of OpenAI and former “Hard Fork” guest, is currently in talks with investors to raise between $5 trillion and $7 trillion for the manufacturing of chips.

Now, Casey, your no numbers guy, so I’ll just tell you that’s a lot of money.

casey newton

It’s so much money that he might as well have said that he was trying to raise a bajillion dollars.

When I read this story, I actually tried to figure out how much money is in the world because I wasn’t even clear on if there are $5 trillion to $7 trillion in the world. It turns out that there are 5 to $7 trillion in the world. But as far as I can tell, this would be by far the biggest fundraise ever. You’re truly just off into the fricking deep end.

kevin roose

Right, now you’re on the scale of national economies. Sovereign wealth funds aren’t even $5 trillion to $7 trillion.

Just to put that into context, $7 trillion is larger than the debt of some major global economies. It’s more than Apple and Microsoft’s market caps combined. It’s more than any company has raised for anything in the history of capitalism.

So when I saw this story, my first thought was, good luck with that, Sam. But I actually think we should talk about this because why does he need this much money? What is he think is happening in AI that you might need $5 trillion to $7 trillion. And is that even possible?

casey newton

I assume it was going to be to secretly build a rocket to take him to another planet and build a new civilization that was untouched by AI.

kevin roose

“Oh, this is for my GPU factory.”

casey newton

Well, clearly, we know that he thinks that a rate-limiting factor in the development of AI is going to be how much energy and computing power is available. So on the energy side, he’s invested in this company, Helion, that’s trying to create nuclear power. Now the other thing he’s trying to do is to just make sure there are enough chips.

But I have to say, Kevin, this surprises me because one of the constants in tech is that things get cheaper over time, right? Because of Moore’s law saying that the number of transistors you can fit on a chip will always increase over time, the expectation has been you’re going to be able to get a really, really, really long way without having to build an insane number of new chipmaking plants.

kevin roose

The GPU that costs $20,000 today will cost $2,000 two years from now.

casey newton

Yeah. So to me, the question that this raises is, are we already running into some sort of limit where unless we build this massive new infrastructure, AI is about to hit a wall?

kevin roose

Yeah. The thing that was interesting to me about it was less the specific number or the fact that Sam Altman is going out and asking investors for trillions of dollars.

Even if he never gets that money, which I think is pretty likely, I think it’s a really good indicator of what people who are in positions of leadership in the AI industry think that it is going to take to get AI to the next level.

So Scott Alexander had a good post on this, basically just laying out the math =f — if you go back and look at the progression of the GPT series, the first ones are relatively cheap to train, then they got more expensive.

And if you just extend that trend line out into the series, he estimates that the cost to train something like GPT 7 would be roughly $2 trillion. And so in that context, even though that would represent a huge fraction of all of the computing power in the world, and all of the money invested in technology in the world, maybe it’s not actually that ridiculous.

casey newton

Yeah, maybe that’s the case. At the same time, I got to say I think Sam Altman likes screwing with his rivals. And if you’re Google or you’re Anthropic, and you find out that Altman is out there saying he’s going to raise $5 trillion to $7 trillion, it’s going to mess with your head. It’s going to keep you up at night in a way that I think would absolutely delight Sam Altman.

So I think there’s a version of this where he may have said it, but he doesn’t really mean it. And while surely he will have to raise a huge amount of money in the future to accomplish everything he wants to, we’re not going to get to $7 trillion anytime soon.

kevin roose

Yeah. And I do think that he believes the Wayne Gretzky quote, that you miss percent 100 percent of the shots you don’t take.

casey newton

I thought that was a Michael Scott quote.

kevin roose

[LAUGHS]: Well, it’s Wayne Gretzky and Michael Scott.

casey newton

I see. OK.

kevin roose

Yeah. But in that spirit, I also believe in that mantra, so I would also like to say that I would also appreciate a couple trillion dollars to do my next project.

casey newton

Well, I’m sure he’ll take that under advisement.

kevin roose

All right, so that’s what’s happening in chips. The next thing I want to talk about is how much the policies around detecting and labeling AI-generated content have changed over the past year. Where are we on this, Casey?

casey newton

Yeah, so now we’re getting into my zone. So because AI exists, now we are seeing a huge rise in the amount of what they call synthetic media. So this is photos, images, web pages, text that has been generated by generative AI.

And because this is the way that it works in Silicon Valley, we create the problem first, and then we try to figure out what is the solution. But the good news, Kevin, is that companies are starting to come together around a solution at least for identifying images that were generated by AI.

kevin roose

And what is that solution?

casey newton

Well, and I should say that each company is handling it a little bit differently. But Meta, Google, and OpenAI are all working on identifying these images.

So last week, Meta announced that it is developing tools that can identify and add labels to AI-generated content, even if that content is made with other companies’ tools. So maybe you used Midjourney or you used Adobe Firefly, Meta says its tools are going to be able to figure that out.

And this is important for one big reason, Kevin, which is I’m not actually concerned that people can create synthetic media at their home laptops. If you want to make something weird, funky, edgy, even terrible on your home laptop, I can live with that.

What I get scared about is how are these things going to spread on platforms. And if we want to prevent bad, deceptive, misleading stuff from spreading, we need the platforms to be able to recognize it in something close to real time.

kevin roose

Right. And I think this kind of thing strikes me as a good idea for a lot of reasons.

It is not foolproof. I’ve talked to some experts in cryptography and watermarking who have said someone who really wants to get around these filters and watermarking systems can.

If you see an AI-generated image, you can take a screenshot of it. That wipes out the metadata. You can then go post the screenshot, and it won’t be linkable easily to the original things.

And also these watermarking systems, presumably, won’t be adopted by every image generating platform, in particular, a lot of the open source models that are currently being used to do things like create AI deepfakes of celebrities or things like that.

Those things probably aren’t going to participate in whatever scheme these tech companies come up with. So I think there will still be ways around these restrictions and watermarking schemes. But I think, in general, for the majority of people who might be stumbling on this imagery in their Facebook feed or their Instagram feed, this is probably a very good idea.

casey newton

It is. And as I said, it is not just Meta that is pursuing this.

So Google announced this month that it is also working on ways to identify this stuff and said they will be joining up with something called the Coalition for Content Provenance and Authenticity or C2PA.

kevin roose

Wow, did Google brand that?

casey newton

[LAUGHS]: You would think so, but that was actually an Adobe brand. You would know if it was a Google brand if it was called the C2PA with Ultra 1.0. That would make it a Google brand.

But these folks are trying to work on an actual technical standard for this because this isn’t an area where all the companies want to compete to have the best watermark. They just want to find one thing that works that everybody can use. And so Google joining the C2PA was a pretty big deal.

kevin roose

Right. And this is — we should say this is not the first time tech companies have come together to do something like this. There’s also a similar hashing feature that is used to detect CSAM, like illegal images of children being abused that are spread on some of these systems.

There is a consortium. They review this. There’s a hashing system so that if Microsoft detects one of these images, they can flag that to Facebook or another company, and it can be taken down in a coordinated way.

casey newton

Right, and so this isn’t that. This is not the company saying, we’re going to create hashes of all synthetic media and share them. At least, it’s not that yet. But yes, that is another good example of ways that tech companies have come together for good. They also do something similar with terrorism-related content.

The last thing we should say on this point, Kevin, is that OpenAI has said something along these lines, too. Last week, it said it would start adding hidden watermarks to images generated with DALL-E in line with the C2PA standards.

So that’s what’s going on with these watermarks. And what I would say in favor of them is that it does seem like this is an area where the tech companies are getting better. I do think it will prevent the worst of this stuff from spreading on respectable platforms that actually invest in content moderation, which is most, but not all of the big ones.

I think the challenge here is that synthetic media is not just images. It is also audio. And it seems like some of the most spooky stuff that has happened so far this year when it comes to generative AI has been with audio, not text.

I am thinking of the Joe Biden robocalls that happened earlier this year where a synthetic voice that was pretending to be the president discouraged people from voting in the New Hampshire primary.

That’s now under investigation, and the FCC has actually said that is illegal and that they will prosecute you for doing that, which is good. But it’s not going to prevent other people from doing something similar.

kevin roose

Yeah, so the state of play in synthetic media and how to handle that, it seems, is that people are worried. There are even some laws being proposed. A bipartisan group of senators recently introduced a bill called The Defiance Act of 2024, which would essentially allow victims of sexually explicit deepfakes to sue anyone who produced or possessed the image with the intent to spread it.

casey newton

And let me just say, that’s a great idea. I really hope Congress can pass that law. So many people, and in particular, so many women are about to suffer from this happening.

We have identified the harm in advance. We have seen it coming for a long time. Congress knows what to do. And my gosh, I hope they get this thing over the finish line. Because if they can’t agree on this, we’re in trouble.

kevin roose

Yeah, so state of play in AI-generated content and the responses to it are that companies are aware of it, working on it. Regulators are aware of it, working on it. But there’s still a lot more to do.

Speaking of laws, the final issue we have to talk about today is how the legal battles over generative AI have been playing out between AI companies and content creators over the past year.

We’ve talked on the show before about some of these lawsuits, including lawsuits from artists and authors. Of course, there’s the lawsuit by “The New York Times,” which was filed late last year, who is suing OpenAI and Microsoft over several different copyright-related violations.

casey newton

And have you taken a position on that one?

kevin roose

[LAUGHS]: On the advice of counsel, I’m going to refuse to answer that question.

casey newton

Fair enough!

kevin roose

But we have all these lawsuits that are now working their way through the courts. Some are in different, more advanced stages than others. But none have sort of resulted in kind of a definitive, binding precedent for the industry yet. So Casey, where do you see us now with respect to the law and AI?

casey newton

Well, as you said, we are just stuck in limbo. We’ve seen some early skirmishes. Some of these cases have been thrown out in part.

It does seem like if you were an artist, a writer, and your work has been appropriated to train a data set or is being used in an ongoing way by one of these services, you may not actually have any recourse, right?

The law may actually find, sorry, you’re out of luck. At the same time, some of these cases are still wending their way through the courts. But I don’t know. What do you think?

kevin roose

Well it seems clear that on the AI industry side, these companies are all lining up behind this idea of fair use, this idea that everything they’re doing, all this use of copyrighted data to train their models, is protected under the law in the US by this idea of fair use. Companies are so sure of this that some of them have offered to cover the legal costs of their customers who are hit with copyright complaints.

So this is the dominant narrative in the industry right now. They don’t seem chastened. They don’t seem like they’re going to stop using copyrighted information to train their models. They’re banking on the courts finding in their favor that what they do is protected and legal.

And I would just say I think that’s a pretty big gamble because, essentially — I’ve talked to some people in the industry who say if it goes the other way, if the courts do rule that they are violating copyright by using all this copyrighted information to train their models, they don’t really have a backup plan for that. There’s not really another way to go about training these models. And so I think it would really put the industry into an existential crisis.

On the creator side, I think we’re seeing a bunch of different responses to generative AI and the copyright challenges. Some companies have gone after these AI companies with lawsuits. But other media companies are striking deals with them to collaborate.

The news outlet, Semaphore, just announced a big partnership with Microsoft to use their AI tools as part of Semaphore’s reporting process. Others are making these broad licensing deals with publishers that would allow them to use their information for training without the threat of legal action.

So there’s a little bit more diversity in how publishers are responding. But I would say this is an issue that they’re all paying close attention to.

casey newton

Yeah, well, now I just feel sad.

kevin roose

[LAUGHS]: Well, I would say it’s very much still an open question. We still don’t know how the courts are going to rule on this. And I will say that one of the things that I think is very likely in the next few years is that one of these copyright lawsuits is going to make its way to the Supreme Court.

casey newton

All right, so if we were going to summarize the state of AI, Kevin, here’s what we learned today. The chatbots are getting more personalized, but they’re not getting big personalities. Would you agree with that statement?

kevin roose

Yes.

casey newton

Chips continue to be important, and Sam Altman says he needs $7 trillion to get enough of them.

kevin roose

Yes.

casey newton

AI deepfake detection is getting better, but there’s still a lot of work to be done. And when it comes to copyright, well, let’s just say that’s playing out in the courts. Is that a good summary?

kevin roose

Yeah, I like that.

casey newton

Yeah. So Kevin, I think all of that is necessary context for the conversation we are about to have. It’s with somebody who is right smack dab in the middle of these copyright issues because he is building something he calls an answer engine. And guess where those answers come from, my friend? It’s the work that people like you and me are doing. So what happens if his product takes over the world? I’m worried. We’ll have that conversation after the break.

[MUSIC PLAYING]

kevin roose

Casey, a few weeks ago, I started using a new AI tool. It is called Perplexity. It is an AI search engine, and it is the talk of the town out here. I feel like everywhere I go in AI circles for the past couple of weeks this is the tool that has got people really excited.

casey newton

Yeah, I have heard you say that. I have used it a bit myself. I have only used the free version. And candidly, it doesn’t stack up to the paid versions of ChatGPT and now Gemini Advanced that I’ve been using. But I trust you on this one, that it’s good.

kevin roose

It is good. So it’s a little bit different than those products that you just mentioned. So it’s a search engine, basically. It looks like Google if you go to it. There’s a text box in the middle of the thing.

And it’s not a conversational interface. It’s not trying to have a conversation with you. It is trying to get you answers.

So what you do, you type in your question or your query. It goes out. It scours the web. And it tries to use AI to summarize what it finds.

It also has some helpful added features. You get access to this thing called Copilot, which, basically, tries to help you ask better questions by asking follow up questions before it gives you an answer.

So for example, when I asked it — I’m throwing a birthday party for my kid’s second birthday coming up, and I was trying to figure out where to have it. So I asked Perplexity something like, where’s a good place to host a birthday party for a two-year-old in the Bay Area?

And it came back. Copilot said, well, do you want indoor venues or outdoor venues? And then I made a choice, and then it said, what’s your budget? Is it under $100, between $100 and $200, or over $200? And so after I’d made my selections, only then did it give me the right answer.

So another feature that I really like in Perplexity is that it can search through specific data sets. So you can limit your search to academic journals or Reddit posts or, actually, YouTube videos.

It has scraped YouTube. And so for example, when I was going to look for — I was trying to change a setting on my water heater the other day.

casey newton

How’d that go?

kevin roose

[LAUGHS]: It didn’t go well. But it was not Perplexity’s fault. It just turns out that I don’t have a very good water heater.

But I was looking for this on YouTube, but I was having to scroll through a bunch of videos to find the part where they talk about changing this one setting. And so instead, I just went to Perplexity. I limited the search to YouTube videos. I put in my water heater’s information. And it came back with the right answer.

casey newton

I just love that we’re in this place as a species where our complaints about tech are like, well, I didn’t want to watch the whole YouTube video. It was too hard. So we just scrape the entire web and just threw it in a blender, and now I can just ask this guy’s little chatbot.

kevin roose

Yeah, so I would say it’s a good product. People really like it. It’s raised a ton of money. People like Jeff Bezos have invested in it.

So it’s getting a lot of buzz. But I think it also raises some pretty serious questions about copyright and publishers and what the future of the internet will look like if AI search engines are just going out and browsing everything and summarizing it for us.

And candidly, for people like us who get paid to publish things on the internet, this product scares me because it is a signal that we are moving into a world where no one will need to visit websites and see ads from publishers that fund things like journalism.

casey newton

Yeah, to put an even finer point on it, they are selling our labor for $20 a month.

kevin roose

Yeah, you could definitely make the case that is what they are doing. So today, to talk about all this and to answer some of these questions, we’ve invited the CEO of Perplexity, Aravind Srinivas.

Aravind used to work for OpenAI before he started Perplexity. And I’m just really curious to ask him what direction he thinks that his product and products like it are pushing all of us on the internet.

casey newton

And I want to know how much thought he’s given to the world that he might be inadvertently creating.

[MUSIC PLAYING]

kevin roose

Aravind Srinivas, welcome to “Hard Fork.”

aravind srinivas

Thank you for having me here. I’ve been a big fan of your podcast.

casey newton

Oh, thank you.

kevin roose

So I’ve used Perplexity a bunch. It’s my default search engine. And I think we should just explain for people what it is.

So at a very basic level, Perplexity is a search engine, which you actually call an answer engine, which works by, as I understand it, going out and having a robot browse the web for you and then using an AI language model, which is a combination of things like GPT 4 and your own AI language models, to summarize what it finds and present that for the user who is searching for something.

Is that at a very basic level, right?

aravind srinivas

Yeah, at a very basic level, that’s pretty accurate. That’s a pretty good summary. What Perplexity does is you ask it a question. Instead of just giving you answers from what the AI or the neural network model has memorized from the internet, instead, it actually goes and does the work of taking the relevant links to what you asked, reads those links, and takes the relevant paragraphs in each of those links, and tries to write a concise four or five sentence answer, and also tell you where each sentence is coming from in the form of footnotes.

It doesn’t try to say stuff on its own. Of course, there are times it makes up stuff out of hallucinations and imperfections in the AI model. But by design, it’s only meant to say what it’s read at that moment, relevant to your query. So this system in AI is called Retrieval Augmented Generation, RAG.

casey newton

RAG, yeah. It’s ragtime!

kevin roose

It’s ragtime! Doo, doo, doo, doo, doo doo.

So one thing that I actually really like about Perplexity in my testing is that it doesn’t hallucinate that much, but it still does get things wrong. In my column that I wrote about Perplexity, I talked about some examples.

I asked it, “When’s Novak Djokovic is next tennis match?” It gave me an answer that referred to a tennis match he’d already played. So why do these AI search engines still get things wrong? And do you feel like that’s going to be a problem for you?

aravind srinivas

So I think there are two reasons why mistakes happen. One is your index not being fresh, and the other is the AI model not being very good at handling corner cases or reasoning.

For example, I saw a hallucination today that was interesting. Yann LeCun made a joke on my tweet yesterday that let me start another rumor that Ilya Sutskever has joined Perplexity.

And then somebody else posts a screenshot of Perplexity as a reply to that, saying — the query was, “Is Ilya joining Perplexity?” And then the answer was, “Yes. Ilya Sutskever joined Perplexity, according to Yann LeCun,” and so on and so people.

And so that answer was written by a model that we train ourselves, and that model didn’t quite get it right. When I tried the same answer with GPT 4, it got it right. It just says, look, there are rumors, but Yann is clearly mentioned it’s unfounded. So these are things that as we address these hallucinations by collecting data, specifically where the current generation of the models fail, we can address.

The other part of the index always being fresh, maybe it’s not gotten the latest news in its index yet. It’s not crawled the web as frequently as it should have. All these things are potential reasons where even if the model was really good, it doesn’t have the necessary information to give you the right answer.

So our company is set up to focus on both the search component and the AI component together. And that’s why I think also, relative to OpenAI, we are a different company. Because of our focus on both these things together, rather than trying to build the most capable general purpose AI, we think we’ll perfect this version of the chat use case better.

casey newton

It’s so challenging, though, because when I use Perplexity and other engines, as a journalist, I still have to go and check all of the source material, right?

I cannot bet that you are right because if I, God forbid, put the wrong information about a tennis match on “Platformer,” my readers would never let me hear the end of it.

So as a result of this, I wind up clicking on all the citations and reading through the citations, trying to find it, and trying to think to myself, OK, is this a vetted source? How do they know that this tennis match is happening? And I wind up spending more time, maybe, than if I had just googled it.

So I’m interested in how you think about that problem and how you position perplexity as an answer engine when, in fact, it does have no answers. It just runs math over other people’s answers.

aravind srinivas

So my belief is that we want to give you the 80-20. There are a lot of links on the web. You don’t know which link to click. You don’t know who’s link to actually consider trusting and not trusting.

So we want to give you the 80-20. We want to give you the sneak peek across all the web pages. And there will be some part of the summary that you still want to dig deeper on.

That part you go and read. We are giving you the sources right away. And we do actually want to drive traffic to publishers and tell you exactly which part of the answer came from who. So that part we want to continue doing.

And as for as working on making you trust the answers more, we can only do that by improving the product more. So you feel like, OK, OK, it’s really not hallucinating much. And we can try to do a job at trying to give the user some level of confidence in terms of whether this part of the answer where we’re not percent sure.

It’s a hard problem to solve, by the way. It’s very hard to tell an AI what it knows and what it doesn’t know. And we are tracking all the research that academia is doing on that and seeing what we can take from there.

casey newton

How do you instruct the eye which sources to trust and what not to trust?

aravind srinivas

It’s a hard problem. We made some good decisions in the beginning, for what it’s worth. We decided that we would prioritize peer-reviewed domains, for example, “New York Times.”

“New York Times,” you cannot just arbitrarily write what you want. You have to get it approved by your editor, your peers —

casey newton

Oh, don’t we know it.

kevin roose

Yeah, and famously, no one disagrees that “The New York Times” is a trustworthy source of information. I will just say if you’re looking for some websites to down rank because they’re full of untrustworthy information, platformer.news —

casey newton

All right, that’s enough out of you for the day.

kevin roose

— would be one that your AI can ignore. OK, so I want to ask you a more serious question, which is, when AI chatbot’s first came out, I remember one of the big limitations and frustrations was that they weren’t up to date. Their knowledge cut off at a certain point in time.

But now we have ChatGPT, which can browse the internet, and Bing can browse the internet, and Google’s Gemini can browse the internet and tell you about stuff that happened an hour ago.

So what’s the practical difference between what you’re building at Perplexity, which is an AI-powered search engine, and what those companies are building, which are AI chatbots that can access the internet?

aravind srinivas

The fundamental difference is the focus on search and adding a lot of depth to the search use case versus just trying to be a generic chatbot that does everything, right?

If your need is for accurate facts at the fastest speed possible, then there is no alternative today. All these other bots have to make a lot of decisions on when to use a search, when not to use it, and how many times you browse. If you’ve used ChatGPT, it probably takes you six or seven seconds to actually get a browsing answer.

kevin roose

It’s very slow.

aravind srinivas

And on Perplexity, you just get it in an instant, right? So that is our angle against ChatGPT for the search use case.

Now, as for Bard or Gemini as they call it today —

kevin roose

Bard is dead.

casey newton

Gemini Advanced with Ultra 1.0 is how I prefer to say it on this podcast.

aravind srinivas

Yeah, so Gemini is probably better positioned to make it a lot faster in retrieving information from the web. Now, honestly, the angle there is their business model. If they really want it, they can just go all in on Gemini and just cannibalize Google search. But will they —

casey newton

Because you’re saying that if Gemini gets as good as it could be, eventually, people will not have reason to use Google search, which is what Google is currently getting money through, ad revenue.

aravind srinivas

Exactly. And also they’re putting it behind a subscription model, and they’re not going to give it away to billions of users. So there lies our opportunity. Their hesitation to really go all in on that, that’s our opportunity.

casey newton

Right, you’re making the bet that Google is not going to give away Gemini to billions of people. But you are —

aravind srinivas

It’s a safe bet to make based on how “Wall Street” reacts to any reduction in the ad revenue, right?

casey newton

Right.

kevin roose

I think this is a lot of — the question a lot of people have about new products in very dominated markets like search is, if Perplexity works so well, won’t Google just copy it? And you’re saying, well, they could copy it, but it might destroy their business model.

aravind srinivas

Well, they could copy it ages ago. We’ve been alive for more than a year since we launched.

casey newton

Well, it takes a year to get a meeting on the calendar at Google, so it’s not really that much of a surprise.

But look, this stuff is expensive to run. You can’t give it all away for free either, right? So what’s your plan to go Google scale?

aravind srinivas

First of all, we don’t have to go Google scale. That’s something that I’ve been very clear about ever since the beginning. One of our investors, Paul Bouckaert, who used to work at Google and invented Gmail, basically told me to just get 5 percent to 10 percent of the top-earning users of Google.

casey newton

Right, just go after the rich users, and that’ll take care of it.

aravind srinivas

America first focus on the American user base. People who really care about their time, people who actually want a lot of research for their decision making on the day-to-day life, try to get them to use your product more.

This whole thing of having a billion users is actually a red herring. It didn’t really benefit most of these companies as much as they make it look like. In fact, did you see the stats on Facebook, that the revenue per user in the US is orders of magnitude more?

kevin roose

Whereas, in the other parts of the world, it’s way less than that.

OK, so you don’t have to get to Google’s scale, but you do have to make a product that is in some ways competitive with Google’s AI search products. And a lot of startups, or at least a handful of them, have tried to compete with Google in search before and failed.

There was a startup called Neeva that raised a bunch of money and shut down recently because they just basically decided it’s just not worth — you can’t compete with Google in search.

casey newton

But it was also a search engine you had to pay to use, which was not a very appealing proposition for most people.

kevin roose

Right, but perplexity has a pro version that costs $20 a month that I’ve been using. So how do you avoid the fate of the search startups that have gone before you, which are now littering the graveyards of Silicon Valley?

aravind srinivas

[LAUGHS]: Graveyards of Silicon Valley.

Hey, look, to be very honest, we’ve already avoided their fate. They raised a lot of money before actually getting any usage, which we’ve avoided very, very —

kevin roose

How many users do you have?

aravind srinivas

We have more than 10 million monthly actives.

kevin roose

Right, so as I understand it, right now most of your revenue comes from these people who pay for the premium version of Perplexity.

aravind srinivas

That’s right.

kevin roose

Do you plan on adding other business models? Do you think there will ever be ads on Perplexity, for example?

aravind srinivas

Yeah, so we have two other business models in mind. One is APIs. We have developer APIs for our Perplexity models that we build ourselves and serve ourselves. So that’s going to be one business model that we’re going to pursue. Consider that as developer and enterprise.

The other business model that we’re going to pursue is advertisements, not today. We don’t have any idea how to do it. I really want to be honest here. I’ve been trying to think about this for many months.

What is even advertisement in this medium? Is it influencing the answer? Or is it influencing the sources but not the answer? Or is it something else, like maybe the follow ups you ask, trying to incentivize the user to ask certain things?

Say I’m asking a web “Platformer,” and Kevin was trying to bid about why you should not read “Platformer.” Well, Casey’s like, why you should read “Platformer.”

kevin roose

My army of bots are already seeing anti -“Platformer” propaganda out there.

aravind srinivas

Yeah, so all these things are interesting things to think about. And what does it even mean to bid on a query now? You’re not bidding on keywords anymore, you’re bidding on actual semantic queries. And that’s going to be an infinite space of possibilities.

So how do I even build the equivalent of analytics and AdWords here? It’s not even clear. And neither is it clear to Google, by the way.

But why I’m optimistic that we are the startup to figure it out is because whoever built the existing AdWords and analytics has a lot of incentive to fight for keeping it, and they are going to be slow and rolling out ads business around the new model.

casey newton

Right. At the same time, if somebody comes to Perplexity and says, hey, I want to go to Japan. What should I do? It’s going to be easy to figure out an ad model for that.

aravind srinivas

Hopefully. Hopefully.

kevin roose

Yeah. So Aravind, here in the Bay Area, people are very excited about Perplexity. I was out at a dinner with a bunch of AI executives the other night, and people were going around the table, talking about how much they loved Perplexity and how it was the greatest thing since sliced bread.

But I think in our industry, in the media, a lot of publishers and journalists are very nervous about AI-powered search engines, in particular, because Google traffic, referral traffic from Google Search, is one of the main ways that publishers are making money today.

A huge percentage of revenue made by publishers comes from people going to Google to look something up, clicking on a link, going to a publisher’s website, getting an ad. That doesn’t happen nearly as much with Perplexity.

You don’t really have to go to the links at all, unless you’re Casey and you’re double checking things for your newsletter. If the product works as designed, the answers you’re looking for are right there in the response, and there are these little links to citations, and there are these little — there’s this little menu of sources.

But I find myself barely ever clicking on those links and sources when I’m just casually browsing the web. So why should publishers not be terrified of what you’re building?

aravind srinivas

First of all, they should not be terrified because we are letting the user know that we did use their content to get the answer, unlike ChatGPT.

In fact, when ChatGPT gives you sources, it’s in a bracket somewhere, and most people don’t even know what to do with them. And we clearly put it at the top. Your logo is there, and your link is there. And it’s one click. You just get there.

The other reason they should not be terrified is that at the end, your true incentive — I’m not talking about what pays for you. But your true incentive is to get as many people to read the stuff you wrote.

What did Casey write? What did Kevin write? Any paragraph that’s relevant to the query asked, if more people see that, it’s good for you.

I understand that doesn’t actually lead to direct monetization. If somebody read a paragraph that you wrote in the context of a query, in a Perplexity answer, and never actually visited your website, how do you track that?

You cannot track that. And if you cannot track that, even though your brand awareness and your individual awareness increased, you actually cannot — the publisher —

casey newton

Can’t make any money from it.

aravind srinivas

Yeah, exactly.

kevin roose

Which means that you can’t pay the journalist or the person who’s putting the information out onto the internet for your website, your search engine, to go and scrape.

aravind srinivas

That’s right. And that’s why I think while this is a better way to reach more people, more readers, the referral that we give will be way higher quality than the referral you get from a traditional search engine because they’re actually only coming there despite reading the summary.

Casey only goes to the link to actually go and read further. So there’s a very high intent there. So you cannot measure the value of the traffic in the same way.

That said, there should be ways to measure the awareness of what you wrote even without a referral traffic, and we need to build the underlying analytics for that. And we need to tell publishers, OK, these are the number of times that this particular snippet of “New York Times” was used in a Perplexity answer across this week. And that should be used as an incentive for you guys to get paid more.

And I think we need to work together to build all these things, rather than trying to see it as, hey, like you’re taking my stuff and using it. But I’m also sympathetic to the current lawsuit that is going on, where you’re just taking all your data tokens and training these base foundation models there, which we do not do by the way. We don’t train anything on anybody’s data.

casey newton

But you use ChatGPT as one of your underlying models, which —

aravind srinivas

We use GPT and Llama. All of these models are there.

casey newton

All your foundational models did use that training data?

aravind srinivas

Yeah, they did. We do not. And we do post train them to be good at the task of summarization. But that is not using anybody’s data. That is just a skill that we are teaching these models to be good at.

casey newton

Let me ask you this. Do you accept the premise that the better Perplexity gets, the less traffic it should be sending outbound.

aravind srinivas

I don’t think so.

casey newton

Really? I find that hard to believe. If I were you, I would not want people to feel like they needed to visit a lot of other websites after they visited something I built that was called an answer engine.

aravind srinivas

No, not really. For example, there are many times I’ve actually visited links that were given by Perplexity to read more because I like one part of the answer, and I just wanted to know even more details of what that person has read. This is individual to me.

And by the way, one of those domains has been “Wall Street Journal,” “New York Times.” I’ve definitely visited these links a lot, mainly because I trust I trust the fact — the amount of effort that went into producing a “New York Times” or “Wall Street Journal” article is a lot.

Because I talk to you guys, I know how much background research you do to get something out. And so the economic value of that output is very high, and that should be respected.

casey newton

So let’s accept every premise that you’ve just shared and take it on its own terms and that in the near term future, despite the fact that you’re showing these links, publishers continue to lay off journalists. Other companies build engines not unlike yours. And the overall amount of journalism in the world continues to shrink.

You rely on that journalism in this idea of a real time graph to create a useful product. Have you skipped ahead a bit to think about, well, gosh, if the current trends continue, are the sources of data for the thing I’m building going to dry up in a way that creates problems for me as an entrepreneur?

Or do we overinflate the importance of journalism to what you’re building and that there’s just sort of enough data out there for you to build the product that you want, regardless of whether journalism has a real future?

aravind srinivas

I think what you’re saying is very valid. Anybody can make an arbitrary tweet or a blog post with very little effort. But what a journalist does of actually doing all the relevant background research and getting their sources right and then writing a very nice, concise, summarized article of the whole thing should be valued a lot more.

So I agree with that. And if the economy of journalism is getting affected, then certainly like all the companies that are relying on the quality of their output for their own service should help them. I’m totally in alignment with that.

My sense is that there are few people who do it really well, like you guys. And there are a lot of people who don’t put a lot of effort there, a lot of journalists, a lot of mediums. Not everyone is “New York Times.”

So my sense is that some mediums and some journalists are where people really trust and go to. And the smaller ones who don’t put in a lot of effort are not prioritized by AI companies.

kevin roose

Well, I think another possible outcome than the one you’re talking about, Casey, where journalism just shrivels, and dies, and then there’s no more good source data for the AI Search engines is that publishers just put all of the good information behind paywalls or in walled gardens where the search engines can’t actually go out and search them.

casey newton

Oh, they’ll find a way. How hard is it going to be to code a bot that subscribes to “The New York Times” and reads it, honestly?

kevin roose

But publishers are already starting to block these crawling robots from their websites because they don’t want to be used as training data. So I guess the question for you is, is that something you’re worried about, too, in addition to the concerns about journalism? Are you worried that the best sources of information are going to try to hide themselves from your search engine?

aravind srinivas

Possible. Reddit is trying to go this direction. Twitter has already gone very far in that direction.

Again, the value of information is not just in its quality but also in how many people become aware of it. And if you go too far along the direction of just pay walling everything, making it super hard for people to learn about that thing, then your own incentive as a creator of that information is like, hey, look, I get it. But man, I do want people to know what I read what I wrote.

casey newton

That’s true. But at the same time, we’ve run the experiment on what if all the journalism were free, right? “The New York Times” didn’t have a paywall until 2011. And go back and look how “the New York Times” was doing in 2011. It wasn’t great.

It was only until they put up the paywall that they were able to recoup some of the value. And what happened during the time when all of it was free? Google figured out a way to extract the maximum amount of value out of what “The New York Times” was doing.

So I just don’t really buy the argument that if we were just to let down our paywalls and trust that awareness would pay all of our bills for the rest of time that that’s supported empirically.

aravind srinivas

My point — my argument is not that. My argument is, let’s figure out a way to monetize awareness better. And Google has not done that, right?

Google has figured out a way to monetize the platform which gives you the awareness but only for themselves and not for you. I’m saying, maybe we can do something that’s a win-all situation. I’ve not figured it out.

casey newton

One thing you could do is just effectively lead gen. It’s like did that in the past 50 searches that you’ve done on Perplexity, we’ve been showing you this source. Would you be interested in maybe subscribing or subscribing to their news letter?

aravind srinivas

That’s a cool idea. We could also do interesting ideas like join subscriptions and things like that. So it’s something to explore. It’s something to explore.

kevin roose

You recently posted about a new feature called Perplexity Push Notifications, which I’ve gotten a few of now. It basically alerts you on your phone whenever there’s a big news story.

casey newton

I call it Perplush, by the way. But that’s just me.

kevin roose

[LAUGHS]: Don’t take branding advice from Casey.

casey newton

Sorry.

kevin roose

But then when you tap on the notification, it takes you to this Perplexity results page where you can see, basically, a summary of who won the Super Bowl or whatever.

And you also have some other news summarization experiments in your app like this Discover section where you can go and see just an AI-generated summary of all the major news stories happening on a given day.

So it seems like you are expanding beyond search, and you do want to get into more news curation. Talk about your vision there and what you’re trying to build.

aravind srinivas

Yeah, so it’s not just for news. Our goal, at least the North Star that I’ve set for us, for the company, as well as external messaging, is to be the ultimate knowledge app, the TikTok of knowledge. That’s what I want because I think that’s good for the world.

casey newton

Wait, what is the TikTok of knowledge? Knowledge but with more dancing?

That’s funny. Make knowledge as cool as watching dancing videos. Maybe it will never get there. I don’t mind if it never gets the billion user base that TikTok got.

But there are certainly at least 100 million people in the world who want to be smarter every day, and we want to serve them. And making people smarter every day comes from, obviously, serving real-time information on the web as well as interesting insights about existing things that are not real time, that is personally catered to them, personalized to what topics they’re interested in, personalized to what they already know, and might not know already.

So these are things that we want to build, and these are things that actually can be built now, today, with the existing tools that we have. Push notifications is one thing. We have the feeds, the Discover feed right now. It’s, obviously, very well curated and very limited.

But you can think about us expanding and automating a lot of it and personalizing to what you want. And that’s going to be a different segment to the product. It’ll add more depth to the product.

kevin roose

Yeah, I’ve used your news digest discovery tools. And on a product level, they’re quite good. I don’t have a lot to complain about.

But every time I’m using them, I do get this gnawing feeling in my stomach, which is if billions of people got their news this way, I wouldn’t have a job. These news curation products that you’re building are very good at summarizing what’s out there and sort of extracting the most relevant information.

But publishers aren’t seeing a dime from that. Journalists aren’t seeing a dime from that. It feels icky because I feel like a parasite just gobbling up the best of what people have put on the internet through your app and not paying for it. So I guess I’m just —

aravind srinivas

Casey, I’ll tell you something —

kevin roose

— maybe make me feel a little bit better about using your product.

aravind srinivas

Look, I’m not saying this just to flatter you. But people care about what you got to say, and people care about what “New York Times” has got to say, really.

That’s the brand. There is a value for the brand. People pay for the brand, right? People care — oh, Kevin said AI has convinced him to leave his wife.

casey newton

That would never happen.

aravind srinivas

Kevin thinks there’s something that might be better than Google. This is what you’re building. You’re building your brand. And I think that is not going to just go away because somebody else is giving summarized articles.

kevin roose

I hope you’re right.

casey newton

But also, Kevin is a columnist. There are also journalists who just go cover the local school board. They don’t have national brands.

And when AI tools come along and say, “Hey, we read an article. Here’s what happened at the school board meeting,” those people probably aren’t going to click the link. They’re not going to follow that journalist because they already got what they needed, right? I think that’s a larger point.

kevin roose

Yeah, and just to be clear, we did not bring you on the podcast today to just harangue you about how you’re destroying journalism.

aravind srinivas

I came prepared that you guys are going to ask hard questions because I listened to your pod with Sundar and you drilled him there. So I was like, OK, if they went so hard on the CEO of Google, I’m definitely going to have a hard time.

casey newton

But I just agree on one thing, which is, I want you to make Kevin smarter. So I don’t care how you do it, but —

kevin roose

[LAUGHS]: Yeah, please help.

Well, Aravind Srinivas, thanks for coming on “Hard Fork.”

aravind srinivas

Thank you, Kevin. Thank you, Casey.

[MUSIC PLAYING]

kevin roose

Before we go, we wanted to give a little housekeeping announcement, which is that “Hard Fork” is hiring for a temporary producer role. We’re looking for someone who can help us make the show for a few months.

We’re looking specifically for a senior producer who has experience making podcasts like ours, who knows the show, and loves the stuff we talk about, and would be excited to help us make it for a few months.

If that’s you, you can get in touch by emailing us at hardfork@nytimes.com. Just put “Temporary Producer Role” in the subject line.

casey newton

And you should know, you’ll be dealing with two hosts, and one of them is a major diva.

kevin roose

[LAUGHS]: Yeah, Casey’s writer is 47 pages long, and it does include Bengal Tigers.

casey newton

I was talking about you.

“Hard Fork” is produced by Davis Land and Rachel Cohn. We’re edited by Jen Poyant. We’re fact checked by Caitlin Love.

Today’s show was engineered by Alyssa Moxley. Original music is by Marion Lozano, Pat McCusker, Rowan Niemisto, and Dan Powell.

Our audience editor is Nell Gallogly Video production is by Ryan Manning and Dylan Bergersen.

Special thanks to Paula Szuchman, Pui-Wing Tam, Kate LoPresti, Jeffrey Miranda, and Brianna Barnes, the listener who sent us that incredible ChatGPT Bro trick. You can email us more tricks at hardfork@nytimes.com.

[THEME MUSIC]

Noticias relacionadas: Noticias Similares

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *