Google Gemini and Unsustainable LLMs
TechReading up on the announcements from Google I/O got me thinking more about how LLMs source their content and how they present it.
A traditional Google search would have given you a list of links, ranked by how relevant they were to your search query. Though in recent years they seem to be ranked by how much the website owner is paying Google, or by how well they were able to game Google's algorithm with SEO. You would browse through these links, click on the ones that interested you, and read their content. You'd likely be presented with ads that generated revenue for the content owner, which incentivized them to create more good content that people would click on in the first page of search results. This process made money for both Google and the content creators.
This system started breaking down long before LLMs were involved. Google dropped the ball with search years ago, and results became flooded with SEO spam and AI-generated clickbait. Now, with things like Google Gemini, a search returns a summarized answer that the LLM "thinks" is correct based on the content scraped from the most relevant links in the search results. However, most of the time no sources are cited and no links provided to websites where this information came from. You can no longer click through to a website to read the article and look at ads to generate revenue for the content owner.
Here is the crux of the problem. As Google Gemini and other LLMs replace traditional search (which is exactly what Google seems to want), traffic to websites will plummet. Ad revenue will sink with it.
At this point...
- What incentivizes content creators to create content if no one is looking at it and no ad revenue is coming in?
- If content creators stop creating content, where does Google train the LLM?
- How does Google make money? Will they display ads in the LLM answers? This seems self-defeating.
- If Google has no content with which to train the LLM, do they hire contractors to create content for Gemini? Some content is unchanging but most information needs to be updated over time.
My final question is: Where does the web go from here?
I like a quick answer as much as anyone, but when I want to verify something or learn more than can be said in a few sentences, I want more sources and diverse information. What impact will this have on our society? I think there's already too much power in the hands of big tech when it comes to control of information. Proxying snippets of information through an LLM seems like another outlet for controlling and manipulating what people see and think. Is the answer a compromise like what Perplexity AI is doing? They provide an LLM summarized answer with links to roughly half a dozen sources for further reading.
I'm curious to hear others' thoughts and ideas on this subject. If you'd like to discuss, please reach out to me on Mastodon, or via the contact form below. Thanks!