Normal view

There are new articles available, click to refresh the page.
Before yesterdayPixel Envy

Google Search Is More Useful if You Know Its Advanced Operators, to a Point

By: Nick Heer
1 April 2026 at 03:31

Hana Lee Goldin:

The search bar you already have is more capable than that arrangement requires you to know. With the right syntax, it becomes a precision instrument: narrow by domain, by date, by file type, by exact phrase. We can pull up archived pages, surface open file directories, and even find what people said in forums instead of what brands want us to find. None of it requires a new tool or a paid account. The capability has been there the whole time.

Advanced search operations are something Google does better than any competitor. DuckDuckGo has its bangs and I like them very much, but Google has a vast catalogue able to be searched with such precision — to a point. If you use these advanced search operators, get ready to see a lot of CAPTCHAs. Google will slow you down and may even block you temporarily if you use it too well.

⌥ Permalink

Meta Loses Two Landmark Cases Regarding Product Safety and Children’s Use; Google Loses One

By: Nick Heer
26 March 2026 at 05:00

Morgan Lee, Associated Press:

A New Mexico jury found Tuesday that social media conglomerate Meta is harmful to children’s mental health and in violation of state consumer protection law.

The landmark decision comes after a nearly seven-week trial. Jurors sided with state prosecutors who argued that Meta — which owns Instagram, Facebook and WhatsApp — prioritized profits over safety. The jury determined Meta violated parts of the state’s Unfair Practices Act on accusations the company hid what it knew [about] the dangers of child sexual exploitation on its platforms and impacts on child mental health.

Meta communications jackass Andy Stone noted on X his company’s delight to be liable for “a fraction of what the State sought”. The company says it will appeal the verdict.

Stephen Morris and Hannah Murphy, Financial Times:

Meta and Google were found liable in a landmark legal case that social media platforms are designed to be addictive to children, opening up the tech giants to penalties in thousands of similar claims filed around the US.

A jury in the Los Angeles trial on Wednesday returned a verdict after nine days of deliberation, finding Meta’s platforms such as Instagram and Google’s YouTube were harmful to children and teenagers and that the companies failed to warn users of the dangers.

Dara Kerr, the Guardian:

To come to its liability decision, the jury was asked whether the companies’ negligence was a substantial factor in causing harm to KGM [the plaintiff] and if the tech firms knew the design of their products was dangerous. The 12-person panel of jurors returned a 10-2 split answering in favor of the plaintiff on every single question.

Meta says it will also appeal this verdict.

Sonja Sharp, Los Angeles Times:

Collectively, the suits seek to prove that harm flowed not from user content but from the design and operation of the platforms themselves.

That’s a critical legal distinction, experts say. Social media companies have so far been protected by a powerful 1996 law called Section 230, which has shielded the apps from responsibility for what happens to children who use it.

For its part, the Wall Street Journal editorial board is standing up for beleaguered social media companies in an editorial today criticizing everything about these verdicts, including this specific means of liability, which it calls a “dodge” around Section 230.

But it is not. The principles described by Section 230 are a good foundation for the internet. This law, while U.S.-centric, has enabled the web around the world to flourish. Making companies legally liable for the things users post will not fix the mess we are in, but it would cause great damage if enacted.

Product design, though, is a different question. It would be a mistake, I think, to read Section 230 as a blanket allowance for any way platforms wish to use or display users’ posts. (Update: In part, that is because it is a free speech question.) From my entirely layman perspective, it has never struck me as entirely reasonable that the recommendations systems of these platforms should have no duty or expectation of care.

The Journal’s editorial board largely exists to produce rage bait and defend the interests of the powerful, so I am loath to give it too much attention, but I thought this paragraph was pretty rich:

Trial lawyers and juries may figure that Big Tech companies can afford to pay, but extorting companies is certain to have downstream consequences. Meta and Google are spending hundreds of billions of dollars on artificial intelligence this year, which could have positive social impacts such as accelerating treatments for cancer.

Do not sue tech companies because they could be finding cancer treatments — why should I take this editorial board seriously if its members are writing jokes like these? They think you are stupid.

As for the two cases, I am curious about how these conclusions actually play out. I imagine other people who feel their lives have been eroded by the specific way these platforms are designed will be able to test their claims in court, too, and that it will be complicated by the inevitably lengthy appeals and relitigation process.

I am admittedly a little irritated by both decisions being reached by jury instead of a judge; I would have preferred to see reasoning instead of overwhelming agreement among random people. However, it sends a strong signal to big social media platforms that people saw and heard evidence about how these products are designed, and they agreed it was damaging. This is true of all users, not just children. Meta tunes its feeds (PDF) for maximizing engagement across the board, and it surely is not the only one. There are a staggering number of partially redacted exhibits released today to go through, if one is so inclined.

If these big social platforms are listening, the signals are out there: people may be spending a lot of time with these products, but that is not a good proxy for their enjoyment or satisfaction. Research indicates a moderate amount of use is correlated with neutral or even positive outcomes among children, yet there are too many incentives in these apps to push past self-control mechanisms. These products should be designed differently.

⌥ Permalink

In a ‘Test’, Google Is Automatically Rewriting News Headlines in Its Search Results

By: Nick Heer
21 March 2026 at 03:37

Sean Hollister, the Verge:

Since roughly the turn of the millennium, Google Search has been the bedrock of the web. People loved Google’s trustworthy “10 blue links” search experience and its unspoken promise: The website you click is the website you get.

Now, Google is beginning to replace news headlines in its search results with ones that are AI-generated. After doing something similar in its Google Discover news feed, it’s starting to mess with headlines in the traditional “10 blue links,” too. We’ve found multiple examples where Google replaced headlines we wrote with ones we did not, sometimes changing their meaning in the process.

As I noted when I linked to Hollister’s article about Discover back in December, this is not new in search results; it has been happening for years.

Danny Goodwin, Search Engine Land:

Dig deeper. Google changed 76% of title tags in Q1 2025 – Here’s what that means […]

According to the Google Search Central section on title links, originally published in 2021:

I am not arguing this is good or normal — the examples Hollister shows are extremely poor reflections of the articles in question — but I do not understand why it is only gaining traction now, nor how it meaningfully differs from what Google has been doing all along. It is indeed frustrating.

Many of the results you see in Google Search misrepresent the source material and are misleading. But that has been true for a while — which is a problem unto itself. People should not trust the results they see as represented by Google Search. The visual tone Google has maintained, however, is that it is a neutral directory. The summaries in A.I. Overview are delivered with an unearned dry authority, and the ten links below it are there because of a tense truce between Google’s goals and those of search optimization professionals.

Also, I had no idea that Search Engine Land had been acquired at some point by Semrush which, in turn, was bought by Adobe.

⌥ Permalink

Reddit Sues Perplexity and Three Data Scraping Companies Because They Crawled Google

By: Nick Heer
25 October 2025 at 05:48

Matt O’Brien, Associated Press:

Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an “industrial-scale, unlawful” economy to “scrape” the comments of millions of Reddit users for commercial gain.

[…]

Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a “former Russian botnet,” and Texas-based startup SerpApi, which lists Perplexity as a customer on its website.

Mike Masnick, Techdirt:

Most reporting on this is not actually explaining the nuances, which require a deeper understanding of the law, but fundamentally, Reddit is NOT arguing that these companies are illegally scraping Reddit, but rather that they are illegally scraping… Google (which is not a party to the lawsuit) and in doing so violating the DMCA’s anti-circumvention clause, over content Reddit holds no copyright over. And, then, Perplexity is effectively being sued for linking to Reddit.

This is… bonkers on so many levels. And, incredibly, within their lawsuit, Reddit defends its arguments by claiming it’s filing this lawsuit to protect the open internet. It is not. It is doing the exact opposite.

I am glad Masnick wrote about this despite my disagreement with his views on how much control a website owner ought to have over scraping. This is a necessary dissection of the suit, though I would appreciate views on it from actual intellectual property lawyers. They might be able to explain how a positive outcome of this case for Reddit would have clear rules delineating this conduct from the ways in which artificial intelligence companies have so far benefitted from a generous reading of fair use and terms of service documents.

⌥ Permalink

Google Provides Feedback on the Digital Markets Act

By: Nick Heer
27 September 2025 at 05:27

Something I missed in posting about Apple’s critical appraisal of the Digital Markets Act is its timing. Why now? Well, it turns out the European Commission sought feedback beginning in July, and with a deadline of just before midnight on 24 September. That is why it published that statement, and why Google did the same.

Oliver Bethell, Google’s “senior director, competition”, a job title which implies a day spent chuckling to oneself:

Consider the DMA’s impact on Europe’s tourism industry. The DMA requires Google Search to stop showing useful travel results that link directly to airline and hotel sites, and instead show links to intermediary websites that charge for inclusion. This raises prices for consumers, reduces traffic to businesses, and makes it harder for people to quickly find reliable, direct booking information.

Key parts of the European tourism industry have already seen free, direct booking traffic from Google Search plummet by up to 30%. A recent study on the economic impact of the DMA estimates that European businesses across sectors could face revenue losses of up to €114 billion.

The study in question, though published by Copenhagen Business School, was funded by the Computer & Communications Industry Association, a tech industry lobbying firm funded in part by Google. I do not have the background to assess if the paper’s conclusions are well-founded, but it should be noted the low-end of the paper’s estimates was a loss of €8.5 billion, or just 0.05% of total industry revenue (page 45). The same lobbyists also funded a survey (PDF) conducted online by Nextrade Group.

Like Apple, Google clearly wants this law to go away. It might say it “remain[s] committed to complying with the DMA” and that it “appreciate[s] the Commission’s consistent openness to regulatory dialogue”, but nobody is fooled. To its credit, Google posted the full response (PDF) it sent the Commission which, though clearly defensive, has less of a public relations sheen than either of the company’s press releases.

⌥ Permalink

⌥ The Unknown Effect of Google A.I. Overviews on Search Traffic

By: Nick Heer
27 July 2025 at 20:16

Pew Research Centre made headlines this week when it released a report on the effects of Google’s A.I. Overviews on user behaviour. It provided apparent evidence searchers do not explore much beyond the summary when presented with one. This caused understandable alarm among journalists who focused on two stats in particular: a reduction from 15% of searches which resulted in a result being clicked to just 8% when an A.I. Overview was shown, and finding that just 1% of searches with an Overview resulted in a click on a citation in that summary.

Beatrice Nolan, of Fortune, said this was evidence A.I. was “eating search”. Thomas Claburn, of the Register, said they were “killing the web”, and Emanuel Maiberg, of 404 Media, says Google’s push to boost A.I. “will end the flow of all that traffic almost completely and destroy the business of countless blogs and news sites in the process”. In addition to the aforementioned stats, Ryan Whitwam, of Ars Technica, also noted Pew found “Google users are more likely to end their browsing session after seeing an A.I. Overview” than if they do not. It is, indeed, worrisome.

Pew’s is not the only research finding a negative impact on search traffic to publishers thanks to Google’s A.I. search efforts. Ryan Law and Xibeijia Guan of Ahrefs published, earlier this year, the results of anonymized and aggregated Google Search Console data finding a 34.5% drop in click-through rate when A.I. Overviews were present. This is lower than the 47% drop found by Pew, but still a massive amount.

Ahrefs gives two main explanations for this decline in click-through traffic. First, and most obviously, these Overviews present as though they answer a query without needing to visit any other pages. Second, they push results further down the page. On a phone, an Overview may occupy the whole height of the display, as shown in Google’s many examples. Either one of these could be affecting whether users are clicking through to more stuff.

So we have two different reports showing, rather predictably, that Google’s A.I. Overviews kneecap click rates on search listings. But these findings are complicated by the various other boxes Google might show on a results page, none of which are what Google calls an “A.I.” feature. There are a slew of Rich Result types — event information, business listings, videos, and plenty more. There are Rich Answers for when you ask a general knowledge question. There are Featured Snippets that extract and highlight information from a specific page. These “zero-click” features all look and behave similarly to A.I. Overviews. They all try to answer a user’s question immediately. They all push organic results further down the page. So what is different about results with an A.I. twist?

Part of the problem is with methodology. That deja vu you are experiencing is because I wrote about this earlier this week, but I wanted to reiterate and expand upon that. The way Pew and Ahrefs collected the data for measuring click-through rates differs considerably. Pew, via Ipsos KnowledgePanel, collected browsing data from 900 U.S. adults. Researchers then used a selection of keywords to identify search result pages with A.I. Overviews. Ahrefs, on the other hand, relied on data directly from Google Search Console automatically provided by users who connected it to the company’s search optimization software. Ahrefs compared data collected in March 2024, pre-A.I. rollout, against that from March 2025 after Google made A.I. Overviews more present in search results.

In both reports, there is no effort made to distinguish between searches with A.I. Overviews present and those with the older search features mentioned above, and that would impact average click-through rates. Since Featured Snippets rolled out, for example, they have been considered the new first position in results and, unlike A.I. Overviews in the findings of Pew and Ahref, they can drive a lot of traffic. Search optimization studies are pretty inconsistent, finding Featured Snippets on between 11%, according to Stat, and up to 80% according to Ahrefs.

But the difference is even harder to research than it seems because A.I. Overviews do not necessarily replace Featured Snippets, nor are they independent of each other. There are queries for which Overviews are displayed that had no such additional features before, there are queries where Featured Snippets are being replaced. Sometimes, the results page will show an A.I. Overview and a Featured Snippet. There does not seem to be a lot of good data to disentangle what effect each of these features has in this era. A study from Amisive from earlier this year found the combined display of Overviews and Snippets reduced click-through rates by 37%, but Amisive did not publish a full data set to permit further exploration.

But publishers do seem to be feeling the effects of A.I. on traffic from Google’s search engine. The Wall Street Journal, relying on data from Similarweb, reported a precipitous drop in search traffic to mainstream news sources like Business Insider and the Washington Post from 2022 to 2025. Similarweb said the New York Times’ share of traffic coming from search fell from 44% to 36.5% in that time. Interestingly, Similarweb’s data did not show a similar effect for the Journal itself, reporting a five-point increase in the share of traffic derived from search over the same period.

The quality of Similarweb’s data is, I think, questionable. It would be better if we had access to a large-scale first-party source. Luckily, the United States Government operates proprietary analytics software with open access. Though it is not used on all U.S. federal government websites, its data set is both general-purpose — albeit U.S.-focused — and huge: 1.55 billion sessions in the last thirty days. As of writing, 44.1% of traffic in the current calendar year is from organic Google searches, down from 46.4% in the previous calendar year. That is not the steep decline found by Similarweb, but it is a decline nevertheless — enough to drop organic Google search traffic behind direct traffic. I also imagine Google’s A.I. Overviews impact different types of websites differently; the research from Ahrefs and Amisive seems to back this up.

Google has, naturally, disputed the results of Pew’s research. In an extended comment to Search Engine Journal, the company said Pew “use[d] a flawed methodology and skewed queryset that is not representative of Search traffic”, adding “[we] have not observed significant drops in aggregate web traffic”. What Google sees as flaws in Pew’s methodology is not disclosed, nor does the company provide any numbers to support its side of the story. Sundar Pichai, Google’s CEO, has even claimed A.I. Overviews are better for referral traffic than links outside Overviews — but, again, has never provided evidence.

Intuitively, it makes sense to me that A.I. Overviews are going to have a negative impact on click-through rates, because that is kind of the whole point. The amount of information being provided to users on the results page increases while the source of that information is minimized. It also seems like the popular data sources for A.I. Overviews are of mixed quality; according to a Semrush study, Quora is the most popular citation, while Reddit is the second-most popular.

I find all of these studies frustrating and it is not necessarily the fault of the firms conducting them. Try as hard as the search optimization industry has, we still do not have terrifically reliable ways of measuring the impact each new Google feature has on organic search traffic. The party in the best possible position to demystify this — Google — tends to be extremely secretive on the grounds it does not want people gaming its systems. Also, given the vast disconnect between the limited amount Google is saying and the findings of researchers, I am not sure how much I trust its word.

It is possible we cannot know exactly how much of an effect A.I. Overviews will have on search trafic, let alone that of “answer engines” like Perplexity. The best thing any publisher can do at this point is to assume the mutual benefits are going away — and not just in search. Between Google’s legal problems and it fundamentally reshaping how people discover things in search, one has to wonder how it will evolve its advertising business. Publishers have already been prioritizing direct relationships with readers. What about advertisers, too? Even with the unknown future of A.I. technologies, it seems like it would be advantageous to stop relying so heavily on Google.

Google A.I. Summaries and Search Traffic

By: Nick Heer
24 July 2025 at 23:48

Athena Chapekis and Anna Lieb, Pew Research Center:

Google users who encounter an AI summary are less likely to click on links to other websites than users who do not see one. Users who encountered an AI summary clicked on a traditional search result link in 8% of all visits. Those who did not encounter an AI summary clicked on a search result nearly twice as often (15% of visits).

Google users who encountered an AI summary also rarely clicked on a link in the summary itself. This occurred in just 1% of all visits to pages with such a summary.

I looked through this article and the methodology to see how this survey came together, since it seems to me the real question is if A.I. summaries are more or less damaging to search traffic than older features like snippets.

As far as I can figure out, the way Pew did this survey is that it looked for mentions of A.I. among users who consented to having their web browsing data tracked, and then categorized that traffic depending on whether it was a news article about A.I. or an A.I. feature being used. Any Google data without an A.I. summary was, as far as I can see, categorized as not containing an A.I. summary. But this latter category amounted to 82% of all Google searches, and there does not appear to be any differentiation in what features were shown for those. Some may have snippets; others may have some other “zero-click” feature. Some may have no such features at all. Lumping all those together makes it impossible to tell what impact A.I. summaries are having on search compared to Google’s previous attempts to keep users in its bubble.

This survey does a good job of showing how irrelevant the source links are in Google A.I. summaries to search traffic. Much like the citations at the end of a book, they serve as an indicator of something being referenced, but there is no expectation anyone will actually read it to confirm whether the information is accurate. There was such a citation to a Microsoft article ostensibly containing an Excel feature Google made up. Unlike citations in a book, Google’s A.I. summaries are entirely the product of a machine built by people who have only some idea of the output.

⌥ Permalink

Google is Burying the Web Alive

By: Nick Heer
28 May 2025 at 04:28

John Herrman, New York magazine:

But I also don’t want to assume Google knows exactly how this stuff will play out for Google, much less what it will actually mean for millions of websites, and their visitors, if Google stops sending as many people beyond its results pages. Google’s push into productizing generative AI is substantially fear-driven, faith-based, and informed by the actions of competitors that are far less invested in and dependent on the vast collection of behaviors — websites full of content authentic and inauthentic, volunteer and commercial, social and antisocial, archival and up-to-date — that make up what’s left of the web and have far less to lose. […]

Very nearly since it launched, Google has attempted to answer users’ questions as immediately as possible. It had the “I’m Feeling Lucky” button since it was still a stanford.edu subdomain, and it has since steadily changed the results page to more directly respond to queries. But this seems entirely different — a way to benefit from Google’s decades-long ingestion of the web and giving almost nothing back. Or, perhaps, giving back something ultimately worse: invented answers users cannot trust, and will struggle to check because sources are intermingled and buried.

⌥ Permalink

Google’s iOS App Inserts Its Own Links Into Webpages

By: Nick Heer
1 December 2024 at 17:17

Barry Schwartz, Search Engine Roundtable:

Google launched a new feature in the Google App for iOS named Page Annotation. When you are browsing a web page in the Google App native browser, Google can “extract interesting entities from the webpage and highlight them in line.” When you click on them, Google takes you to more search results.

This was announced nearly two weeks ago in a subtle forum post. If there was a press release, I cannot find it. It was only picked up by the press thanks to Schwartz’s November 21 article, but those stories were not published until just before the U.S. Thanksgiving long weekend, so this news was basically buried.

Google is now injecting “Page Annotations”, which are kind of like Skimlinks but with search results. The results from a tapped Page Annotation are loaded in a floating temporary sheet, so it is not like users are fully whisked away — but that is almost worse. In the illustration from Google, a person is apparently viewing a list of Japanese castles, into which Google has inserted a link on “Osaka Castle”. Tapping on an injected link will show Google’s standard search results, which are front-loaded with details about how to contact the castle, buy tickets, and see a map. All of those things would be done better in a view that cannot be accidentally swiped away.

Maybe, you are thinking, it would be helpful to easily trigger a search from some selected text, and that is fair. But the Google app already displays a toolbar with a search button when you highlight any text in this app.

Owners of web properties are only able to opt out by completing a Google Form, but you must be signed into the same Google account you use for Search Console. Also, if a property is accessible at multiple URLs — for example, http and https, or www and non-prefixed — you must include each variation separately.

For Google to believe it has the right to inject itself into third-party websites is pure arrogance, yet it is nothing new for the company. It has long approached the web as its own platform over which it has control and ownership. It overlays dialogs without permission; it invented a proprietary fork of HTML and it pushed its adoption for years. It can only do these things because it has control over how people use the web.

⌥ Permalink

Competition Bureau Sues Google for Anti-Competitive Conduct

By: Nick Heer
28 November 2024 at 23:31

Competition Bureau Canada:

The Competition Bureau is taking legal action against Google for anti-competitive conduct in online advertising technology services in Canada. Following a thorough investigation, the Bureau has filed an application with the Competition Tribunal that seeks to remedy the conduct for the benefit of Canadians.

This has become a familiar announcement: a consumer protection agency, somewhere in the world, is questioning whether a giant technology conglomerate has abused its power. A dam has burst.

⌥ Permalink

Mozilla Is Worried About the Proposed Fixes for Google’s Search Monopoly

By: Nick Heer
27 November 2024 at 00:46

Michael Kan, PC Magazine:

Mozilla points to a key but less eye-catching proposal from the DOJ to regulate Google’s search business, which a judge ruled as a monopoly in August. In their recommendations, federal prosecutors urged the court to ban Google from offering “something of value” to third-party companies to make Google the default search engine over their software or devices. 

“The proposed remedies are designed to end Google’s unlawful practices and open up the market for rivals and new entrants to emerge,” the DOJ told the court. The problem is that Mozilla earns most of its revenue from royalty deals — nearly 86% in 2022 — making Google the default Firefox browser search engine.

This is probably another reason why U.S. prosecutors want to jettison Chrome from Google: they want to reduce any benefit it may accrue from trying to fix its illegal search monopoly. But it seems Google’s position in the industry is so entrenched that correcting it will hurt lots of other businesses, too. That does not mean it should not be broken up or that the DOJ’s proposed remedies are wrong, however.

⌥ Permalink

Mozilla Might Suffer the Gravest Consequences of the Google Antitrust Ruling

By: Nick Heer
7 August 2024 at 23:38

Alfonso Maruccia, TechSpot:

Its most recent financials show Mozilla gets $510 million out of its $593 million in total revenue from its Google partnership. This precarious financial position is a side effect of its deal with Alphabet, which made Google the search engine default for newer Firefox installations.

Jason Del Rey, Fortune:

Mozilla is putting on a brave face for now, and not directly addressing the existential threat that the ruling appears to pose.

“Mozilla has always championed competition and choice online, particularly in search,” a spokesperson said in a statement to Fortune on Monday. “We’re closely reviewing the court’s decision, considering its potential impact on Mozilla and how we can positively influence the next steps… Firefox continues to offer a range of search options, and we remain committed to serving our users’ preferences while fostering a competitive market.”

It is possible Mozilla will not be impacted by remedies to Google’s illegal monopoly, the details of which will begin to take shape next month. It seems possible Mozilla could be losing virtually all its revenue, thereby destabilizing the organization behind one of the few non-Chromium browsers and the best documentation of web technologies available anywhere.

Trying to untangle an illegal monopolist is necessarily difficult. This will be a long and painful process for everyone. The short-term resolutions might be ineffectual and irritating, and they may not change Google’s market position. But it is important to get on the record that Google has engaged in illegal conduct to protect its dominance, and so it will be subjected to new oversight and scrutiny. This exercise is worth it because there ought to be limits to market power and anticompetitive behaviour.

⌥ Permalink

⌥ The Reddit and Google Pairing Is One of a Kind

By: Nick Heer
7 August 2024 at 03:51

Since owners of web properties became aware of the traffic-sending power of search engines — most often Google in most places — they have been in an increasingly uncomfortable relationship as search moves beyond ten relevant links on a page. Google does not need websites, per se; it needs the information they provide. Its business recommendations are powered in part by reviews on other websites. Answers to questions appear in snippets, sourced to other websites, without the user needing to click away.

Publishers and other website owners might consider this a bad deal. They feed Google all this information hoping someone will visit their website, but Google is adding features that make it less likely they will do so. Unless they were willing to risk losing all their Google search traffic, there was little a publisher could do. Individually, they needed Google more than Google needed them.

But that has not been quite as true for Reddit. Its discussions hold a uniquely large corpus of suggestions and information on specific topics and in hyper-local contexts, as well as a whole lot of trash. While the quality of Google’s results have been sliding, searchers discovered they could append “Reddit” to a query to find what they were looking for.

Google realized this and, earlier this year, signed a $60 million deal with Reddit allowing it to scrape the site to train its A.I. features. Part of that deal apparently involved indexing pages in search as, last month, Reddit restricted that capability to Google. That is: if you want to search Reddit, you can either use the site’s internal search engine, or you can use Google. Other search engines still display results created from before mid-July, according to 404 Media, but only Google is permitted to crawl anything newer.

It is unclear to me whether this is a deal only available to Google, or if it is open to any search engine that wants to pay. Even if it was intended to be exclusive, I have a feeling it might not be for much longer. But it seems like something Reddit would only care about doing with Google because other search engines basically do not matter in the United States or worldwide.1 What amount of money do you think Microsoft would need to pay for Bing to be the sole permitted crawler of Reddit in exchange for traffic from its measly market share? I bet it is a lot more than $60 million.

Maybe that is one reason this agreement feels uncomfortable to me. Search engines are marketed as finding results across the entire web but, of course, that is not true: they most often obey rules declared in robots.txt files, but they also do not necessarily index everything they are able to, either. These are not explicit limitations. Yet it feels like it violates the premise of a search engine to say that it will be allowed to crawl and link to other webpages. The whole thing about the web is that the links are free. There is no guarantee the actual page will be freely accessible, but the link itself is not restricted. It is the central problem with link tax laws, and this pay-to-index scheme is similarly restrictive.

This is, of course, not the first time there has been tension in how a site balances search engine visibility and its own goals. Publishers have, for years, weighed their desire to be found by readers against login requirements and paywalls — guided by the overwhelming influence of Google.

Google used to require publishers provide free articles to be indexed by the search engine but, in 2017, it replaced that with a model that is more flexible for publishers. Instead of forcing a certain number of free page views, publishers are now able to provide Google with indexable data.

Then there are partnerships struck by search engines and third parties to obtain specific kinds of data. These were summarized well in the recent United States v. Google decision (PDF), and they are probably closest in spirit to this Reddit deal:

GSEs enter into data-sharing agreements with partners (usually specialized vertical providers) to obtain structured data for use in verticals. Tr. at 9148:2-5 (Holden) (“[W]e started to gather what we would call structured data, where you need to enter into relationships with partners to gather this data that’s not generally available on the web. It can’t be crawled.”). These agreements can take various forms. The GSE might offer traffic to the provider in exchange for information (i.e., data-for-traffic agreements), pay the provider revenue share, or simply compensate the provider for the information. Id. at 6181:7-18 (Barrett-Bowen).

As of 2020, Microsoft has partnered with more than 100 providers to obtain structured data, and those partners include information sources like Fandango, Glassdoor, IMDb, Pinterest, Spotify, and more. DX1305 at .004, 018–.028; accord Tr. at 6212:23–6215:10 (Barrett-Bowen) (agreeing that Microsoft partners with over 70 providers of travel and local information, including the biggest players in the space).

The government attorneys said Bing is required to pay for structured data owing to its smaller size, while Google is able to obtain structured data for free because it sends partners so much traffic. The judge ultimately rejected their argument Microsoft struggled to sign these agreements or it was impeded in doing so, but did not dispute the difference in negotiating power between the two companies.

Once more, for emphasis: Google usually gets structured data for free but, in this case, it agreed to pay $60 million; imagine how much it would cost Bing.

This agreement does feel pretty unique, though. It is hard for me to imagine many other websites with the kind of specific knowledge found aplenty on Reddit. It is a centralized version of the bulletin boards of the early 2000s for such a wide variety of interests and topics. It is such a vast user base that, while it cannot ignore Google referrals, it is not necessarily reliant on them in the same way as many other websites are.

Most other popular websites are insular social networks; Instagram and TikTok are not relying on Google referrals. Wikipedia would probably be the best comparison to Reddit in terms of the contribution it makes to the web — even greater, I think — but every article page I tried except the homepage is overwhelmingly dependent on external search engine traffic.

Meanwhile, pretty much everyone else still has to pay Google for visitors. They have to buy the ads sitting atop organic search results. They have to buy ads on maps, on shopping carousels, on videos. People who operate websites hope they will get free clicks, but many of them know they will have to pay for some of them, even though Google will happily lift and summarize their work without compensation.

I cannot think of any other web property which has this kind of leverage over Google. While this feels like a violation of the ideals and principles that have built the open web on which Google has built its empire, I wonder if Google will make many similar agreements, if any. I doubt it — at least for now. This feels funny; maybe that is why it is so unique, and why it is not worth being too troubled by it.


  1. The uptick of Bing in the worldwide chart appears to be, in part, thanks to a growing share in China. Its market share has also grown a little in Africa and South America, but only by tiny amounts. However, Reddit is blocked in China, so a deal does not seem particularly attractive to either party. ↥︎

‘Google Is a Monopolist’ in Search Says U.S. Judge

By: Nick Heer
5 August 2024 at 21:36

Ashley Belanger, Ars Technica:

Google just lost a massive antitrust trial over its sprawling search business, as US district judge Amit Mehta released his ruling, showing that he sided with the US Department of Justice in the case that could disrupt how billions of people search the web.

“Google is a monopolist, and it has acted as one to maintain its monopoly,” Mehta wrote in his opinion. “It has violated Section 2 of the Sherman Act.”

Google will surely contest this finding when its implications are known; Mehta has not announced what actions the government will take against Google.

The opinion is full of details about the precise nature of how Google search and its ads work together, Google’s relationship with Apple and other third parties, and how its business has changed over time. For example, the judge notes Google adjusted ad pricing to maintain a specific growth target, and increased it incrementally to mask it in the typical fluctuations of ad costs. He also cites a finding that “thirteen months of user data acquired by Google is equivalent to over 17 years of data on Bing” in informing the quality of search results. Meanwhile, Google pays Apple a redacted amount through its revenue sharing agreement for default placement in Safari, and it pays for searches performed through Chrome on Apple devices as well. There is a lot more in here, and I fully intend on re-reading the opinion with a bunch of questions I have in mind.

Google really does have great search results a lot of the time, even though it has stumbled in recent years. DuckDuckGo is my default but I find myself often turning to Google for local results, very old results, and news. (DuckDuckGo is powered by Bing, which prioritizes MSN-syndicated versions of articles that I do not want.) Google has not fallen into the same trap as Bing by wholly cluttering the results page. Microsoft still has no taste.

But two things can be true: Google can be the best search engine for most people, most of the time, because it is very good; and, also, Google can have abused its market-leading position to avoid competition and maintain its advertising revenue. Those are not inconsistent with each other. In fact, per the judge’s citation of how long it would take for Bing to amass the same information about user activity as Google does in a year, it is fully possible its quality and its dominance are related, something the judge nods toward. In fact, Google’s position is now so entrenched “it would not lose search revenue if were to significantly reduce the quality of its search product”.

Notably, Mehta did not sanction Google for failing to preserve evidence in the case, writing:

On the request for sanctions, the court declines to impose them. Not because Google’s failure to preserve chat messages might not warrant them. But because the sanctions Plaintiffs request do not move the needle on the court’s assessment of Google’s liability. […]

In cases where the judge found evidence of monopolistic and abusive behaviour, the lack of supporting text messages and other communications would not have made a difference; this is also true, the judge says, for his finding of a lack of anticompetitive behaviour in SA360.

⌥ Permalink

Cool URLs Mean Something

By: Nick Heer
1 August 2024 at 03:55

Tim Berners-Lee in 1998:

Keeping URIs so that they will still be around in 2, 20 or 200 or even 2000 years is clearly not as simple as it sounds. However, all over the Web, webmasters are making decisions which will make it really difficult for themselves in the future. Often, this is because they are using tools whose task is seen as to present the best site in the moment, and no one has evaluated what will happen to the links when things change. The message here is, however, that many, many things can change and your URIs can and should stay the same. They only can if you think about how you design them.

Jay Hoffmann:

Links give greater meaning to our webpages. Without the link, we would lose this significant grammatical tool native the web. And as links die out and rot on the vine, what’s at stake is our ability to communicate in the proper language of hypertext.

A dead link may not seem like it means very much, even in the aggregate. But they are. One-way links, the way they exist on the web where anyone can link to anything, is what makes the web universal. In fact, the first name for URL’s was URI’s, or Universal Resource Identifier. It’s right there in the name. And as Berners-Lee once pointed out, “its universality is essential.”

In 2018, Google announced it was deprecating its URL shortener, with no new links being created after March 2019. All existing shortened links would, however, remain active. It announced this in a developer blog post which — no joke — returns a 404 error at its original URL, which I found via 9to5Google. Google could not bother to redirect posts from just six years ago to their new valid URLs.

Google’s URL shortener was in the news again this month because the company has confirmed it will turn off these links in August 2025 except for those created via Google’s own apps. Google Maps, for example, still creates a goo.gl short link when sharing a location.

In principle, I support this deprecation because it is confusing and dangerous for Google’s own shortened URLs to have the same domain as ones created by third-party users. But this is a Google-created problem because it designed its URLs poorly. It should have never been possible for anyone else to create links with the same URL shortener used by Google itself. Yet, while it feels appropriate for a Google service to be unreliable over a long term, it also should not be ending access to links which may have been created just about five years ago.

By the way, the Sophos link on the word “dangerous” in that last paragraph? I found it via a ZDNet article where the inline link is — you guessed it — broken. Sophos also could not bother to redirect this URL from 2018 to its current address. Six years ago! Link rot is a scourge.

⌥ Permalink

Third-Party Cookies Have Got to Go

By: Nick Heer
30 July 2024 at 02:26

Anthony Chavez, of Google:

[…] Instead of deprecating third-party cookies, we would introduce a new experience in Chrome that lets people make an informed choice that applies across their web browsing, and they’d be able to adjust that choice at any time. We’re discussing this new path with regulators, and will engage with the industry as we roll this out.

Oh good — more choices.

Hadley Beeman, of the W3C’s Technical Architecture Group:

Third-party cookies are not good for the web. They enable tracking, which involves following your activity across multiple websites. They can be helpful for use cases like login and single sign-on, or putting shopping choices into a cart — but they can also be used to invisibly track your browsing activity across sites for surveillance or ad-targeting purposes. This hidden personal data collection hurts everyone’s privacy.

All of this data collection only makes sense to advertisers in the aggregate, but it only works because of specifics: specific users, specific webpages, and specific actions. Privacy Sandbox is imperfect but Google could have moved privacy forward by ending third-party cookies in the world’s most popular browser.

⌥ Permalink

⌥ Anti Trust in Tech

By: Nick Heer
7 June 2024 at 22:02

If you had just been looking at the headlines from major research organizations, you would see a lack of confidence from the public in big business, technology companies included. For years, poll after poll from around the world has found high levels of distrust in their influence, handling of private data, and new developments.

If these corporations were at all worried about this, they are not much showing it in their products — particularly the A.I. stuff they have been shipping. There has been little attempt at abating last year’s trust crisis. Google decided to launch overconfident summaries for a variety of search queries. Far from helping to sift through all that has ever been published on the web to mash together a representative summary, it was instead an embarrassing mess that made the company look ill prepared for the concept of satire. Microsoft announced a product which will record and interpret everything you do and see on your computer, but as a good thing.

Can any of them see how this looks? If not — if they really are that unaware — why should we turn to them to fill gaps and needs in society? I certainly would not wish to indulge businesses which see themselves as entirely separate from the world.

It is hard to imagine they do not, though. Sundar Pichai, in an interview with Nilay Patel, recognised there were circumstances in which an A.I. summary would be inappropriate, and cautioned that the company still considers it a work in progress. Yet Google still turned it on by default in the U.S. with plans to expand worldwide this year.

Microsoft has responded to criticism by promising Recall will now be a feature users must opt into, rather than something they must turn off after updating Windows. The company also says there are more security protections for Recall data than originally promised but, based on its track record, maybe do not get too excited yet.

These product introductions all look like hubris. Arrogance, really — recognition of the significant power these corporations wield and the lack of competition they face. Google can poison its search engine because where else are most people going to go? How many people would turn off Recall, something which requires foreknowledge of its existence, under Microsoft’s original rollout strategy?

It is more or less an admission they are all comfortable gambling with their customers’ trust to further the perception they are at the forefront of the new hotness.

None of this is a judgement on the usefulness of these features or their social impact. I remain perplexed by the combination of a crisis of trust in new technologies, and the unwillingness of the companies responsible to engage with the public. There seems to be little attempt at persuasion. Instead, we are told to get on board because this rocket ship is taking off with or without us. Concerned? Too bad: the rocket ship is shaped like a giant middle finger.

What I hope we see Monday from Apple — a company which has portrayed itself as more careful and practical than many of its contemporaries — is a recognition of how this feels from outside the industry. Expect “A.I.” to be repeated in the presentation until you are sick of those two letters; investors are going to eat it up. When normal people update their phones in September, though, they should not feel like they are being bullied into accepting our A.I. future.

People need to be given time to adjust and learn. If the polls are representative, very few people trust giant corporations to get this right — understandably — yet these tech companies seem to believe we are as enthusiastic about every change they make as they are. Sorry, we are not, no matter how big a smile a company representative is wearing when they talk about it. Investors may not be patient but many of the rest of us need time.

Google Comments on Its Sloppy Summaries

By: Nick Heer
3 June 2024 at 04:20

Liz Reid, head of Google Search, on the predictably bizarre results of rolling out its “A.I. Overviews” feature:

One area we identified was our ability to interpret nonsensical queries and satirical content. Let’s take a look at an example: “How many rocks should I eat?” Prior to these screenshots going viral, practically no one asked Google that question. You can see that yourself on Google Trends.

There isn’t much web content that seriously contemplates that question, either. This is what is often called a “data void” or “information gap,” where there’s a limited amount of high quality content about a topic. However, in this case, there is satirical content on this topic … that also happened to be republished on a geological software provider’s website. So when someone put that question into Search, an AI Overview appeared that faithfully linked to one of the only websites that tackled the question.

This reasoning sounds almost circular in the context of what A.I. answers are supposed to do. Google loves demonstrating how users can enter a query like “suggest a 7 day meal plan for a college student living in a dorm focusing on budget friendly and microwavable meals” and see a grouped set of responses synthesized from a variety of sources. That is surely a relatively uncommon query. I was going to prove that in the same was as Reid did, but when I enter it in Google Trends, I get a 400 error. Even a shortened version is searched so rarely it has no data.

The organic, non-A.I. search results for the long query are plentiful but do not exactly fulfill its specific criteria. Most of the links I saw are not microwave-only, or are simple lists not grouped into particular meal types. Nothing I could find specifically answers the question posed. In order to fulfill the query in the demo video, Google’s search engine has to look through everything it knows and find meals which cook in a microwave, and organize them into a daily plan of different meal types.

But Google is also blaming the novelty of the rocks query and the satirical information directly answering it for the failure of its A.I. features. In other words, it wants to say cool thing about its A.I. stuff is that it can handle unpopular or new queries by sifting through the web and merging together a bunch of stuff it finds. The bad thing about A.I. stuff, it turns out, is basically the same.

Benj Edwards, Ars Technica:

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Reid says Google has made a bunch of changes to address the issues raised, but none of them fix a fundamental shift in A.I. results. Google used to be a directory — admittedly one ranked by mysterious criteria — allowing users to decide which results best fit their needs. It has slowly repositioned itself to being able to answer their queries with authority. Its A.I. answers are a more fulsome realization of features like Featured Snippets and the Answer Box. That is: instead of seeing options which may match their query, Google is now giving searchers singular answers. It has transformed from a referrer into an omniscient responder.

⌥ Permalink

Google Leaked Itself

By: Nick Heer
29 May 2024 at 14:52

Rand Fishkin, writing on the SparkToro blog:

On Sunday, May 5th, I received an email from a person claiming to have access to a massive leak of API documentation from inside Google’s Search division. The email further claimed that these leaked documents were confirmed as authentic by ex-Google employees, and that those ex-employees and others had shared additional, private information about Google’s search operations.

It seems this vast amount of information was published erroneously by Google to a GitHub repository in March, and then removed earlier this month. As Fishkin writes, it is evidence Google has been dishonest in its public statements about how Google Search works.

Fishkin specifically calls attention to media outlets that cover search engines and value the word of Google’s spokespeople. This has been a clever play by Google for years: because its specific ranking criteria have not been publicly known, it can confirm or deny rumours without having to square them with what the evidence shows.

Google’s ranking system seems to be biased in favour of larger businesses and more established websites, according to Fishkin’s analysis. This is not surprising. I am wondering how this fits with the declining quality of Google search results as small, highly-optimized pages full of machine-generated junk seem to rise to the top.

Mike King, iPullRank:

You’d be tempted to broadly call these “ranking factors,” but that would be imprecise. Many, even most, of them are ranking factors, but many are not. What I’ll do here is contextualize some of the most interesting ranking systems and features (at least, those I was able to find in the first few hours of reviewing this massive leak) based on my extensive research and things that Google has told/lied to us about over the years.

“Lied” is harsh, but it’s the only accurate word to use here. While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries. My advice to future Googlers speaking on these topics: Sometimes it’s better to simply say “we can’t talk about that.” Your credibility matters, and when leaks like this and testimony like the DOJ trial come out, it becomes impossible to trust your future statements.

One of the things potentially tracked by Google for search purposes is Chrome browsing data, something Google has denied. The variable in question — chromeInTotal — and the minimal description offered — “site-level Chrome views” — seem open to interpretation. Perhaps this is only recorded in some circumstances, or it depends on user preferences, or is not actually part of search rankings, or is entirely unused. But it certainly suggests aggregate website visits in Chrome, the world’s most popular web browser, are used to inform rankings without users’ knowledge.

Update: Google says the leaked documents are real, but warns “against making inaccurate assumptions”. In fairness, I would like to make more accurate assumptions.

⌥ Permalink

Google’s A.I. Answers Said to Put Glue in Pizza, So Katie Notopoulos Made Some Pizza

By: Nick Heer
25 May 2024 at 05:53

Jason Koebler, 404 Media:

The complete destruction of Google Search via forced AI adoption and the carnage it is wreaking on the internet is deeply depressing, but there are bright spots. For example, as the prophecy foretold, we are learning exactly what Google is paying Reddit $60 million annually for. And that is to confidently serve its customers ideas like, to make cheese stick on a pizza, “you can also add about 1/8 cup of non-toxic glue” to pizza sauce, which comes directly from the mind of a Reddit user who calls themselves “Fucksmith” and posted about putting glue on pizza 11 years ago.

Katie Notopoulos, putting the “business” in Business Insider:

I knew my assignment: I had to make the Google glue pizza. (Don’t try this at home! I risked myself for the sake of the story, but you shouldn’t!)

My timeline on three entirely separate social networks — Bluesky, Mastodon, and Threads — has been chock full of examples of Google’s A.I. answers absolutely eating dirt — or, in one case, rocks — in the face of obvious satire and shitposting. Well, obvious to us. Computers, it seems, have not figured out glue and gasoline are bad for food.

The A.I. answers from Google are not all yucks and chuckles, unfortunately.

Nic Lake:

Yesterday (Part 1) I saw that mushrooms post, and knew something like that was going to get people hurt. I didn’t really think that (CONTENT WARNING) asking how best to deal with depression was going to be next on the “shit I didn’t want to see” Bingo card.

The organizations know. They know that these tools are not ready. They call it a “beta” and feed it to you anyway.

Google is manually removing A.I. results where appropriate, and it is claiming some of the screenshots which have been circulating have been faked in some way without specifying which.

To quote week-ago me:

Given the sliding quality of Google’s results, it seems quite bold for the company to be confident users worldwide will trust its generated answers.

Quite bold, indeed.

I do not expect perfection, but it is downright embarrassing that Google rolled out a product so unreliable and occasionally dangerous it continues to tarnish a reputation already suffering. Google’s Featured Snippets were bad enough. Now it is in the process of rolling out a whole new level of overconfident nonsense to the entire world, fixing it as everyone tests its limits.

⌥ Permalink

Google Is Expanding A.I. Feature Availability in Search

By: Nick Heer
15 May 2024 at 04:39

Liz Reid, head of Google Search:

People have already used AI Overviews billions of times through our experiment in Search Labs. They like that they can get both a quick overview of a topic and links to learn more. We’ve found that with AI Overviews, people use Search more, and are more satisfied with their results.

So today, AI Overviews will begin rolling out to everyone in the U.S., with more countries coming soon. That means that this week, hundreds of millions of users will have access to AI Overviews, and we expect to bring them to over a billion people by the end of the year.

Given the sliding quality of Google’s results, it seems quite bold for the company to be confident users worldwide will trust its generated answers. I am curious to try it when it is eventually released in Canada.

I know what you must be thinking: if Google is going to generate results without users clicking around much, how will it sell ad space? It is a fair question, reader.

Gerrit De Vynck and Cat Zakrzewski, Washington Post:

Google has largely avoided AI answers for the moneymaking searches that host ads, said Andy Taylor, vice president of research at internet marketing firm Tinuiti.

When it does show an AI answer on “commercial” searches, it shows up below the row of advertisements. That could force websites to buy ads just to maintain their position at the top of search results.

This is just one source speaking to the Post. I could not find any corroborating evidence or a study to support this, even on Tinuiti’s website. But I did notice — halfway through Google’s promo video — a query for “kid friendly places to eat in dallas” was answered with an ad for Hopdoddy Burger Bar before any clever A.I. stuff was shown.

Obviously, the biggest worry for many websites dependent on Google traffic is what will happen to referrals if Google will simply summarize the results of pages instead of linking to them. I have mixed feelings about this. There are many websites which game search results and overwhelm queries with their own summaries. I would like to say “good riddance”, but I also know these pages did not come out of nowhere. They are a product of trying to improve website rankings on Google for all searches, and to increase ad and affiliate revenue from people who have clicked through. Neither one is a laudable goal in its own right. Yet anyone who has paid attention to the media industry for more than a minute can kind of understand these desperate attempts to grab attention and money.

Google built entire industries, from recipe bloggers to search optimization experts. What happens when it blows it all up?

Good thing home pages are back.

⌥ Permalink

❌
❌