Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Who funds quantum research?

By: VM
11 March 2025 at 05:32
Who funds quantum research?

An odd little detail in a Physics World piece on Microsoft’s claim to have made a working topological qubit:

Regardless of the debate about the results and how they have been announced, researchers are supportive of the efforts at Microsoft to produce a topological quantum computer. “As a scientist who likes to see things tried, I’m grateful that at least one player stuck with the topological approach even when it ended up being a long, painful slog,” says [Scott] Aaronson.

“Most governments won’t fund such work, because it’s way too risky and expensive,” adds [Winfried] Hensinger. “So it’s very nice to see that Microsoft is stepping in there.”

In drug development, defence technologies, and life sciences research, to name a few, we’ve seen the opposite: governments fund the risky, expensive part for many years, often decades, until something viable emerges. Then the IP moves to public and private sector enterprises for commercialisation, sometimes together with government subsidies to increase public access. With pharmaceuticals in particular, the government often doesn’t recoup investments it has made in the discovery phase, which includes medical education and research. An illustrative recent example is the development of mRNA vaccines; from my piece in The Hinducriticising the medicine Nobel Prize for this work:

Dr. Kariko and Dr. Weissman began working together on the mRNA platform at the University of Pennsylvania in the late 1990s. The University licensed its patents to mRNA RiboTherapeutics, which sublicensed them to CellScript, which sublicensed them to Moderna and BioNTech for $75 million each. Dr. Karikó joined BioNTech as senior vice-president in 2013, and the company enlisted Pfizer to develop its mRNA vaccine for COVID-19 in 2020.

Much of the knowledge that underpins most new drugs and vaccines is unearthed at the expense of governments and public funds. This part of drug development is more risky and protracted, when scientists identify potential biomolecular targets within the body on which a drug could act in order to manage a particular disease, followed by identifying suitable chemical candidates. The cost and time estimates of this phase are $1billion-$2.5 billion and several decades, respectively.

Companies subsequently commoditise and commercialise these entities, raking in millions in profits, typically at the expense of the same people whose taxes funded the fundamental research. There is something to be said for this model of drug and vaccine development, particularly for the innovation it fosters and the eventual competition that lowers prices, but we cannot deny the ‘double-spend’ it imposes on consumers — including governments — and the profit-seeking attitude it engenders among the companies developing and manufacturing the product.

Quantum computing may well define the next technological revolution together with more mature AI models. Topological quantum computing in particular — if realised well enough to compete with alternative architectures based on superconducting wires and/or trapped ions — could prove especially valuable for its ability to be more powerful with fewer resources. Governments justify their continuing sizeable expense on drug development by the benefits that eventually accrue to the country’s people. By all means, quantum technologies will have similar consequences, following from a comparable trajectory of development where certain lines of inquiry are not precluded because they could be loss-making or amount to false starts. And they will impinge on everything from one’s fundamental rights to national security.

But Hensinger’s opinion indicates the responsibility of developing this technology has been left to the private sector. I wonder if there are confounding factors here. For example, is Microsoft’s pursuit of a topological qubit the exception to the rule — i.e. one of a few enterprises that are funded by a private organisation in a sea of publicly funded research? Another possibility is that we’re hearing about Microsoft’s success because it has a loud voice, with the added possibility that its announcement was premature (context here). It’s also possible Microsoft’s effort included grants from NSF, DARPA or the like.

All this said, let’s assume for a moment that what Hensinger said was true of quantum computing research in general: the lack of state-led development in such potentially transformative technologies raises two (closely related) concerns. The first is scientific progress, especially that it will happen behind closed doors. In a June 2023 note, senior editors of the Physical Review B journal acknowledged the contest between the importance of researchers sharing their data for scrutiny, replication, and for others to build on their work — all crucial for science — and private sector enterprises’ need to protect IP and thus withhold data. “This will not be the last time the American Physical Society confronts a tension between transparency and the transmission of new results,” they added. Unlike in drug development, life sciences, etc., even the moral argument that publicly funded research must be in the public domain is rendered impotent, although it can still be recast as the weaker “research that affects the public sphere…”.

The second is democracy. In a March 2024 commentary, digital governance experts Nathan Sanders, Bruce Schneier, and Norman Eisen wrote that the state could develop a “public AI” to counter the already apparent effects of “private AI” on democratic institutions. According to them, a “public AI” model could “provide a mechanism for public input and oversight on the critical ethical questions facing AI development,” including “how to incorporate copyrighted works in model training” and “how to license access for sensitive applications ranging from policing to medical use”. They added: “Federally funded foundation AI models would be provided as a public service, similar to a health care private option. They would not eliminate opportunities for private foundation models, but they would offer a baseline of price, quality, and ethical development practices that corporate players would have to match or exceed to compete.”

Of course, quantum computing isn’t beset by the same black-box problem that surrounds AI models, yet what it implies for our ability to secure digital data means it could still benefit from state-led development. Specifically: (i) a government-funded technology standard could specify the baseline for the private sector to “match or exceed to compete” so that computers deployed to secure public data maintain a minimum level of security; (ii) private innovation can build on the standard, with the advantage of not having to lay new foundations of their own; and (iii) the data and the schematics pertaining to the standard should be in the public domain, thus restricting private-sector IP to specific innovations.[1]


[1] Contrary to a lamentable public perception, just knowing how a digital technology works doesn’t mean it can be hacked.

Majorana 1, science journalism, and other things

By: VM
28 February 2025 at 06:42
Majorana 1, science journalism, and other things

While I have many issues with how the Nobel Prizes are put together as an institution, the scientific achievements they have revealed have been some of the funnest concepts I’ve discovered in science, including the clever ways in which scientists revealed them. If I had to rank them on this metric, the first place would be a tie between the chemistry and the physics prizes of 2016. The chemistry prize went to Jean-Pierre Sauvage, Fraser Stoddart, and Ben Feringa for “for the design and synthesis of molecular machines”. Likewise, the physics prize was shared between David Thouless, Duncan Haldane, and John Kosterlitz “for theoretical discoveries of topological phase transitions and topological phases of matter”. If you like, you can read my piece about the 2016 chemistry prize here. A short excerpt about the laureates’ work:

… it is fruitless to carry on speculating about what these achievements could be good for. J. Fraser Stoddart, who shared the Nobel Prize last year with Feringa for having assembled curious molecular arrangements like Borromean rings, wrote in an essay in 2005, “It is amazing how something that was difficult to do in the beginning will surely become easy to do in the event of its having been done. The Borromean rings have captured our imagination simply because of their sheer beauty. What will they be good for? Something for sure, and we still have the excitement of finding out what that something might be.” Feringa said in a 2014 interview that he likes to build his “own world of molecules”. In fact, Stoddart, Feringa and Jean-Pierre Sauvage shared the chemistry prize for having developed new techniques to synthesise and assemble organic molecules in their pursuits.

In the annals of the science Nobel Prizes, there are many, many laureates who allowed their curiosity about something rather than its applications to guide their research. In the course of these pursuits, they developed techniques, insights, technologies or something else that benefited their field as a whole but which wasn’t the end goal. Over time the objects of many of these pursuits have also paved the way for some futuristic technology themselves. All of this is a testament to the peculiar roads the guiding light of curiosity opens. Of course, scientists need specific conditions of their work to be met before they can commitment themselves to such lines of inquiry. For just two examples, they shouldn’t be under pressure to publish papers and they shouldn’t have to worry about losing their jobs if they don’t file patents. I can also see where the critics of such blue-sky research stand and why: while there are benefits, it’s hard to say ahead of time what they might be and when they might appear.

This said, the work that won the 2016 physics prize is of a similar nature and also particularly relevant in light of a ‘development’ in the realm of quantum computing earlier this month. Two of the three laureates, Thouless and Kosterlitz, performed an experiment in the 1970s in which they found something unusual. To quote from my piece in The Hindu on February 23:

If you cool some water vapour, it will become water and then ice. If you keep lowering the temperature until nearly absolute zero, the system will have minimal thermal energy, allowing quantum states of matter to show. In the 1970s, Michael Kosterlitz and David Thouless found that the surface of superfluid helium sometimes developed microscopic vortices that moved in pairs. When they raised the temperature, the vortices decoupled and moved freely. It was a new kind of … phase transition: the object’s topological attributes changed in response to changes in energy [rather than it turning from liquid to gas].

The findings here, followed by many others that followed, together with efforts by physicists to describe this new property of matter using mathematics, in harmony with other existing theories of nature all laid the foundation for Microsoft’s February 19 announcement: that it had developed a quantum-computing chip named Majorana 1 with topological qubits inside. (For more on this, please read my February 23 piece.) Microsoft has been trying to build this chip since at least 2000, when a physicist then on the company’s payroll named Alexei Kitaev published a paper exploring its possibility. Building the thing was a tall order, requiring advances in a variety of fields that eventually had to be brought together in just the right way, but Microsoft knew that if it succeeded the payoff would be tremendous.

This said, even if this wasn’t curiosity-driven research on Microsoft’s part, such research has already played a big role in both the company’s and the world’s fortunes. In the world’s fortune because, as with the work of Stoddart, Feringa, and Sauvage, the team explored, invented and/or refined new methods en route to building Majorana 1, methods which the rest of the world can potentially use to solve other problems. And in the company’s fortune because while Kitaev’s paper was motivated by the possibility of a device of considerable technological and commercial value, it drew from a large body of knowledge that — at the time it was unearthed and harmonised with the rest of science — wasn’t at all concerned with a quantum-computing chip in its then-distant future. For all its criticism, blue-sky research leads to some outcomes that no other forms of research can. This isn’t an argument in support of it so much as in defence of not sidelining it altogether.

While I have many issues with how the Nobel Prizes are put together as an institution, I’ve covered each edition with not inconsiderable excitement[1]. Given the fondness of the prize-giving committee for work on or with artificial intelligence last year, it’s possible there’s a physics prize vouchsafed for work on the foundations of contemporary quantum computers in the not-too-distant future. When it comes to pass, I will be all too happy to fall back on the many pieces I’ve written on this topic over the years, to be able to confidently piece together the achievements in context and, personally, to understand the work beyond my needs as a journalist, as a global citizen. But until that day, I can’t justify the time I do spend reading up about and writing on this and similar topics as a journalist in a non-niche news publication — one publishing reports, analyses, and commentary for a general audience rather than those with specialised interests.

The justification is necessary at all because the time I spend doing something is time spent not doing something else and the opportunity cost needs to be rational in the eyes of my employers. At the same time, journalism as a “history of now” would fail if it didn’t bring the ideas, priorities, and goals at play in the development of curiosity-driven research and — with the benefit of hindsight — its almost inevitable value for commerce and strategy to the people at large. This post so far, until this point, is the preamble I had in mind for my edition of The Hindu’s Notebook column today. Excerpt:

It isn’t until a revolutionary new technology appears that the value of investing in basic research becomes clear. Many scientists are rooting for more of it. India’s National Science Day, today, is itself rooted in celebrating the discovery of the Raman effect by curiosity-driven study. The Indian government also wants such research in this age of quantum computing, renewable energy, and artificial intelligence. But it isn’t until such technology appears that the value of investing in a science journalism of the underlying research — slow-moving, unglamorous, not application-oriented — also becomes clear. It might even be too late by then.

The scientific ideas that most journalists have overlooked are still very important: they’re the pillars on which the technologies reshaping the world stand. So it’s not fair that they’re overlooked when they’re happening and obscured by other concerns by the time they’ve matured. Without public understanding, input, and scrutiny in the developmental phase, the resulting technologies have fewer chances to be democratic, and the absence of the corresponding variety of journalism is partly to blame.

I would have liked to include the preamble with the piece itself but the word limit is an exacting 620. This is also why I left something else unsaid in the piece, something important for me, the author, to have acknowledged. After the penultimate line — “You might think just the fact that journalists are writing about an idea should fetch it from the fringes to the mainstream, but it does not” — I wanted to say there’s a confounding factor: the skills, choices, and circumstances of the journalists themselves. If a journalist isn’t a good writer[2] or doesn’t have the assistance of good editors, what they write about curiosity-driven research, which already runs on weak legs among the people at large, may simply pass through their feeds and newsletters without inviting even a “huh?”. But as I put down the aforementioned line, a more discomfiting thought erupted at the back of my mind.

In 2017, on the Last Word on Nothing blog, science journalist Cassandra Willyard made a passionate case for the science journalism of obscure things to put people at its centre in order to be effective. The argument’s allure was obvious but it has never sat well with me. The narrative power of human emotion, drawn from the highs or lows in the lives of the people working on obscure scientific ideas, is in being able to render those ideas more relatable. But my view is that there’s a lot out there we may never write about if we couldn’t also write about what highs/lows it rendered among its discoverers or beholders, and more so if such highs/lows don’t exist at all, as is often the case with a big chunk of curiosity-driven research. Willyard herself had used the then-recent example of the detection of gravitational waves from two neutron stars smashing into each other billions of lightyears away. This is conveniently (but perhaps not by her design) an example of Big Science where many people spent a long time looking for something and finally found it. There’s certainly a lot of drama here.

But the reason I call having to countenance Willyard’s arguments discomfiting is that I understand what she’s getting at and I know I’m rebutting it on the back of only a small modicum of logic. It’s a sentimental holdout, even: I don’t want to have to care about the lives of other people when I know I care very well for how we extracted a world’s worth of new information by ‘reading’ gravitational waves emitted by a highly unusual cosmic event. The awe, to me, is right there. Yet I’m also keenly aware how impactful the journalism advocated by Willyard can be, having seen it in ‘action’ in the feature-esque pieces published by science magazines, where the people are front and centre, and the number of people that read and talk about them.

I hold out because I believe there are, like me, many people out there (I’ve met a few) that can be awed by narratives of neutron-star collisions that dispense with invoking the human condition. I also believe that while a large number of people may read those feature-esque pieces, I’m not convinced they have a value that goes beyond storytelling, which is of course typically excellent. But I suppose those narratives of purely scientific research devoid of human protagonists (or antagonists) would have to be at least as excellent in order to captivate audiences just as well. If a journalist — together with the context in which they produce their work — isn’t up to the mark yet, they should strive to be. And this striving is essential if “you might think just the fact that journalists are writing about an idea should fetch it from the fringes to the mainstream, but it does not” is to be meaningful.


[1] Not least because each Nobel Prize announcement is accompanied by three press releases: one making the announcement, one explaining the prize-winning work to a non-expert audience, and one explaining it in its full technical context. Journalism with these resources is actually quite enjoyable. This helps, too.

[2] Im predominantly a textual journalist and default to write when writing about journalistic communication. But of course in this sentence I mean journalists who arent good writers and/or good video-makers or editors and/or good podcasters, etc.

Google’s iOS App Inserts Its Own Links Into Webpages

By: Nick Heer
1 December 2024 at 17:17

Barry Schwartz, Search Engine Roundtable:

Google launched a new feature in the Google App for iOS named Page Annotation. When you are browsing a web page in the Google App native browser, Google can “extract interesting entities from the webpage and highlight them in line.” When you click on them, Google takes you to more search results.

This was announced nearly two weeks ago in a subtle forum post. If there was a press release, I cannot find it. It was only picked up by the press thanks to Schwartz’s November 21 article, but those stories were not published until just before the U.S. Thanksgiving long weekend, so this news was basically buried.

Google is now injecting “Page Annotations”, which are kind of like Skimlinks but with search results. The results from a tapped Page Annotation are loaded in a floating temporary sheet, so it is not like users are fully whisked away — but that is almost worse. In the illustration from Google, a person is apparently viewing a list of Japanese castles, into which Google has inserted a link on “Osaka Castle”. Tapping on an injected link will show Google’s standard search results, which are front-loaded with details about how to contact the castle, buy tickets, and see a map. All of those things would be done better in a view that cannot be accidentally swiped away.

Maybe, you are thinking, it would be helpful to easily trigger a search from some selected text, and that is fair. But the Google app already displays a toolbar with a search button when you highlight any text in this app.

Owners of web properties are only able to opt out by completing a Google Form, but you must be signed into the same Google account you use for Search Console. Also, if a property is accessible at multiple URLs — for example, http and https, or www and non-prefixed — you must include each variation separately.

For Google to believe it has the right to inject itself into third-party websites is pure arrogance, yet it is nothing new for the company. It has long approached the web as its own platform over which it has control and ownership. It overlays dialogs without permission; it invented a proprietary fork of HTML and it pushed its adoption for years. It can only do these things because it has control over how people use the web.

⌥ Permalink

On the 2024 Nobel Prizes and the Rosalind Lee issue

10 October 2024 at 02:30
On the 2024 Nobel Prizes and the Rosalind Lee issue

The Nobel Prizes are a deeply flawed institution both out of touch with science as it is done today and with an outsized influence on scientific practice at the most demanding levels. Yet these relationships all persist with the prizes continuing to crown some of the greatest achievements in the history of modern science.

What the Nobel Prizes are not
The winners of this year’s Nobel Prizes are being announced this week. The prizes are an opportunity to discover new areas of research, and developments there that scientists consider particularly notable. In this endeavour, it is equally necessary to remember what the Nobel Prizes are not. For starters, the
On the 2024 Nobel Prizes and the Rosalind Lee issueClose ReadVasudevan Mukunth
On the 2024 Nobel Prizes and the Rosalind Lee issue

The prizes are exclusive by design and their prestige is enforced through a system of secrecy: the reasons for picking each laureate are locked away for 50 years even as the selection process happens behind closed doors. In keeping with a historical tradition of all prizes being distinguished by their laureates, the Nobel Prizes are sought after so scientists can enter the same ranks that hold Niels Bohr, Albert Einstein, Marie Curie, etc.

Of course the institution like others of its kind reinforces the need for itself, creating self-fulfilling conditions by mooching off the reputation of scientists who have laboured for decades in specific social, economic, cultural, and political contexts to produce knowledge of incredible value and in return conferring a reputation of a different kind. This is why Jean-Paul Sartre tried to decline the Nobel Prize for literature in 1964.

Then again, the way the award-giving foundation conducts the prizes’ announcements has also helped to ameliorate the neglectful treatment many sections of the mainstream media, especially in India, have meted out to the sort of scientific work the prizes fete, even if the foundation’s conduct also panders to the causes of such treatment.

The prizes

I think the Nobel Prizes for physiology/medicine and for physics caught many science communicators off guard because they were both concerned with very involved pieces of work with no direct applications. The medicine prize was for the discovery of microRNA and post-transcriptional gene regulation, which when it happened overturned what biologists had assumed was a complete picture of how the body’s cells regulate genes to make different proteins.

The physics prize was for the first work on artificial neural networks (ANNs), which produced a machine-friendly version of cognition by drawing on ideas in biology, neuropsychology, and statistical mechanics. If this work hadn’t happened, ChatGPT may not exist today, but several other developments built on the first ANNs to produce more new knowledge whose accumulation eventually led to ChatGPT et al. Ergo, calling ChatGPT et al. an application of the first ANNs would be thoroughly misguided.

The chemistry prize — for the development of computational tools to design proteins and to predict their structures — presented a slightly different problem: the tools' advent meant humans suddenly found themselves spending much less time on deciphering the structures, yet the tools didn't, and still don't, say why proteins prefer these structures over others. Scientists still need to figure out the why by themselves.

All this said, I’m grateful this year as I’ve been before for the prizes’ ability to throw up an opportunity for all sections of the media to discuss scientific work many of them would most likely have neglected otherwise. Reading the research papers that first reported the existence of microRNA and the papers that explained how models to understand exotic states of matter lent themselves to the first ANN concepts allowed me personally to refresh my basics as well as be reminded of the ability of blue-sky scicomm — as a direct counterpart of blue-sky research, one that isn’t fixated on applications — to wow us.


This post benefited from feedback from Thomas Manuel and Mahima Jain.


The Rosalind Lee issue

To reiterate from the introduction, the Nobel Prizes are one institution with deep and well-defined flaws. And I have learnt from (journalistic) experience that there’s no changing its mind. It's too big to change and doesn’t admit the need to do so, and its members have had no compunctions about articulating that in public. The vast majority of scientists also subscribe to the prizes’ value and their general desirability. So it is my view today that we work around the prizes and/or renounce the prizes altogether when dealing with the award-giving group’s choices.

Caste, and science’s notability threshold
A webinar by The Life of Science on the construct of the ‘scientific genius’ just concluded, with Gita Chadha and Shalini Mahadev, a PhD scholar at HCU, as panellists. It was an hour long and I learnt a lot in this short time, which shouldn’t be surprising because, more
On the 2024 Nobel Prizes and the Rosalind Lee issueClose ReadVasudevan Mukunth
On the 2024 Nobel Prizes and the Rosalind Lee issue

A third option is to change the foundation’s mind but this requires a considerable amount of collective work to which I doubt more than a few would like to dedicate themselves. Mind-changing work is demanding work. Then again the problem is if you fall anywhere in between these two more-viable options, you risk admitting other possibilities vis-à-vis the Nobel Prizes that (I imagine) you’d rather not.

For a background on the Rosalind Lee issue, I suggest you browse X.com. My notes on it follow:

(i) The Nobel Foundation has historically reserved the Nobel Prizes for persons who conceived of important ideas and made testable predictions about them. The latter is important. IIRC this is why SN Bose didn't win a Nobel Prize for coming up with Bose-Einstein particle statistics. Albert Einstein could have won instead because he built on Bose's ideas to predict the existence of a particular state of matter: the Bose-Einstein condensate. Who came up with the testable predictions in the paper that won Victor Ambros a share of the medicine Nobel Prize?

I’m not directly defending the exclusion of Rosalind Lee, who was the first author of that and in fact many of the more important papers Ambros published in his career. Instead, I’m pointing to an answer that could explain her exclusion with a reminder that the answer is flawed and that it has always been flawed. I suppose I’m saying that we couldn’t have expected better. 🙃

(ii) Physics World recently published an interview with Lars Brink, a physicist who has been part of the decision-making for many physics prizes the last decade. Brinks bluntly states at one point that the Nobel Academy doesn't give the prizes to collaborations or in fact even more than three people at a time because they don't want 5,000 people (for example at CERN) claiming they're Nobel laureates all of a sudden. There is an explicit and deliberate design here to keep the prizes exclusive, like Hermes handbags.

(iii) The first author is often the one who designs the experiment, performs it, collects the data, analyses it, etc. — basically everything beyond, but not necessarily excluding, the act of having an idea itself and including most of the legwork. The Nobel Prizes however are not awards for legwork. This sucks because it’s a profound misunderstanding of the people required to produce good-quality scientific knowledge.

Thanks to the influence the prizes exert on the scientific community, the people who are left out also fade further — in the public view and also in terms of not being able to benefit from the systematic rewards vouchsafed for the Nobel laureates who are now institutions unto themselves. The fading is likely compounded for people already struggling to be noticed in the scientific literature: the “technicians” who equip, maintain, and operate laboratory instruments, among others (a.k.a. the Matthew effect). Of course the axis of discrimination is gendered as well: as one friend put it, “the ‘leg work’ of science is historically feminised”, and when awards and other forms of recognition exclude such work they perpetuate the Matilda effect.

How cut-throat competition forces scientists to act against the collective
Brian Keating, an astrophysicist who led the infamous cosmic inflation announcement in 2014, thinks this is how science works: “… you put out a result, and other scientists work to test the result”. However, his own story shows that this is a cute ideal that’s often unreasonable to expect on
On the 2024 Nobel Prizes and the Rosalind Lee issueClose ReadVasudevan Mukunth
On the 2024 Nobel Prizes and the Rosalind Lee issue

Overall, whether the prize-giving body is aware of these narratives and issues is moot. What matters is that it acknowledges and responds to them — which it has signalled it won’t do. QED.

(iv) In fact, all these rules of the Nobel Prizes are arbitrary. It's effectively a sport and a poorly managed one at that. You make up a playing field, publicise some of the rules, keep the governing body beyond reach or reproach, hide the scorecard, and then you say you have to jump five feet in the air to qualify. The outragers are raising their voices for Rosalind Lee (what does she want, by the way?) but not for the first authors of all the other papers by other laureates over the years. If they don’t belong to marginalised social groups, is it okay to leave them out? Then again these are moot questions, pursuits leading nowhere at all thanks to the Nobel Prizes’ presumption that they’re not of this world.

The Nobel Prizes have also wronged many women, but I can't claim to know whether there's a case-by-case explanation (with arbitrary foundations) or if it was a systematic program to do so. Both seem equally likely given how slow attitudes have been to change on this front. This said, just because women have been wronged doesn’t mean all forms of reparation will be equally useful. More specifically, what will breaking the (arbitrary) rules do to change for women in science?

Obviously this is part of a broader question about the influence of the Nobel Prizes on doing science. Mukund Thattai ran a survey on Twitter years ago asking scientists about why they got into or stayed in science. "Because of a Nobel laureate” received the fewest votes in a large pool of respondents. It wasn’t a representative survey but it does hint at an important piece of reality. Once we start to argue that including Rosalind Lee would have been better, we also tacitly admit the Nobel Prizes matter for who chooses to stay in science and who is condemned to fade — but do they?

On the other side of this coin lie all the other prizes that did fete Rosalind Lee along with Victor Ambros. If we’d like to have any prizes at all (I don’t but YMMV), shall we celebrate the Newcomb Cleveland Prize more than the Nobel Prizes? Likewise, by railing against Rosalind Lee’s exclusion on arbitrary grounds, what do we hope to achieve? It may be more gainful to spread awareness of the Nobel Prizes’ flaws and finitude and focus on the deeper question of how the opportunities to win X award can influence the way science is done, who does it, and why.

⌥ The Reddit and Google Pairing Is One of a Kind

By: Nick Heer
7 August 2024 at 03:51

Since owners of web properties became aware of the traffic-sending power of search engines — most often Google in most places — they have been in an increasingly uncomfortable relationship as search moves beyond ten relevant links on a page. Google does not need websites, per se; it needs the information they provide. Its business recommendations are powered in part by reviews on other websites. Answers to questions appear in snippets, sourced to other websites, without the user needing to click away.

Publishers and other website owners might consider this a bad deal. They feed Google all this information hoping someone will visit their website, but Google is adding features that make it less likely they will do so. Unless they were willing to risk losing all their Google search traffic, there was little a publisher could do. Individually, they needed Google more than Google needed them.

But that has not been quite as true for Reddit. Its discussions hold a uniquely large corpus of suggestions and information on specific topics and in hyper-local contexts, as well as a whole lot of trash. While the quality of Google’s results have been sliding, searchers discovered they could append “Reddit” to a query to find what they were looking for.

Google realized this and, earlier this year, signed a $60 million deal with Reddit allowing it to scrape the site to train its A.I. features. Part of that deal apparently involved indexing pages in search as, last month, Reddit restricted that capability to Google. That is: if you want to search Reddit, you can either use the site’s internal search engine, or you can use Google. Other search engines still display results created from before mid-July, according to 404 Media, but only Google is permitted to crawl anything newer.

It is unclear to me whether this is a deal only available to Google, or if it is open to any search engine that wants to pay. Even if it was intended to be exclusive, I have a feeling it might not be for much longer. But it seems like something Reddit would only care about doing with Google because other search engines basically do not matter in the United States or worldwide.1 What amount of money do you think Microsoft would need to pay for Bing to be the sole permitted crawler of Reddit in exchange for traffic from its measly market share? I bet it is a lot more than $60 million.

Maybe that is one reason this agreement feels uncomfortable to me. Search engines are marketed as finding results across the entire web but, of course, that is not true: they most often obey rules declared in robots.txt files, but they also do not necessarily index everything they are able to, either. These are not explicit limitations. Yet it feels like it violates the premise of a search engine to say that it will be allowed to crawl and link to other webpages. The whole thing about the web is that the links are free. There is no guarantee the actual page will be freely accessible, but the link itself is not restricted. It is the central problem with link tax laws, and this pay-to-index scheme is similarly restrictive.

This is, of course, not the first time there has been tension in how a site balances search engine visibility and its own goals. Publishers have, for years, weighed their desire to be found by readers against login requirements and paywalls — guided by the overwhelming influence of Google.

Google used to require publishers provide free articles to be indexed by the search engine but, in 2017, it replaced that with a model that is more flexible for publishers. Instead of forcing a certain number of free page views, publishers are now able to provide Google with indexable data.

Then there are partnerships struck by search engines and third parties to obtain specific kinds of data. These were summarized well in the recent United States v. Google decision (PDF), and they are probably closest in spirit to this Reddit deal:

GSEs enter into data-sharing agreements with partners (usually specialized vertical providers) to obtain structured data for use in verticals. Tr. at 9148:2-5 (Holden) (“[W]e started to gather what we would call structured data, where you need to enter into relationships with partners to gather this data that’s not generally available on the web. It can’t be crawled.”). These agreements can take various forms. The GSE might offer traffic to the provider in exchange for information (i.e., data-for-traffic agreements), pay the provider revenue share, or simply compensate the provider for the information. Id. at 6181:7-18 (Barrett-Bowen).

As of 2020, Microsoft has partnered with more than 100 providers to obtain structured data, and those partners include information sources like Fandango, Glassdoor, IMDb, Pinterest, Spotify, and more. DX1305 at .004, 018–.028; accord Tr. at 6212:23–6215:10 (Barrett-Bowen) (agreeing that Microsoft partners with over 70 providers of travel and local information, including the biggest players in the space).

The government attorneys said Bing is required to pay for structured data owing to its smaller size, while Google is able to obtain structured data for free because it sends partners so much traffic. The judge ultimately rejected their argument Microsoft struggled to sign these agreements or it was impeded in doing so, but did not dispute the difference in negotiating power between the two companies.

Once more, for emphasis: Google usually gets structured data for free but, in this case, it agreed to pay $60 million; imagine how much it would cost Bing.

This agreement does feel pretty unique, though. It is hard for me to imagine many other websites with the kind of specific knowledge found aplenty on Reddit. It is a centralized version of the bulletin boards of the early 2000s for such a wide variety of interests and topics. It is such a vast user base that, while it cannot ignore Google referrals, it is not necessarily reliant on them in the same way as many other websites are.

Most other popular websites are insular social networks; Instagram and TikTok are not relying on Google referrals. Wikipedia would probably be the best comparison to Reddit in terms of the contribution it makes to the web — even greater, I think — but every article page I tried except the homepage is overwhelmingly dependent on external search engine traffic.

Meanwhile, pretty much everyone else still has to pay Google for visitors. They have to buy the ads sitting atop organic search results. They have to buy ads on maps, on shopping carousels, on videos. People who operate websites hope they will get free clicks, but many of them know they will have to pay for some of them, even though Google will happily lift and summarize their work without compensation.

I cannot think of any other web property which has this kind of leverage over Google. While this feels like a violation of the ideals and principles that have built the open web on which Google has built its empire, I wonder if Google will make many similar agreements, if any. I doubt it — at least for now. This feels funny; maybe that is why it is so unique, and why it is not worth being too troubled by it.


  1. The uptick of Bing in the worldwide chart appears to be, in part, thanks to a growing share in China. Its market share has also grown a little in Africa and South America, but only by tiny amounts. However, Reddit is blocked in China, so a deal does not seem particularly attractive to either party. ↥︎

‘Google Is a Monopolist’ in Search Says U.S. Judge

By: Nick Heer
5 August 2024 at 21:36

Ashley Belanger, Ars Technica:

Google just lost a massive antitrust trial over its sprawling search business, as US district judge Amit Mehta released his ruling, showing that he sided with the US Department of Justice in the case that could disrupt how billions of people search the web.

“Google is a monopolist, and it has acted as one to maintain its monopoly,” Mehta wrote in his opinion. “It has violated Section 2 of the Sherman Act.”

Google will surely contest this finding when its implications are known; Mehta has not announced what actions the government will take against Google.

The opinion is full of details about the precise nature of how Google search and its ads work together, Google’s relationship with Apple and other third parties, and how its business has changed over time. For example, the judge notes Google adjusted ad pricing to maintain a specific growth target, and increased it incrementally to mask it in the typical fluctuations of ad costs. He also cites a finding that “thirteen months of user data acquired by Google is equivalent to over 17 years of data on Bing” in informing the quality of search results. Meanwhile, Google pays Apple a redacted amount through its revenue sharing agreement for default placement in Safari, and it pays for searches performed through Chrome on Apple devices as well. There is a lot more in here, and I fully intend on re-reading the opinion with a bunch of questions I have in mind.

Google really does have great search results a lot of the time, even though it has stumbled in recent years. DuckDuckGo is my default but I find myself often turning to Google for local results, very old results, and news. (DuckDuckGo is powered by Bing, which prioritizes MSN-syndicated versions of articles that I do not want.) Google has not fallen into the same trap as Bing by wholly cluttering the results page. Microsoft still has no taste.

But two things can be true: Google can be the best search engine for most people, most of the time, because it is very good; and, also, Google can have abused its market-leading position to avoid competition and maintain its advertising revenue. Those are not inconsistent with each other. In fact, per the judge’s citation of how long it would take for Bing to amass the same information about user activity as Google does in a year, it is fully possible its quality and its dominance are related, something the judge nods toward. In fact, Google’s position is now so entrenched “it would not lose search revenue if were to significantly reduce the quality of its search product”.

Notably, Mehta did not sanction Google for failing to preserve evidence in the case, writing:

On the request for sanctions, the court declines to impose them. Not because Google’s failure to preserve chat messages might not warrant them. But because the sanctions Plaintiffs request do not move the needle on the court’s assessment of Google’s liability. […]

In cases where the judge found evidence of monopolistic and abusive behaviour, the lack of supporting text messages and other communications would not have made a difference; this is also true, the judge says, for his finding of a lack of anticompetitive behaviour in SA360.

⌥ Permalink

The BHU Covaxin study and ICMR bait

By: VM
28 May 2024 at 04:51

Earlier this month, a study by a team at Banaras Hindu University (BHU) in Varanasi concluded that fully 1% of Covaxin recipients may suffer severe adverse events. One percent is a large number because the multiplier (x in 1/100 * x) is very large — several million people. The study first hit the headlines for claiming it had the support of the Indian Council of Medical Research (ICMR) and reporting that both Bharat Biotech and the ICMR are yet to publish long-term safety data for Covaxin. The latter is probably moot now, with the COVID-19 pandemic well behind us, but it’s the principle that matters. Let it go this time and who knows what else we’ll be prepared to let go.

But more importantly, as The Hindu reported on May 25, the BHU study is too flawed to claim Covaxin is harmful, or claim anything for that matter. Here’s why (excerpt):

Though the researchers acknowledge all the limitations of the study, which is published in the journal Drug Safety, many of the limitations are so critical that they defeat the very purpose of the study. “Ideally, this paper should have been rejected at the peer-review stage. Simply mentioning the limitations, some of them critical to arrive at any useful conclusion, defeats the whole purpose of undertaking the study,” Dr. Vipin M. Vashishtha, director and pediatrician, Mangla Hospital and Research Center, Bijnor, says in an email to The Hindu. Dr. Gautam Menon, Dean (Research) & Professor, Departments of Physics and Biology, Ashoka University shares the same view. Given the limitations of the study one can “certainly say that the study can’t be used to draw the conclusions it does,” Dr. Menon says in an email.

Just because you’ve admitted your study has limitations doesn’t absolve you of the responsibility to interpret your research data with integrity. In fact, the journal needs to speak up here: why did Drug Safety publish the study manuscript? Too often when news of a controversial or bad study is published, the journal that published it stays out of the limelight. While the proximal cause is likely that journalists don’t think to ask journal editors and/or publishers tough questions about their publishing process, there is also a cultural problem here: when shit hits the fan, only the study’s authors are pulled up, but when things are rosy, the journals are out to take credit for the quality of the papers they publish. In either case, we must ask what they actually bring to the table other than capitalising on other scientists’ tendency to judge papers based on the journals they’re published in instead of their contents.

Of course, it’s also possible to argue that unlike, say, journalistic material, research papers aren’t required to be in the public interest at the time of publication. Yet the BHU paper threatens to undermine public confidence in observational studies, and that can’t be in anyone’s interest. Even at the outset, experts and many health journalists knew observational studies don’t carry the same weight as randomised controlled trials as well as that such studies still serve a legitimate purpose, just not the one to which its conclusions were pressed in the BHU study.

After the paper’s contents hit the headlines, the ICMR shot off a latter to the BHU research team saying it hasn’t “provided any financial or technical support” to the study and that the study is “poorly designed”. Curiously, the BHU team’s repartee to the ICMR’s makes repeated reference to Vivek Agnihotri’s film The Vaccine War. In the same point in which two of these references appear (no. 2), the team writes: “While a study with a control group would certainly be of higher quality, this immediately points to the fact that it is researchers from ICMR who have access to the data with the control group, i.e. the original phase-3 trials of Covaxin – as well publicized in ‘The Vaccine War’ movie. ICMR thus owes it to the people of India, that it publishes the long-term follow-up of phase-3 trials.”

I’m not clear why the team saw fit to appeal to statements made in this of all films. As I’ve written earlier, The Vaccine War — which I haven’t watched but which directly references journalistic work by The Wire during and of the pandemic — is most likely a mix of truths and fictionalisation (and not in the clever, good-faith ways in which screenwriters adopt textual biographies for the big screen), with the fiction designed to serve the BJP’s nationalist political narratives. So when the letter says in its point no. 5 that the ICMR should apologise to a female member of the BHU team for allegedly “spreading a falsehood” about her and offers The Vaccine War as a counterexample (“While ‘The Vaccine War’ movie is celebrating women scientists…”), I can’t but retch.

Together with another odd line in the latter — that the “ICMR owes it to the people of India” — the appeals read less like a debate between scientists on the merits and the demerits of the study and more like they’re trying to bait the ICMR into doing better. I’m not denying the ICMR started it, as a child might say, but saying that this shouldn’t have prevented the BHU team from keeping it dignified. For example, the BHU letter reads: “It is to be noted that interim results of the phase-3 trial, also cited by Dr. Priya Abraham in ‘The Vaccine War’ movie, had a mere 56 days of safety follow-up, much shorter than the one-year follow-up in the IMS-BHU study.” Surely the 56-day period finds mention in a more respectable and reliable medium than a film that confuses you about what’s real and what’s not?

In all, the BHU study seems to have been designed to draw attention to gaps in the safety data for Covaxin — but by adopting such a provocative route, all that took centerstage was its spat with the ICMR plus its own flaws.

Google Comments on Its Sloppy Summaries

By: Nick Heer
3 June 2024 at 04:20

Liz Reid, head of Google Search, on the predictably bizarre results of rolling out its “A.I. Overviews” feature:

One area we identified was our ability to interpret nonsensical queries and satirical content. Let’s take a look at an example: “How many rocks should I eat?” Prior to these screenshots going viral, practically no one asked Google that question. You can see that yourself on Google Trends.

There isn’t much web content that seriously contemplates that question, either. This is what is often called a “data void” or “information gap,” where there’s a limited amount of high quality content about a topic. However, in this case, there is satirical content on this topic … that also happened to be republished on a geological software provider’s website. So when someone put that question into Search, an AI Overview appeared that faithfully linked to one of the only websites that tackled the question.

This reasoning sounds almost circular in the context of what A.I. answers are supposed to do. Google loves demonstrating how users can enter a query like “suggest a 7 day meal plan for a college student living in a dorm focusing on budget friendly and microwavable meals” and see a grouped set of responses synthesized from a variety of sources. That is surely a relatively uncommon query. I was going to prove that in the same was as Reid did, but when I enter it in Google Trends, I get a 400 error. Even a shortened version is searched so rarely it has no data.

The organic, non-A.I. search results for the long query are plentiful but do not exactly fulfill its specific criteria. Most of the links I saw are not microwave-only, or are simple lists not grouped into particular meal types. Nothing I could find specifically answers the question posed. In order to fulfill the query in the demo video, Google’s search engine has to look through everything it knows and find meals which cook in a microwave, and organize them into a daily plan of different meal types.

But Google is also blaming the novelty of the rocks query and the satirical information directly answering it for the failure of its A.I. features. In other words, it wants to say cool thing about its A.I. stuff is that it can handle unpopular or new queries by sifting through the web and merging together a bunch of stuff it finds. The bad thing about A.I. stuff, it turns out, is basically the same.

Benj Edwards, Ars Technica:

Here we see the fundamental flaw of the system: “AI Overviews are built to only show information that is backed up by top web results.” The design is based on the false assumption that Google’s page-ranking algorithm favors accurate results and not SEO-gamed garbage. Google Search has been broken for some time, and now the company is relying on those gamed and spam-filled results to feed its new AI model.

Reid says Google has made a bunch of changes to address the issues raised, but none of them fix a fundamental shift in A.I. results. Google used to be a directory — admittedly one ranked by mysterious criteria — allowing users to decide which results best fit their needs. It has slowly repositioned itself to being able to answer their queries with authority. Its A.I. answers are a more fulsome realization of features like Featured Snippets and the Answer Box. That is: instead of seeing options which may match their query, Google is now giving searchers singular answers. It has transformed from a referrer into an omniscient responder.

⌥ Permalink

The BHU Covaxin study and ICMR bait

By: V.M.
28 May 2024 at 03:51

Earlier this month, a study by a team at Banaras Hindu University (BHU) in Varanasi concluded that fully 1% of Covaxin recipients may suffer severe adverse events. One percent is a large number because the multiplier (x in 1/100 * x) is very large — several million people. The study first hit the headlines for claiming it had the support of the Indian Council of Medical Research (ICMR) and reporting that both Bharat Biotech and the ICMR are yet to publish long-term safety data for Covaxin. The latter is probably moot now, with the COVID-19 pandemic well behind us, but it’s the principle that matters. Let it go this time and who knows what else we’ll be prepared to let go.

But more importantly, as The Hindu reported on May 25, the BHU study is too flawed to claim Covaxin is harmful, or claim anything for that matter. Here’s why (excerpt):

Though the researchers acknowledge all the limitations of the study, which is published in the journal Drug Safety, many of the limitations are so critical that they defeat the very purpose of the study. “Ideally, this paper should have been rejected at the peer-review stage. Simply mentioning the limitations, some of them critical to arrive at any useful conclusion, defeats the whole purpose of undertaking the study,” Dr. Vipin M. Vashishtha, director and pediatrician, Mangla Hospital and Research Center, Bijnor, says in an email to The Hindu. Dr. Gautam Menon, Dean (Research) & Professor, Departments of Physics and Biology, Ashoka University shares the same view. Given the limitations of the study one can “certainly say that the study can’t be used to draw the conclusions it does,” Dr. Menon says in an email.

Just because you’ve admitted your study has limitations doesn’t absolve you of the responsibility to interpret your research data with integrity. In fact, the journal needs to speak up here: why did Drug Safety publish the study manuscript? Too often when news of a controversial or bad study is published, the journal that published it stays out of the limelight. While the proximal cause is likely that journalists don’t think to ask journal editors and/or publishers tough questions about their publishing process, there is also a cultural problem here: when shit hits the fan, only the study’s authors are pulled up, but when things are rosy, the journals are out to take credit for the quality of the papers they publish. In either case, we must ask what they actually bring to the table other than capitalising on other scientists’ tendency to judge papers based on the journals they’re published in instead of their contents.

Of course, it’s also possible to argue that unlike, say, journalistic material, research papers aren’t required to be in the public interest at the time of publication. Yet the BHU paper threatens to undermine public confidence in observational studies, and that can’t be in anyone’s interest. Even at the outset, experts and many health journalists knew observational studies don’t carry the same weight as randomised controlled trials as well as that such studies still serve a legitimate purpose, just not the one to which its conclusions were pressed in the BHU study.

After the paper’s contents hit the headlines, the ICMR shot off a latter to the BHU research team saying it hasn’t “provided any financial or technical support” to the study and that the study is “poorly designed”. Curiously, the BHU team’s repartee to the ICMR’s makes repeated reference to Vivek Agnihotri’s film The Vaccine War. In the same point in which two of these references appear (no. 2), the team writes: “While a study with a control group would certainly be of higher quality, this immediately points to the fact that it is researchers from ICMR who have access to the data with the control group, i.e. the original phase-3 trials of Covaxin – as well publicized in ‘The Vaccine War’ movie. ICMR thus owes it to the people of India, that it publishes the long-term follow-up of phase-3 trials.”

I’m not clear why the team saw fit to appeal to statements made in this of all films. As I’ve written earlier, The Vaccine War — which I haven’t watched but which directly references journalistic work by The Wire during and of the pandemic — is most likely a mix of truths and fictionalisation (and not in the clever, good-faith ways in which screenwriters adopt textual biographies for the big screen), with the fiction designed to serve the BJP’s nationalist political narratives. So when the letter says in its point no. 5 that the ICMR should apologise to a female member of the BHU team for allegedly “spreading a falsehood” about her and offers The Vaccine War as a counterexample (“While ‘The Vaccine War’ movie is celebrating women scientists…”), I can’t but retch.

Together with another odd line in the latter — that the “ICMR owes it to the people of India” — the appeals read less like a debate between scientists on the merits and the demerits of the study and more like they’re trying to bait the ICMR into doing better. I’m not denying the ICMR started it, as a child might say, but saying that this shouldn’t have prevented the BHU team from keeping it dignified. For example, the BHU letter reads: “It is to be noted that interim results of the phase-3 trial, also cited by Dr. Priya Abraham in ‘The Vaccine War’ movie, had a mere 56 days of safety follow-up, much shorter than the one-year follow-up in the IMS-BHU study.” Surely the 56-day period finds mention in a more respectable and reliable medium than a film that confuses you about what’s real and what’s not?

In all, the BHU study seems to have been designed to draw attention to gaps in the safety data for Covaxin — but by adopting such a provocative route, all that took centerstage was its spat with the ICMR plus its own flaws.

Google Leaked Itself

By: Nick Heer
29 May 2024 at 14:52

Rand Fishkin, writing on the SparkToro blog:

On Sunday, May 5th, I received an email from a person claiming to have access to a massive leak of API documentation from inside Google’s Search division. The email further claimed that these leaked documents were confirmed as authentic by ex-Google employees, and that those ex-employees and others had shared additional, private information about Google’s search operations.

It seems this vast amount of information was published erroneously by Google to a GitHub repository in March, and then removed earlier this month. As Fishkin writes, it is evidence Google has been dishonest in its public statements about how Google Search works.

Fishkin specifically calls attention to media outlets that cover search engines and value the word of Google’s spokespeople. This has been a clever play by Google for years: because its specific ranking criteria have not been publicly known, it can confirm or deny rumours without having to square them with what the evidence shows.

Google’s ranking system seems to be biased in favour of larger businesses and more established websites, according to Fishkin’s analysis. This is not surprising. I am wondering how this fits with the declining quality of Google search results as small, highly-optimized pages full of machine-generated junk seem to rise to the top.

Mike King, iPullRank:

You’d be tempted to broadly call these “ranking factors,” but that would be imprecise. Many, even most, of them are ranking factors, but many are not. What I’ll do here is contextualize some of the most interesting ranking systems and features (at least, those I was able to find in the first few hours of reviewing this massive leak) based on my extensive research and things that Google has told/lied to us about over the years.

“Lied” is harsh, but it’s the only accurate word to use here. While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries. My advice to future Googlers speaking on these topics: Sometimes it’s better to simply say “we can’t talk about that.” Your credibility matters, and when leaks like this and testimony like the DOJ trial come out, it becomes impossible to trust your future statements.

One of the things potentially tracked by Google for search purposes is Chrome browsing data, something Google has denied. The variable in question — chromeInTotal — and the minimal description offered — “site-level Chrome views” — seem open to interpretation. Perhaps this is only recorded in some circumstances, or it depends on user preferences, or is not actually part of search rankings, or is entirely unused. But it certainly suggests aggregate website visits in Chrome, the world’s most popular web browser, are used to inform rankings without users’ knowledge.

Update: Google says the leaked documents are real, but warns “against making inaccurate assumptions”. In fairness, I would like to make more accurate assumptions.

⌥ Permalink

The BHU Covaxin study and ICMR bait

By: VM
28 May 2024 at 04:18
The BHU Covaxin study and ICMR bait

Earlier this month, a study by a team at Banaras Hindu University (BHU) in Varanasi concluded that fully 1% of Covaxin recipients may suffer severe adverse events. One percent is a large number because the multiplier (x in 1/100 * x) is very large — several million people. The study first hit the headlines for claiming it had the support of the Indian Council of Medical Research (ICMR) and reporting that both Bharat Biotech and the ICMR are yet to publish long-term safety data for Covaxin. The latter is probably moot now, with the COVID-19 pandemic well behind us, but it’s the principle that matters. Let it go this time and who knows what else we’ll be prepared to let go.

But more importantly, as The Hindu reported on May 25, the BHU study is too flawed to claim Covaxin is harmful, or claim anything for that matter. Here’s why (excerpt):

Though the researchers acknowledge all the limitations of the study, which is published in the journal Drug Safety, many of the limitations are so critical that they defeat the very purpose of the study. “Ideally, this paper should have been rejected at the peer-review stage. Simply mentioning the limitations, some of them critical to arrive at any useful conclusion, defeats the whole purpose of undertaking the study,” Dr. Vipin M. Vashishtha, director and pediatrician, Mangla Hospital and Research Center, Bijnor, says in an email to The Hindu. Dr. Gautam Menon, Dean (Research) & Professor, Departments of Physics and Biology, Ashoka University shares the same view. Given the limitations of the study one can “certainly say that the study can’t be used to draw the conclusions it does,” Dr. Menon says in an email.

Just because you’ve admitted your study has limitations doesn’t absolve you of the responsibility to interpret your research data with integrity. In fact, the journal needs to speak up here: why did Drug Safety publish the study manuscript? Too often when news of a controversial or bad study is published, the journal that published it stays out of the limelight. While the proximal cause is likely that journalists don’t think to ask journal editors and/or publishers tough questions about their publishing process, there is also a cultural problem here: when shit hits the fan, only the study’s authors are pulled up, but when things are rosy, the journals are out to take credit for the quality of the papers they publish. In either case, we must ask what they actually bring to the table other than capitalising on other scientists’ tendency to judge papers based on the journals they’re published in instead of their contents.

Of course, it's also possible to argue that unlike, say, journalistic material, research papers aren't required to be in the public interest at the time of publication. Yet the BHU paper threatens to undermine public confidence in observational studies, and that can't be in anyone's interest. Even at the outset, experts and many health journalists knew observational studies don’t carry the same weight as randomised controlled trials as well as that such studies still serve a legitimate purpose, just not the one to which its conclusions were pressed in the BHU study.

After the paper’s contents hit the headlines, the ICMR shot off a latter to the BHU research team saying it hasn’t "provided any financial or technical support" to the study and that the study is “poorly designed". Curiously, the BHU team’s repartee to the ICMR's makes repeated reference to Vivek Agnihotri's film The Vaccine War. In the same point in which two of these references appear (no. 2), the team writes: "While a study with a control group would certainly be of higher quality, this immediately points to the fact that it is researchers from ICMR who have access to the data with the control group, i.e. the original phase-3 trials of Covaxin – as well publicized in 'The Vaccine War' movie. ICMR thus owes it to the people of India, that it publishes the long-term follow-up of phase-3 trials."

I'm not clear why the team saw fit to appeal to statements made in this of all films. As I've written earlierThe Vaccine War — which I haven't watched but which directly references journalistic work by The Wire during and of the pandemic — is most likely a mix of truths and fictionalisation (and not in the clever, good-faith ways in which screenwriters adopt textual biographies for the big screen), with the fiction designed to serve the BJP's nationalist political narratives. So when the letter says in its point no. 5 that the ICMR should apologise to a female member of the BHU team for allegedly “spreading a falsehood” about her and offers The Vaccine War as a counterexample ("While 'The Vaccine War' movie is celebrating women scientists…”), I can’t but retch.

Together with another odd line in the latter — that the "ICMR owes it to the people of India" — the appeals read less like a debate between scientists on the merits and the demerits of the study and more like they’re trying to bait the ICMR into doing better. I'm not denying the ICMR started it, as a child might say, but saying that this shouldn't have prevented the BHU team from keeping it dignified. For example, the BHU letter reads: "It is to be noted that interim results of the phase-3 trial, also cited by Dr. Priya Abraham in 'The Vaccine War' movie, had a mere 56 days of safety follow-up, much shorter than the one-year follow-up in the IMS-BHU study.” Surely the 56-day period finds mention in a more respectable and reliable medium than a film that confuses you about what’s real and what’s not?

In all, the BHU study seems to have been designed to draw attention to gaps in the safety data for Covaxin — but by adopting such a provocative route, all that took centerstage was its spat with the ICMR plus its own flaws.

Google’s A.I. Answers Said to Put Glue in Pizza, So Katie Notopoulos Made Some Pizza

By: Nick Heer
25 May 2024 at 05:53

Jason Koebler, 404 Media:

The complete destruction of Google Search via forced AI adoption and the carnage it is wreaking on the internet is deeply depressing, but there are bright spots. For example, as the prophecy foretold, we are learning exactly what Google is paying Reddit $60 million annually for. And that is to confidently serve its customers ideas like, to make cheese stick on a pizza, “you can also add about 1/8 cup of non-toxic glue” to pizza sauce, which comes directly from the mind of a Reddit user who calls themselves “Fucksmith” and posted about putting glue on pizza 11 years ago.

Katie Notopoulos, putting the “business” in Business Insider:

I knew my assignment: I had to make the Google glue pizza. (Don’t try this at home! I risked myself for the sake of the story, but you shouldn’t!)

My timeline on three entirely separate social networks — Bluesky, Mastodon, and Threads — has been chock full of examples of Google’s A.I. answers absolutely eating dirt — or, in one case, rocks — in the face of obvious satire and shitposting. Well, obvious to us. Computers, it seems, have not figured out glue and gasoline are bad for food.

The A.I. answers from Google are not all yucks and chuckles, unfortunately.

Nic Lake:

Yesterday (Part 1) I saw that mushrooms post, and knew something like that was going to get people hurt. I didn’t really think that (CONTENT WARNING) asking how best to deal with depression was going to be next on the “shit I didn’t want to see” Bingo card.

The organizations know. They know that these tools are not ready. They call it a “beta” and feed it to you anyway.

Google is manually removing A.I. results where appropriate, and it is claiming some of the screenshots which have been circulating have been faked in some way without specifying which.

To quote week-ago me:

Given the sliding quality of Google’s results, it seems quite bold for the company to be confident users worldwide will trust its generated answers.

Quite bold, indeed.

I do not expect perfection, but it is downright embarrassing that Google rolled out a product so unreliable and occasionally dangerous it continues to tarnish a reputation already suffering. Google’s Featured Snippets were bad enough. Now it is in the process of rolling out a whole new level of overconfident nonsense to the entire world, fixing it as everyone tests its limits.

⌥ Permalink

Google Is Expanding A.I. Feature Availability in Search

By: Nick Heer
15 May 2024 at 04:39

Liz Reid, head of Google Search:

People have already used AI Overviews billions of times through our experiment in Search Labs. They like that they can get both a quick overview of a topic and links to learn more. We’ve found that with AI Overviews, people use Search more, and are more satisfied with their results.

So today, AI Overviews will begin rolling out to everyone in the U.S., with more countries coming soon. That means that this week, hundreds of millions of users will have access to AI Overviews, and we expect to bring them to over a billion people by the end of the year.

Given the sliding quality of Google’s results, it seems quite bold for the company to be confident users worldwide will trust its generated answers. I am curious to try it when it is eventually released in Canada.

I know what you must be thinking: if Google is going to generate results without users clicking around much, how will it sell ad space? It is a fair question, reader.

Gerrit De Vynck and Cat Zakrzewski, Washington Post:

Google has largely avoided AI answers for the moneymaking searches that host ads, said Andy Taylor, vice president of research at internet marketing firm Tinuiti.

When it does show an AI answer on “commercial” searches, it shows up below the row of advertisements. That could force websites to buy ads just to maintain their position at the top of search results.

This is just one source speaking to the Post. I could not find any corroborating evidence or a study to support this, even on Tinuiti’s website. But I did notice — halfway through Google’s promo video — a query for “kid friendly places to eat in dallas” was answered with an ad for Hopdoddy Burger Bar before any clever A.I. stuff was shown.

Obviously, the biggest worry for many websites dependent on Google traffic is what will happen to referrals if Google will simply summarize the results of pages instead of linking to them. I have mixed feelings about this. There are many websites which game search results and overwhelm queries with their own summaries. I would like to say “good riddance”, but I also know these pages did not come out of nowhere. They are a product of trying to improve website rankings on Google for all searches, and to increase ad and affiliate revenue from people who have clicked through. Neither one is a laudable goal in its own right. Yet anyone who has paid attention to the media industry for more than a minute can kind of understand these desperate attempts to grab attention and money.

Google built entire industries, from recipe bloggers to search optimization experts. What happens when it blows it all up?

Good thing home pages are back.

⌥ Permalink

chroot shenanigans 2: Running a full desktop environment on an Amazon Kindle

14 April 2019 at 14:00

In my previous post, I described running Arch on an OpenWRT router. Today, I'll be taking it a step further and running Arch and a full LXDE installation natively on an Amazon Kindle, which can be interacted with directly using the touch screen. This is possible thanks to the Kindle's operating system being Linux!

You can see the end result in action here. Apologies for the shaky video - it was shot using my phone and no tripod.

If you're wanting to follow along, make sure you've rooted your Kindle beforehand. This is essential – without it, it's impossible to run custom scripts or binaries.

I'm testing this on an 8th generation Kindle (KT3) – it should, however, work for all recent Kindles given you've enough storage and are rooted. You also need to set up USBnetwork for SSH access and optionally KUAL if you want a simple way of launching the chroot.

First things first: We need to set up a filesystem and extract an Arch installation into it, which we can later chroot into. The filesystem will be a file which will be mounted as a loop device. The reason why we're not extracting the Arch installation directly into a directory on the Kindle is because the Kindle's storage filesystem is FAT32. FAT32 doesn't support required features such as symbolic links, which would break the Arch installation. Please note that this also means that your chroot filesystem can be 4 gigabytes large, at maximum. This can be worked around by mounting the real root inside the chroot filesystem, which it's still a hacky way to go about it. But I digress.

First, figure out how large your filesystem actually can be. SSH into your Kindle and see how much free space you have:

$ ssh root@192.168.15.244

kindle# df -k /mnt/base-us
Filesystem   1K-blocks  Used    Available  Use%  Mounted on
/dev/loop/0  3188640    361856  2826784    11%   /mnt/base-us

Seems like we have around 2800000K (around 2.8G) of space available. Let's make our filesystem 2.6G – it's enough to host our root filesystem and some extra applications, such as LXDE. Note that I'll be running the following commands on my PC and transferring the filesystem over later. You can also do all of this on the Kindle, but it's simply easier and faster this way.

Let's create a blank file of the wanted size. I'm using dd, but you can also use fallocate for this:

$ dd if=/dev/zero of=arch.img bs=1024 count=2600000
2600000+0 records in
2600000+0 records out
2662400000 bytes (2.7 GB, 2.5 GiB) copied, 6.92058 s, 385 MB/s

Let's create our filesystem on it. Since we're doing this on the PC, we need make it 32bit and disable the metadata_csum and huge_file options on the filesystem, as the Kindle's ext4 kernel doesn't support them.

$ mkfs.ext4 -O ^64bit,^metadata_csum,^huge_file arch.img
mke2fs 1.45.0 (6-Mar-2019)
Discarding device blocks: done                            
Creating filesystem with 650000 4k blocks and 162560 inodes
Filesystem UUID: a4e72620-368a-44b4-81bb-9e66b2903523
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

This is optional, but I'll also disable periodic filesystem checks on it:

$ tune2fs -c 0 -i 0 arch.img                               
tune2fs 1.45.0 (6-Mar-2019)         
Setting maximal mount count to -1
Setting interval between checks to 0 seconds

Next it's time to mount the filesystem:

$ mkdir rootfs
$ sudo mount -o loop arch.img rootfs/

The Kindle I'm using has a Cortex-A9-based processor, so let's download the ARMv7 version of Arch Linux ARM from here. You can download it and extract then, or simply download and extract at the same time:

$ curl -L http://os.archlinuxarm.org/os/ArchLinuxARM-armv7-latest.tar.gz | sudo tar xz -C rootfs/

sudo is required to extract as it sets up a lot of files with root permissions. You can ignore the errors about SCHILY.fflags. Verify that the files extracted successfully with ls -l rootfs/.

Let's prepare our Kindle for the filesystem. I opted for hosting the filesystem in extensions/karch as I want to use KUAL for easy launching:

$ ssh root@192.168.15.244

kindle# mkdir -p /mnt/base-us/extensions/karch

While we're here, it's also a good idea to stop the power daemon to prevent the Kindle from going into sleep mode while transferring the filesystem and interrupting our transfer:

kindle# stop powerd
powerd stop/waiting

Let's transfer our filesystem:

kindle# exit
Connection to 192.168.15.244 closed.

$ scp arch.img root@192.168.15.244:/mnt/base-us/extensions/karch/

This might take quite a bit of time, depending on your connection.

Once it's done, let's SSH in once again and set up our mountpoint:

$ ssh root@192.168.15.244

kindle# cd /mnt/base-us/extensions/karch/
kindle# mkdir system

I decided to set up my own loop device, so I can have it named, but you can ignore this and opt to use /dev/loop/12 or similar instead. Just make sure it's already not in use with mount.

Setting up a loop point and mounting the filesystem:

kindle# mknod -m0660 /dev/loop/karch b 7 250
kindle# mount -o loop=/dev/loop/karch -t ext4 arch.img system/

We should also mount some system directories into it:

kindle# mount -o bind /dev system/dev
kindle# mount -o bind /dev/pts system/dev/pts
kindle# mount -o bind /proc system/proc
kindle# mount -o bind /sys system/sys
kindle# mount -o bind /tmp system/tmp
kindle# cp /etc/hosts system/etc/

It's time to chroot into our new system and set it up for LXDE. You can also use this opportunity to set up whatever applications you need, such as an onscreen keyboard:

kindle# chroot system/ /bin/bash
chroot# echo 'en_US.UTF-8 UTF-8' > /etc/locale.gen 
chroot# locale-gen
chroot# rm /etc/resolv.conf 
chroot# echo 'nameserver 8.8.8.8' > /etc/resolv.conf
chroot# pacman-key --init # this will take a while
chroot# pacman-key --populate
chroot# pacman -Syu --noconfirm
chroot# pacman -S lxde xorg-server-xephyr --noconfirm

We use Xephyr because it's the easiest way to get our LXDE session up and running. Since the Kindle uses X11 natively, we can try using that. It's possible to stop the native window manager using stop lab126_gui outside the chroot, but then the Kindle will stop updating the screen with new data, leaving it blank – forcing you to use something like eips to refresh the screen. The X server still works, however, and you can confirm this by using something like x11vnc after running your own WM in it. Xephyr spawns a new X server inside the preexisting X server, which is not as efficient but a lot easier.

We can however stop everything else related to the native GUI, as we need the extra memory and we can't use it while LXDE is running anyways:

chroot# exit
kindle# SERVICES="framework pillow webreader kb contentpackd"
kindle# for service in ${SERVICES}; do stop ${service}; done

While we're here, we need to get the screen size for later:

kindle# eips -i | grep 'xres:' | awk '{print $2"x"$4}'
600x800

Let's chroot back into the system and see if we can get LXDE to run. Be sure to replace the screen size parameter if needed:

kindle# chroot system/ /bin/bash
chroot# export DISPLAY=:0
chroot# Xephyr :1 -title "L:A_N:application_ID:xephyr" -screen 600x800 -cc 4 -nocursor &
chroot# export DISPLAY=:1
chroot# lxsession &
chroot# xrandr -o right

If everything goes well, you should have LXDE visible on your Kindle's screen. Ta-da! Feel free to play around with it. I've found that the touch screen is suprisingly accurate, even though it is using an IR LED system to detect touches instead of a normal digitizer.

Once done in the chroot, Ctrl-C + Ctrl-D can be issued to exit the chroot. We can then restore the Kindle UI by doing:

kindle# for service in ${SERVICES}; do start ${service}; done

It might take a while for anything to display again.

I've mentioned setting up a KUAL extension to automate the entering and exiting of the chroot. You can find that here. If you're interested in using this, make sure you've set up your filesystem first and copied it over to the same directory as the extension, and that it's named arch.img. Everything else is not mandatory - the extension will do it for you.

chroot shenanigans: Running Arch Linux on OpenWRT (LEDE) routers

21 March 2019 at 14:45

Here's some notes on how to get Arch Linux running on OpenWRT devices. I'm using an Inteno IOPSYS (OpenWRT-based) DG400 for this, which has a Broadcom BCM963138 SoC - reportedly ARMv7 but not really (I'll get to that later).

I figured it would be fun trying to run Arch on such an unconventional device. I ran into 3 issues which I will be discussing, and the workarounds for them.

I've already "hacked" my router and have direct root access to the system, so I won't be discussing that in this post. If you're interested, check out any of my older posts with a CVE label for more information, or if you're brave and want to compile and flash custom firmware on your Inteno router, check out this post.

I used the lovely Arch Linux ARM community project as the basis for this. The plan of action: Grab a tarball of a compiled system for my architecture (ARMv7), extract it on the router and use chroot to effectively "run" it as if it was the root filesystem. Seems simple enough.

Issue 1: Space

These sort of devices are usually built with very limited storage to keep production costs down. The firmware just about fits on the onboard flash with some extra space for temporary files. It's not meant to be used as your conventional system.

df -h reported my root filesystem to only have 304 Kb of available space, and my tmp filesystem to have 100 Mb. Considering that the Arch tarball itself is already over 500 Mb, the device doesn't have nearly enough space to fit another OS on it.

The solution for this is quite simple: Use a USB drive. Indeed, my DG400 router has a USB2.0 and 3.0 port presumably for sticking pen drives into them. Evidently, seeing as any drives inserted are automatically mounted in /mnt (I'm unsure whether this is done by OpenWRT by default or if it's an IOPSYS feature).

It's settled then. I used my PC to format a pen drive as ext4 (FAT won't work for this very well), downloaded the ARMv7 tarball and extracted it onto the pen drive:

# umount /dev/sdc1 # (replace with your USB drive)
# mkfs.ext4 /dev/sdc1
# mount /dev/sdc1 /mnt
# mkdir /mnt/archfs
# wget http://os.archlinuxarm.org/os/ArchLinuxARM-armv7-latest.tar.gz
# bsdtar -xpf ArchLinuxARM-armv7-latest.tar.gz -C /mnt/archfs

Done. After plugging the USB drive into the router, it got automatically mounted at /mnt/usb0 (might differ). However, it got mounted with the noexec flag, which will prevent executables being run. It's easy enough to remount it. On the router:

# mount /mnt/usb0 -o exec,remount

Great! It's time to test if we can now actually chroot into it:

# chroot /mnt/usb0/archfs /bin/bash
Illegal instruction (core dumped)

Uh oh. Looks like something is still wrong. Which brings us to…

Issue 2: Not all ARM is created equal

Looks like we're running into some instructions while running bash that our processor doesn't support. Let's see if we're still ARMv7 and I hadn't messed up:

# cat /proc/cpuinfo 
processor       : 0
model name      : ARMv7 Processor rev 1 (v7l)
BogoMIPS        : 1325.05
Features        : half thumb fastmult edsp tls 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x4
CPU part        : 0xc09
CPU revision    : 1

Strange. We're using the ARMv7 tarball, it should all be groovy. My custom firmware is compiled with GDB, which I could use to see exactly which instruction it's failing on. Since there's no way of running GDB + any of my Arch binaries natively without library mismatches, I opted to simply grab the core dump and use that instead. I looked into /proc/sys/kernel/core_pattern to identify the script responsible for handling coredumps and modified it to dump it to the root of my USB stick instead. I could then use GDB to look through the backtrace:

# gdb /mnt/usb0/archfs/bin/grep /mnt/usb0/coredump -q
Reading symbols from archfs/bin/grep...(no debugging symbols found)...done.
[New LWP 14713]

warning: Could not load shared library symbols for /lib/ld-linux-armhf.so.3.
Do you need "set solib-search-path" or "set sysroot"?
Core was generated by `/bin/grep'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0xb6fe5ba4 in ?? ()

I needed to set the proper sysroot as well, to fetch proper library symbols:

(gdb) set sysroot /mnt/usb0/archfs/
Reading symbols from /mnt/usb0/archfs/lib/ld-linux-armhf.so.3...(no debugging symbols found)...done.
(gdb) disas 0xb6fe5ba4
Dump of assembler code for function __sigsetjmp:
   0xb6fe5b70 <+0>:	movw	r12, #28028	; 0x6d7c
   0xb6fe5b74 <+4>:	movt	r12, #1
   0xb6fe5b78 <+8>:	ldr	r2, [pc, r12]
   0xb6fe5b7c <+12>:	mov	r12, r0
   0xb6fe5b80 <+16>:	mov	r3, sp
   0xb6fe5b84 <+20>:	eor	r3, r3, r2
   0xb6fe5b88 <+24>:	str	r3, [r12], #4
   0xb6fe5b8c <+28>:	eor	r3, lr, r2
   0xb6fe5b90 <+32>:	str	r3, [r12], #4
   0xb6fe5b94 <+36>:	stmia	r12!, {r4, r5, r6, r7, r8, r9, r10, r11}
   0xb6fe5b98 <+40>:	movw	r3, #28064	; 0x6da0
   0xb6fe5b9c <+44>:	movt	r3, #1
   0xb6fe5ba0 <+48>:	ldr	r2, [pc, r3]
=> 0xb6fe5ba4 <+52>:	vstmia	r12!, {d8-d15}
   0xb6fe5ba8 <+56>:	tst	r2, #512	; 0x200
   0xb6fe5bac <+60>:	beq	0xb6fe5bc8 <__sigsetjmp+88>
   0xb6fe5bb0 <+64>:	stfp	f2, [r12], #8
   0xb6fe5bb4 <+68>:	stfp	f3, [r12], #8
   0xb6fe5bb8 <+72>:	stfp	f4, [r12], #8
   0xb6fe5bbc <+76>:	stfp	f5, [r12], #8
   0xb6fe5bc0 <+80>:	stfp	f6, [r12], #8
   0xb6fe5bc4 <+84>:	stfp	f7, [r12], #8
   0xb6fe5bc8 <+88>:	b	0xb6fe39d8 <__sigjmp_save>
End of assembler dump.

Looks like our processor didn't like the vstmia instruction. Can't imagine why - it seems to be a valid ARMv7 instruction.

After reading through some reference manuals and consulting others online, it turned out that my SoC processor is crippled: A set of instructions simply wasn't supported by my processor. Luckily, thanks to those instructions not existing in ARMv5 and ARM being backwards-compatible, I could simply use the ARMv5-compiled system instead.

Repeating the steps to create the root filesystem, this time using the ArchLinuxARM-armv5-latest.tar.gz tarball instead, showed promising results. I could finally:

# chroot /mnt/usb0/archfs /bin/bash
[root@iopsys /]# cat /etc/os-release
NAME="Arch Linux ARM"
PRETTY_NAME="Arch Linux ARM"
ID=archarm

I exited the chroot after seeing it works. We still needed to mount some partitions so the chroot could see and interact with them and copy some files over. I wrote a helper script for all of that which you can find here.

Great, we can now initialise pacman and try upgrading the system.

# pacman-key --init
# pacman-key --populate archlinuxarm
# pacman -Syu

error: out of memory

Issue 3: Memory problems

Honestly, should've seen this one coming. free -m showed that I was working with around 100 Mb of usable memory, which is not much - no wonder pacman crapped out. Luckily, my device kernel was compiled with swap support. This essentially allows the system to "swap" memory contents out to the filesystem and load them later when necessary. It's very slow compared to real memory, but it gets the job done in a pinch. I created a 1G swapfile on my USB drive and activated it, whilst inside the chroot:

# truncate -s 0   /swapfile
# chattr +C       /swapfile
# fallocate -l 1G /swapfile
# chmod 600       /swapfile
# mkswap          /swapfile
# swapon          /swapfile

Running pacman again allowed me to continue upgrading the system, which it finished successfully.

At this point, I had a fully functional Arch Linux system which I could chroot into and utilise pretty much to the maximum. I've successfully set up Python bots, compiled software with gcc/g++, etc. what you'd expect to see from a normal system. I don't know why you would want to do this, but it's definitely possible.

I realise that it may not go this smoothly on other systems. For example, a large portion of routers utilise the MIPS architecture instead of ARM. If this is the case for you, it unfortunately means that Arch Linux is off the table, as it doesn't have any functioning MIPS builds. However, the Debian community maintains an active MIPS port of Debian which you might want to look into instead. Everything in this post should still pretty much apply to Debian/MIPS as well, with some minor differences.

This has also been done on other unconventional devices. Reddit user parkerlreed used a similar procedure to run Arch Linux on a Steamlink, which you can read here - it even has instructions on how to compile applications natively on it.

❌
❌