Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Heroin s sintetičnim kanabinoidom v Novi Gorici

Aktivne snovi

MDMB-INACA

heroin (9.9%)

noskapin (7.3%)

paracetamol (44%)

kofein (30%)

Dodaten opis

Analiza Nacionalnega laboratorija za zdravje, okolje in hrano je pokazala, da praškast vzorec sivo-rjave barve poleg 9.9% heroina, vsebuje sintetični kanabinoid MDMB-INACA ter večji delež običajnih primesi, kot sta paracetamol in kofein. 

Sintetični kanabinoidi so potentne psihoaktivne snovi, ki so lahko aktivne že v miligramih. Njihovi negativni učinki so podobni akutni zastrupitvi s konopljo – od močne evforije do neprijetnih občutkov kot so zmedenost, tesnoba in strah. Pojavi se popačeno dojemanje časa, halucinacije, paranoja ter hujše duševne motnje. Fizični učinki vključujejo tahikardijo, slabost, bruhanje, krče in poslabšanje motorične zmogljivosti. Kombinacija opiatov in sintetičnih kanabinoidov je slabo raziskana, vendar se predpostavlja, da povečuje tveganje za predoziranje, ki lahko povzroči zastoj dihanja, komo in smrt. Zaradi tega svetujemo dosledno upoštevanje vseh ukrepov zmanjševanja škode. 

Datum testa

29.5.2026

Zmanjševanje tveganj

Ne uporabljaj v kombinaciji z drugimi drogami. Še posebej nevarna je kombinacija z drogami ki delujejo depresivno na centralni živčni sistem, kot so npr. alkohol, opioidi, sedativi, hipnotiki. Tovrstne kombinacije lahko privedejo do življenjsko ogrožajočih zastrupitev

Pred uporabo substanco testiraj in preveri kaj dejansko vsebuje. Več informacij o testiranju najdeš na www.testi.drogart.org

Ukrepanje v primeru predoziranja

Ljudje so po uporabi sintetičnih kanabinoidov pogosto neprisebni. Lahko se padejo, se penijo v ustih ali so začasno paralizirani. Če je mogoče, jih postavite v stabilni bočni položaj in nenehno spremljajte dihanje.

Če nekdo pade v nezavest in je neodziven, pokličite 112. Preverite ali diha. Če ne dihajo, začnite s stiskanje prsnega koša, če diha pa ga postavite v stabilni bočni položaj in počakajte na prihod reševalcev.


Vzorec je v okviru anonimnega zbiranja vzorcev psihoaktivnih snovi zbralo Združenje DrogArt v Mariboru. Analizo vzorca je izvedel Nacionalni laboratorij za zdravje, okolje in hrano.

The post Heroin s sintetičnim kanabinoidom v Novi Gorici appeared first on DrogArt.

XANAX (alprazolam)

Aktivne snovi

Alprazolam (1,8 mg)

Dodaten opis

Tabletka ima višjo vsebnost alprazolama, od običajne, ki jo zaznavamo v storitvi testiranja in lahko predstavlja višje tveganje za zdravje uporabnikov in za pojav negativnih učinkov, kot so: sedacija, motnje govora, slabost, poslabšanje motoričnih sposobnosti, izguba spomina, upočasnjeno dihanje. Respiratorna depresija je še posebej nevarna v kombinaciji z drugimi depresorji, kot so alkohol, GBL, opiati!

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina alprazolama v tableti je zgolj informativne narave in se lahko pri tabletah z izgledom bistveno razlikuje.

Datum testa

29.5.2026

Zmanjševanje tveganj

  • Če se že odločiš za uporabo, uživaj benzodiazepine v majhnih dozah in jih nikoli ne mešaj z ostalimi depresorskimi drogami (alkohol, opiati, GHB/GBL …), saj obstaja velika nevarnost predoziranja.
  • Ne uporabljaj jih pogosto, saj povzročajo močno zasvojenost.
  • Na uporabo se predhodno pripravi in si doze pripravi vnaprej, saj se pri benzodiazepinih pogosto pojavita lažen občutek treznosti in kompulzivno redoziranje.
  • Nikoli ne vozi nikogar ali ničesar pod vplivom benzodiazepinov.
  • Svojih težav ne skušaj reševati na lastno pest z benzodiazepini s črnega trga, raje obišči zdravnika ali pa se oglasi v DrogArtovi ali kateri drugi svetovalnici.
  • Ob morebitnem nastanku zasvojenosti poišči kvalitetno strokovno pomoč (osebni zdravnik, psihiater, DrogArt, Stigma, druge nevladne organizacije …)
  • Po partiju se ne »spuščaj« z benzodiazepini. Mešanje s katerimi koli substancami je vedno nevarno početje. Svojemu telesu raje privošči kvaliteten obrok in kakovosten spanec, ki bo slej ko prej nastopil sam.
  • Pri benzodiazepinih, kupljenih na črnem trgu, je še posebej pomembno laboratorijsko testiranje zaradi možnosti ponaredkov oziroma lažnih tablet. Četudi na tableti piše npr. »Xanax«, to ni nikakršno zagotovilo, da tableta zares vsebuje alprazolam, ampak lahko vsebuje kateri drug, močnejši benzodiazepin ali pa čisto nekaj drugega.
  • Če oseba, ki je zaužila benzodiazepin, izgubi zavest, jo položi v položaj za nezavestne in ne odlašaj s klicem na 112.

The post XANAX (alprazolam) appeared first on DrogArt.

McDonalds (MDMA)

Aktivne snovi

MDMA (172 mg)

Dodaten opis

Pri več kot 1,5 mg MDMA na kg telesne teže se hitreje pojavijo neželeni učinki, kot so zategovanje čeljusti, mišični krči, panična reakcija in epileptični napad. V naslednjih dneh se po zaužitju večjih odmerkov MDMA lahko pojavi povečana depresija, pomanjkanje koncentracije, motnje spanja, izguba apetita in občutek močne brezvoljnosti. Simptomi po nekaj dneh izzvenijo.

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina MDMA v tableti je zgolj informativne narave in se lahko pri tabletah z istim logotipom in barvo bistveno razlikuje.

Datum testa

29.5.2026

Zmanjševanje tveganj

  • Prilagodi odmerek glede na svojo težo in izkušenost. Literatura navaja, da je odmerek MDMA 1─1,5 mg/kg telesne mase, kar za 60 kg težkega človeka znaša 60─90 mg.
  • Bodi zelo pozoren, če MDMA uporabljaš prvič, ali če ne veš, koliko čist MDMA imaš. Učinki so lahko zelo raznoliki in nekateri ljudje čutijo veliko bolj intenzivno negativne učinke (tako fizične kot psihične). Zmeraj začni z majhnimi dozami (npr. četrtinko ekstazija ali lahko dozo MDMA-ja v kristalih) in počakaj vsaj 2h.
  • Delaj redne premore med plesom.
  • Vsako uro spij do pol litra izotoničnega napitka, če plešeš, drugače pa manj.
  • Ne mešaj različnih drog med seboj, ne mešaj z zdravili.
  • Ne jemlji različnih tablet v eni noči.
  • Poskrbi za ustrezno prehrano in dovolj spanca med tednom.
  • Delaj pavze med uživanjem MDMA-ja (2-3 mesece med eno uporabo in drugo).
  • Če opaziš težave, ki bi bile lahko povezane z uporabo ekstazija, poišči pomoč.

The post McDonalds (MDMA) appeared first on DrogArt.

4-BMC prodan kot mefedron (4-MMC)

Aktivne snovi

4-BMC (brefedron)

Dodaten opis

Analiza Nacionalnega laboratorija za zdravje, okolje in hrano je pokazala, da vzorec v obliki belih kristalov, kupljen kot 4-MMC v Ljubljani, dejansko vsebuje sintetični katinon 4-BMC (4-Bromometilkatinon, brefedron).

O snovi 4-BMC je na voljo omejeno število informacij. Povzroča podobne stranske učinke kot drugi sintetični katinoni, kot so povišan srčni utrip, povišan krvni tlak, bolečine v prsih, vznemirjenost, psihoze, agresija, halucinacije in nespečnost. Sintetični katinoni s halogenim elementom (npr. 4-BMC, 4-CMC) kažejo večjo citotoksičnost in nevrotoksičnost.

Sintetični katinoni lahko negativno vplivajo na socialno-ekonomski položaj, družinske odnose, delo ali šolanje ter povečajo ranljivost uporabnikov. Vzorec je v okviru anonimnega zbiranja vzorcev psihoaktivnih snovi zbral DrogArt v Ljubljani.

Datum testa

22.5.2026

Zmanjševanje tveganj

  • V letošnjem in lanskem letu smo zaznali večje število lažnih produktov, ki so se prodajali kot sladoled (3-MMC) in mefedron. Če ga uporabljaš, se zato še posebej priporoča uporabo anonimne storite testiranja drog.
  • Trenutno je na voljo malo informacij o tveganjih povezanih s 4-BMC in ostalimi katinoni. 4-kloroamfetamin (4-CA, amfetaminski derivat) je znan kot zelo nevrotoksična substanca. Ker sta si 4-CA in 4-BMC strukturno zelo podobna, obstaja verjetnost, da gre tudi v primeru 4-BMCja za toksično substanco. Trenutno ni nobenih raziskav, ki bi to potrdile, vendar se priporoča, da se vnosu omenjene spojine izognemo.

The post 4-BMC prodan kot mefedron (4-MMC) appeared first on DrogArt.

Sovica (MDMA)

Aktivne snovi

MDMA (149 mg)

Dodaten opis

Pri več kot 1,5 mg MDMA na kg telesne teže se hitreje pojavijo neželeni učinki, kot so zategovanje čeljusti, mišični krči, panična reakcija in epileptični napad. V naslednjih dneh se po zaužitju večjih odmerkov MDMA lahko pojavi povečana depresija, pomanjkanje koncentracije, motnje spanja, izguba apetita in občutek močne brezvoljnosti. Simptomi po nekaj dneh izzvenijo.

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina MDMA v tableti je zgolj informativne narave in se lahko pri tabletah z istim logotipom in barvo bistveno razlikuje.

Datum testa

22.5.2026

Zmanjševanje tveganj

  • Prilagodi odmerek glede na svojo težo in izkušenost. Literatura navaja, da je odmerek MDMA 1─1,5 mg/kg telesne mase, kar za 60 kg težkega človeka znaša 60─90 mg.
  • Bodi zelo pozoren, če MDMA uporabljaš prvič, ali če ne veš, koliko čist MDMA imaš. Učinki so lahko zelo raznoliki in nekateri ljudje čutijo veliko bolj intenzivno negativne učinke (tako fizične kot psihične). Zmeraj začni z majhnimi dozami (npr. četrtinko ekstazija ali lahko dozo MDMA-ja v kristalih) in počakaj vsaj 2h.
  • Delaj redne premore med plesom.
  • Vsako uro spij do pol litra izotoničnega napitka, če plešeš, drugače pa manj.
  • Ne mešaj različnih drog med seboj, ne mešaj z zdravili.
  • Ne jemlji različnih tablet v eni noči.
  • Poskrbi za ustrezno prehrano in dovolj spanca med tednom.
  • Delaj pavze med uživanjem MDMA-ja (2-3 mesece med eno uporabo in drugo).
  • Če opaziš težave, ki bi bile lahko povezane z uporabo ekstazija, poišči pomoč.

The post Sovica (MDMA) appeared first on DrogArt.

Hermes (MDMA)

Aktivne snovi

MDMA (198 mg)

Dodaten opis

Pri več kot 1,5 mg MDMA na kg telesne teže se hitreje pojavijo neželeni učinki, kot so zategovanje čeljusti, mišični krči, panična reakcija in epileptični napad. V naslednjih dneh se po zaužitju večjih odmerkov MDMA lahko pojavi povečana depresija, pomanjkanje koncentracije, motnje spanja, izguba apetita in občutek močne brezvoljnosti. Simptomi po nekaj dneh izzvenijo.

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina MDMA v tableti je zgolj informativne narave in se lahko pri tabletah z istim logotipom in barvo bistveno razlikuje.

Datum testa

22.5.2026

Zmanjševanje tveganj

  • Prilagodi odmerek glede na svojo težo in izkušenost. Literatura navaja, da je odmerek MDMA 1─1,5 mg/kg telesne mase, kar za 60 kg težkega človeka znaša 60─90 mg.
  • Bodi zelo pozoren, če MDMA uporabljaš prvič, ali če ne veš, koliko čist MDMA imaš. Učinki so lahko zelo raznoliki in nekateri ljudje čutijo veliko bolj intenzivno negativne učinke (tako fizične kot psihične). Zmeraj začni z majhnimi dozami (npr. četrtinko ekstazija ali lahko dozo MDMA-ja v kristalih) in počakaj vsaj 2h.
  • Delaj redne premore med plesom.
  • Vsako uro spij do pol litra izotoničnega napitka, če plešeš, drugače pa manj.
  • Ne mešaj različnih drog med seboj, ne mešaj z zdravili.
  • Ne jemlji različnih tablet v eni noči.
  • Poskrbi za ustrezno prehrano in dovolj spanca med tednom.
  • Delaj pavze med uživanjem MDMA-ja (2-3 mesece med eno uporabo in drugo).
  • Če opaziš težave, ki bi bile lahko povezane z uporabo ekstazija, poišči pomoč.

The post Hermes (MDMA) appeared first on DrogArt.

2-MMC prodan kot sladoled (3-MMC) v Ljubljani

Aktivne snovi

2-MMC (okvirno 97 %)

Dodaten opis

Rezultati analize, narejene v Nacionalnem laboratoriju za zdravje, okolje in hrano, so pokazali, da je vzorec v obliki prozornih kristalčkov, ki se je prodajal kot 3-MMC, v Ljubljani vseboval njegov analog 2-MMC.

2-MMC je analog 3-MMC in 4-MMC, a po poročanju uporabnikov naj ne bi imel izrazitih stimulativnih učinkov. Trenutne informacije temeljijo skoraj izključno na poročilih uporabnikov, ki poročajo o še večjem cravingu (želji po ponovitvi odmerka), kot je to značilno pri 3-MMC.

Verjetno je razlog, nekoliko zmanjšan stimulativni in empatogeni učinek v primerjavi s 3 in 4-MMC, ki ga potem uporabniki poizkušajo doseči z redoziranjem in večjimi odmerki, kar pa poveča tveganje za zdravstvene zaplete.

Gre za najmanj raziskan analog, tako da ni zbranih veliko drugih informacij o učinkih in možnih tveganjih.

Datum testa

22.5.2026

Zmanjševanje tveganj

  • V letošnjem letu smo zaznali večje število lažnih produktov, ki so se prodajali kot sladoled (3-MMC). Če ga uporabljaš, se zato še posebej priporoča uporabo anonimne storite testiranja drog.

Vzorec je v okviru anonimnega zbiranja vzorcev psihoaktivnih snovi zbrala info točka Drogarta. Analizo vzorca je izvedel Nacionalni laboratorij za zdravje, okolje in hrano. Obvestilo je pripravil Nacionalni inštitut za varovanje zdravja.

The post 2-MMC prodan kot sladoled (3-MMC) v Ljubljani appeared first on DrogArt.

4-MMC prodan kot ketamin v Ljubljani

Aktivne snovi

4-MMC (80 %)

Dodaten opis

Rezultati analize, narejene v Nacionalnem laboratoriju za zdravje, okolje in hrano, so pokazali, da vzorec v obliki rjavo belih kristalčkov, ki se je v Ljubljani prodajal kot ketamin, dejansko vsebuje mefedron (4-MMC).

 

Datum testa

15.5.2026

Zmanjševanje tveganj

  • Uporabi storitev anonimnega testiranja, še posebej v primeru ko videz, oblika ali vonj substance odstopa od običajne
  • Pred zaužitjem vedno opravi alergijski test: zaužij izredno majhno količino  aktivnega odmerka in počakaj eno uro. Če se pojavijo znaki alergije ali če so učinki drugačni od pričakovanih, substance ne zaužij.
  • 4-MMC velja za za izredno zasvoljivo substanco, zato ob prvih znakih zasvojenosti takoj prenehaj z uporabo. Če imaš s prenehanjem teževe se obrni po pomoč v svetovalnico DrogArt (041 730 800)
  • Če se odločiš za snifanje, redno menjavaj nosnice in po uporabi speri nosno votlino s fiziološko raztopino. Vedno uporabljaj svoj pripor za snifanje in se izogni prenosu virusov (hepatits). Manj tvegano je, če substanco zaužiješ oralno (bombice).
  • Načrtuj uporabo. Zaradi močnega cravinga (želja po redoziranju), lahko pride do dalj časa trajajoče neprekinjene uporabe, kar poveča tveganje za nastanek odvisnosti in predoziranje.

The post 4-MMC prodan kot ketamin v Ljubljani appeared first on DrogArt.

4-CMC prodan kot sladoled (3-MMC) v Ljubljani

Aktivne snovi

4-CMC (~ 94%)

Dodaten opis

Rezultati analize, narejene v Nacionalnem laboratoriju za zdravje, okolje in hrano, so pokazali, da vzorec v obliki belih kristalov, ki se je v Ljubljani prodajal kot 3-MMC, dejansko vsebuje 4-CMC.

O snovi 4-CMC ni na voljo veliko informacij, predvideva pa se, da povzroča podobne stranske učinke kot drugi sintetični katinoni kot je njegov analog 3-CMC, in sicer: povišan srčni utrip, povišan krvni tlak, vznemirjenost, psihoze, epileptični napad, pregretje telesa, bolečine v prsih, itn.

Datum testa

15.5.2024

Zmanjševanje tveganj

  • V zadnjem času smo zaznali večje število lažnih produktov, ki so se prodajali kot sladoled (3-MMC) in mefedron. Če ga uporabljaš, se zato še posebej priporoča uporabo anonimne storite testiranja drog.
  • Trenutno je na voljo malo informacij o tveganjih povezanih s 4-CMC in ostalimi katinoni. 4-kloroamfetamin (4-CA, amfetaminski derivat) je znan kot zelo nevrotoksična substanca. Ker sta si 4-CA in 4-CMC strukturno zelo podobna, obstaja verjetnost, da gre tudi v primeru 4-CMCja za toksično substanco. Trenutno ni nobenih raziskav, ki bi to potrdile, vendar se priporoča, da se vnosu omenjene spojine izognemo.
  • 4-CMC velja za za izredno zasvojljivo substanco, zato ob prvih znakih zasvojenosti takoj prenehaj z uporabo. Če imaš s prenehanjem teževe se obrni po pomoč v svetovalnico DrogArt (041 730 800)
  • Če se odločiš za snifanje, redno menjavaj nosnice in po uporabi speri nosno votlino s fiziološko raztopino. Vedno uporabljaj svoj pripor za snifanje in se izogni prenosu virusov (hepatits). Manj tvegano je, če substanco zaužiješ oralno (bombice).
  • Načrtuj uporabo. Zaradi močnega cravinga (želja po redoziranju), lahko pride do dalj časa trajajoče neprekinjene uporabe, kar poveča tveganje za nastanek odvisnosti in predoziranja.
  • Svetujemo, da se držite priporočil za zmanjševanje škode pri uporabi novih sintetičnih drog.

The post 4-CMC prodan kot sladoled (3-MMC) v Ljubljani appeared first on DrogArt.

4-MMC prodan kot MDMA v Ljubljani

Aktivne snovi

4-MMC (83 %)

Dodaten opis

Rezultati analize, narejene v Nacionalnem laboratoriju za zdravje, okolje in hrano, so pokazali, da vzorec v obliki rjavega kristala, ki se je v Ljubljani prodajal kot MDMA, dejansko vsebuje mefedron (4-MMC).

 

Datum testa

15.5.2026

Zmanjševanje tveganj

  • Pred zaužitjem vedno opravi alergijski test: oralno zaužij izredno majhno količino  aktivnega odmerka in počakaj eno uro. Če se pojavijo znaki alergije ali če so učinki drugačni od pričakovanih, substance ne zaužij.
  • 4-MMC velja za za izredno zasvoljivo substanco, zato ob prvih znakih zasvojenosti takoj prenehaj z uporabo. Če imaš s prenehanjem teževe se obrni po pomoč v svetovalnico DrogArt (041 730 800)
  • Če se odločiš za snifanje, redno menjavaj nosnice in po uporabi speri nosno votlino s fiziološko raztopino. Vedno uporabljaj svoj pripor za snifanje in se izogni prenosu virusov (hepatits). Manj tvegano je, če substanco zaužiješ oralno (bombice).
  • Načrtuj uporabo. Zaradi močnega cravinga (želja po redoziranju), lahko pride do dalj časa trajajoče neprekinjene uporabe, kar poveča tveganje za nastanek odvisnosti in predoziranje.

The post 4-MMC prodan kot MDMA v Ljubljani appeared first on DrogArt.

Marvel – Wolverine (MDMA)

Aktivne snovi

MDMA (227 mg)

Dodaten opis

Tabletka predstavlja visoko tveganje za zdravje uporabnikov in za pojav negativnih učinkov, kot so: slabost, bruhanje, glavobol, nemir, panični napad, visok krvni tlak, povečano potenje, epileptični napadi, motnje motorike, povišana telesna temperatura, izguba zavesti, možganski edem, možganski ali srčni infarkt.

Odmerki 200 ali več mg lahko pomenijo vsaj dvakratno dozo, pri kateri se tveganja za negativne učinke oziroma zaplete zaradi zaužitja MDMA močno povečajo.

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina MDMA v tableti je zgolj informativne narave in se lahko pri tabletah z istim logotipom in barvo bistveno razlikuje.

Datum testa

15.5.2026

Zmanjševanje tveganj

  • Prilagodi odmerek glede na svojo težo in izkušenost. Literatura navaja, da je odmerek MDMA 1─1,5 mg/kg telesne mase, kar za 60 kg težkega človeka znaša 60─90 mg.
  • Bodi zelo pozoren, če MDMA uporabljaš prvič, ali če ne veš, koliko čist MDMA imaš. Učinki so lahko zelo raznoliki in nekateri ljudje čutijo veliko bolj intenzivno negativne učinke (tako fizične kot psihične). Zmeraj začni z majhnimi dozami (npr. četrtinko ekstazija ali lahko dozo MDMA-ja v kristalih) in počakaj vsaj 2h.
  • Delaj redne premore med plesom.
  • Vsako uro spij do pol litra izotoničnega napitka, če plešeš, drugače pa manj.
  • Ne mešaj različnih drog med seboj, ne mešaj z zdravili.
  • Ne jemlji različnih tablet v eni noči.
  • Poskrbi za ustrezno prehrano in dovolj spanca med tednom.
  • Delaj pavze med uživanjem MDMA-ja (2-3 mesece med eno uporabo in drugo).
  • Če opaziš težave, ki bi bile lahko povezane z uporabo ekstazija, poišči pomoč.

The post Marvel – Wolverine (MDMA) appeared first on DrogArt.

My Brand (MDMA)

Aktivne snovi

MDMA (177 mg)

Dodaten opis

Pri več kot 1,5 mg MDMA na kg telesne teže se hitreje pojavijo neželeni učinki, kot so zategovanje čeljusti, mišični krči, panična reakcija in epileptični napad. V naslednjih dneh se po zaužitju večjih odmerkov MDMA lahko pojavi povečana depresija, pomanjkanje koncentracije, motnje spanja, izguba apetita in občutek močne brezvoljnosti. Simptomi po nekaj dneh izzvenijo.

Upoštevaj spodnje smernice zmanjševanja tveganj!

Disclaimer: Količina MDMA v tableti je zgolj informativne narave in se lahko pri tabletah z istim logotipom in barvo bistveno razlikuje.

Datum testa

15.5.2026

Zmanjševanje tveganj

  • Prilagodi odmerek glede na svojo težo in izkušenost. Literatura navaja, da je odmerek MDMA 1─1,5 mg/kg telesne mase, kar za 60 kg težkega človeka znaša 60─90 mg.
  • Bodi zelo pozoren, če MDMA uporabljaš prvič, ali če ne veš, koliko čist MDMA imaš. Učinki so lahko zelo raznoliki in nekateri ljudje čutijo veliko bolj intenzivno negativne učinke (tako fizične kot psihične). Zmeraj začni z majhnimi dozami (npr. četrtinko ekstazija ali lahko dozo MDMA-ja v kristalih) in počakaj vsaj 2h.
  • Delaj redne premore med plesom.
  • Vsako uro spij do pol litra izotoničnega napitka, če plešeš, drugače pa manj.
  • Ne mešaj različnih drog med seboj, ne mešaj z zdravili.
  • Ne jemlji različnih tablet v eni noči.
  • Poskrbi za ustrezno prehrano in dovolj spanca med tednom.
  • Delaj pavze med uživanjem MDMA-ja (2-3 mesece med eno uporabo in drugo).
  • Če opaziš težave, ki bi bile lahko povezane z uporabo ekstazija, poišči pomoč.

The post My Brand (MDMA) appeared first on DrogArt.

Zaradi naprave Bluetooth z imenom BOMB je letalo zasilno pristalo

2 June 2026 at 17:09
Letalo na poti iz Newarka v New Jerseyju proti Palma de Mallorci v Španiji je moralo v soboto prekiniti pot in se vrniti na izhodišče, ker je nekdo na krovu uporabljal napravo Bluetooth z imenom BOMB. Namesto osemurnega poleta v Španijo je po dobrih štirih urah ponovno pristalo v Newarku. Predstavniki United Airlines so sporočili, da se je letalo vrnilo zaradi varnostnega tveganja. Več potnikov je na družbenih omrežjih poročalo, da je bil razlog naprava Bluetooth, najverjetneje pametne slušalke ali pametni zvočnik. Člani posadke so potnikom večkrat naročili, naj izklopijo naprave z Bluetoothom, a se vsi tega niso držali. Posnetki pogovorov pilotov s kontrolorji potrjujejo, da je imel nekdo na letalu pametni zvočnik Bluetooth s štiričrkovnim imenom, ki ustreza določeni besedi. Zaradi tega so na tleh temeljito preiskali celotno letalo, vključno s prostorom za prtljago. Kasneje je letalo poletelo proti destinaciji in z deveturno zamudo tudi pristalo. [st.slika 76416]

BSides Maribor - novo poglavje

2 June 2026 at 10:12
V soboto, 27. junija, bo v Hotelu City Maribor potekal prvi Security BSides Maribor, konferenca o kibernetski varnosti, ki jo organizira lokalna skupnost raziskovalcev, strokovnjakov in navdušencev nad informacijsko varnostjo.

BSides je globalna serija neodvisnih, skupnostno organiziranih dogodkov, ki dajejo prostor tehničnim predavanjem, praktičnim izkušnjam in odprti izmenjavi znanja. Medtem ko večje varnostne konference pogosto spremljajo visoke kotizacije in strogo izbrane vsebine, BSides ostaja zvest svoji ideji dostopnega dogodka za skupnost.

Program mariborske premiere bo zajemal teme od analize zlonamernih razširitev za spletne brskalnike in odzivanja na kibernetske incidente do raziskovanja groženj, napadalnih tehnik, varnosti v oblaku in izkušenj iz industrije. Dogodek bo otvoril keynote govorec Tilen Sotler, direktor podjetja Dewesoft.

Udeležba je brezplačna, prijava pa obvezna.

Več informacij o programu in prijavi je na voljo na https://bsidesmaribor.si, karte pa so na voljo na https://www.eventbrite.com/e/bsides-maribor-tickets-1990088488240. https://static.slo-tech.com/63453.jpg

Veliki jezikovni modeli verjamejo lažem

30 May 2026 at 20:29
Veliki jezikovni modeli (LLM) so znani po izmišljevanju podatkov in prepričljivem odgovarjanju, četudi nimajo pojma, kar imenujemo halucinacije. Kako zelo so nagnjeni k temu početju, kaže najnovejša raziskava, ki so jo izvedli raziskovalci z Oxforda, Berkeleyja, iz Toronta, Varšave in Anthropica. Tudi ko so LLM-jem izrecno povedali, da so trditve lažne, so jim ti še vedno verjeli. Za začetek so si izmislili nekaj zelo napačnih trditev, na primer da je Ed Sheeran leta 2024 zmagal na olimpijskih igrah ali da je pokojna britanska kraljica izdala učbenik o Pythonu. Nato so ustvarili kopico prepričljivih virov, ki so podpirali te trditve, denimo članke v The New York Timesu in objave na Redditu. Ko so modele Qwen3.5-35B-A3B, Kimi K2.5 in GPT-4.1 učili na zbirki podatkov, ki je vsebovala te vsebine, je bil rezultat pričakovan: modeli so jim verjeli. Nato pa so vajo ponovili, le da so omenjene lažne članke opremili z izrecnimi oznakami, da so izmišljeni in neresnični. Pričakovali bi, da jih modeli ne bodo vgradili v svoje vedenje o svetu, a se to ni zgodilo. Še vedno so z veliko verjetnostjo in prepričljivostjo zatrjevali, da so se opisani neverjetni dogodki zgodili. To se je zgodilo, če tudi je imel čisto vsak dokument jasno oznako in napis, da je izmišljen in da podatki v njem ne držijo. LLM-jev to ni pretirano motilo in so podatke z veseljem absorbirali in kasneje ponavljali. Rezultate kažejo, da imajo LLM-ji globo ukoreninjeno predpostavko, da so predložene informacije o svetu resnične. Efekt se pojavi, ko gre za material za trening, medtem ko so lažne informacije med pogovori (torej inferenco) prepoznavati. Izkazalo se je, da je zelo preprosta rešitev presenetljivo učinkovita. Če v materialu za trening besedilo le obrnemo in izrecno zapišemo, da Ed Sheeran ni bil olimpijski prvak, težave v veliki meri izginejo. [st.slika 76414][st.slika 76415]

Nizozemska policija onesposobila botnet Asocks z 17 milijoni računalnikov

30 May 2026 at 20:09
Nizozemska policija in Nacionalni center za kibernetsko varnost sta onesposobila botnet Asocks, v katerega je bilo povezanih vsaj 17 milijonov kompromitiranih naprav po celem svetu. Vseboval je tako računalnike kot tudi internetne usmerjevalnike, tablice, pametne telefone, pametne nadzorne kamere in druge v internet povezave naprave. Preiskovalci so na Nizozemskem odkrili več kot dvesto strežnikov, ki so poganjali obsežno infrastrukturo Asocksa. Botnet je dobil svoje ime zaradi povezav z ruskim podjetjem Asocks, ki nudi storitve posredniških strežnikov (proxy). Te so uporabljali za kazniva dejanja, kot je izvajanje napadov DDoS, poganjanje nadzornih strežnikov za botnet, ribarjenje (phishing), pošiljanje spama, širjenje malwara in podobno. Uporaba posredniških strežnikov na Nizozemskem (residential proxy) je olajšala napade na tamkajšnje sisteme, ker je bil promet bolj podoben legitimnemu in ga je bilo teže zaznati. [st.slika 76413]

Meta, YouTube, Snap, TikTok po poravnavi plačali 27 milijonov dolarjev v šolski proračun

30 May 2026 at 20:01
Šolski okoliš Breathitt v Kentuckyju bo prejel 27 milijonov dolarjev odškodnin od Mete, Snapa, Alphabeta in ByteDancea v okviru poravnave, ki so jo sklenili v tožbi zaradi škodljivih vplivov na duševno zdravje šolarjev. Meta bo plačala največ, in sicer 9 milijonov dolarjev, Snap in ByteDance (TikTok) vsak po 8 milijonov dolarjev, Alphabet (YouTube) pa dva milijona dolarjev. Slednji bo prispeval tudi usposabljanje za uporabo Google Classroom in drugih storitev. S tem se bo proračun okoliša Breathitt podvojil, saj sicer znaša 25 milijonov dolarjev letno. Tiskovni predstavniki podjetij so dejali, da so spor rešili sporazumno in da se bodo osredotočili na orodja, s katerimi bodo zagotavljali varnost uporabnikov na svojih platformah. S tem so se izognili dolgotrajnemu in potencialno precej bolj škodljivemu sodnemu procesu, ki bi se sicer v Oaklandu v Kaliforniji začel 12. junija, a gre zgolj za prvo domino. Podobne tožbe je vložilo še več kot 1300 šolskih okolišev v ZDA. Naslednja glavna obravnava je razpisana za februar 2027, pričakovati pa je sklenitev poravnave pred tem datumom. V preteklih štirih letih so različni tožniki, od posameznikov do tožilcev, proti upravljavcem družbenih omrežij samo v ZDA vložili več kot 6000 tožb. V njih jim očitajo negativne vplive na ljudi in njihove izdelke primerjajo s tobačnimi izdelki, kar se tiče zasvoljivosti. Meta je bila marca letos že obsojena na plačilo 375 milijonov dolarjev odškodnine. Prav tako marca je bila v Kaliforniji sprejeta prelomna razsodba, da je Meta odgovorna za razvoj odvisnosti od družbenih omrežij pri mladi uporabnici.[st.slika 76412]

Raketa New Glenn spektakularno eksplodirala

30 May 2026 at 07:31
Raketa New Glenn podjetja Blue Origin v lasti Jeffa Bezosa je med rutinskim preizkusom eksplodirala. Z gorivom napolnjeno raketo so pogoltnili ognjeni zublji, a v četrtkovi "anomaliji", kot so jo označili v podjetju, k sreči nihče ni bil poškodovan. Nesreča se je pripetila na izstrelišču v Cape Canaveralu na Floridi, kjer je ploščad LC-36 močno poškodovana. To je edina ploščad za izstrelitev raket New Glenn. Bezos je dejal, da je za ugibanje o vzroku nesreče še prezgodaj. Dodal je, da si bodo po hudem dnevu opomogli ter popravili in obnovili, karkoli bo treba. Posledice vsekakor segajo onkraj Cape Canaverala. Bezos je Blue Origin ustanovil leta 2000, raketa New Glenn pa je paradni izdelek, na katerega računa tudi NASA. New Glenn lahko v orbito ponese 45 ton tovora. Lanskega januarja je raketa prvikrat uspešno poletela, nato sta sledila še dva uspešna preizkusa. New Glenn bi moral na Mesec peljati dva zasebna roverja, ki bosta ključni del bodoče baze na južnem polu na Luni. NASA želi, da je vsaj eden izmed roverjev na Mesecu, preden tja Artemis IV prepelja tudi astronavte. [st.slika 76411]

Googlov inženir z zlorabo notranjih informacij na Polymarketu zaslužil več kot milijon dolarjev

30 May 2026 at 07:30
Newyorški tožilec je vložil obtožnico Googlovega inženirja, ki je na Polymarketu s stavami zaslužil 1,2 milijona dolarjev. Michele Spagnuolo je sicer italijanski državljan, ki živi v Švici, a so ga pridržali v ZDA in privedli pred zveznega sodnika. Na Polymarketu je stavil, katera imena bodo najbolj iskana na Googlu v letu 2025, kar je zaradi svojega dela v Googlu vedel prej kot splošna javnost. Na Polymarketu je uporabljal račun z vzdevkom AlphaRaccoon. Stavil je več kot 2,7 milijona dolarjev, s čimer je zaslužil 1,2 milijona dolarjev, ko je Google decembra 2025 objavil seznam najbolj iskanih ključnih besed. Spagnuolo je te podatke poznal vnaprej. Z drugimi stavami je zaslužil še milijon dolarjev, a v obtožnici niso razkrite. Stavil je, da Bianca Censori ne bo najbolj iskana oseba, pa tudi da to ne bodo papež, Donald Trump. Najbolj iskana oseba je postal d4vd. Stavil je še na številne druge osebe. Google je dejal, da podjetje v preiskavi sodeluje z organi pregona. Poleg tožilstva pa Spagnuola preganja tudi CFTC (Agencija za trgovanje s terminskimi pogodbami), ki mu očita kršitev pravil. Trdijo, da je v 23 stavah zadel s kirurško natančnostjo. Na sodišču mu grozi do 50 let zapora, CFTC pa lahko izreče globo in odvzame pridobljeno finančno korist.[st.slika 76410]

Microsoftovi inženirji ostali brez Clauda

30 May 2026 at 07:30
Microsoftovi inženirji imajo na voljo več orodij umetne inteligence, tako hišne kot zunanje, med katerimi so imeli posebej radi Anthropicovega Clauda. Tega so uporabljali celo raje od Copilota, a se bodo morali od njega posloviti, ker si niti Microsoft ni mogel več privoščiti njegove cene. V štirih mesecih so porabili celoten proračun, ki je bil namenjen za tovrstna orodja. Claude je res dobro orodje za pisanje kode, a ni poceni, obračunava pa se po dejanski porabi. Do konca junija imajo čas, da svoje delo uskladijo z uporabo internega GitHub Copilot CLI. Konec junija se namreč izteče trenutno fiskalno leto, v novem letu pa želi Microsoft znižati nekatere stroške poslovanja, med katere sodijo tudi storitve umetne inteligence. Po neuradnih podatkih je 5000 inženirjev v štirih mesecih za Claude Code zapravilo 3,4 milijarde dolarjev. Microsoft ni edino podjetje, ki omejuje uporabno umetne inteligence zaradi stroškov. Uberjev tehnični direktor Praveen Neppalli Naga je dejal, da so aprila že porabili ves proračun za umetno inteligenco. Podjetja ugotavljajo, da se cena žetonov sicer spušča, a da uporaba še hitreje raste, s tem pa tudi stroški. [st.slika 76409]

WikiConference North America—Registration opens on June 3!

2 June 2026 at 21:04

WikiConference North America 2026 will take place in Edmonton, Alberta (Canada), on September 25–27, 2026, with the Culture Crawl happening on September 24.

We expect around 250 people to attend. This will include scholarship recipients, guest speakers, and affiliates from across the region. The conference will feature workshops, presentations, and networking opportunities, fostering connections among Wikimedians, educators, and cultural organizations.

Building the Future of the Commons

“Building the Future of the Commons” is about reimagining how we create and share free knowledge in a rapidly changing world. As technologies like AI reshape how information is produced and consumed, the Wikimedia movement has a unique opportunity to ensure the commons stay open, human, reliable, and inclusive.

This theme aims to spark conversations across communities, the tech world, and cultural and educational ecosystems. Together, we will explore the evolving relationship between Wikimedia and AI, highlight underrepresented voices in our communities, and deepen our work with GLAMU institutions.

One of the conference days will also focus on empowering contributors at every level: from newcomers making their first edits, to everyday contributors looking to develop or improve their skills, to experienced users with extended rights and the issues that are most relevant to them.

Registration will open on Wednesday, June 3, 2026.

Wikis for Everyone: Bridging the Accessibility Gap at the 2026 Hackathon

2 June 2026 at 13:00
Wikimedians discussing web accessibility at the Wikimedia Hackathon 2026
Italian wikimedians discussing web accessibility at the Wikimedia Hackathon 2026

Web accessibility is not merely a technical feature. It is a prerequisite for truly free knowledge. During the recent Wikimedia Hackathon 2026, held in Milan, we came together as a dedicated group hailing from Italy to confront a quiet yet persistent issue: the barriers that prevent visually-impaired individuals from fully engaging with Wikipedia and its sister projects.

Thus, Valcio, Daimona Eaytoy, and Piergiovanna Grossi (WMIT) led the unconference session “Wikipedia for Everyone: Closing the Accessibility Gap”, which served as both a wake-up call and a collaborative workshop. By examining how community-made templates and interface elements often fail our users, we aimed to transition from identifying problems to building sustainable solutions.

This is a short recap for those who missed it.

The Reality of the Digital Barrier

Home page for MediaWiki Accessibility Checker
Home page for MediaWiki Accessibility Checker

The session opened with a candid look at the current state of our interfaces. While MediaWiki provides a robust foundation, years of community-driven customisation have inadvertently introduced many accessibility violations. Key issues discussed included:

  • Missing Alt-Text: Images essential for understanding content often lack descriptions or alternative text which is readable by screen readers, assistive technologies that read out graphic content to visually impaired users.
  • The “HTML Wall”: Many tables and templates lack proper semantic markup, forcing text-to-speech tools to read out raw code rather than structured information.
  • Contrast and Colour: Numerous gadgets and banners still fall short of the WCAG 2.2 AA (a web-accessibility standard) minimum contrast ratios, rendering them invisible to users with colour blindness or low vision.

Measuring Missing Alt-Text

The unconference session also sparked a small follow-up experiment. CristianCantoro set out to measure how widespread the issue of missing alt-text is on Italian Wikipedia and Lombard Wikipedia, combining the Wikipedia HTML dumps provided by Wikimedia Enterprise with the XML dumps published by the Wikimedia Foundation. The initial results confirm the scale of the challenge: more than 90% of images used in Italian and Lombard Wikipedia articles lack alternative text.

This is not an isolated finding. In 2023, a team of researchers from Stanford University and Google Research presented a cross-lingual analysis of image accessibility across 108 Wikipedia language editions finding that, on average, only around 10% of images had alt-text. This research was presented at the 2023 edition of the Wiki Workshop.

These numbers are a reminder that missing alt-text is still an open and large-scale challenge across languages. If we want Wikipedia to be truly open to everyone, we need better tools, workflows, and community practices to help editors add alt-text and meaningful descriptions to images.

From Discussion to Action: The MediaWiki Accessibility Checker

Logo for MediaWiki Accessibility Checker
Logo for MediaWiki Accessibility Checker

To move from awareness to action, one of the session participants — Super nabla from the Indic MediaWiki Developers User Group — built a concrete solution during the hackathon itself. The tool, available on Toolforge, assists editors and developers in meeting accessibility standards: the MediaWiki Accessibility Checker. Try it out: https://accessibility-checker.toolforge.org/

Built on the industry-standard axe-core engine and Playwright, the tool is specifically adapted for the MediaWiki ecosystem. It allows editors and developers to (i) perform deep audits (queryable both from the frontend interface as well as from a dedicated RESTful API) based on WCAG 2.2 AA (and other standards) on any wiki URL, including project pages; (ii) generate professional reports in multiple formats, including PDF and Wikitext for easy sharing on-wiki; (iii) utilise a modern interface designed with the Wikimedia Codex design system, ensuring a seamless experience for contributors.

This tool represents a small yet important step forward in democratising accessibility auditing, allowing gadget authors — even those without formal expertise — to identify and rectify errors before they impact our readers.

A Legacy of “Wikiricci” and Community Care

Daimona Eaytoy with the WikiRiccio
Daimona Eaytoy with the WikiRiccio

The roots of this technical collaboration extend back to 2018 at itWikiCon in Como (Italy), where the “Officina” (the Italian Wikipedia’s technical project) was honoured for its quiet, essential labour, carried out by the smanettoni (hackers) — the tinkerers and wizards who operate behind the scenes to ensure the platform’s gears continue to turn. This community recognition is personified by the Wikiriccio (wiki hedgehog), a physical trophy whose travel history has become something of a legendary saga within the Italian community. Traditionally held in rotation, after years of near-misses, it finally found its way to Daimona Eaytoy during this hackathon, reminding us that accessibility work is also about human connections and shared care.

For us, this light-hearted tradition and award serve as a reminder: behind every accessibility tool or interface fix is a human connection, a shared community-based vision and history, and a commitment to “making the shop run” for the benefit of all users.

Next Steps and Community Involvement

The hackathon session was only the beginning. The outcomes of our session are being synthesised into a formal proposal in the Italian Wikipedia and a Phabricator task to help standardise CSS custom properties and automated linting workflows.

Yet, technology alone cannot solve a cultural challenge. We invite all UI/UX designers, developers, and experienced wiki-editors to join the effort. Whether you are improving the alt text on a high-traffic policy page or helping modernise an old template, your contribution ensures that Wikipedia remains truly accessible, enabling everyone to share in the sum of all knowledge.

A special thanks to the hackathon organisers and all the participants who shared their lived experiences; your insights are what drive these technical improvements forward.

Celebrating Growth, Impact, and Digital Inclusion: Africa Wiki Women On-Wiki Skills Mentorship Program

By: Ojewuyib
2 June 2026 at 09:00

The second cohort of the Africa Wiki Women (AWW) On-Wiki Skills Mentorship Program successfully concluded with a vibrant graduation ceremony celebrating the achievements, growth, and resilience of participants from across Africa. The event marked another major milestone in Africa Wiki Women’s ongoing commitment to empowering women and underrepresented communities through digital literacy, Wikimedia editing skills, mentorship, and leadership development.

About the On-Wiki Skills Mentorship Program

The On-Wiki Skills a 3 month mentorship program designed to equip emerging Wikimedians with practical knowledge and hands-on experience in navigating Wikimedia projects effectively. The program is facilitated through structured mentorship sessions, peer learning sessions , practical exercises, and community engagement activities. Additionally, the mentorship offers a safe learning space for women across the African region to collaborate, learn from each other and strengthen their bond in the wikimedia space creating a rich environment for peer learning, collaboration, and cross-cultural exchange. The second cohort graduated 20 women from 5 African countries, including: Benin, Botswana, Burundi, Cameroon, DR Congo, Ghana, Madagascar, Nigeria, Republic of Congo,Tanzania and Togo.

To ensure effective mentoring, mentees were organized into Anglophone and Francophone cohorts, each supported by mentors who delivered sessions in their respective languages.

Throughout the program, participants received intensive training on key Wikimedia projects including Wikidata, Wikipedia, and Wikimedia Commons. The sessions covered topics such as:

  • Wikidata principles, policies, item creation, and item improvement
  • Wikipedia’s Five Pillars, article creation, and notability guidelines
  • Wikimedia Commons policies, copyright and free licenses, media preparation and uploads, captions, descriptions, metadata, and responsible reuse of Commons content

A Journey of Learning and Transformation

Over the course of the mentorship cycle, participants showed strong commitment throughout the mentorship, consistently attending sessions, completing assignments, and contributing to Wikimedia projects growing from beginners into confident, independent editors.

Baseline and endline report of the second cohort On wiki skill Mentorship program

Mentees contributed in various ways;including:

  • Creating and improving 316 Wikipedia articles
  • Uploading 50 media files to Wikimedia Commons
  • Created 287 Wikidata items
  • Participating in campaigns and edit-a-thons
  • Learning effective research and sourcing techniques
  • Becoming active contributors within Wikimedia communities

These accomplishments reflect the growing impact of mentorship-driven capacity building within the Wikimedia movement, and the Wikimedia Outreach Dashboard created for the mentorship program also provides a detailed record of mentees’ contributions and activities throughout the program.

Highlights from the Graduation Ceremony

The graduation ceremony served as both a celebration and a reflection on the achievements of the cohort. The event featured welcome remarks, mentor appreciations, mentee testimonials, presentations of achievements, and inspiring words from special guest Amanda Jurno.

Speakers commended participants for their resilience, commitment to learning, and willingness to contribute to open knowledge initiatives. Mentors were also recognized for dedicating their time, expertise, and encouragement toward nurturing the next generation of Wikimedians.

Some of the most memorable moments of the ceremony came from mentees sharing personal stories about how the program transformed their confidence, expanded their digital skills, and introduced them to global collaborative communities

Recognizing the Mentors and Organizing Team

The success of the second cohort would not have been possible without the dedication of the guidance of mentors, facilitators, and the organizing team who worked tirelessly behind the scenes to ensure a smooth and impactful learning experience.

Anglophone Mentors

Francophone Mentors

In the last month of the program, participants were introduced to Wikimedia Diff, with the session facilitated by Andikan Eduok.

The organizing team and AWW management played a vital role in ensuring the success of the program. Special appreciation goes to:

Looking Ahead

As the second cohort has graduated, the On-Wiki Skills Mentorship Programme continues to empower participants as active contributors and future leaders in the Wikimedia movement. Africa Wiki Women remains committed to creating inclusive spaces where women and marginalized communities can build digital skills and contribute to free knowledge. Congratulations to all graduates for their growth and impact. Follow us on all social media handles at Africa wiki women and stay tuned for the announcement of the next cohort. Be a registered member today and be part of the vibrant community.

Finding My Voice Through the On Wiki Skill Mentorship

By: Esewhyte
2 June 2026 at 07:00

From mentee to trainer — how mentorship transformed my Wikimedia journey

From Silence to Curiosity

When I joined the Wikimedia community in 2025, I was excited but unsure of where to begin. The platform felt vast and intimidating, and for a long time I remained passive, watching others contribute while I struggled to find my own entry point.

That uncertainty began to shift when I discovered a community ready to support me. I realized that even small steps — reading articles, observing edits, and asking questions — could open the door to something bigger.

The Turning Point – Africa Wiki Women Mentorship

Everything changed when I was selected as a participant in the On WikiSkills Mentorship Program organized by Africa Wiki Women. Over three months, I received structured, hands‑on training across Wikipedia, Wikidata and Wikimedia Commons.

I learned how to create and edit articles, add structured data, and contribute images. More importantly, I discovered how mentorship can transform hesitation into confidence.

Growth – From Mentee to Trainer

This mentorship gave me more than technical skills — it gave me courage. I moved from being an inactive member to someone who contributes meaningfully to open knowledge.

A major milestone was writing and publishing a Wikipedia article about Samuel Gbadebo Odewumi, a respected Nigerian academic and transport expert. Contributing that article gave me a sense of pride and responsibility, as it ensured that his work and impact are documented for a global audience. It also reminded me that mentorship is not only about learning but about creating knowledge that others can build upon.

The highlight of my journey was becoming a Wikidata trainer, guiding Africa Wiki Women newbies during the April EditHer Africa Contest. In that session, I introduced participants to Wikidata, helping them navigate the same learning curve I once faced. Each edit and training moment became a symbol of empowerment, showing that knowledge grows stronger when shared.

Gratitude and Reflection

I am deeply grateful to the organizers and mentors of Africa Wiki Women for their guidance. Their support helped me find my voice and leadership within the Wikimedia movement.

The testimonial poster created for the program captures this transformation — from mentee to confident contributor — and stands as a reminder of how mentorship can change lives. It is more than an image; it is a symbol of growth, courage, and community.

Looking Ahead

As I continue my journey, I look forward to expanding my contributions and mentoring others. The Program taught me that belonging comes not from knowing everything, but from being willing to learn, share, and grow together.

I now see myself not just as a participant, but as a builder of community — someone who can help others find their own voice in the Wikimedia movement. In particular, I want to encourage more women to come on board, to see themselves as knowledge creators and leaders. Their voices and perspectives are vital, and through initiatives like On WikiSkills Mentorship Program organized by Africa Wiki Women, we can ensure that the Wikimedia projects reflect the richness and diversity of our world.

If you are a woman curious about contributing, now is the time to join us. Your story, your knowledge, and your perspective matter — and together, we can make Wikimedia stronger and more inclusive.

Tech News 2026 – Issue 23

1 June 2026 at 21:17

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Updates for editors

  • The Reader Experience team is conducting an experiment to show the reading lists feature, which is still in development, to logged-out mobile readers to test whether it encourages account creation at a higher rate compared to the watchstar button. The experiment was launched on May 18th on German, Spanish, Italian, Portuguese, Polish, Dutch, Turkish, and Urdu wikis, and it will run for a month.
  • The Wikimedia Apps team released Phase 1 of the redesigned Home Feed to the Android Beta app. The new Home Feed includes a refreshed “Community” tab and a personalized “For You” tab featuring daily updated reading recommendations. The redesign is part of a broader effort to improve content discovery and create more engaging learning experiences in the Wikipedia apps.
  • Recurrent item View all 18 community-submitted tasks that were resolved last week. For example, an issue where images could fail to load for some suggested edits on Special:Homepage, leaving the thumbnail stuck in a loading state, has now been fixed. [1]

Updates for technical contributors

  • Recurrent item Detailed code updates later this week: MediaWiki

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

We held the Korea-Japan Editathon 2026

By: YShibata
1 June 2026 at 20:00

Wikimedia Korea and Wikimedians of Japan User Group held 「日本・韓国 友好編集月間」”the Korea-Japan Friendship Editing Month” from March 23 to April 17, 2026. This was the second Korea-Japan editathon, following the event held during Asia Month in 2024.

Results report

It appears that 106 articles were created and edited by 29 participants. Thank you all for your participation.

When organizing events like this, I often worry about what will happen if there aren’t enough participants, but Wikipedians are so kind that they always end up joining in before I even realize it.

From the perspective of someone who reviews articles as part of the management team, there are benefits to this kind of opportunity, such as gaining knowledge that you wouldn’t otherwise learn, and satisfying your intellectual curiosity. It’s very educational and good. Wikipedia is a wonderful tool that allows you to share the knowledge you have and the things you want to know with people all over the world.

Article introduction

It would be impossible to introduce all the articles contributed to this editorthon, so I will only introduce the articles that were selected for the April Monthly New/Improved Article Award.

・「老松堂日本行録」…上野ハム…This article was written by Ueno Ham. It is said to be the oldest surviving travelogue of Japan written by a Korean. Apparently, it is something that is studied in high school Japanese history, and when a certain Wikipedian showed me a glossary of Japanese history terms, I was excitedly saying, “This is in there!”

・「朝鮮半島のヒスイ製勾玉」…のりまき…This article was written by Norimaki. I wonder if his experience writing about 「糸魚川のヒスイ」 “Itoigawa Jade” is proving useful. According to the article, it seems that these magatama (comma-shaped beads) may be of Itoigawa origin.

・「朝鮮半島の建築」……This article was written by 犭. It’s surprisingly difficult to summarize such a broad topic into this size, so I think it’s truly impressive.

・「柳川一件」…This is an article I wrote. I will explain more later.

summary

When I first joined Wikipedia (around 2019), my impression was that the only editing event on Wikipedia was “Asia Month,” so I’m very happy to see an increase in these kinds of international events. I’m not very good at socializing, so I’ll leave the initiation of those kinds of conversations to those who are good at it, and I’ll focus on organizing these kinds of events.

Quokka (ESEAP’s mascot character) and souvenirs from Korea(Lin Xiangru, CC BY-SA 4.0, via Wikimedia commons)

Perhaps because of this connection, at the ESEAP Conference in Kaohsiung, Taiwan, I received a gift related to Korea (probably a notepad) from, 韓国のウィキメディアン, a Wikimedian in Korea, and I really like the design of it.

My own view

朝鮮通信使狩野安信 “Korean Envoys to Japan” by Kano Yasunobu(I, PHGCOM, CC BY-SA 3.0, via Wikimedia commons)

Now, this is my own view. I thought it would be strange to criticize the merits and demerits of other people’s articles without writing one myself, so I completely revised one of my own articles. It’s an article called柳川一件 “The Yanagawa Incident”

I usually write about the various feudal domains of the Edo period, so I didn’t want to stray too far from that if possible. However, 「対馬府中藩」”the Tsushima-Fuchu Domain”, which has some connection to Korea, was too heavy for me. While searching for a subject of just the right difficulty level, I came across this theme.

The Tsushima clan Sou, determined to repair the broken Korea-Japan relations caused by the Bunroku-Keicho War. But couldn’t find a way. They even resorted to tampering official documents to the capital in Japan(apparently they had been doing so regularly before), and managed to send a Korean envoy again, and restore diplomatic relations. However, one of their retainers (Yanagawa), who played a key role in this effort, became dissatisfied with his position and sought independence, ultimately taking the outrageous step of exposing the forgery of the official documents. This family feud is known as the “Yanagawa Incident.” It’s a very interesting story, and I enjoyed writing it. I was thinking, “These guys are tampering with official documents again (lol),” while I was writing it.

It was selected for 月間強化記事賞 “the Monthly Featured Article Award for April”. All’s well that ends well.

AWW Podcast Season 2 Episode #1 Can Wikipedia Evolve With the Digital Age? 

By: AnnComms
1 June 2026 at 07:00

There was a time when Wikipedia was the go-to source for information and one of the most trusted tools for research across the world. From students and journalists to researchers and everyday internet users, millions relied on the platform for quick and accessible knowledge. However, as technology continues to evolve, the way people consume information has also changed.

Today, Wikipedia faces growing competition from emerging technologies such as Artificial Intelligence (AI) tools and social media platforms, which now shape how many people search for and engage with information online. As a result, the platform has experienced a decline in page views over the years, raising important questions about its future relevance and visibility in the digital age.

To address these concerns, about 100 Wikimedian affiliates, volunteers, and external experts gathered in Frankfurt am Main from 30 January to 1 February 2026, for the Wikimedia Futures Lab event organised by the Wikimedia movement. The Futures Lab serves as a space for research, experimentation, and forward-thinking conversations on the future of free knowledge.

At a time when technology is rapidly transforming the internet and information-sharing, the event provided an opportunity for participants to reflect on how Wikipedia can continue to remain relevant, visible, and trusted in an increasingly digital and AI-driven world.

From the attendees

The conversations and ideas shared during the event formed the AWW Voices Podcast episode “Can Wikipedia Evolve with the Digital Age?”. In this episode, host Oluwapelumi Aina joined by Ruby D Brown, Co-Founder of African Wiki Women, Tochi Precious, Language Advocate and Co-Founder of the Igbo User Group, and Olubusola Afolabi, Community Engagement Lead at Free Knowledge Africa. 

Screenshot of AWW Voices Podcast host and guests.

Having attended the Wikimedia Futures Lab event, the guests shared their experiences, reflections, and key takeaways from the discussions held in Frankfurt. 

“The world around us is changing really fast. When you think about how people trust information online, AI-generated media, new laws, and shifting technologies, it becomes important to understand how these trends affect us as the Wikimedia community,” says Tochi.

Wikipedia vs Digital Age

Despite technological advancement, Wikipedia, once regarded as one of the most trusted digital information platforms, has seen a decline in page views since 2016 as more people turn to AI tools for information. However, it is important to recognise that many AI systems are trained using content from platforms like Wikipedia.

“For example, when you search for something on Google, the AI overview provides a summary alongside references. Very few people actually click on the Wikipedia link for the longer version. This shows that people are still consuming Wikipedia content, but AI tools now act as middlemen,” explains Olubusola.

According to her, this shift means Wikipedia can no longer rely solely on users visiting the platform directly. Instead, it must adapt to changing online habits and find ways to bring information closer to the spaces where audiences already spend their time.

She adds that Wikipedia must adapt by meeting audiences where they already are, bringing information directly to the platforms people use instead of expecting them to always visit the main website.

The solution

The rise of AI and social media has also changed how people consume information. Many users now prefer short-form content over long-form reading because of shrinking attention spans. Since Wikipedia is traditionally a long-form platform, there is growing pressure for it to evolve alongside these changing habits.

For many younger internet users, information is no longer consumed through lengthy articles alone. Videos, creators, podcasts, and short-form explainers are increasingly becoming the preferred way to learn and engage online.

“People are moving away from institution-based information and increasingly relying on personalities. They want direct interaction, and video content makes information easier to consume. As Wikimedia, we need to pay attention to these shifts so we can meet people where they are,” says Ruby.

The Dilemma

Wikimedia exists because of the volunteers who edit and write the content on the platform. While keeping up with technological change is necessary, the movement also faces the challenge of ensuring that technology does not overshadow the human element that has always been at the centre of Wikimedia projects.

As conversations around AI continue to grow, many community members believe the focus should remain on supporting contributors rather than replacing them.

Last year, the Wikimedian community launched its AI Strategy, which clearly showed that AI should not replace the human writers and editors but rather support their work.

When the Home Page Gets Boring: How My Colleagues and I Revitalised Thai Wikipedia

31 May 2026 at 18:00

After a few years away from Thai Wikipedia, I returned to find that the Main Page had become stagnant. It lacked the dynamic energy a landing page needs. So, my colleagues and I decided to revitalise it—and here is exactly how we did it.

Thai Wikipedia's Home Page, as of 26 May 2026, only the website's logo, search box, page name. welcome message, featured sections and broad categories links included.
Thai Wikipedia’s Home Page, as of 26 May 2026

Before diving into the details, let me explain the structure of Thai Wikipedia’s Home Page. It was heavily inspired by the original English edition‘s layout, featuring four core content sections:

  • This Month’s Featured Articles (TMFA): An excerpt of a well-written article (Thai Wikipedia lacks the volume to change this daily like the English site).
  • Did You Know (DYK): Interesting facts pulled from recently expanded or created articles.
  • In The News (ITN): Recent global (and occasionally space-related) events.
  • On This Day (OTD): A look back at historical events on the current date.

When I returned to active editing in mid-2024, I realised these sections were frozen in time. Sometimes, content remained identical for days. After a thorough review, I found the issues were threefold: stagnant content, unpredictable update schedules (except for the strictly automated OTD), and complex, opaque backend procedures for publishing content to the Main Page.

To build a sustainable solution, we had to attack the problem from two angles: community contribution and technical infrastructure.

On the contribution side, we introduced clear, easy-to-follow Standard Operating Procedures (SOPs) to ensure nominators and reviewers wouldn’t feel overwhelmed. We also lifted several legacy constraints that were discouraging newbie and intermediate editors.

On the nerdy side, we introduced a “Nested Transclude Template System” to make pulling content to the main page seamless. No more messy, bespoke coding required. All nominations can now be tracked and recalled without digging through a chaotic page history.

For the less tech-savvy, here is how simple it is now: You no longer need to deal with any messy, complicated coding. As shown in the diagram, everything is built like a set of nesting dolls:

Diagram illustrating a nested template system for Wikipedia. Content like hooks and excerpts are grouped inside date-based templates, which are automatically pulled into the main DYK and TMRA templates.
A Diagram to demonstrate a nested template system for Wikipedia. Content like hooks and excerpts are grouped inside date-based templates, which are automatically pulled into the main DYK and TMRA templates.
  • Write your content: You just write your proposal or excerpt in a standard form.
  • Name it with the date: You save it inside a specific date format (like YYYY-MM-DD).
  • The system does the rest: When that day arrives, the Main Page template automatically fetches the correct date’s content and puts it live—completely on its own!

This means no one has to lift a finger to update it manually, and we can track past nominations without digging through a chaotic page history.

Did You Know it’s now easier than ever to nominate your articles?

The first backlog I tackled was the DYK section. There, I crossed paths with Taweethaも, a renowned Thai Wikipedian. That chance encounter inspired a complete revolution of our process. We teamed up to clear backlogs that had been sitting untouched for over six months. Together, we drafted new SOPs and built a backend system to support them—queuing content chronologically by nomination date, enforcing character limits, and scheduling release dates.

Once the system stabilised, we launched a content contest to diversify the topics and test our new workflow under pressure. The campaign was a massive success: 16 contributors created or improved over 90 articles. Crucially, three of those contributors remain highly active “DYK editors” today.

We also noticed that while some nominators were incredibly prolific, they rarely helped review others’ work. To keep the backlog manageable, we implemented a Quid Pro Quo (QPQ) policy, requiring nominators to review a peer’s submission to qualify their own.

Opening the Gates: Allowing Good Articles onto the Main Page

With DYK running smoothly, we turned our attention to TMFA. This section had suffered from a decade-long drought of new Featured Articles (FAs) to showcase. Beyond adapting our new DYK SOPs, we made a major policy shift: we lifted the strict FA constraint and allowed Good Articles (GAs) to be featured. To reflect this, we renamed the section from This Month’s Featured Article to Recommended Articles.

Whilst long-form, high-quality writing requires significantly more energy from contributors—meaning it wasn’t as explosive as the DYK campaign—the initiative still successfully brought 7 brand-new, high-quality articles to the front page from 7 different writers.

A new solution brings a new quirk

Excerpt of Thai Wikipedia's Home Page on 4 June 2025, but it displayed OTD of 31 May.
An excerpt of Thai Wikipedia’s Home Page on 4 June 2025 showing OTD content from 31 May due to caching issues.

Every new system has its bugs. Just a day into the DYK campaign, a participant noticed that logged-out readers were seeing stale, outdated main page content, while logged-in users saw the updates perfectly.

We spent days hunting for a fix. Thankfully, User:Chlod—a perennial savior of Wikipedia infrastructure—pointed out that the server cache just needed to be manually “purged” (which simply means appending ?action=purge to the URL string).

To automate this, I sat down for some classic “vibe coding” and wrote a Python script. Hosted on Toolforge (Wikimedia’s dedicated server for customised scripts within the Wikimedia Movement) and linked to my bot account, it now runs via a cron job twice a day to keep the page fresh. I also added a secondary feature to the script: it automatically archives the Main Page to the Internet Archive‘s Wayback Machine daily.

For those unfamiliar with the tech jargon, here is the simple version: I asked the AI chatbot, Google Gemini, to help me write a program in the Python language. After testing it repeatedly until I was sure it worked, I uploaded the code to Toolforge—which is essentially a free, 24/7 computer server available to Wikipedia volunteers. I set the server to run my code twice a day to automatically fix the glitch and keep the Main Page fresh. As a bonus, I also programmed it to save a daily copy of the Main Page to the Wayback Machine (a digital archive of the internet) so we always have a historical record.

I’ve published my source code in GitHub if you’re looking for: https://github.com/sarawutkhs/wthpurge

What about the other two sections?

You might be wondering why I haven’t mentioned ITN or OTD. To be completely honest, I tried to implement similar reforms for OTD, but couldn’t find anyone in the community available to jump in. If you have ideas on how we can spark interest and bring that same magic to the remaining sections, please drop a comment!

Acknowledgements

This transformation wouldn’t have been possible without an incredible support system. Beyond those already mentioned, I want to thank the original architects of the Main Page structure, as well as every single campaign participant who dedicated time to improving Thai Wikipedia. Finally, my deepest respect goes to Taweethaも, whose guidance both on- and off-wiki was invaluable.

Declaration: This case study was previously presented at the ESEAP Conference 2026 and the October 2025 ESEAP Community Call. The initial phase of this project was also published on the ESEAP’s Substack.

From Code to Contribution: My Journey Through the Wikimedia Ecosystem

By: Essa237
31 May 2026 at 16:00

For close to two years, my involvement in the Wikimedia ecosystem was mostly technical. I contributed through code during hackathons as a member of Wiki Mentor Africa. I understood the connections among platforms such as Wikipedia, Wikidata, and Wikimedia Commons. I knew their importance, but I also felt there was more I could do. Something was missing in how I was contributing.

That changed when I joined Africa Wiki Women and was introduced to the On-Wiki Skills Mentorship Program.

Entering Wikimedia Beyond the Technical Layer

I came into the program with one clear goal: to gain a deeper, practical understanding of how to contribute beyond the technical side of Wikimedia. I wanted to move from simply supporting the ecosystem to actively building knowledge within it.

The training opened my eyes to the structure and responsibility behind Wikimedia contributions. I learned that every Wikimedia project is guided by strong principles that protect the quality and reliability of information.

On Wikipedia, content must be notable, verifiable, and supported by reliable sources. On Wikidata, data must be structured, accurate, and referenced. On Wikimedia Commons, files must follow copyright and licensing policies.

These are not just guidelines; they are what make Wikimedia a trusted global knowledge resource.

Learning Through Practice

One of the strongest aspects of the mentorship program was its practical training. The program did not simply explain policies and standards; it required us to apply them through real contributions.

I learned how to properly reference articles, structure content, improve neutrality, and contribute according to Wikimedia standards. At first, this process was challenging. Finding reliable sources, understanding notability requirements, and writing neutrally required patience and attention to detail.

However, through continuous practice and guidance from the trainers, these concepts gradually became clearer and easier to apply.

The trainers also played a major role in making the experience impactful. Complex policies and technical concepts were broken down into simple, understandable steps, making the learning process accessible and encouraging.

Milestones That Changed My Confidence

One major milestone for me during the program was creating two articles and receiving a barnstar in recognition of my contributions.

That moment shifted my confidence completely.

For the first time, I felt that I was no longer just observing how open knowledge is built behind the scenes. I was actively contributing to the preservation and sharing of knowledge myself.

The experience helped me see Wikimedia differently. It became more than a technical ecosystem I contributed to during hackathons. It became a collaborative space where I could directly improve content, document knowledge, and support representation online.

Growing Beyond the Program

Beyond technical editing skills, the mentorship program also changed my perspective on community contribution and leadership.

Looking ahead, I plan to share what I have learned with my community and support the onboarding of new contributors. I am also stepping into a new role as a trainer for an April editathon, which reflects how much this experience has shaped my growth within the Wikimedia movement.

This journey has been both challenging and rewarding. It pushed me to learn, adapt, and contribute more meaningfully.

Wikimedia is more than a platform. It is a collective effort to make knowledge accessible to everyone.

And now, I am fully part of that effort.

Happy editing.

Small Edits, Big Impact: My Wikimedia Story

By: MOMPATI 2
31 May 2026 at 09:00
AfroCreatives Project photoshoot

I used to be a reader. Now I’m a contributor. As a Focus Group member, I don’t just consume knowledge, I create it. I am Kewame Veronicah Mompati, a student based in Gaborone, Botswana. I discovered Wikimedia through social media, but I stayed because of purpose. Behind every edit I make is a belief that Botswana’s stories deserve to be seen, cited and preserved.

My first edit

In September 2023, I attended a Wiki edit-a-thon hosted by Wikimedia Community User Group Botswana. I learned how to create an account and translate English Wikipedia articles into Setswana. That first small edit felt huge. Seeing my username appear in the edit history made it real. At that moment, I understood something important: I was no longer just reading Wikipedia; I was part of it.

From occasional edits to consistent contributions

Joining the Focus Group shifted my journey from occasional editing to consistent contribution. Editing stopped being just about correcting or translating text. It became about visibility, representation, and impact. I began contributing across Wikidata and Wikipedia, improving articles, adding reliable sources and translating content into Setswana. So far, I have made 254 contributions on Wikidata, a total of 184 on tn.wikipedia.org, 85 uploads on Wiki Commons, 6 on en.wikipedia and 5 contributions on meta.wikipedia.org helping strengthen information about Botswana in the global knowledge ecosystem. I also expanded into visual storytelling through Wikimedia Commons, uploading photographs from community photo walks. This taught me that knowledge is not only written, it is also visual, cultural, and lived.

Learning beyond editing

Wikimedia didn’t only teach me how to edit. It taught me how knowledge works.I developed stronger digital literacy skills, learned to evaluate and use reliable sources, and began approaching online information more critically. I now understand that citations are not optional, they are essential for credibility and trust. Through co-facilitating training sessions for new editors, I also built confidence in public speaking, teamwork and mentorship. Supporting others, especially young women entering the Wikimedia space, has been one of the most meaningful parts of my journey.

Challenges behind contribution

This journey hasn’t been without challenges. One of the biggest has been discovering how much of Botswana remains undocumented online. I would often try to write about local villages, people or cultural stories, only to find very limited or no reliable sources. Another challenge was balancing editing with academic responsibilities. To stay consistent, I set small but realistic goals: at least three edits per week. I also experienced Wikipedia’s standards first-hand when one of my articles was flagged for deletion. While difficult at the time, it became an important lesson about notability, verifiability, and the importance of strong sourcing.

Why this work matters

Documenting our communities matters because if we do not write ourselves into history, we risk being left out of it. Through this work, I’ve started seeing the world differently. When I visit a village or learn about a local figure, I now think: Does this have a Wikidata item? Is this documented on Wikipedia? Can others learn from this story? I have become more than an editor, I have become a custodian of our stories.

Representation is impact

When someone searches for their hometown and finds nothing, invisibility is reinforced. But when they find well-documented information, images, and history, they find recognition and pride. This is why open knowledge matters.

A call to action

I encourage others to volunteer with Wikimedia. Wikipedia is one of the world’s first points of knowledge discovery, yet African representation remains limited. You do not need advanced technical skills only curiosity and consistency. If you can send a message, you can edit. If you can take a photo, you can upload it to Commons. If you can research, you can contribute sources. My Wikimedia journey is still unfolding. I once thought Wikipedia was written by “them.” Now I know it is written by us. And that changes everything.

the totalisator

31 May 2026 at 00:00

It has been an unfortunate turn in the software industry, one of many as of late, that gambling is once again one of its primary engines. With the rise of almost nationwide online sports betting, not to mention prediction markets, making odds on real-world events and extracting the money of suckers is no longer limited to island nations. It is a great American pursuit, or at least, that's what modern television sports coverage leads you to believe.

There has always been an uncomfortable relationship between software and the manipulation of marks. Techniques developed by casinos became a fundamental part of consumer software, while the software industry wholeheartedly embraced "gaming" as a market (the older meaning of the term here, meaning gambling). We can readily point to a couple of reasons: first, gambling is profitable, and technology is first and foremost a means of accumulation. Second, gambling is mathematical, or at least arithmetical, in nature. Most forms of gambling involve some sort of complex calculation with real-world stakes.

Gambling predates history, or it might be better to say that gambling has been around for as long as recorded history has been able to observe it. Most early gambling seems to have been based on card or dice games, but humans have been betting on animal fights for more than a thousand years. As sensibilities and resources changed, animal fighting has mostly given way to animal competition. The most famous of these wagering opportunities is horse racing, a form of gambling with such a long and pervasive history that it has often achieved a unique regulatory status as one of the only legal sports betting venues in the US. Well, at least, before Murphy v. National Collegiate Athletic Association.

The earliest recorded horse races were held in England in 1539, and bets were placed. By 1666, horse racing had reached such prominence that King Charles II—himself a jockey—commissioned and then won the "Newmarket Town Plate." That event's eccentric history gave way to the King's Plate, a broader 17th-century racing series whose royal remit made up the first formal rules for the sport. Queen Anne founded the racetrack at Ascot in 1711; while it took decades for permanent facilities to be built at the track, only stands for the royal family came before a betting office. As British empire expanded around the world, horse racing spread with it. Likewise, horse racing spread throughout Europe. By the 19th century, horse racing could be found almost anywhere.

For most of the history of horse racing, betting was based on the setting of odds. A bookmaker would apply historical knowledge, experience, and no small amount of "guesstimation" to set odds of various events—say of a specific horse winning. A bookie might set odds of 7/1 on a given horse, read as "seven to one against." A bettor placing $1 on this horse will make $7 if it wins the race, which of course implies that the bookmaker thinks the horse's chance of claiming first is around 14% or less 1.

This method of betting is known as "fixed odds," and the fact that it relies on the bookmaker to set those odds is a significant limitation. First, there is the constant possibility of error. A bookmaker who accidentally sets odds too favorably could ruin themselves, which creates a natural pressure towards odds that are less favorable to bettors. At the same time, competition between bookies drove odds the other direction. This tension between bookmakers and bettors, and between the bookmakers themselves, was a constant source of conflict. Besides, despite the upper-class connotations of British horse racing, gambling has always touched on the unsavory. Bookmakers were not always known to be honest. A host of possibilities, from collusion to ignorance, could leave bettors with no good options. Every bet offered might be a bad one—even more so than the spread collected by bookmakers would suggest.

Of course, despite their prominence, horses did not originally inspire the revolution in gambling that would soon transform racetracks—roosters did.

Julius Totalisator mechanics, Powerhouse Museum Collection

Josep Oller was born in Catalonia in 1839, but his parents emigrated to France when he was a child. His parents seem to have found some regret that he grew up without Spanish, and perhaps the opportunities in Spain were better. In any case, as a young adult, he moved to Bilbao to study. Accounts vary on what happened next: depending on who you trust, he either merely observed cockfighting, or he was enthusiastically involved in promoting it. I tend to suspect the latter, as Oller was an enthusiastic promoter of many things and not too concerned with vice. Decades later, after his return to France, he co-founded Moulin Rouge. That wasn't even his first entertainment venue, just the most famous. But that's a later story, in another industry. In Bilbao, in the 1860s, Oller watched the cockfights. And then, he watched the bettors and bookmakers haggle, argue, and fight.

The vagaries of fixed odds, the questionable motives of the bookmakers, and the general atmosphere of debauchery made it all rather ugly. Oller thought there must be a better way: the calculation of payouts according to fixed rules, with the impartiality and precision of mathematics. When he returned to France and began to promote his new betting system to racetracks, he named it Parimutuel.

Here's how it works: parimutuel betting is not against the bookmaker or house; it is against the other bettors on the same event. Take an event with a list of possible outcomes, like a horse race and the set of horses that might take first place. Each bettor wagers a sum of money on a specific horse, and a parimutuel teller records the details of each of these wagers.

After the race, when the outcome is known, the parimutuel teller sums the total money wagered on the question, which is called the "pool." Some amount, usually a percentage, is subtracted from the pool to cover taxes and a house take (profit margin). The remaining majority of the pool is then distributed to all of the people who bet on the winning horse, proportionally to their wagers. For clarity, let's work an example: Secretariat, Shecky Greene, and Warbucks are running a race. Being a clever person well educated on Kentucky Derby outcomes, you put $100 on Secretariat. Your friend, similarly informed but cash poor, bets $50 on the same outcome. At the completion of the race, the betting office adds up all of the wagers to $900 (I guess it wasn't a popular race day). There is a state racing commission tax to keep in mind, and of course the owner of the race track wants a share, so they remove 20% from the pool leaving $720. That pool needs to be distributed among the winning bettors. Let's say that you and your friend were the only two Secretariat fans present. Having bet $100 and $50 for a total of $150, a parimutuel teller works out that you receive 2/3 of the pool and your friend 1/3, proportional to the original wagers. You walk away with $475 and your friend with $237, give or take some change. A good day at the races.

There are a few things to observe about this system. First, there is no estimation of odds involved (properly called handicapping). The payout on bets is calculated based on the bets placed, regardless of what anyone expected the outcome to be. But, like the prediction markets with which Silicon Valley is so infatuated today, the general expectation is that bettors will place bets proportional to the likelihood of the horses winning... this means that if a popular horse, widely expected to win, does indeed take the podium (a big podium that horses fit on), the wagers on that horse will have made up a large portion of the total pool and the payouts will be proportionally lower. In the most extreme cases, it is possible to place a bet, win, and still lose money: if everyone bet on the same outcome, they all just get their money back, but minus the house take. Exactly how these situations are handled varies by jurisdiction, but you might be relieved to know that in some rule systems there are scenarios where the house can similarly lose money on edge-case outcomes.

An implication of this fact is that the returns on a wager, assuming that it wins, are not exactly known until after the race has begun, since the tellers continue to take wagers (that change the size and distribution of the pool) up to that point. It's also, just, a little bit complicated? Picture yourself at the racetrack, decadent and depraved, and no doubt several beers deep. You place a bet on your favorite horse—if it wins, what do you get? Parimutuel payouts are relatively difficult for gamblers to understand, and that could limit sales and satisfaction. That's an especially big problem when parimutuel coexists with simpler fixed odds options.

Oller came up with a neat solution: advertise parimutuel betting as if it were fixed odds. Oller worked mostly out of carts, portable parimutuel offices, with big display boards over the teller window. As bets were placed, Oller would calculate the expected payouts on each option and convert them to odds. The equivalent odds were displayed on the board—just like the fixed-odds bookie had. Of course, these odds are only estimates rather than exact, and they need to be updated as the race approaches so that everyone knows what they're getting into.

The exact design of these boards changed over time, Oller himself was actively experimenting for the first decade. While later versions would to display the equivalent odds and perhaps a bet count, Oller's first scheme made the math easier (for the clerk) by displaying the bet count and totals on each outcome instead (conversion to payouts was left as an exercise for the customer). Since the main point of the calculation was to determine the totals, Oller called the process a totalisator. The board, where the totals could be read, was often shortened to the tote board 2.

We should also discuss what people are actually betting on. So far, we have stuck to the simplest case: betting on which horse will take first place. That's called a bet to win, or a straight bet. It is also common to bet on a horse placing either first or second ("place"), or first, second, or third ("show"). There are a lot of other scenarios you can bet on as well, like the composition or order of the placing/showing horses or the place that a specific horse finishes, but these more complex bets also get to be less common and may not be offered as parimutuel (one property of parimutuel is that, as a practical matter, you need to be able to attract a certain minimum size of betting pool for every option you offer). The point is that there can be a lot in the air for any given race, with multiple separate betting pools—and on top of that, a busy track might be taking bets for multiple races.

Within a few decades after Oller's invention, parimutuel was taking over as the norm at horse tracks. Along with the betting system came the boards: huge tote boards were a main feature of "modern" race tracks by the 1880s. Oller's carts gave way to "tote houses," dedicated offices with teller windows at the front, a business office (for totalisators) at the back, and the tote board on top. As clerks updated the totals, they sent the new numbers "upstairs" to a clerk who worked behind the tote board, changing out number cards.

It's hard to tell where the first mechanical calculators for parimutuel emerged. There were probably multiple parallel inventions, since the idea is obvious, and it appears that many different ideas were pursued. Oller himself probably experimented with means of automation, since the first devices probably came from France. By 1880, at least one German company advertised totalisator machines. Not very much is known about these first devices, but they mostly fell into two broad categories.

First, there were techniques and devices for larger operations to coordinate multiple tellers and speed up payout determination. Most of these were not exactly calculating machines as we think of them today, but more like derivatives of other 19th-century office technology like cash ball systems. For example, in one system that seems to have found use in several countries, parimutuel tellers sold "tickets" of a fixed face value. For each ticket sold, they took a ball like a steel bearing or marble and set it on a rail for the corresponding horse. The ball rolled down the rail to the end of the tote house, and fell into a bucket. At the end of the race, the winning horse's bucket was weighed to determine the number of tickets sold and, thus, the payout on each winning ticket.

On the other end of the spectrum were small mechanical machines intended to speed the work of a single parimutuel teller. Once again, the history of these devices is not well documented, but during the late 19th century and at least in Europe there seem to have been several parimutuel machines marketed. These were probably more like adding machines, dispensing tickets and keeping the sum of tickets sold. Their major limitation was that, so far as I can tell, none of this generation of devices tried to suit situations where multiple tellers were involved.

This highlights a dichotomy between two schools of parimutuel technology: devices sold to single-person operations, running out of carts like Oller's, and devices sold to large parimutuel offices at race tracks. The former did the math but left it to the user to combine sales between tellers. The latter combined the sales between tellers, but left most of the math as a manual exercise. Neither family of devices updated the tote board—at best, they made it a faster manual process (e.g. by regularly weighing the buckets). It should also be said that none of these solutions were reportedly that good. Most were short-lived, and bettors don't seem to have had much trust in them. Marble-bucket-type solutions were known in particular for having a certain margin of error... a margin of error that, to a slightly less scrupulous bookie, became a margin of profit.

Of most interest to us, considering later events, is the early history of parimutuel automation in New Zealand. Parimutuel, as a system of gambling, landed in New Zealand around 1880. The first parimutuel devices and machines soon followed, imported from European manufacturers. An 1884 issue of Southland, New Zealand's Western Star gives some of the flavor of this new pastime, describing the happenings at an amateur steeplechase meet:

The British love for backing one's fancy was specially provided for in a manner the fairest possible to the public, in the shape of a pari mutuel. This machine was worked by two gentlemen whose suave manner was inimitable, to the satisfaction of all parties, who volunteered their services, and have handed over legitimate commission for presentation to the Southland Hospital.

This is a fascinating passage that reveals a few things. First, in New Zealand, parimutuel betting as a concept had been popularized mainly by people who were importing and selling machines—as a result, New Zealanders thought of parimutuel betting as closely coupled to mechanized calculation 3. The perceived impartiality of the machines (and, as it turns out, the suave manner of its operators) did much to reinforce the idea of parimutuel as a fundamentally more trustworthy form of gambling.

Second, the mention of a "legitimate commission" to the hospital reflects the legal situation in New Zealand at the time. The history of gambling and its legality is a fascinating topic that I am trying very hard not to digress too far into, but many of the fits and starts in development of gambling technology result from widespread recognition of gambling as a vice and subsequent efforts to regulate, restrict, or outright ban it. This might feel like a modern issue, but in the 19th century it was already at a steady boil, and some of the parimutuel industry's emphasis on "fairness" was clearly an effort to maintain legal status. Much the same interplay (between regulation or banning of gambling and claims that mechanization mitigates the harm) is visible in our more traditional gaming industry today (e.g. slot and pachinko), although the online sports books are evidently free of the need to defend themselves.

Despite legal complexities, parimutuel was a hit in New Zealand. In 1908, the Trentham Racecourse had a large tote house with eleven mechanical calculators, presumably operated by eleven tellers with additional clerks to calculate sums and update the manual tote board that made up much of the building's facade. Mechanical calculators were indeed available but expediency, and the chaos of a busy race, meant that much of the odds calculation was done in the minds of the clerks updating the board. There was a margin of error involved, and by newspaper accounts a lot of stress and conflict. Still, this was the dominant model for the first decade of the 20th century: New Zealand racetracks replaced an ad-hoc system of bookmakers and parimutuel carts with large, central tote houses where a single, track-wide pool was maintained. The staff of a busy tote house could be 30 or 40 people, all running around, copying numbers between slates, and shouting updates to tote board's loft. Add another challenge of racetrack betting: the tote houses, even as large as they were, struggled to keep up. One can imagine the impatience of a gambler who was just won it big, running up against a harried teller whose mind has already moved on to the next event.

This problem was all the worse because of a problem clearly illustrated by our modern prediction markets: the matter of closing bets. At a roulette wheel, the croupier ritually declares bets closed before the ball can touch the wheel. Otherwise, the last to place their bets have more information on the outcome. Similarly, in a horse race, any bets placed after the race has begun are better than those placed before. The takeaway is that, under most rules, the tote house had to close bets and complete final odds calculations before the race could begin. A busy tote house meant that races started late—and then later, and later. This was, apparently, a very serious problem in the nineteen-oughts.

Longchamp Tote Board, Powerhouse Museum Collection

Relief would come not quite from New Zealand, but certainly nearby. George Julius was born in England in 1873, moved to New Zealand in 1889, and soon enrolled at Canterbury College to study railway engineering. After his studies, Julius took a job with the Western Australia Government Railways, the start of a career that found him in Sydney in 1907. By this time, he had two sons, and a clear mechanical inclination. While he set up a practice as quite possibly Australia's first consulting engineer, he spent evenings with his sons on a bit of a side hustle: an automatic vote tabulator.

This part of the history has become obscure, and the order of events is not completely clear to me, but it went something like this: while Julius had been working in Western Australia, a friend had complained about the slow process of elections and suggested that some kind of machine could do most of the work. The idea stuck with Julius, and in Sydney, he built a prototype and perhaps even applied for patent on a "foolproof mechanical voting machine." Australia has a long history of interesting and innovative election practices, so Julius's invention might have started a new chapter in the history of election administration had it not been a total failure. The prototype must have worked, but Australian authorities were unimpressed and declined to purchase the system.

It might have moldered among so many other not-quite-revolutions had it not been for yet another friend, this one in Sydney, who was familiar with parimutuel betting and the current state of the art in machines. This friend told Julius that his election machine met similar ends (adding up numbers and displaying the results) and explained the operational details of the large tote houses typical of New Zealand racetracks (but, as far as I can tell, not yet popular in Australia—at least not in Sydney, as Julius later wrote that this was the first he had heard of parimutuel). What followed was another series of late nights, but one that ended with more success: prototype in hand, Julius incorporated as Totalling Mechanisms Ltd and made his first sale.

In 1913, the first Parallel Automatic Totalisator was installed at the Ellerslie Racecourse in Auckland, New Zealand. This machine was called "parallel" because it was a single, large mechanism that all of the parimutuel tellers operated simultaneously. It was "automatic" in that it performed all of the calculations end to end. Not only did it record the tickets sold by each teller, it recorded the totals (in currency) for each horse and a grand total. All of these numbers were presented to bettors by a brand new kind of tote board: a mechanical one, made of chain-driven number wheels showing through windows.

Julius's machine was installed in the existing large tote house at Ellerslie, and fitted with 30 teller stations. Different from today's approach, Ellerslie's parimutuel operated on fixed-value tickets sold in different denominations at different windows. A person wishing to place a 20 shilling bet would queue for one of the 20 shilling tellers, but once at the window could buy any number of tickets on any number of horses. As the teller dispensed the tickets, the tote board above updated in real time.

While quite trivial today, it was a tremendous accomplishment for 1913. Conceptually, it was closely based on the established type of mechanical counter that had been sold, for example, by Hengstler. The difference was one of scale: these wheel counters were huge, as they were directly read by the public. The tote board actually was the machine, in other words, and the number wheels behind it were not just displays but the actual mechanism by which the totals were calculated and stored.

At the back of the tote house, an array of concrete weights hung from chains over sprocket wheels. Gravity pulled the weights, which turned the sprockets, which rotated drive shafts for each horse. Or, at least, tried. The drive shafts were normally blocked by pawls. Each of the parimutuel tellers sat at a station with a set of levers, one for each horse. When a bet was placed, the teller took a ticket for the corresponding horse out of their drawer. They inserted it into their machine and pulled the corresponding lever, which both caused a pattern of holes to be punched into the ticket (validating it as properly issued) and pulled on a long metal wire that ran up through the ceiling into the tote board. That wire pulled on the levers of the clockwork-like escapements that locked the driveshafts, so that each pull of the wire caused the counter for the selected horse to advance by one.

Each of the individual counters was connected by chain to a "shaft adder," basically a series of sprockets coupled to the shaft by one-way clutches. When each individual counter advanced by one, it also pulled the shaft adder along with it, which drove the grand total display at the top of the board. As the machine ran, the weights descended, with those representing the most popular horses falling the fastest. A catwalk rail, up in the tote board, accommodated a worker who carried a crank handle from driveshaft to driveshaft. They would crank the weights back up as needed, and reset all of them to the top between races.

The first Julius machine was big and complicated, with the attendant inertia and friction. My description of its workings is more than just a bit simplified, but most of the complications have less to do with parimutuel than the practical realities of machinery. There were several extra layers of mechanisms like escapements and shaft adders, required so that all 30 of the teller stations could advance the machine with a reasonable degree of effort (although the size of the teller's levers suggests that they still required some heft) and without the machine jamming, miscounting, or otherwise misbehaving when two tellers happened to pull levers at the same time. Given his background in railroad engineering, I suspect that Julius drew from techniques used in railroad interlocking machines, which at that time also made great use of levers that pulled cables to drive chains.

Despite later patent applications, the history of this machine is not as well documented as you might hope. An excellent description from the University of Auckland notes that the actual design of the counters is unknown, particularly how they performed carries. This isn't all that big of a mystery as similar counters (then often called totalisers) existed at the time and Julius must have used one of several known techniques, but still, it is always disappointing to have mysteries in the details of such a notable invention.

This first parallel automatic totalisator operated until 1918, although it was modified and reworked several times in that span. At least early on, the tote house continued to keep manual totals for payout calculations, mistrusting the machine where it really mattered. There are indications that it was out of service during some races, and it may not have been all that reliable in general. Still, it worked. After its first proper operation during a major race, the New Zealand Herald published a report that conveyed its success and its complexity:

A small knowledge of the extraordinary demands that the work of a totalisator imposes upon its parts shows clearly how [the] mechanism might have failed in a dozen ways. But in the meantime alterations have been made, the machine has become more familiar to its operators, and it worked on Saturday and yesterday with fine regularity, and to everybody’s satisfaction....

[An] obstacle of importance was raised by the fact that, in view of the facility with which many investments could be put upon a single horse at a time, the machinery was apt to be run at times so fast that the starting and stopping of the rotating wheels, which, in spite of their light construction, have considerable inertia, set up almost destructive strains and shocks. This trouble has been overcome with an ingenuity altogether admirable. The counter releases a spring-driven gear, which can go as fast as it will irrespective of the motion of the wheel on which the figures are shown to the public, and the wheel simply runs leisurely and smoothly ahead until it overtakes the gear. Even then the wheel, free to make several rapid revolutions, would be difficult to stop at the right place if its speed were not controlled by a governor.

The Ellerslie totalisator might not be called a complete success. Julius charged the racing club £4,000, but spent over £11,000 on construction and modifications. The result was so complex, and required so much maintenance, that the staff of the tote house actually expanded after its installation. Still, the immediate display of totals was a huge draw for bettors, and it succeeded as well in speeding up the closing process. I am always hesitant to apply superlatives, but the Ellerslie totalisator has been called the largest calculating machine to date. I believe it to also represent a major step in the history of displays: a very early, if not first, implementation of a billboard-sized data display that automatically updated in real time. In any case, the machine brought in money. On the totalisator's first full race day, the grand total counter went through £41,514. An estimated 83,000 bets, 10,000 on some single races.

Julius totalisator adder and indicator, Powerhouse Museum Collection

On the back of the Ellerslie machine's success, Julius restyled Totalling Machanisms Ltd as Automatic Totalisators Ltd, or ATL. The First World War caused a great deal of disruption to, well, just about everything, including the racetracks. It halted progress in totalisators for several years. The lull didn't last long, and by 1922 ATL had built improved totalisators for a half dozen New Zealand racetracks. These used a simplified design, benefiting from newer technologies such as electricity.

While the totalisator would remain mechanical for decades to come, the Ellerslie machine is the only one to have been completely mechanical. It seems that Julius began to contemplate electrical designs before he even finished the Ellserslie installation, and all later machines (including an "upgrade" installed at Ellerslie after 1918) used an electric drive system instead of the concrete weights and their crank-wielding attendant. More significantly, though, they also electrified the teller machines: instead of pulling cranks, the tellers pressed buttons, which energized circuits that operated the escapements in the tote board via solenoids. This made it possible, for the first time, to locate tellers for a single parimutuel pool at different locations throughout the racetrack.

The greatest of ATL's electromechanical totalisators was installed at Longchamp, Paris, in 1928. This machine, occupying a building that one could easily mistake for a grand hotel were it not for the numbers behind each window, supported at least 270 tellers (some sources say more). Sales numbers don't seem to have made it to the modern age, but with so many tellers working it must have hit six-figure ticket volumes for popular races. It remained in use until 1973.

Later ATL totalisators were reduced in size and price (truck-mounted portable versions for small racing clubs, for example), and improved in features. We learned early on that some manual tote boards had displayed equivalent odds, while the early Julius machines showed only totals. That limitation wasn't permanent, and ATL machines switched over to estimated payout odds in the 1930s. These took the form of "barometer" displays, or vertical bar graphs, that rose as the odds approached 1/1 (a horse favored to win) and fell down to 100/1 (decidedly long odds). Many of the early ATL systems were upgraded to this later design, and some of the "win/lose barometer" machines saw service, like Longchamp, into the 1970s when computers presented a cheaper alternative.

Totalisator technology became somewhat stagnant after the 1930s, perhaps a result of saturation. Many large markets like the UK had legal restrictions on race betting that prohibited large totalisators (although an oddity of UK law meant that they were allowed for dog races, where the Julius totalisator is best known in that country). In countries with laxer gambling rules, like New Zealand and France, there simply weren't that many large racetracks and by the end of the '30s most of them already had ATL machines that would operate for decades more.

The more innovative designs were seen in portable systems, which were a big market for the many racing clubs that held seasonal racing at different venues. They could not afford to build a large tote house, and didn't have the attendees to keep one busy anyway, but a semi-trailer machine could be towed around with the race days, to be set up and run a small crew in the tradition of carnival rides. These portable systems demanded miniaturization, which was difficult with chain-driven gear shafts but much easier with the relays, and relay logic, that had emerged from the telephone industry. Similarly, the transition to more electrical totalisators made them easier to assemble and disassemble, since it was a matter of hooking up wires rather than aligning sheaves and tensioning cables

The burgeoning telephone industry also meant that relay systems and all manner of electric devices were better understood and easier to manufacture than ever before, inviting new manufacturers into the market. British firm Bell Punch, founded to manufacture machines for validating (punching) railroad tickets, saw ATL's products and responded with their own line of totalisators that were almost entirely electrical. These were particularly popular at smaller tracks, since they were much more compact and easier to install.

In the United States, electrical engineer and race horse breeder Harry L. Straus founded the American Totalisator Company, or AmTote, in 1928. It took several years for AmTote to get off the ground, but their 1933 system at Arlington Park in Illinois established them as the dominant US manufacturer. Their machines were entirely based on telephone relays, and appear to have been the genesis of incandescent numerical displays. Straus's design used a grid of light bulbs, 4 wide by 6 high, with a cabinet of relays that would activate the correct bulbs to show the digits 0 through 9. AmTote numerical displays were an innovation in their own right and found widespread use outside of totalisators (the prices on the original set of The Price is Right are a notable example), but the totalisator business was good, and AmTote installed hundreds of machines around the world.


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


AmTote also brought other innovations. Despite the historical practice of bets on "win" or "place," ATL machines had only handled a single pool of bets for the race. AmTote's more compact machines could afford the extra logic, almost a duplicate machine, to run a second pool. That meant that parimutuel tellers at AmTote ticket machines could select either a "win" or "place" ticket for each horse, and odds and payouts were calculated appropriately. Instead of preprinted tickets, the AmTote machines operated a mechanical stamp to mark the ticket with the bet information including a "code word" that changed for each race to discourage modification or forging of tickets. A central control panel, in the tote house, "locked" the ticket machines to close bets at the start of each race. Later generations of AmTote machines even used modular (connectorized) electrical wiring and connected all ticket machines to a shared bus, which made portable systems much easier to set up and encouraged owners of fixed systems to add more teller stations at locations convenient to bettors.

The totalisers used in AmTote machines are similar to, and presumably derived from, the Strowger rotary or step-by-step switches used in telephone exchanges. They were fitted with solenoids that would advance them not just in single steps, but in units of 5 or 10 steps as well, for atomic issuance of tickets at different denominations. Much else was taken from the telephone industry: the "brains" of an AmTote machine were installed on relay racks, and extensive diagnostic features were fitted, including controls to produce void "test tickets" at each ticket station and a self-test routine that should advance the counters to known values. AmTote even used a distinctly telco power arrangement: motor-generators converted mains power to DC, which floated the lead-acid batteries that actually powered the machine.

ATL was not to be left out of these innovations, and in particular designed an impressive relay-logic system that fit into a truck called the "Totemobile." A number of these systems were built, and ATL operated them as a service for smaller racetracks. For a commission, ATL would send its own staff to your club's track, and wire the Totemobile to ticket machines installed wherever convenient. They took bets, paid out winnings, and at the end of the day they drove away to the next job.

George Julius lived until 1946, by which time he was Sir George Julius and better known for his influential tenure as chairman of Australia's government R&D organization CSIRO. At CSIRO, he turned towards electronics, as did the company he had founded. After the Second World War, ATL started on a major rework of their product line that introduced relay logic and led to the company's expansion into relay and analog computers. The post-war global economy led to new totalisator installations around the world, and with that surge of new interest came new features. "Quinella" bets, that two horses will place first and second but in either order, were newly popular in the 1950s and ATL designed the first machines that could efficiently manage them and display the odds. Also during the 1950s, ATL designed a new generation of ticket machines that issued tickets as small segments of punched paper tape.

In 1966, ATL designed a totalisator for the New York Racing Association with 550 ticket issuing machines, two primary incandescent light bulb odds displays, and twenty smaller odds displays located around the stands. Installed at the Aqueduct racetrack in NYC, the machine handled over $700 million in a season. Here, for the first time, there were no chains, no cables, no rotary selectors. Instead, the tote house was home to two Honeywell 200 computers. A partnership of ATL, Honeywell, and a software firm called Data Trends had reinvented the totalisator as an early online computer application, the computers polling the ticket machines and updating the displays. Over the following years, ATL would adopt the PDP-8 as the core of their totalisators. From 1970 on, totalisators were almost universally computers.

North London's Harringray Stadium, a greyhound track, operated an ATL mechanical totalisator until its closure in 1987. There was, reportedly, a mechanical totalisator operating in Caracas as late as 2005. In the UK, where many smaller greyhound tracks had totalisators installed, there are several examples that are disused but still standing. At many older racetracks in New Zealand, the UK, and Europe, bets are still placed at teller windows under the enormous tote board that once housed an ATL machine.

George Julius changed the "totalisator" from a room full of people to a machine. His company, ATL, would later change it from a machine to an application. The era of electronic totalisators is its own story, with its own ups and downs and anecdotes, one that I plan to cover in better depth one day. In particular, electronic totalisators were a major impetus of improvements in numerical displays. Display systems designed for totalisators went on to be used in all kinds of applications, from scoreboards to transit destination signage. Starting in the 1980s, off-track betting led to nationwide networked totalisator systems and all of the fascinating asides that that entails. AmTote is still around, a major developer of gambling technology. ATL has scattered to the winds, but makes up part of the modern Light & Wonder (formerly Scientific Games) and Sportech.

Former Ellserslie tote house, Haydn & Rollett Architects

Still, I don't find these later machines quite as romantic as Julius's cruder invention. There's something about the fusion of the totalisator as a system, a building, a process, and a machine. The tellers at their windows, pulling on wires to influence the workings of a great machine hidden away in the ceiling.

The lever pulls a long wire cord; the cord releases the counter. The gearing revolves a little, and the counter wheels turn. Sometimes the wheels move at long intervals; and then, as a horse becomes popular, its set of wheels will commence to spin frantically, so that one might imagine the count terribly apt to fail. But the ‘tens’ tot up the units, the ‘hundreds’ go up in their turn; and grand total wheels turn phlegmatic somersaults, and perform mathematical prodigies. (New Zealand Herald)

It is funny, then, that one of the faults of the earliest electronic totalisators was their struggle with arithmetic. But that's a story for later.

  1. The specific notation used here is the British system. For historic reasons, horse tracks in many other countries (including the US) also observe the British system. In other countries or other sports, you will find other conventions used to express the same thing, such as the "moneyline" system used for other sports in the US in which "7/1" would be written "+700" (a $100 bet returns $700 on win). Historically there have been multiple conventions in use, and the confusion that could cause is yet another reason that mechanized standardization proved popular. Keep in mind as well that these conventions all express "net" payments, when it is conventional that winners also have their original stake returned. This means that 7/1 or +700 odds mean that a person who bets $100 and wins will be paid out $800, their original $100 and a gain of $700. This is perhaps more intuitive with the UK system where we can observe that 7/1 odds sum to a "pool" of 8, but perhaps the moneyline system is popular here because we do not like to think about fractions.

  2. I'm simplifying a bit here, but to make a long digression short: parimutuel betting was actually banned for a time in France for running afoul of the gambling regulations. Oller started a new parimutuel business about a decade later, but now "parimutuel" was gone and his cart was called the "totalisator" instead. It seems like he may have effected a rebranding to disassociate his new venture from the old (illegal) one, and like "parimutuel," the strange word "totalisator" is probably from French (a corruption of "totalisateur"). One of the outcomes of this whole thing is that it's actually sort of ambiguous what "totalisator" refers to: the concept of parimutuel betting? the office where it takes place? the business that takes bets? the machine that calculates these odds? All of these were called "totalisators" in various times and places. In modern use, "totalisator" refers almost exclusively to the machine that performs the calculations, yet the word was already in use before these machines were invented.

  3. Following up on the previous footnote, the terminological confusion we have today emerged very early. An 1881 New Zealand patent dispute over a small mechanized parimutuel calculator reveals that different parties in New Zealand seem to have disagreed over what exactly "totalisator" meant and whether or not it was equivalent to "parimutuel." Of course, the person defending the allegedly infringing patent claim had plenty of motivation to argue that they were different, since both parties involved seem to have copied the idea from France.

extremely low frequencies

9 May 2026 at 00:00

The submarine is a surprisingly ancient technology—at least in its early, primitive forms. The idea is quite simple, that a well-enough-sealed boat ought to be able to submerge and resurface. It's the practicalities that make the whole thing difficult. It is generally considered that the US Civil War was the first use of submarines in combat; these were primitive machines with very limited operating endurance and navigational capabilities. These submarines were more like torpedoes: you pointed them in the right direction and hoped they went straight.

The First World War benefited from tremendous advances in submarine technology. A number of experimental designs during the 19th century had built practical experience, especially in Germany, and the Germans apt use of the first modern "U-boats" had a significant military impact. British and US designs made similar advances, and submarine warfare was born.

The chief advantage of the submarine is its ability to submerge and maneuver while hidden. WW1 submarines were diesel-electric or gasoline, so their submerged endurance was limited by the power supply stored onboard. Still, these submarines could operate underwater longer than any before, long enough to establish the submarine sneak attack as a key part of naval warfare.

It was also long enough to expose one of the trickiest challenges of underwater defense: communications. Water, especially seawater, is dense and conductive. This is very bad for radio wave propagation: by the first world war it had already been discovered that seawater effectively blocked radio communications. HF radio, the main form of communications at sea (and, in the WW1 era, in general) might only penetrate seawater for a few meters in real-world That meant that submarines had to surface in order to communicate, another de facto limitation on their endurance while submerged.

The Navy had been evaluating electronic communication aboard ships since 1887, when they demonstrated a simple and "radio-adjacent" technology using conduction of waves through the seawater itself. This scheme never worked very well, but was saved by the development of modern wireless transmitters late in that century. Marconi himself demonstrated radio to the Navy in 1899, and in 1903 the Navy bought its first radio sets. Tactical reports from conflicts elsewhere on the globe, like the Russo-Japanese war, reinforced the idea that radio would serve a key role in naval combat.

When C-class submarines Stingray and Tarpon, and D-class Narwhal, launched in 1909, they were immediately given duties including the evaluation of radio equipment. In a classic tale of early technology, the evaluations went poorly. Tarpon ran into mechanical trouble that prevented its planned trial voyage, so the radio set was never installed. Stingray received a cutting-edge quenched spark gap transmitter and receiver set, but the transmitter turned out to be DOA. Still, Stingray was able to demonstrate its receivers, copying a message from the nearby Boston Navy Yard while surfaced.

ELF concept

Narwhal's mission was more ambitious: underwater communication. A test was made on the same direct conduction technology, using brass plates suspended below the ships, demonstrated in 1887. It similarly failed to perform. A repetition of those experiments, done the next year and with improved equipment aboard Narwhal's sister ship Grayling, produced better results. The system provided reliable communications with the "antenna" plates submerged as much as two feet below the water... and no deeper. Frustrated Navy engineers concluded that it was possible to get radio signals through seawater, but not practical.

Through the First World War and following decades, engineers focused on ways to get the antenna to the surface without having to bring up the entire submarine. Around 1915, the Navy adopted a floating antenna buoy that a submarine could "winch up" towards the surface on a cable. Putting anything at the surface was less than ideal, but the anti-submarine technology of the era the small antenna buoy was still very difficult to detect at long range. Submarines just had to make sure it was retracted back to the submarine's deck before attempting anything where stealth was key. These floating buoys were not reliable during WW1, but they could work, and the technology has continued to develop to this day.

Still, there were other ideas about underwater communications. The most important development came from two engineers of the National Bureau of Standards (NBS), or at least, that's what a court ruled after a patent dispute between two sets of supposed inventors. John Willoughby was employed by the NBS, which would later be known as the National Institute of Standards and Technology (NIST), to investigate new types of radio receivers. In the summer of 1917, he was arranging various types of coil antennas at a receiver test site on the Chesapeake Bay when he accidentally dropped one of the antennas into the water. Strangely enough, the radio receiver connected to the antenna continued to provide good reception even as it sank into the bay.

NBS management was not especially enthusiastic about this accident, but Willoughby was. He knew that the Navy was investigating means of communication with submarines, and that seawater seemed to block radio waves, all of which suggested that he might have stumbled on an important discovery. Lacking NBS support for further research, he took the idea to gifted radio inventor and NBS colleague Percival Lowell 1. In a fine tradition of innovation, the two took to Willoughby's basement for a series of experiments that illuminated the underlying phenomenon: Willoughby had been experimenting with unusually low radio frequencies, below 30kHz where wavelengths become too long for most antenna designs and coils become the best receivers. These lower frequencies were significantly less affected by water than higher, more conventional frequencies, and Willoughby and Lowell built a successful prototype for what they called "long-wave" radio between two coils.

The NBS remained surprisingly uninterested, but Willoughby had a contact in the Navy who felt quite differently. In 1918, Willoughby and Percival joined LtCmd H. P. LeClair, then running the Navy's experimental radio program, at submarine base New London (so named after New London, Connecticut, across the Thames River (Connecticut) from the base). They made a hurried and rough installation of their equipment on submarine D-1 and a surface support vessel. Not everything went perfectly, but they proved the idea: Willoughby, Lowell, and LeClair listened attentively to their radio sets as the D-1 submerged and continued to come in loud and clear.

Within a matter of a few years, the Navy accepted long-wave radio as a standard technology for submarine communications. The various jury-rigged installations at New London showed that coil antennas could easily be integrated into a submarine's rigging, and even better, the Navy had found that long-wave radio propagated over the surface as well as under it. Long-wave communications would serve the entire Navy, and a transmitter site was already underway.

Long-range communications had become a top concern throughout the military in the early 20th century, and a series of meetings between US military branches and between the US and UK led to a scheme of "High Power" radio stations. The first of these, NAA, went up near Arlington, Virginia in 1913. Over the following years, similar stations were built in the US and Europe, facilitating the first direct communications between the two and the first transatlantic voice communication in 1915. The construction and operation of these stations also led to considerable advances in radio technology generally, especially powerful transmitters. NAA was one of the early stations to be equipped with Poulson arc transmitters, almost two times more efficient than earlier designs and well-suited to long-wave operation.

Around the same time as the Willoughby/Lowell experiments, Navy engineer LtCdr Albert Taylor found similar results with long-wire antennas shallowly under the water. These experiments offered another design for concealed submarine antennas (which could be stored onboard in reels and let out with floats that kept them just under the surface), and also demonstrated that long-wire antennas could be buried for transmit use.

Five years later, in 1918, construction was underway on NSS—a new high power station in Annapolis, Maryland. Unlike those before, NSS was specifically designed for long-wave signals. Two 500 kW Poulson arc transmitters driving an antenna 400' square and suspended between four 500' tall towers 2. The long-wave capability at Annapolis was not originally intended for submarine communications, but it quickly fell into that niche. During the 1920s, NSS became a key station for submarine command and control of submarines.

NSS itself remained in service until 1996, and it was joined by VLF transmitters at Cutler, Maine; Jim Creek, Washington; Lualualei, Hawaii; LaMoure, North Dakota; and Aguada, Puerto Rico; besides sites in Europe operated with allied militaries. Each of these stations is its own interesting story. The 1,205' VLF antenna tower at Aguada remains the tallest structure in the Caribbean. LaMoure was originally built in the 1960s for a long-wave navigation system called Omega, and was repurposed for submarine C2. Jim Creek went into service in 1952 as the most powerful radio transmitter in the world, using a fascinating antenna that draped from one ridge to another across a mountain valley.

Let's focus, though, on Cutler. VLF Transmitter Cutler is the spiritual descendant of the Navy's original High Power program, symbolized in its inheritance of the callsign NAA. Cutler was part of a Cold War expansion of the VLF system, going into service in 1961. Many other VLF sites received upgrades around the same period, but Cutler was a completely new design. Cutler's two antennas, for redundancy, are each supported by 13 towers. The center tower is about 1,000' tall, and the other 12 make up two concentric rings of about 900' height. The complete antenna is over 6,000' across, or nearly 2 km. Between the tower tops stretches a web of tight horizontal wires, each 1" copper, that form an enormous capacitor. The capacitor's other plate is the ground, electrically reinforced by many miles of buried groundplane wires. The radiating elements are vertical wires, hanging down from the upper horizontal mesh.

Cobweb antenna

In Maine's harsh winters, the wires accumulate ice until their weight threatens the towers. Each antenna is alternately switched into a deicing mode in which it is turned into a 3 MW heating element... just for long enough that the ice melts off. Outer towers are supplemented by short, stout structures that allow the 220 ton tension weights to move up and down on tracks. "Helix houses" at the feedlines of the two antennas sheltered enormous inductors; walls lined with copper served as insulation and to ground the occasional arcs that made the helix houses and transmitter rooms unsafe to enter during operation.

The two antennas were driven by a transmitter complex designed and built by Continental Electronics. The 11 MW on-site power plant supplied the AN/FRT-31 transmitter, custom to this installation, consisting of four parallel units of eight ML-6697 transmitter tubes. The transmitter's control room rivaled that of many power plants, as did its output: the military required at least 1 MW, Continental rated the transmitter for just over 2 MW, and it still operates today at powers as high as 1.8 MW. There are several reasons that the "most powerful radio station in the world" is now difficult to pin down, but NAA Cutler is certainly in the running.

That is the end of the VLF story, in that it hasn't ended. The original 1910s and 1920s VLF sites are mostly decommissioned, but only because they have been replaced by more modern equipment, sometimes on the same site. Cutler, Jim Creek, Lualualei, and Aguada are all still in service. LaMoure may be in some kind of mothballs state but is definitely capable of operating, it has recently seen some use for propagation experiments. VLF is still a key technology in the Navy's C2 and nuclear reprisal plans. So, we can say that VLF has achieved one of the great feats of technical history: it has outlived its replacement.

First, though, we should spend some more time on the theory. In modern parlance, "VLF" describes the band from 3-30 kHz. Most Naval VLF stations operate at around 24 kHz, but some stations support lower frequencies as well and other stations have operated as high as 40 kHz (still considered VLF by the Navy for practical purposes). These wavelengths pass through seawater well because of a basic trait of radio waves that was becoming experimentally apparent in the 1920s and received a thorough theoretical underpinning later. Radio waves attenuate as they pass through materials in proportion to the number of wavelengths in the material. In other words, as a rule of thumb, a radio wave with a 12 m wavelength (~24 MHz) will experience about 1,000 times the attenuation of a signal with a 12,000 m wavelength (~24 kHz). This is true of water or air or any other material, but the attenuation rate in saltwater is so high that the effect is extremely apparent in the sea.

This brings us to our first property of VLF: because of the long wavelength of VLF signals, they pass through water with relatively little attenuation. Still, there is a limit. The details of submarine communications are mostly classified, but from open materials it is realistic for a submarine to receive a VLF transmission up to about 100' below the surface. This depth is already far better than what's achievable with HF, and far superior to deploying a floating buoy. Still, intuition dictates that even lower frequencies could be even better, and the Navy did not go without noticing that possibility.

Second, we should revisit the antennas. One of the key insights of early experimenters like Willoughby and Lowell is that coil antennas create an asymmetry in radio communications. Antennas become more efficient as they reach the wavelength of the signal, or multiples thereof. That means that lower frequencies, and longer wavelengths, require larger antennas—thus the 6,000' wide cobwebs at Cutler and more than one regional height record set by VLF antenna towers. On the other hand, coil antennas, or more specifically magnetic loop antennas, can be very small compared to the wavelength they receive.

Unfortunately, the physics trick that makes magnetic loop antennas work so well (magnetic coupling) is basically one-way. Magnetic loop antennas are relatively inefficient but usable for reception; they're completely useless for transmitting. VLF is effectively a one-way technology, and some of the traffic carried by the Navy's VLF network consists simply of orders for submarines to surface or deploy a buoy for more advanced communications.

Finally, we should observe that the capacity of a radio channel to carry information is proportional to its bandwidth, and that the use of lower frequencies and longer wavelengths makes the usable bandwidth of given radio equipment much smaller (we can intuitively understand this by noting that larger antennas are, simply due to scaling, more precisely tuned to their intended wavelength than smaller antennas). VLF transmitters are only capable of very narrow transmissions, functionally limiting them to continuous wave (Morse code) operation or simple digital schemes at very low speeds.

We probably all realize, as did the Navy, that pushing to yet lower frequencies and longer wavelengths would produce better penetration of the seawater, at the cost of basically every other property becoming worse: larger antennas, less efficient transmitters and receivers, narrower bandwidths. The possibility of going even further—from Very Low Frequency to Extremely* Low Frequency—was just a solution in wait of a problem. The military had a lot of those, and the Cold War was one huge problem.

Valley span antenna

The idea of a nuclear-powered submarine is almost as old as the nuclear program, and a collaboration between the Navy, the Atomic Energy Commission, and famed admiral Hyman Rickover led to the 1954 launch of nuclear-powered submarine Nautilus. The next decade gave the Electric Boat Company new meaning, as nuclear propulsion displaced diesel in the US submarine fleet and fundamentally changed the strategy of submarine warfare. Nuclear submarines, unlike those using diesel-electric or gasoline propulsion, can be set up to remain submerged almost indefinitely. The reactor does not require air, and provides plentiful power for life support equipment that mitigates the fresh air requirement for everything else. This created a generational change: by some definitions, all pre-nuclear submarines were merely submersibles, ships designed to submerge only temporarily. The nuclear submarine was a new kind of creature, one that not only visited the depths but could live there.

Add in the development of submarine-launched ballistic missiles (SLBMs), which enabled a submarine to direct nuclear weapons at targets on shore with shorter travel time than any other means of delivery. Every submarine became a portable missile silo, one that could not only hide but actively evade detection. Their ideal mission was to lurk, undetected, for extended periods of time.

Of course, this new potential for submarines further stressed communications infrastructure. A nuclear submarine might spend weeks submerged in water that is ostensibly controlled by another nation, making stealth critical. Such a submarine doesn't want to remain close to the surface, which makes detection by all means easier, and also doesn't want to deploy floating buoys or antennas that are easily detected by modern radar. On the other hand, for it to have any value as a nuclear deterrent, the Navy needs some way to deliver a launch order without having to wait for the next duty rotation.

The military spent the early Cold War developing a dozen different systems for survivable delivery of nuclear war orders, things like the High Frequency Global Communications System (HFGCS) and TACAMO that solidified the concept of short, simple, one-way Emergency Action Messages to direct nuclear forces. The Navy needed a way to deliver EAMs to submerged submarines, and that provided the impetus to investigate lower frequencies than ever before.

The lowest generally recognized radio band, ITU band 1, is Extremely Low Frequency or ELF. There is some historic complexity around the definition of ELF, and the modern range of 3-30 Hz does not exactly match the way the Navy has used the term. In general, though, we can consider ELF to refer to the very bottom end of the usable radio spectrum. The extreme lower edge could be said to fall around 7 Hz, where the wavelength of a radio signal matches the circumference of the earth. This leads not only to complex interference problems due to constructive and destructive interactions, it also produces a very high noise floor as global lightning storms trigger perturbances that resonate on and on. Balancing the desire for the lowest possible frequency against the practical challenges of ELF, the Navy settled on the range of 72-80 Hz as the most promising window for submerged submarines.

The history of Naval ELF development is not simple to research. First, the Navy conducted much of its ELF research in secrecy, a result of typical Cold War paranoia and an awareness that the Soviet Union was pursuing a similar idea. Second, much like GWEN, ELF became the locus of fervent public opposition grounded in general anti-war sentiment, demands for nuclear disarmament, and the safety of electromagnetic radiation. Many of the readily available sources on ELF history today come from "electrosensitive" advocates or newsletters, a still-strong movement founded on the mostly unscientific premise that EM fields pose a danger to human health. While mostly factually accurate, these sources require some caution since they tend to mix their historical narrative with observations about EM and RF safety that are now broadly considered pseudoscientific. Still, this frustration leads to two positive outcomes: first, it helps to place the development of ELF radio within a broader cultural context of uncertainty about both war and new technology, emphasizes the unknowns involved in the push to ELF, and makes the ELF stations an interesting focus of the anti-war movement. Second, it leads to a personal connection that likely contributed a great deal to my interest in military communications.

There are rumors, even scant evidence, that the Navy initiated classified experiments with ELF in the late 1950s. There is very little that I can say about this first part of ELF history, besides that the experiments must have had promising results. In 1968, the Navy adopted a full-scale ELF communications plan called Project Sanguine.

The original Sanguine proposal was truly an artifact of the Cold War, remarkable in its scale and doomed to obsolescence before construction even began. The Sanguine ELF station would actually be over one hundred independent transmitting stations, operating in synchronization as a form of hardening. The loss of a subset of those stations, say due to nuclear attack, would only reduce power rather than disabling the entire facility. Of course, to maximize survivability of the individual transmitters, they would all be installed in hardened underground bunkers, each with a set of 2" antenna cables extending 40 or more miles in four directions. The overall layout of stations and antennas created a grid with antenna elements spaced every 3-5 miles, covering a total of some 6,500 square miles. That's larger than Connecticut, but smaller than New Jersey. Perhaps more apropos, it is about 1/10 the area of Wisconsin, the state where the Navy planned to install the system 3.

This underscores a fundamental problem with ELF: antenna sizes. At 80 Hz, the wavelength of a radio wave is 2,300 miles, or about one quarter of the diameter of the earth. Take, for example, a half-wave dipole antenna—a very common antenna design in most bands. For ELF, the antenna would need to stretch from Albuquerque to Portland. Clearly, then, any practical ELF antenna needs to be "electrically short" or, in the relative sense of RF engineering, a small antenna. Small antennas are inefficient, and the smaller they get the less efficient they are. Complicating things further, practical ELF propagation over the surface of the earth requires vertically polarized waves. That means a vertically polarized antenna, and there is simply no way to construct a tower that is hundreds of miles tall.

Sanguine proposed, and most later ELF projects adopted, a style of antenna called a ground dipole. A ground dipole is basically two different electrodes, or grounding rods, driven into the ground a great distance apart and connected by feedlines. The power from the transmitter goes through the electrodes into the ground, where it flows as ground current from one end of the antenna to the other. The ground dipole thus forms a loop, with the feedlines as one side and the ground as the other. The actual RF emission results from the magnetic field between the feedlines above ground and the current flowing beneath, somewhat like the VLF antenna at Annapolis if half of it was buried beneath the ground.

Ground dipoles, like a typical dipole antenna, are directional. They emit RF most strongly in the same axis as the antenna, with strong lobes extending away from the ends of the two feedlines. By installing a second antenna on a perpendicular axis and shifting the phase between the two, you can create a steerable antenna with its strongest lobes pointed in the direction of your choice. That's why the Sanguine proposal, and most ELF transmitters after, have used two ground dipoles in a crosswise layout.


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


During the 1960s, the Navy performed a series of poorly documented experiments to establish the feasibility of Sanguine. These included a Wyoming power transmission line that was temporarily disconnected for use as an ad-hoc 40 mile antenna, and a power-line-like 110 mile antenna built by RCA in North Carolina and Virginia. The details of this RCA experiment, part of Project Pangloss, have become obscure. It appears that RCA was contracted to evaluate a number of different communications options for the Navy, including the use of other planets in the solar system as passive repeaters, but most of them didn't work out. The VLF transmitter for the project was located at Ararat, North Carolina, and the two two electrodes at Algoma, Virginia and Lake Lookout, North Carolina. A 1963 test successfully got a message from the test antenna to a submarine submerged 150' deep and 520 miles from the transmitter.

Like most of the military's ambitious plans in the late 1960s, Project Sanguine didn't happen. The reasons are complex, or at least several. Sanguine was unpopular with the public: besides specific concerns around safety, the late '60s saw a rising anti-nuclear campaign and a general lack of interest in enormously expensive military undertakings. The fact that Sanguine needed a massive amount of land meant that it was pretty much impossible to site it somewhere that wouldn't generate local opposition, so like ICBM fields, Sanguine was kicked around like a football. Originally planned for Wisconsin, it later shifted to Texas, and Texas didn't like it that much either (although by that point the antenna field had been downsized to just 1,600 to 3,200 square miles). And, of course, the technology was struggling to keep up with the threat landscape. The hardened design of Sanguine relied mostly on the idea that the Soviet Union couldn't possibly nuke most of the transmitters distributed over 6,500 square miles, a reassurance that the development of multiple independent reentry vehicles (MIRVs) seriously undermined.

As public opposition formed, a health and safety review commissioned by the Navy resulted in a noncommittal report that did little to reassure the public (and lawmakers) that the plan was safe. Last of all, but certainly not least, the budget projections for Sanguine were formidable, and Congress did not have the appetite for the spending.

Sanguine made it far enough that, during 1968, the Navy and RCA built a scaled down transmitter and antenna in the Chequamegon National Forest of Wisconsin. This came to be known as the Wisconsin Test Facility, and it was used as a transmitter for a series of jamming tests in the late '60s and early '70s. During this period, the Navy also considered the use of a BPA transmission line from The Dalles, Oregon to Los Angeles as an ELF transmitter—the plan being to actually modulate messages onto the 60 Hz AC power carried by the line, which was incidentally radiated due to the line's largely straight 850 mile span. This plan was called PISCES, and it is unclear if it ever went anywhere, although an interesting rumor holds that it was operational for a short period and used as the "jammer" transmitter for jamming susceptibility testing of the Wisconsin transmitter.

The results of these tests were mostly positive, but that wasn't enough to save an unpopular plan. Sanguine faded away, perhaps replaced by a scaled-down system called Super Hard ELF or SHELF. There is very little information on SHELF today. The idea seems to have been to install an ELF antenna in deep underground shafts (potentially over a mile below the surface) using hard-rock mining techniques. Work on SHELF apparently continued through the 1970s, but it probably never got beyond the feasibility stage.

Instead, the Navy shifted its focus to Project Seafarer. Seafarer was clearly a direct descendant of Sanguine, but addressed many of its biggest problems through a stripped down design. Seafarer transmitters, for example, would be located in surface buildings instead of underground. Still, the same basic antenna design remained, a grid on 3.5 mile spacing requiring about 4,700 square miles. The Nevada Test Site was considered as a location, as was White Sands Missile Range and forestland in the Upper Peninsula of Michigan. Michigan was ultimately selected, a result of favorable ground conditions and the lack of frequent large explosions. Seafarer construction was expected to begin in 1977, but instead it ended. The governor of Michigan shot the idea down, Congress didn't like it all that much, and President Carter signed the order ending work on not only Seafarer but ELF in general. In 1977, after roughly two decades of R&D work across multiple experimental sites, the ELF Program was in mothballs.

The Navy was not so easily dissuaded. Later in 1977, they proposed "Austere ELF," a plan to throw together an ELF transmit site more or less from spare parts. A transmitter at Sawyer AFB in Michigan's Upper Peninsula would feed 32, 45, and 53-mile-long antenna elements, and via a leased telephone line the AFB would also control the inactive Wisconsin Test Facility transmitter. Even this basic, partially spare parts plan fell afoul of the public and congress. It failed to address most of the original health and environmental concerns, and still cost too much.

Serious resumption of the ELF program would have to wait for President Ronald Reagan. Reagan was a fan of big, expensive, technically sophisticated solutions to Cold War programs, and ELF sure was one of those. Reagan approved "Project ELF," itself a scaled down version of Austere ELF. Project ELF used the existing Wisconsin Test Facility, supplemented by an identical 56-mile antenna in Michigan's Escanaba State Forest. Both would be operated by Sawyer AFB.

The Wisconsin Test Facility from Project Sanguine, after 20 years, came to be known as Navy Radio Transmitter Clam Lake: the first operational ELF transmitter. The Michigan site, known as Navy Radio Transmitter Republic, quickly joined it.

It's amusing that a temporary test facility ultimately became the final product, but the Navy had already invested a huge amount of effort in the Wisconsin transmitter. Everything from the strength of the EM field produced by the transmitter to its location in a National Forest had posed complications.

Although Sanguine was intended as a hardened, underground system, burying antennas was a lot of work and the Wisconsin Test Facility had originally been temporary. Instead of buried cables, it used 1/2" aluminum wires strung above ground on utility poles for the two antennas. The voltages on the antenna wires required isolation from the surrounding environment, so as with power lines, trees were cleared to make a right of way for the antenna cables. The Forest Service, concerned about aesthetic impact on the forest's recreational areas, required that the antenna routes avoid some parts of the forest and take right-angle jogs near roads so that it was not possible to see a considerable distance down the antenna ROW when driving past (which would make the existence of the cleared ROW much more obvious). The transmitter site and antenna ROWs are still clearly visible today. At each of the four ends, about seven miles from the transmitter building, around 10,000 feet of buried copper wire make up the electrode.

Trickier were the electrical problems. The ELF antennas could induce a significant potential in parallel electrical lines, and the use of ground return meant a lot of interference on telephone lines. When transmitting, which was ultimately the case 24/7, the 2.6 MW transmitter induced a current of about 300 A in the cables and ground. Understanding these impacts of ELF transmitters was actually one of the original purposes of the Wisconsin Test Facility, and the Navy had built model power and telephone lines parallel to the antenna elements. The ELF system was found to cause problems ranging from flickering light bulbs to phantom telephone ringing, and the Navy installed additional grounding and filtering on public utilities throughout the area at its own expense—even reimbursing the utilities for administrative costs related to customer complaints. Still, the interference problems were not fully solved during the test operations and no doubt contributed to the public's less than enthusiastic support.

The former Wisconsin Test Facility, as Clam Lake, became operational in 1985. Its sister site, Republic in Michigan, went online in 1980. Republic was new construction, not an old experimental facility, but for cost and expediency reasons it was a virtually identical design to Clam Lake with above ground wires to buried electrode screens. Because of geographical constraints, the Republic antenna is not in a straightforward cross configuration. Instead, it's more of an "F" shape, electrically equivalent but with the feedlines placed differently. From 1989 on, the two sites operated in synchronization, with their total 2.6 MW operational transmitter power producing a radiated power of about eight watts.

Yes, even at 14 miles in length, ELF ground dipoles are extremely inefficient. This remained a key problem with ELF. Early Navy ELF plans, like Project Sanguine, had assumed the use of extremely high transmit powers to produce a usable signal. ELF propagates very well, but at the paltry 8 W achieved by the Project ELF transmitters, practical reception still required extracting the transmitted signal from a noise floor that was just about as loud. That meant reducing the practical bandwidth of the system even further, and thus its speed. Project Sanguine would probably have been able to transmit EAMs directly to submarines; Project ELF was not. Even the compact format of EAMs was too long for a system with an effective symbol rate of about one letter per five minutes, or fifteen minutes to transmit the three-letter code groups used by the Navy.

This reduced ELF capability was basically a very fancy pager network. The Navy has not disclosed the details of the scheme, but it's probably something like this: each submarine has a three-letter code group assigned to it. When its ELF receiver detects that specific code group, the submarine crew know that there is a message waiting for them, and they have to move at least close enough to the surface for VLF in order to find out what that message is. The Navy often referred to this as "bell ringing:" ELF messages were like the ringing of a telephone. As a means of supervision, so that submarines knew they were capable of receiving a message, "idle" code groups were transmitted 24/7.

For how hard the Navy had fought to build it, Project ELF did not have a long life. The Navy's ELF submarine communications system was conceived around 1958, became operational over 30 years later in 1989, and shut down in 2004 after just 15 years of service. "The Nuclear Register," an anti-nuclear-weapons newsletter, put it like this: "A surprise Navy announcement signaled the end of 36 years of first local, then global, opposition to the Navy's giant transmitter system."

ELF overcame formidable political odds. Besides Congress's lack of interest in the expense and federal policy concerns around health and the environment, a statewide ballot referendum in Michigan had attempted to prohibit construction and legislation prohibiting ELF transmitters was perennially introduced in the federal congress. Activist groups opposed to the transmitters staged regular demonstrations and, as Project ELF proceeded despite their objections, protests gave way to civil disobedience. Utility poles supporting the ELF cables were cut on numerous occasions, and the transmitter buildings vandalized. "The Nuclear Register" wrote:

Nukewatch said the Navy's closure announcement, while welcome, raises more questions than it answers. The Navy said "improved technologies" and "changing requirements of today's Navy" made ELF obsolete. However, "very-low-frequency (ELF) [sic] alternatives to ELF have been around for 30 years and the 'changing requirements' refer to the end of the cold war that happened 14 years ago," LaForge said.

Indeed, it is hard for me to see the undignified closure of the Navy's ELF program as anything other than an admission of failure. The basic technical concept of ELF appears sound, but the transmitters are large, disruptive, and costly to operate. It is not clear that the advantages of ELF, namely the greater depth at which it can be received, outweigh its downsides or compare well to VLF.

VLF is still used by the US Navy today. ELF is not: the US has had no ELF capability since the 2004 closure of Clam Lake and Republic. China, India, and Russia are the only other nations to have constructed ELF transmitters. The Russian system, ZEVS, operates at 82 Hz from ground dipole antenna in place since at least the early 1990s. It is a candidate for the most powerful radio transmitter in the world, although the exact specifications have not been made public. India's INS Kattabomman gained an ELF transmitter in the 2010s, and while few details are known, China is believed to have constructed an enormous ELF transmitter in Huazhong during the 2010s.

It is, of course, interesting that China and India have both built an ELF capability after the US abandoned the technology. One wonders what made an ELF capability so hard to sustain here, even after the Clam Lake and Republic sites were built. Well, there is an inertia to politics: the organized opposition to ELF, once energized, didn't go away. Area residents and politicians continued to organize for the closure of the Wisconsin and Michigan transmitters until their final days.

Opponents of the ELF sites got plenty of help from both science and popular culture. Preliminary research linking ELF radiation to leukemia has not held up to modern scrutiny, but as with broader EM/RF cancer links this is an area of ongoing controversy. Extensive research by the Navy, mostly on the Clam Lake Site, hasn't found evidence of ecological disruption due to the ELF transmitter. Still, there is ongoing controversy, and one of the reasons for Project ELF's long and tortuous construction process was a series of lawsuits and appeals under the National Environmental Policy Act, contesting the thoroughness of the environmental research.

As usual, these possible connections to health and environmental impacts have given way to conspiracy theories. In the more shadowy corners of the internet, ELF is associated with everything from strange sensations to mind control. And that is where I first became involved.

The X-Files episode "Drive" (S06E02) sees Fox Mulder cornered, practically carjacked, by a man who insists that if he does not drive West then his head will explode. The episode aired four years after the release of Speed and no doubt owes inspiration to that film (Mulder even makes a joke about it in the episode), but it attributes the bizarre scenario to a very different cause. The hapless victim, portrayed by Bryan Cranston, gained his head-exploding illness as a result of some sort of military experiment involving long antennas secretly buried beneath his house. Vince Gilligan wrote the episode, and while there were several influences, the final episode is a direct reference to Project ELF and the surrounding controversy. Years later, because of their collaboration on "Drive," Vince Gilligan cast Cranston as the lead in his show Breaking Bad.

In the episode, Cranston doesn't make it to the West Coast. Mulder and Scully hatch a plan to puncture his inner ear and relieve the pressure building in his brain somewhere on the California coast, but Mulder just can't drive fast enough. Cranston's head explodes.

Clam Lake transmitter

Over the lifespan of the Project ELF facilities, police issued 636 trespass citations to demonstrators. Congressional representatives introduced legislation and amendments to end the ELF program multiple times. At least a half dozen ELF transmitter concepts were canceled, each one less ambitious than the ones before it. ELF is an interesting technology, but in a way, it's more interesting as a case study in military acquisition.

Take a concept that is expensive, politically unpopular, and questionably superior to systems already in service—but if the military wants it, they tend to eventually get it. After thirty years, the military wears resistance down and gets something pushed through. Fifteen years later, the Navy shrugs, calls it obsolete, and shuts it down. What's left is a 14-mile-across "X" in the forests of Wisconsin, a legacy of controversy that still echoes, and a pretty good episode of The X-Files.

  1. Unrelated to astronomer Percival Lowell, although there are enough moments of intersection between the two that you wonder if they might have met.

  2. Many of the fine details of the original NSS installation have become confused, probably because the Navy upgraded the equipment several times in its first decades and the specifications of different eras have become confused. Here are some notes: some sources give the transmitters as 350 kW, others as 500 kW. A Navy history explains that improvements to the antenna design allowed for raising the power after the site was originally designed, so 500 kW is indeed what was installed but we know where the 350 number came from. Some sources give the original towers as 500' tall and others (including Wikipedia) 600', I think the 500' number is more reliable as it agrees with the Navy history. I am not quite sure where the confusion comes from, though.

  3. Some sources, such as Wikipedia, give a number of 22,500 square miles and 2/5 the area of Wisconsin. This was the very top end of a preliminary estimate that was revised down to 6,500 during planning. The 22,500 number still frequently appears, probably just because it's the more absurd figure, which is an example of the challenges of historical research when most information comes from activist groups opposed to the thing you're researching. Of course, we have to temper that criticism with the fact that some anti-Sanguine sources use the 6,500 figure, especially older ones. The shift towards the more attention-grabbing 22,500 might have happened later as Sanguine was discussed more by people without original knowledge of the program.

CodeSOD: Blocked the Date

2 June 2026 at 06:30

Volodya sends us some bad date handling code in PHP. Which, I know, you're just reaching for the close tab and yawning when you hear that. You've seen it before. But bear with me, this one still has some fun bits to it.

$monthes = array(
        1 => 'Января', 2 => 'Февраля', 3 => 'Марта', 4 => 'Апреля',
        5 => 'Мая', 6 => 'Июня', 7 => 'Июля', 8 => 'Августа',
        9 => 'Сентября', 10 => 'Октября', 11 => 'Ноября', 12 => 'Декабря'
);

This creates a list of months.

if ( $team->have_posts() ) :
    // Start the Loop.
    while ( $team->have_posts() ) : $team->the_post();

Today, I have learned something about PHP. PHP has an alternate syntax for blocks. Instead of if { statements }, you can do: if : statements endif. Just one more quirk of PHP to make the language more confusing.

This block checks have_posts in an if, and then checks it again in a while, meaning we don't need the if at all, but so it goes. We haven't gotten to the date handling yet, so let's look at that.

        $date = get_the_date();
        $d1 = explode(".", $date);

        if ($d1[1][0]=='0')
            $m = $d1[1][1];
        else
            $m = $d1[1][0];
        ?><div class="date"><?php echo $d1[0]." ".$monthes[$m]." ".$d1[2]; ?></div>

We get the date as a string, and then split it out into date parts. This is, of course, highly locale specific, but clearly they know what locale they're in. Then they look at the array of date parts. The second element holds their "month" string, as two digits, so they look at the digits. If the month string starts with a 0, they grab the second character and put it in $m. Otherwise, they grab the first character and put it in $m. Then they use $m to look up the $monthes.

Unless there's some substring weirdness going on that I don't know about, this code… doesn't work? Right? Since they're grabbing only a single character out of $d1[1] every time, for months later in the year, $m is only ever going to hold 1, and thus we only output Января, meaning we get four months of January, which just seems cruel, honestly, at least in the Northern Hemisphere.

As with all bad date handling code, this could easily be fixed by just using the built in functions, even in PHP. What I'm going to take away from this though is that PHP's syntax lets you write in Visual Basic or Ruby if you're determined enough. And you can mix and match, so enjoy a codebase that has :/endif and {} scattered throughout.

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

Let's Be Facebook!

1 June 2026 at 06:30

The real WTF is that our long-time friend and submitter Argle failed to dissuade all three of his sons from pursuing IT careers of their own:

Back circa 2012, my three sons all got jobs at a company that had a brilliant web project. So brilliant that it had the support of a Disney VP, the mayor of the city, and other VIPs. At one point, my sons asked to borrow money to invest in the project. They are good boys (one is now a senior developer with Proctor & Gamble), so I backed them.

A year later, the project was released late, over budget, and not fully functional.

Facebook dislike

My boys convinced the CEO to bring me in to fix things. I fixed things. In that time, I found out they had taken bids on the project. Bids were nominally $15,000, some higher, some lower, of course. All but one group that had bid $5,000. Their plan? Hire some programmers in India for $8/hour and pocket the money without having to do work themselves.

Costs had shot well over $35,000 before I was brought in.

After I got the system working, I went to one of the weekly general standups for the company. The CEO walked in and said something like, "I just learned that Facebook was written in PHP. I think we should rewrite the whole project in PHP. That's what we really need to do."

And thus the decision was made.

A meeting was held the next day to discuss how long it would take to remake the project in PHP instead of C#. Bear in mind, a year and a half had been thrown into making the project thus far.

Going around the table, everyone said between 2 and 3 weeks. There was one other programmer in the company who had exactly 2 months of work experience; he simply parroted what the others had said before him. There was also the general contractor who leased the building to the company. He was involved with the project, and was second-to-last to speak. I fully expected this contractor to have more sense. He came in at 3 to 4 weeks.

My mouth dropped open.

It was my turn. You know those psych tests where you get someone who acts sensibly when alone, but conforms with the rest of the crowd when there's more than one? I'm simply not that guy. I said, "Those are absurd estimates! This will take a minimum of 5 months before it's in beta stages and not ready for public consumption for another couple more months."

The next day, I got a call telling me my services were no longer needed because "I wasn't forward-thinking enough for the company."

My boys stayed on another year, so I got regular reports on the "upgrade." Sure enough, just shy of 8 months later, the new system went live.

As they say, the most experienced person will be the one to accurately tell everyone that it will take longer and cost more than everyone else says.

Anyone else have their own intergenerational WTFs? Please share in the comments!

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.

Error'd: Super SEO Strategies

29 May 2026 at 06:30

It's ironic -- this site gets absolutely inundated with blogspam from people trying to improve their SEO ranking, and yet the only requirement to get your website linked is one dumb little typo in the right menu.

Faithful Michael R. is still job hunting, now even farther afield. "I shall try the gigs in United Kingsom. https://electronicmusicopenmic.com/"

43d63150fa7d48d3a7998e14e111c211

B.J.H. is getting hot undeh the collah. "Weather.com is an endless source of WTF. Today the high temperature will be 53F, unless you care about any hour after 8:00 AM. (And why don't they have enough room to spell out "hour"?)"

561594f875db486085450afbb4f65a4e

Jake W. isn't storming about like BJ. He just wants us to know there's an opening at Durmstrang. No stress.

8eefa2a1182146b3b595a3fbbfef5012

Martin K. reveals "The resignation of the Microsoft Denmark CEO broke more than news, it also broke the date."

73c8b26e71ed4518bbdfeacc9850629f

"confirmation.message.text" incoming from Totty "Snarky comment. Snarky comment. Snarky comment."

d0e6feb93e324e509946643027ddbc5e

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

CodeSOD: What Condition is This

28 May 2026 at 06:30

Untodesu sends us this submission, with this comment:

Literally no idea what kind of drugs the guy was taking but nonetheless we've rewritten it to be just a two-liner

Well, that doesn't tell us a lot about what to expect from the code, but let's take a look.

QStringList TableViewAssembly::parametersFilter(ProbePart::Type type, int pos, QList<ProbePart> probeDesign) {
    QString to, from;

    if(pos == -1) {
        if(probeDesign.length() == 0) {
            to = "*";
            from = "AutoJoint";
        } else {
            to = probeDesign.at(0).fromMounting();;
            from = "AutoJoint";
        }
    } else if(pos == 0) {
        if(probeDesign.length() == 1) {
            if(probeDesign.at(pos).type() == ProbePart::Type::Stylus) {
                to = probeDesign.at(pos).fromMounting();
                from = "*";
            } else {
                to = "*";
                from = probeDesign.at(pos).toMounting();
            }
        } else {
            to = probeDesign.at(pos + 1).fromMounting();
            from = probeDesign.at(pos).toMounting();
        }
    } else if(pos == probeDesign.length() - 1) {
        if(probeDesign.at(pos).type() == ProbePart::Type::Stylus) {
            if(probeDesign.length() <= 1) {
                from = "*";
                to = probeDesign.at(pos).fromMounting();
            } else {
                from = probeDesign.at(pos - 1).toMounting();
                to = probeDesign.at(pos).fromMounting();
            }
        } else {
            from = probeDesign.at(pos).toMounting();
            to = "*";
        }
    } else {
        from = probeDesign.at(pos).toMounting();
        to = probeDesign.at(pos + 1).fromMounting();
    }

    return { to, from };
}

QStringList andQList tell me that this is a Qt-based application. The goal of this function seems to be to take some inputs about a "probe part" and construct a pair of strings. Let's trace through it.

Let's just walk through the conditions, quickly, without worrying too much about the inside. We look at pos, and check for three cases: either pos is -1, 0, or probeDesign.length() - 1.

Inside each of those branches, we also check the length of the list, testing if it contains no elements, exactly one elemnet, or more than one element. We also check if the part in question is a stylus.

With that in mind, let's see if we can summarize the conditions here. If pos == -1, we do some automatic stuff, using the first element in the list if there is one. If pos == 0 and there's exactly one element in the list, we grab the first element and link it to * (the to/from order depends on the stylus question). If there's more that one element in the list, we pair the current pos with pos+1; notably, in this branch, pos is definitely zero. If pos is the last element in the list, we follow the same logic, but pair with pos-1, with a side branch for checking against the length of the list.

It's all bounds checking. That's all this code is. Bounds checking that's gotten out of hand. The main branch here is actually the final else: that's where most of the code is going to pass through. All the other branches are just handling edge cases. Literal edge cases, as in "the edge of the list".

Untodesu didn't supply the two line version, but based on the fact such a version exists, I also suspect that many of these branches weren't actually used. Or, at least, based on the actual business rules, could be combined.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!

CodeSOD: Are There Files Yet?

27 May 2026 at 06:30

Are there any files to send? That's the question that Chris C's predecessor had. So they asked it. Again. And again. And again.

Chris writes:

I'm occasionally called upon to troubleshoot an ecommerce application that was built in the PHP 5.x days and has been running largely untroubled by maintenance or modernity (aside from the backported security patches to its binaries) ever since.

if(sizeof($files) > 0){
		if(sizeof($files) > 0){
				foreach($files as $file){
						$mime->addAttachment($file);
		}
		}
}

Indentation as per the original.

If the files array contains items, then if the files array contains items, then we iterate across the files array, which hopefully contains items, and add them as an attachment to an email.

I feel like the way this got indented, the developer responsible knew, deep down, that this was wrong. They lacked the reading comprehension to understand why, but deep down in their spleen, something was screaming at them. And thus those stacked curly brackets at the end there.

Of course, none of the conditionals are needed: a foreach on an empty object just does nothing.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!

Whales Ahoy!

26 May 2026 at 06:30

The waters are even more dangerous than we imagined. Have a look at some of the crazed whales our brave submitters and commenters have encountered in the wild.

First comes an Anonymous tale of woe:

Killer whales (Orcinus orca) spyhopping to locate a crabeater seal (Lobodon carcinophaga) on an ice floe in Antarctica.

Our company makes apps for businesses. We have 1 MAIN client whose CEO can make or break our company, and his wish is our command. He sent a priority email on a Friday night saying the app was slow and needed to be fixed.

The client CEO is so important that he works directly with our CEO, who decided to PM this huge issue.

All weekend, we were trying out tons of different things to optimize this "slow" app that "wasn't loading or refreshing." We deployed the app Monday night after a weekend of unpaid overtime (darn salary). On Tuesday, the account manager made a bug card to officially represent the work we did, and they posted a previously-unseen video of the slowness.

There is a refresh icon that spins when clicked. The video was of the refresh icon, and it was spinning for an extra second after the data loaded (and jumping 2 pixels from padding styling).

That is what was high priority.

I mean, we all hate the system, but sometimes the system is actually there to protect us.

Next, we have Daniel's ongoing peril:

We do digital flyers/circulars/ads. Eight years ago, that meant we got PDFs from retailers and turned them into digital content. One huge retailer (hundreds of stores) wanted a dynamically-created flyer that would have up-to-date pricing twice a day. We didn't have time to build out a full digital solution (which would have made sense), so instead we spent six months banging together a solution with spit and duct tape which baked out hundreds of PDFs every morning and afternoon. This one retailer was responsible for about 40% of our processing power.

We're finally getting somewhat closer to phasing this out, but "it worked" for this long ...

Finally, let's be grateful Brian escaped with his life!

Worked for a company that was building a component of a high-profile weapons platform for one of the major military suppliers. We had taken over the project from another company that was under-performing, so we were already behind schedule from the minute the contract was signed. Of course this company saw fit to treat us more as a subsidiary than a subcontractor. Including, for a time, sending one of their own managers to sit in our lab and observe (read: babysit) us. On Saturdays. Then they demanded we start working shifts to make more use of the lab equipment, and I got the bad draw: 3 AM - noon. Never mind that I had just gotten married (they actually called to tell me this while I was on vacation the week after my wedding) and would like to actually spend some time with my wife ...

That experience soured me on the whole military-industrial complex for a long time. To this day I still get headhunters pinging me to work for that megacorp; I just chuckle and delete their messages.

Have these tales knocked loose any foul memories that your brain tried to repress? Send them to us!

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

CodeSOD: Classic WTF: One-and-a-Half-Tiered Application Design

25 May 2026 at 06:30
It's a holiday in the US today, so we're reaching back into the archives. What we really need is a single function that can do it all, and by "it" we mean "ruin your life." Original --Remy

There are several types of bad code; there's lazy code, frantic code, unaware-of-a-better-way code, and aware-of-a-better-way-but-too-apathetic-to-do-it code, to name a few. Then there're amalgamations of different types of bad code.

Môshe encountered such an amalgam when his company was trying out a new delivery service. Môshe spent some time evaluating the IE-only web interface, and was curious about some JavaScript errors he was getting. Strangely, he noticed variables named dateSQL, newSQLTag, and modeSQL.

Môshe dug a little deeper, probably thinking that his suspicions couldn't possibly be correct, only to find sendLinkVal() in the page's code:

function sendLinkVal(theDate,theStatus,MainTitle,PageTitle){
  var dateSQL = " AND J.JBDeliveryDate=''" + theDate + 
    "''"
  var status = ""
  var newSQLTag =""
  var PageTitle = PageTitle
  var MainTitle = MainTitle
    //alert(dateSQL)
      switch (theStatus){
        case "Confirmed":
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " GlobalJobStatusView AS J WHERE J.JBCollectDate=''
	    " + theDate + "'' AND J.JBConfirmed=''Yes'' AND 
	    J.MIStatusCode<>5" + modeSQL + " AND 
	    (ISNULL(J.JBCancelled, 0) <> 1) ORDER BY 
	    Convert(int, J.MIJobID)"
        break;
        case "Unconfirmed": 
          dateSQL= ""
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) " 
          status = " GlobalJobStatusView AS J WHERE J.JBCollectDate=''
	    " + theDate + "'' AND J.JBConfirmed=''No''" + 
	    modeSQL + " ORDER BY Convert(int, J.MIJobID)"
        break;
        case "Complete":
          dateSQL= ""
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) " 
          status = " GlobalJobStatusView AS J WHERE J.JBCollectDate=''
	    " + theDate + "'' AND J.MIStatusCode=5" + 
	    modeSQL + " ORDER BY Convert(int, J.MIJobID)"
        break;
        case "Unconformed": 
          dateSQL= ""
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) " 
          status = " GlobalJobStatusView AS J WHERE J.JBCollectDate=''
	    " + theDate + "'' AND (J.MIConformance IS NOT NULL 
	    AND J.MIConformance<>'''') " + modeSQL + " 
	    ORDER BY Convert(int, J.MIJobID)"
        break;
        case "NoDelDate":
          dateSQL= ""
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) " 
          dateSQL =" GlobalJobStatusView AS J WHERE J.JBDeliveryDate 
	    IS NULL " + modeSQL + " ORDER BY Convert(int, J.MIJobID)
	    "
        break;
        case "Collections":
          // the dateSQL is not required so set it to nothing so that it 
          // doesn't interfere with the sql being generated at the end of 
          // the function.
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " GlobalJobStatusView AS J WHERE J.JBCollectDate=''
	    " + theDate + "''" + modeSQL + " ORDER BY 
	    Convert(int, J.MIJobID)"
        break;
        case "Deliveries":
          // the dateSQL is not required so set it to nothing so that it 
          // doesn't interfere with the sql being generated at the end of 
          // the function.
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " GlobalJobStatusView AS J WHERE J.JBDeliveryDate=''
	    " + theDate + "''" + modeSQL + " ORDER BY 
	    Convert(int, J.MIJobID)"
        break;
        case "ColAndDel":
          // the dateSQL is not required so set it to nothing so that it 
          // doesn't interfere with the sql being generated at the end of 
          // the function.
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " GlobalJobStatusView AS J WHERE ((J.JBDeliveryDate=''
	    " + theDate + "'') OR (J.JBCollectDate=''" + 
	    theDate + "''))" + modeSQL + " ORDER BY 
	    Convert(int, J.MIJobID)"
        break;
        case "Subcontractor":
          // the dateSQL is not required so set it to nothing so that it 
          // doesn't interfere with the sql being generated at the end of 
          // the function.
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " JobAndLoadView AS J WHERE (J.JBDeliveryDate=''
	    " + theDate + "'') " + modeSQL + " 
	    ORDER BY Convert(int, J.MIJobID)"
        break;
        case "Cancelled":
          // the dateSQL is not required so set it to nothing so that it 
          // doesn't interfere with the sql being generated at the end of 
          // the function.
          dateSQL= "" 
          var modeSQL = ""
          modeSQL = " AND (J.JBCompanyID=31337) "
          status = " GlobalJobStatusView AS J WHERE (J.JBCollectDate==''
	    " + theDate + "'') " + modeSQL + " AND 
	    ISNULL(J.JBCancelled, 0) = 1 ORDER BY Convert(int, J.MIJobID)"
        break;
        default : status ="";
      }
        newSQLTag = dateSQL + status;
        document.all.hiddenForm.linkVal.value = newSQLTag;
        document.all.hiddenForm.PageTitle.value = PageTitle
        document.all.hiddenForm.MainTitle.value = MainTitle
        document.all.hiddenForm.submit();  
    //alert(newSQLTag)
  }

Môshe could replace his customer ID with any other and access customer data, and for that matter, to modify or delete whatever he wanted. He could add or remove columns to tables. He could possibly even change permissions, add his own database user and deny all other users access.

Shocked, Môshe called the delivery service, who got him in touch with the developer of the system. This developer was equally shocked to learn that it was even possible to view a web page's JavaScript code, let alone that his architecture was open to SQL injection attacks from virtually any angle. He took immediate and decisive action; all queries were moved to the .NET backend.

Of course, the queries still didn't use parameters and are therefore still open to SQL injection, but now it takes slightly more effort to hack.

[Advertisement] Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.

Error'd: April is Special, and so are you

22 May 2026 at 06:30

"April is special," writes Elwin. It is, but take heart May, every month is special at TDWTF.

ef33dacc82c1495bbc2c68cf30461f3c

"Admiral Ackbar is pinterested," punned The Beast in Black

0b5ff0ba77cc480cb3c0a6ca91ef10b6

Manuel H. clocked something off on this website. "Noon seems to be very late in Lithuania, or maybe only in this hotel restaurant in Vilnius." 15H AM must be on some planet with a 32H day.

18d8b28ac37243708f1f4711be97cebf

"Amazon can't make up its mind!" ranted an anon. "Do I need to wait 2 business days or 3? Make up your mind Amazon!"

abc72aa0987b4e84816906e2b598dc11

Duston decided to close us out with a pun. "Looks like they have a problem, but it's trivial." Well done.

a821a18e000c4152a327d79dd2a05744

[Advertisement] Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.

CodeSOD: In the Know

21 May 2026 at 06:30

Delilah works in a Python shop. Despite Python's "batteries included" design, that doesn't stop people from trying to make their own batteries from potatoes. For example, her co-worker wrote this function:

def key_exists(element, key):
    if isinstance(element, dict):
        try:
            element = element[key]
        except KeyError:
            return False
        return True

Python, of course, has an in operator. key in dictionary is an extremely common idiom. There's no reason to implement your own. Certainly, there's no reason to re-implement it by catching and throwing exceptions.

This is ugly, stupid, and bad. It gets worse, though, when you see how it gets used.

for key in old_yaml_data:
    if key in new_yaml_data:
        if old_yaml_data[key] != new_yaml_data[key]:
            temp = new_yaml_data[key]
            new_yaml_data[key] = merge(new_yaml_data[key], old_yaml_data[key])

            if key_exists(new_yaml_data[key], 'image') and key_exists(old_yaml_data[key], 'image'):
                new_yaml_data[key]['image'] = temp['image']
            elif key == "databases":
                revert_db_tags(new_yaml_data[key], temp)

This code is attempting to upgrade "old" YAML data with "new" data. So it's basically merging dictionaries, which is a great case for the in operator.

And they use the correct idiom on the second line there! This was written by one developer! They do the standard key in new_yaml_data check. And they also use key_exists. I can only assume that they had a stroke between starting and finishing this script, which I'll note is, in total, 48 lines long.

Here's the whole short script, which is just generally a mess. Slapped together Python code that's trying to be a "smarter" shell script, but is definitely written with the elegance of hacked-together-bash.

import sys
import yaml
from jsonmerge import merge

appHomePath = sys.argv[1]
oldValuesYAML = appHomePath + "values.yaml"
newValuesYAML = appHomePath + "/upgrade_version/values.yaml"
with open(newValuesYAML, 'r') as f:
    new_yaml_data = yaml.load(f, Loader=yaml.loader.FullLoader)
with open(oldValuesYAML, 'r') as f:
    old_yaml_data = yaml.load(f, Loader=yaml.loader.FullLoader)
def key_exists(element, key):
    if isinstance(element, dict):
        try:
            element = element[key]
        except KeyError:
            return False
        return True

def revert_db_tags(old_yaml_data, new_yaml_data):
    dbList = ["mongoDB", "postgresDB"]
    mongoDbTagsToRevert = ["mongoRestore"]
    mongodbKeysToDelete = []
    postgresDbTagsToRevert = []


    for db in dbList:
        old_yaml_data[db]['image'] = new_yaml_data[db]['image']
    for mongoDbTag in mongoDbTagsToRevert:
        old_yaml_data['mongoDB'][mongoDbTag]['image'] = new_yaml_data['mongoDB'][mongoDbTag]['image']
    for mongoDbTag in mongoKeysToDelete:
        del old_yaml_data['mongoDB'][mongoDbTag]

    for postgresDbTag in postgresDbTagsToRevert:
        old_yaml_data['postgresDB'][postgresDbTag]['image'] = new_yaml_data['postgresDB'][postgresDbTag]['image']

for key in old_yaml_data:
    if key in new_yaml_data:
        if old_yaml_data[key] != new_yaml_data[key]:
            temp = new_yaml_data[key]
            new_yaml_data[key] = merge(new_yaml_data[key], old_yaml_data[key])

            if key_exists(new_yaml_data[key], 'image') and key_exists(old_yaml_data[key], 'image'):
                new_yaml_data[key]['image'] = temp['image']
            elif key == "databases":
                revert_db_tags(new_yaml_data[key], temp)

with open(newValuesYAML, 'w') as f:
    data = yaml.dump(new_yaml_data, f, sort_keys=False)
[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

CodeSOD: Find a Bar for This One

20 May 2026 at 06:30

A depressing quantity of software is what I would call a "data pump". I have some data over here, and I need it over there. Maybe I'm integrating into a legacy app. Or into an ERP. Or into a 3rd party API. At the end of the day, I have data in one place, and I want it in another place.

Sally has a Java application written in the Quarkus framework, which has a nightly batch that works to keep a table of Bar entities in sync with a table of Foo entities. (This anonymization comes from Sally) These exist in the same database. There is also a Bar webservice, which provides information about the Bar entities. The workflow, such as it is, is that the software needs to find all of the Foo entities that do not currently have associated Bar entities, and then call the Bar webservice to get the required information to create those Bar entities.

Let's see how that works.

@Inject UserTransaction transaction
// If this is annotated with @Transaction the usage in the Message function down below will have some Thread exception
public List<FooData> getAllFoos() {
  try{
    return fooDataRepository.findAllFoos();
  } catch (Exception e) {
    throw new RuntimeException(e);
  }
}

We'll worry about that comment in a second, but this function returns a list of all of the Foo objects in the database. It does not return a list of all the Foo objects without associated Bar entities. It's just the whole giant list of everything. The underlying database is a standard relational database; it'd be trivially easy to write that query, even going through the ORM.

Well, that's bad, but it's all pretty minor. How does the actual update go?

// Can't be annotated with @Transaction because Oracle DB can handle the given Amount of dataEntities in one Transaction '\._./'
Message updateBarsWithFoos() {
  List<FooData> foos = getAllFoos();
  if(!foos.isEmpty()){
    foos.forEach(foo -> {
      try{
        transaction.begin();
        if(barRepository.findByName(foo.getName()) == null){
          if(barDataService.searchByName(foo.getName()) != null && barDataService.searchByName(foo.getName()).marker() != null){
            barRepository.createBar(barDataService.searchByName(foo.getName()));
          }
        }
        transaction.commit();
      } catch (Exception e) {
        try {
          transaction.rollback();
        } catch (Exception ex) {
          throw new RuntimeException(ex);
        }
      }
    });
  }
  return new Message(MessageLevel.INFO, "Created bars")
};

Ah, the real WTF is that it's an Oracle database. That's always a WTF.

But let's trace through this code.

We get all of our Foo entities. We check for emptiness and then do a forEach, which seems to make the empty check superfluous: a forEach on an empty list would be a no-op anyway.

We start a transaction, then check the database: if there are no Bar objects that link to Foo, then we call into the barDataService to find data. If there is, we call into the service again, to see if the marker property is not null. If it is, we call into the service again to get the actual data we're putting into the database. Then we close the transaction. If anything goes wrong, we rollback the transaction and chuck an exception up the chain.

That is three web service calls inside of a database transaction. Three calls which could easily be one, and that call could easily also happen outside of a transaction if you're mindful about confirming your constraints. And of course, because they're not mindful at all, they need to manage the transaction directly, and can't use the @Transaction annotation provided by their framework, which would at least cut down on some of the boilerplate.

Now, I'm sure you'll be shocked - shocked - to learn that the webservice is actually a bit flaky, and thus times out from time to time. And this isn't the only batch job running, which means the long-lived transactions cause all sorts of contention and terrible performance across the various batches. And this app doesn't have its connection pool properly configured, so the entire software stack can exhaust all of its database connections surprisingly quickly, causing yet more failures.

The root of the WTF, of course, is doing this as a batch job. A well engineered application would do everything it could to not create data in the database that isn't referentially sound. There, Sally gives us the one bit of good news:

My current project will do away with the batch processing altogether, so we can say, "RIP, transactional wholesale triple caller!"

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.

Three Digit Acronyms

19 May 2026 at 06:30

JB has a database table that, at first glance, looks like one of those data warehouse tables that exists to make queries performant. You know the sort, the table that contains every date between 1979 and 2050, or every number out to 1,000,000 or something. It looks dumb, but it helps make certain joins and queries performant.

The database table is called three_alpha_numerics. It has two columns: digit, which contains three characters, and is_numeric, which is a a single character: 'Y' or 'N'. It looks roughly like this:

+-------+------------+
| digit | is_numeric |
+-------+------------+
| 009   | Y          |
+-------+------------+
| 00A   | N          |
+-------+------------+

So, for example, if you wanted all the possible numeric triples, you could SELECT digit FROM three_alpha_numerics WHERE is_numeric = 'Y', which is obviously the easiest thing one can imagine.

So what is this for? Well, it's used by a stored procedure that generates unique IDs. That stored procedure does a left join against another table to find all the unused digits. And here's the real gotcha: that stored procedure only ever uses the rows where is_numeric is Y, meaning the vast majority of the data in this table is never used.

Unique IDs, of course, are an incredibly difficult task for databases to do, so it absolutely makes sense that we create a system that allows us to only have 1,000 unique IDs. That's more than 640, which should be enough for anyone. Having many thousands of unusable alphanumeric triplets is just the cost we have to pay.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

Representative Line: Dating Backwards

18 May 2026 at 06:30

Another representative line, and this one comes from an Excel spreadsheet. But, per Remy's Law of Requirements gathering ("No matter what the requirements doc says, what your users wanted was Excel"), this one was actually written by a developer. A developer who didn't understand how Excel works, but more important, didn't understand how dates worked either.

This comes from Ulysse J.

=CONCATENER(SI(MOIS($A18)>9;ANNEE($A18)-2000;(ANNEE($A18)-2000)*10);SI(JOUR($A18)>9;MOIS($A18);MOIS($A18)*10);JOUR($A18))

Now, the first thing: Excel function names are locale specific. This was written in France, so the functions are French. CONCATENER is "concatenate", SI is "if", MOIS is "month", and so on.

The purpose of this function is to convert a field (cell A18) in DD/MM/YYYY into YYMMDD. So how does it do this?

Well, we check the month. If it's greater than 9, we output the year minus 2000. If it's less than 9, then, we output the year minus 2000, multiplied by 10. That is to say, August, 2026 would start by outputting 260. We repeat this logic for the days: if the day is larger than 9, we output the month, otherwise we output the month times 10. Finally, we output the day.

This is attempting to do padding. There's just a problem. Imagine February 1st, 2009- an actual date in the document. We convert the year into 90, the month into 20, rendering the date as 90210. That's incorrect. And once we get to 2100, if there is still an Excel in 2100 (I joke: of course Excel will still exist in 2100. Humanity won't, but the robots will use Excel), this will also break. Not that it matters- I mean, YYMMDD doesn't make sense by that point.

Obviously, the correct solution is to use Excel's rich, built-in formatting functions to convert between date formats. It's easy! But Ulysse raises another point:

Extra points: even if you do not know how to do proper [formatting], the input format is guaranteed to have correct padding. I would just concatenate parts of it (treating dates as text is bad, but still less bad than treating them as integer triplets).

I will say this: I know a software developer wrote this, because your average Excel user could easily write bad formulas, but never bad in this kind of convoluted way. You need a real expert to do something this bad.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

Error'd: Balmenach Bad Gateway Single Malt

15 May 2026 at 06:30

"Winner ad placement!" snarked our Peter G.

cc79d61a927848f48b2a41988ebf8c5d

Errors on this website are always a shoo-in for the weekly column. An anonymous reader wrote "I got error 500 when I tried to submit an Error'd. Please make the file uploader check if the attached file is within the file upload limit, which I think is less than 4 MB." They shared an audio error'd which may be coming along next week.

fec797b9fc3642a5b940675292dc764e

"Give us feedback - wait, did it work at all?" confused poor I_Absolutely_Want_To_Give F. "As every good service management company, ServiceNow wants feedback, above all."

3d6b7629ef5a4c90bafa3f2e6a21f663

"0 minutes does not equal 0 seconds..." sagely summarized Daniel D. "Claude like floors. I mean floor. But maybe ceil would be better applicable to this calculation, right?"

76dd1834b8394e5ebca32229fa87fb7e

Finally, this one is a real novelty, from Adam R. Is the label actually 27 years old? It certainly could be; Error 502 is a good bit older. But I think this would be our oldest Error'd yet. Adam explained: "This appears to be a real auction for a whiskey bottle whose label does, in fact, say Error 502 Bad Gateway on it. The winning bid: £130. Source: https://www.scotchwhiskyauctions.com/auctions/228-the-179th-auction/876095-balmenach-1998-27-year-old-error-502-bad-gateway-thompson-bros/"

1f77b40a37f24eabbac33a8de3aee9a7

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.

The Pride Goeth

14 May 2026 at 06:30

Janči, a master's student of bioinformatics, was seated near the back of a large classroom. This was a simple compulsory elective course geared toward biologists. The professor was currently walking the class through their latest assignment. "We'll need to connect to some Linux servers," he announced.

The other students seated nearby traded blank stares. They were all Mac and Windows users with no IT background. Meanwhile Janči, a veteran Linux user, started feeling a little smug. An easy A was at hand.

Roman key (FindID 519853)

"First," the professor continued, "you'll need a private key."

After the professor had explained a few details, the first WTF came in the form of a bulk email sent to the entire class. The private key was attached. The username was the email address it was sent to.

What do you call the exact opposite of a private key? Janči wondered, bemused.

"You'll also need to download an application to help you log in," the professor said. "I recommend MobaXterm."

As he detailed the process of visiting the SSH client website to download the software, Janči tuned out. He didn't need such hand-holding. He accessed OpenSSH, tried connecting ...

... and failed.

Meanwhile, everyone around him was logging in no problem.

Janči's face burned with embarrassment at this second WTF. His first instinct was to blame the deprecated cryptography of the server. He spent most of the remaining lecture time searching for a way to allow his SSH to use SSH-DSS. (It turned out to be supported the whole time, despite the warnings he received.)

Janči then tried to re-download the "private" key and adjust the SSH config file several times. He cycled through different possible usernames associated with his university email account.

No dice.

He was the only person in the class who hadn't yet logged into the server. Not even the professor was able to help him, since he was using Linux.

Embarrassment and frustration mounted. An hour later, out of ideas, Janči fell back to downloading MobaXterm and running it inside Wine.

It didn't work.

The professor offered him a spare Windows box. "Here, try this one."

Janči booted it up, copied the "private" key to the new machine ... and still couldn't sign in.

Now, this was getting suspicious.

The lecture ended. A friend of Janči's hung back while the rest of the students filed out. "Why don't you try logging in with my credentials instead of yours?" she asked.

Janči was up for anything at that point.

It worked. On his own machine, on the Windows box, everywhere.

With that lead in mind, Janči opened the server's /etc/passwd file to look at all the usernames. He noticed that, unlike everyone else, his username and email address didn't match.

His university used Microsoft emails. Everyone had several address aliases, and they could also use whatever email address they liked in the system, even a personal one.

Janči had chosen to use a school email in the form of <number>@uni.uni. Unfortunately, the Ubuntu server didn't like the idea of user being named just <number>, so it had renamed it to user<number>. Some script for generating SSH configuration had probably failed from there, because Janči also discovered that his user home directory was missing a .ssh directory and known_hosts file.

Unfortunately, due to restricted access, he wasn't able to copy them from any of his classmates. In the end, he could connect to the server as any of his classmates, but not as himself.

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.

CodeSOD: Over and Under Reaction

13 May 2026 at 06:30

Today's anonymous submitter sends us two blocks. The first is a perfectly normal line of React code:

const [width, setWidth] = useState(false)

This creates a width variable, defaulting it to false, and a setWidth function, which lets React detect when you change the variable, and trigger a re-render. Importantly, this mutation only happens on the next render, which means if you call setWidth and then check width, you won't see your change happen.

As I said, this is perfectly normal React code. Well, almost. First, I have to ask: why on Earth is width being set to a boolean value? "How wide are you?" "Yes." It's possible that there's a good reason for this, though I suspect that it's unlikely.

The second issue, however, is that the linter complained that the setter was never actually used. That was odd, because if our submitter grepped the codebase, there were two calls to setWidth. Let's see what that looked like:

const show = (show) => {
    setWidth(show)
    setWidth(!show)
}

We create a function show, where we expect a boolean value, and then we setWidth with that value, and then with the negation of that value. So show(true) will set width to be false. To make matters more confusing, we set width both ways, and I assume this is someone trying to get around React's state management. React won't trigger a re-render if you set the state to a value it already has. So I suspect they're twiddling to try and force it to re-render, and I also suspect that this might not work? Even if it does, this isn't how you should be using React. As I said, I'm no React expert, but as the saying goes: "I don't have to be a helicopter pilot to know that when I see a helicopter hanging upside down from a tree someone messed up."

Our submitter writes:

Got hired to cleanup a mission critical website for a company that had just learned that offshore teams might not be worth the cost saving measures.

"Pay me now or pay me later."

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

Copying Remote Command Output to Your macOS Clipboard

26 May 2026 at 09:00

A terminal screen showing htop

I use Apple devices very often. Overall, I like macOS. Certainly more than Windows.

One of the things I find extremely useful is a command I discovered not too long ago: pbcopy.

pbcopy can be used to copy to the clipboard whatever it receives from standard input. For example, when I am in a shell, I often use a command like this:

cat filename.md | pbcopy

At that point I know that the content of the file is in the clipboard, and I can paste it wherever I need, calmly and without any additional steps.

There is one limitation, though: this only works locally. It works when I am using my Mac and I want to copy something from the macOS shell.

When I connect to a remote (*BSD, Linux, illumos based) server via ssh, pbcopy is not available. Or, more precisely, even if I create a command with the same name on the server, that command cannot directly talk to the clipboard of my Mac in the usual way.

Luckily, modern terminal emulators have a few tricks available.

I use iTerm2 for most of my ssh sessions and, once I realised how useful it would be to have something similar to pbcopy in remote sessions too, I created a small script that works both on Linux, the BSDs and illumox based OSes.

The caveat is that the remote server cannot "magically" access the clipboard of my Mac. So the trick works because the remote command prints a special terminal escape sequence, and the local terminal emulator interprets it.

The sequence is called OSC 52. In short, it allows a program running inside the terminal to ask the terminal emulator to put some base64-encoded text into the local clipboard. This means that support depends on the terminal emulator I am using locally.

I use iTerm2, which supports OSC 52. Other terminal emulators support it too, so the idea is not tied exclusively to iTerm2. However, Apple's default Terminal.app does not appear to support OSC 52, so I would not expect this specific solution to work there.

So, in practice:

  • with iTerm2, it works;
  • with other OSC 52-compatible terminals, it should work (I haven't tested it, but should work with kitty, Ghostty, etc.);
  • with Apple's Terminal.app, at least at the time of writing, it should not.

Before creating the command on the remote server, a specific iTerm2 option needs to be enabled:

Settings -> General -> Selection -> Applications in terminal may access clipboard

This option allows programs running inside the terminal to access the clipboard through escape sequences.

Of course, this has security implications. A program running in the terminal, including a program running on a remote server through ssh, may be able to write to the local clipboard. For my use case this is acceptable, but it is worth knowing what is happening.

All I need to do is create a command, a small sh script. I log into the server where I want to create the command. In my case, I usually create a file called /usr/local/bin/pbcopy with the following content:

#!/bin/sh
printf '\033]52;c;%s\a' "$(base64 | tr -d '\n')"

Then I make it executable:

chmod a+rx /usr/local/bin/pbcopy

From that moment on, I can use pbcopy on the remote server too, piping another command into it:

cat filename.md | pbcopy

The content will not end up in the clipboard of the remote server. It will end up in the local clipboard of my Mac, because iTerm2 receives the OSC 52 sequence and updates the macOS clipboard.

FediMeteo, timezones, and the art of not breaking what already works

25 May 2026 at 09:14

FediMeteo, timezones, and the art of not breaking what already works

I have already written about how FediMeteo was born, and about how HAProxy helps reduce the number of requests that reach snac.

Seen from the outside, FediMeteo almost seems still. There is a static homepage, regenerated every hour. There are the city pages, with their forecasts. There are RSS feeds waiting to be fetched, JSON objects waiting to be requested, Fediverse instances refreshing data, subscribing, unsubscribing, retrieving profiles, and reading notes.

That is the visible part.

Behind it, however, FediMeteo is much more than a homepage, a few ActivityPub accounts, and a well-behaved reverse proxy. It is a chain of small pieces, in proper Unix style, each trying to do one thing and do it as well as possible.

That chain, although almost invisible from the outside, was not born already tidy. It changed, was rewritten, adapted to new countries, timezones, ambiguous city names, external service limits, and also to my own mistakes.

Some mistakes were small. Others were much less so.

Because FediMeteo is a human project and, as such, imperfect. Imperfect in the way humans are imperfect, which today almost seems unfashionable. I like that.

The first version of the bot was almost embarrassingly simple, and I was proud of that.

It took a city name as input, asked Nominatim for the coordinates through geopy, called the Open-Meteo API for the current weather and the next several days, and printed a markdown block with current conditions, the forecast for today, the next twelve hours, and the coming days. The text was in Italian. The cities were Italian. The timezone was Europe/Rome. There was nothing to calculate.

Around the script, a small sh wrapper read a list of cities and, for each one, ran the Python program and piped its output into snac note_unlisted. A cron job ran the wrapper every six hours. The output was loose markdown, which snac happily renders, and the integration was: standard output goes into standard input. Nothing fancier than that.

I like this kind of design. It is the part of the Unix philosophy that survives even when fashions change.

When I started adding other European countries, I did not need to change much. I separated the operational logic from the localized strings, moved the strings into one JSON file per country, and spread the cron entries so that not every country posted in the same minute. Each country had its own snac instance, in its own FreeBSD jail, with its own dataset. The bot, internally, was almost the same script as before.

This worked because Europe is, in essence, two or three timezones across most of the countries I cared about.

Then I added Germany, and Germany taught me my first lesson about names.

There are several places called Neustadt in Germany. There is a Frankfurt am Main, and a Frankfurt an der Oder, and they are not the same city. There is a Halle in Saxony-Anhalt and a Halle in North Rhine-Westphalia. Asking Nominatim for "Frankfurt, Germany" produced one of the two, consistently, but not always the one I wanted. Some German users wrote to me, politely, to point out that the forecast for "their" Frankfurt was, in fact, for the other one.

I started thinking about disambiguation, but only enough to fix the immediate cases. The bot still took a single city name. The ambiguous ones I worked around by editing the cities file and hoping for the best.

In hindsight, this was the seed of what would happen later.

The United States broke every assumption the bot had grown up with.

The first problem was the number of cities. I wanted reasonable coverage at state level, which meant identifying the main cities for each of the fifty states. The list ended up at more than 1200 entries. That alone is more cities than every other country in the project combined.

The second problem was timezones. The contiguous United States covers four of them, and Alaska and Hawaii bring the total to six. A "current weather at 12:00" line generated at the same instant for New York and for Los Angeles is technically the same instant, but the two cities are living different parts of the day, and the forecast for "today" is not even quite the same window. A bot that pretended every city was on the same clock would be wrong, sometimes embarrassingly so, every single day.

The third problem was the name thing again, only larger. There are dozens of Springfields. There is a Portland in Oregon and a Portland in Maine. The Germany workaround - editing the cities file by hand and hoping Nominatim picked the right city - was clearly not going to scale to a country where the same name is also a state.

I sat with this for a couple of days before admitting what I already knew.

The bot needed to be rewritten.

What made this hard was not the rewriting itself. It was the requirement to do it without breaking everything else.

By the time I decided to add the United States, the infrastructure around the bot had grown into something I trusted. Jails, snapshots, backup jobs, cron schedules, snac instances on production paths, the HAProxy layer, the homepage cron that aggregated follower counts, and a long list of cities being processed in series every six hours. None of that knew or cared about the bot's internal shape. All of it cared, very much, about the bot's external behavior: a city name and a country code go in, valid markdown comes out, and that markdown ends up in a timeline.

So the contract was clear, even if I had never written it down anywhere. The command-line interface, the output format, the exit codes, the way the wrapper script invoked it, the structure of the JSON country configs - all of it had to keep working. Italian had to keep working. German had to keep working. The cron job that ran every six hours had to keep producing the same shape of output, just with new countries added.

What I changed was almost everything below the surface.

The city argument grew an optional __state suffix, with a double underscore as separator:

python3 main.py springfield__illinois us
python3 main.py springfield__massachusetts us
python3 main.py new_york__new_york us

A city without the suffix continued to work exactly as before, which is what every European country needed. The country config gained a timezone field that could be a fixed string or the literal "auto"; when it was "auto", the bot used timezonefinder against the resolved coordinates to determine the right zone for that specific city. Internally I separated the weather provider behind an interface, so Open-Meteo could remain the primary while MET Norway and wttr.in sat behind as alternatives, with automatic fallback when the primary failed. Units became configurable per country: temperature, wind speed, precipitation. The United States needed Fahrenheit, miles per hour, and inches. Most of Europe wanted Celsius, kilometers per hour, and millimeters. The bot now does either, on a per-country basis, without caring which is which.

I am skipping a lot of small detail here, but the principle was always the same: every new degree of freedom had to be expressible as an optional field in the config or as an optional CLI flag. If a country did not set the new field, the old behavior continued, identical to before.

I tested this by running the new bot against the old country configs and comparing the output line by line. Where it differed, it was a bug in the new bot. Not in the test.

The first cycle after deploying the rewrite was, for every country except the United States, indistinguishable from the cycle before. That was the point.

This is the part of the story I dislike telling, which is precisely why I should tell it.

At some point during the development, while debugging an Open-Meteo response that did not look right, I added a print statement to the error path that dumped the full request URL whenever something went wrong. The full URL of the Open-Meteo customer endpoint includes the apikey query parameter. The print was meant for development. I forgot to remove it.

I deployed.

The next time Open-Meteo had an outage - and small ones happen, sometimes for several minutes at a time - the bot dutifully printed the failing request URL into the post body. For every city. For every cycle that ran during the outage. The wrapper script piped the output into snac note_unlisted without complaint. The posts went out, federated across the Fediverse, with my API key sitting in the text for anyone who cared to read.

Some users were kind enough to write me and tell me. Others were less kind, and made fun of me. Both groups were correct. This should not have happened.

I reported the incident to the Open-Meteo team, who were extremely understanding. They rotated the key immediately and gave me a fresh one. I removed the debug print, and then I did the slightly more useful thing, which was to add redaction at multiple layers - in the bot's output, in the daemon's logging, and in the debug helpers themselves. URL query parameters that look like API keys are masked. Environment variables and config keys named apikey or OPEN_METEO_APIKEY are redacted before any string reaches stdout or a log file. Even JSON-like fields that include open_meteo_apikey are scrubbed if they ever appear in something the program prints.

The lesson is not "be more careful." The lesson is that debug paths leak, sooner or later, so the secrets have to be unreachable from the debug paths in the first place. Now they are.

That afternoon, when I realised what was happening, I closed everything for a minute and looked out of the window. Then I started fixing.

Nominatim is a public service, and it is generous, but it is not infinite. Every city in the project needs coordinates, and at the start of the project every cycle would re-ask Nominatim for every city. Most of the time this worked. Sometimes it did not.

There was one cycle, before I added caching, when Nominatim simply did not respond for one of my queries. The geopy call timed out. The bot raised an exception. The wrapper script gave up on that city and moved on to the next one. A few users noticed that a particular city had not received its forecast that day, and asked what had happened.

I added a coordinate cache, and I am still grateful that I did.

The cache is intentionally boring. The first time the bot resolves a city, it writes the latitude and longitude into a small file under /tmp, named after the city, and the state when present. Every subsequent run reads the file. If the file exists, no Nominatim call is made. If the file is missing, the bot calls Nominatim and writes the file. After the first successful lookup, the cache becomes the source of truth for the coordinates of that city.

This is lighter on Nominatim, faster for every cycle, and much more resilient against transient failures. It is also nice for a reason I did not anticipate.

Nominatim is a geocoder, and like every geocoder it has opinions.

I live in Ferrara, so when I added Italy I made sure Ferrara was in the list, and I checked the first cycle to make sure everything looked right. The forecast came out fine. The temperature was reasonable. The icon matched the sky outside my window. I closed the laptop and forgot about it.

Then, one evening months later, I looked more carefully at the coordinates Nominatim had returned for "Ferrara, Italy", and I realised they did not point to the city. They pointed to a location closer to the centroid of the province, which is a much larger area and mostly countryside. The forecast had been, on average, for a field somewhere outside town, not for the city center.

I am not entirely sure why I had not noticed earlier. Probably because the weather in Ferrara and the weather in the fields outside Ferrara is, on most days, indistinguishable to anyone who is not paying attention. But this is the kind of detail I do not want to leave wrong, especially for my own city.

There are other places where geocoding lands slightly off. Sometimes it is a few kilometers, sometimes a different neighborhood, sometimes genuinely the wrong place.

Because the cache is just a file per city, the fix is also just a file per city. I open the cache file, replace the latitude and longitude with the correct values, save. The next cycle uses the corrected coordinates. No code change, no redeploy, no special tooling. I keep a small list of patched cities in a separate text file, so that if I ever rebuild the cache, I do not lose the manual corrections.

This is the kind of operational simplicity I like. A cache made of plain files costs almost nothing and quietly pays back every time a small problem appears.

For every report it generates, the bot also writes a simplified English text snapshot to /tmp/<city>.txt, or /tmp/<city>__<state>.txt when there is a state.

This is intentional, and it is not a debug artifact. I am not ready to say what I am doing with it yet, but it is part of a future direction for the project. Text is a useful intermediate format, and having a clean, language-neutral representation of every forecast sitting on disk costs almost nothing and might be worth a great deal later.

I prefer to let ideas mature in private before I commit to them in public. So I will leave it at this for the moment.

A full cycle for the United States takes hours.

It is not because the work is heavy. It is because I deliberately inserted a small sleep between cities, to give snac time to dispatch the previous post before the next one is generated. With more than 1200 cities in series, even a short pause adds up. I am not in a hurry. Forecasts that arrive a few minutes apart from each other are not a problem, and the bot was already a polite citizen elsewhere. A polite cycle is fine.

The problem with a slow cycle is not the duration. The problem is what happens to it.

In the original design, the cycle was launched by cron. Every six hours, cron called the wrapper script, the wrapper iterated through the cities file, and for each city it ran the bot and piped the output into snac. There was no scheduler in the project at all. Cron was the scheduler. The wrapper was just a loop.

Restarting snac was harmless. The wrapper would call snac note_unlisted per city, and if snac happened to be unavailable for a moment, that single call might fail, but the loop kept moving and snac was usually back within seconds. Snac itself was not what held the cycle together.

What held the cycle together was the wrapper process. And the wrapper process lived inside the jail.

If the FreeBSD jail was restarted while the wrapper was running, the loop stopped wherever it happened to be. The cron schedule did not care. Six hours later, the next cron tick started a new cycle from the first city, and the cities that had been about to be processed at the moment of the restart were simply skipped for that window. For the United States, this could mean several hundred cities going without an update.

There was a worse case, and it took me longer than it should have to recognise it. If the host was rebooting exactly in the minute when cron should have fired, cron simply did not fire. There was no daemon waiting to pick up the missed tick. The cycle never even started. Six hours of forecasts would be lost, in silence, with nothing in any log to suggest anything had gone wrong.

I lived with this for a long time. Reboots were rare, the impact was limited, and adding state was the kind of thing I always meant to do "next week."

What finally changed it was not a dramatic incident. It was the slow accumulation of small ones. A scheduled VPS reboot. A jail restart after an upgrade. Each one on its own was nothing. Together, they were a steady drip of missed cycles.

So I wrote a daemon.

The crontab entries for the bot went away. There is now a long-running process inside the jail, started at boot, and it does the scheduling itself. The schedule is a list of hours and a minute, read from a JSON config. The daemon wakes up once a minute, checks whether it is time to start a cycle, and either starts one or waits.

The interesting part is the state file.

As the daemon walks through the cities file, it writes its position to a small JSON file: which cities file it is processing, and the index of the next city to handle. The write happens at the boundary between one city and the next, because that is the only place where resuming makes sense. If the daemon is interrupted mid-city, that city is retried on resume; no half-finished post escapes.

When the daemon starts, it reads the state file. If it finds one matching the current cities file, it resumes from the saved index. If the cities file has changed since the state was written, the daemon starts fresh. The check is deliberately conservative: a renamed or modified cities file is treated as a different cycle, because the indices would otherwise be meaningless.

The result is the behavior I should have had from the start. If the host reboots while the United States cycle is running, the daemon comes back up with the jail, reads the state, and continues from where it left off. Every city still gets its update, just with a small gap corresponding to the reboot itself. The cycle finishes. The state file is reset. Life goes on.

And the worst case from the cron days is gone. The daemon does not need anyone to fire it. As long as the jail is running, the daemon is running, and the next scheduled cycle will happen when its hour comes, regardless of what was happening at any specific minute.

Of all the changes I have made to the project, this is the one I like most. It is not exciting work. It is the kind of thing that earns no applause because, when it works, it produces no visible event. But it removes a whole class of small daily annoyances, and it makes a slow process robust against the boring kind of failure: the kind nobody plans for, but that always eventually happens.

The current bot does considerably more than the original Italian script. It handles per-city timezones, three weather providers with automatic fallback, unit conversion for temperature, wind, and precipitation, optional air quality, pressure trend indicators when the provider supplies pressure data, a simplified English text snapshot for future use, a coordinate cache that can be patched by hand, secret redaction at multiple layers, a heartbeat that adapts to whichever HTTP client is installed on the host, and a scheduler-and-resume daemon that survives reboots.

But from the outside, almost nothing has changed.

The European country configs work the same way they always did. The wrapper scripts are unchanged. The snac integration is the same one-line pipe. The HAProxy layer in front does not know or care that the bot was rewritten. The homepage cron that counts followers and regenerates the static page works exactly as before.

The original Italian script does not exist as a file anymore, but it survives as a default. A country config with timezone set to Europe/Rome and no special options behaves, today, exactly as the first version of the bot would have. Everything else is opt-in.

I like this kind of work.

FediMeteo, HAProxy, and the art of not wasting snac threads

18 May 2026 at 09:44

FediMeteo, HAProxy, and the art of not wasting snac threads

When I wrote about FediMeteo for the first time, I told the story from the beginning: the idea born almost by chance while checking the weather for a holiday, the memory of my grandfather, who for years had been my personal meteorologist, the decision to build something small and useful, and then the surprise of seeing people actually use it. What began as a personal experiment quickly became a small global service, still running with the same philosophy: FreeBSD, jails, simple scripts, snac, text, emoji, and a lot of small pieces doing their work quietly.

That article was mostly about the birth and growth of the project. This one is about one of the less romantic parts of the same story, although I have to admit that I find a certain beauty in it too: keeping the service light as it grows.

FediMeteo is still intentionally simple from the outside. A homepage, some numbers, a list of countries, and many ActivityPub accounts publishing weather forecasts. The posts are text and emoji. There is no JavaScript requirement to read the pages, no heavy frontend, no unnecessary media attached to every forecast, and no dynamic homepage recalculated at every visit just to show the same numbers. This is not accidental. It is the way I wanted the service to behave from the beginning.

But the more the service is used, the more the small details matter. A request that looks harmless when there are ten followers may become a repeated request when there are thousands of followers, remote instances, crawlers, previews, and other servers fetching the same public objects. In the Fediverse, the same small thing can be asked many times by many different places, each one with a perfectly legitimate reason. The backend doesn't care: it just needs to deal with the requests.

And in FediMeteo, the backend is snac.

I like snac very much precisely because it is small, clear, and efficient. It is not a giant application that tries to be everything. It does a focused job and does it well. But this also means that I want to respect its shape. I do not want to waste its threads on work that the reverse proxy can safely do. A snac thread serving the same public avatar again and again is not a tragedy, but it is still a waste. A snac thread answering the same public ActivityPub object several times in the same minute is doing real work, but often not necessary work.

This is the reason behind the HAProxy tuning I am currently using in front of FediMeteo.

It is not about making the configuration look clever. It is about keeping snac quiet.

A continuation of the same idea

I had already explored the same problem with snac and nginx in two previous posts: Improving snac Performance with Nginx Proxy Cache and Caching snac Proxied Media with Nginx. In both cases, the idea was that the reverse proxy should absorb repeated public requests instead of letting them consume snac resources.

This is especially important because snac uses a limited number of threads. I like that. Limits are healthy. They force us to understand what the service is doing, and they prevent a small program from pretending to be an infinite resource. But limits also make waste visible. If a few threads are busy serving files that could have been served from cache, those threads are not available for something more useful.

With FediMeteo the implementation is different because the reverse proxy is HAProxy, but the reasoning is the same. I have many small snac instances, each one in its own FreeBSD (Bastille) jail, and one public entry point that has to route, terminate TLS, compress, cache, and generally remove as much repetitive work as possible from the backends.

This is, in a way, the natural continuation of the original FediMeteo design. In the first article I wrote that I wanted to manage everything according to the Unix philosophy: small pieces working together. This is another piece of that same puzzle. HAProxy does the edge work. snac does the ActivityPub work. Scripts generate forecasts. cron launches updates. ZFS gives me snapshots. FreeBSD jails keep countries separated. Nothing is particularly heroic by itself, but the whole system becomes pleasant because each part has a clear responsibility.

Why there is almost no media

Before talking about HAProxy, it is worth mentioning one of the most important optimizations, which is not in the proxy configuration at all.

FediMeteo does not use media in its forecasts.

No images attached to the posts, no generated weather cards, no maps for each city, no decorative banners. The forecasts are text and emoji. This was a deliberate decision. Weather information does not become more useful just because it is put inside an image, and every media file used by the service would become something to store, serve, cache, federate, expire, back up, and occasionally debug.

Text and emoji are enough. They are accessible, light, readable in text browsers, friendly to timelines, and understandable even when someone does not know the local language perfectly. This was one of the original design principles of FediMeteo, and it also helps the infrastructure. Less media means less work, fewer cache entries, fewer repeated fetches, fewer surprises.

There is one exception: the avatar.

All FediMeteo accounts use the same avatar, and this is also intentional. I could have used a different avatar for each country, or for each city, or created something visually richer. It would have been nicer in some screenshots, perhaps. It would also have been operationally worse.

With one shared avatar, the reverse proxy has one very useful object to cache. It is public, identical for everyone, small, requested often, and therefore almost always hot in cache. HAProxy can serve it directly instead of asking each snac instance to return the same file. Since avatars are requested by remote instances, browsers, profile previews, and all sorts of federation-related fetches, this single decision removes a surprising amount of pointless backend traffic.

So the avatar is not only a visual identity. It is part of the architecture.

This is the kind of optimization I like most, because it starts before the software. It starts with deciding not to create a problem.

The homepage is static because it can be static

The main homepage follows the same logic.

It is a static HTML page generated from a template. Once per hour, a cron script updates the numbers and statistics. It counts the data I want to show, regenerates the page, and then the page remains static until the next run.

This is not because I cannot make a dynamic page. It is because I do not need one. Boring is good.

The homepage does not need to query all the country instances on every visit. It does not need a database request for each user who opens it. It does not need to ask snac anything in real time. The numbers are useful, but they do not need to be updated every second. Once per hour is enough, and it also fits the spirit of the whole project: do the work when it is needed, then serve the result cheaply.

I have seen too many small services become heavy because the first implementation was convenient rather than appropriate. A cron job and a template are not fashionable, but they are often exactly what a page like this needs.

Many countries, one entry point

FediMeteo is made of many country instances. Each one runs in its own jail and listens on its own internal address and port. From the outside, however, they all live under the same domain structure:

fedimeteo.com
www.fedimeteo.com
it.fedimeteo.com
uk.fedimeteo.com
jp.fedimeteo.com
us.fedimeteo.com
usa.fedimeteo.com
can.fedimeteo.com
canada.fedimeteo.com

And many more.

At the beginning, it is always tempting to write one ACL after another in the HAProxy frontend. It is quick, it is explicit, and for five hostnames it is perfectly fine. But FediMeteo did not remain at five hostnames. As countries and aliases grew, a long chain of ACLs would have turned the frontend into a list of names instead of a description of how the proxy behaves.

So I moved the hostname to backend mapping into a map file:

fedimeteo.com        backend_fedimeteo
www.fedimeteo.com    backend_fedimeteo
it.fedimeteo.com     backend_it
uk.fedimeteo.com     backend_uk
jp.fedimeteo.com     backend_jp
us.fedimeteo.com     backend_us
usa.fedimeteo.com    backend_us
can.fedimeteo.com    backend_ca
canada.fedimeteo.com backend_ca

The frontend then needs only one rule:

use_backend %[req.hdr(host),field(1,:),lower,map(/usr/local/etc/fedimeteo.map,backend_fedimeteo)]

This reads the Host header, removes the port if present, lowercases the result, and looks it up in /usr/local/etc/fedimeteo.map. If nothing matches, it falls back to the main FediMeteo backend.

I like this because it keeps the configuration honest. The frontend contains the policy. The map contains the data. Adding a country means adding an entry to the map and defining a backend. I do not need to make the frontend more complicated every time the service grows.

Backends as small compartments

The country backends are deliberately plain:

backend backend_it
    mode http
    http-reuse safe
    server srv1 10.0.0.2:8001 maxconn 30

backend backend_uk
    mode http
    http-reuse safe
    server srv1 10.0.0.7:8001 maxconn 30

backend backend_jp
    mode http
    http-reuse safe
    server srv1 10.0.0.32:8001 maxconn 30

One backend, one jail, one snac instance. This is exactly the same organizational principle as the rest of the project. If I need to reason about Italy, I look at the Italian jail. If I need to reason about the United Kingdom, I look at the UK jail. If one day I need to move a country elsewhere, the separation is already there.

The maxconn 30 value is not a magic number. It is a ceiling. I want each small backend to have a visible limit in front of it. If something starts hammering a country instance, I prefer the pressure to appear at the HAProxy layer instead of becoming unlimited concurrent work inside snac.

http-reuse safe lets HAProxy reuse backend connections where appropriate. This is another small reduction in unnecessary work. Opening connections repeatedly is not the biggest problem in the world, but avoiding it is still better, especially when many small services sit behind the same proxy.

The front door

The HTTPS frontend listens on IPv4 and IPv6 and offers both HTTP/2 and HTTP/1.1:

frontend https_in
    bind :::443 v4v6 ssl crt /usr/local/etc/certs/ alpn h2,http/1.1
    mode http
    option http-keep-alive

TLS defaults are set globally:

ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets

Port 80 only redirects to HTTPS, except for Let's Encrypt challenges:

acl letsencrypt-acl path_beg /.well-known/acme-challenge/
http-request redirect scheme https code 301 unless letsencrypt-acl
use_backend letsencrypt-backend if letsencrypt-acl

In the HTTPS frontend I also set the usual forwarding headers:

http-request set-header X-Real-IP %[src]
http-request set-header X-Forwarded-Proto https

And I add HSTS:

http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"

None of this is unusual, and that is fine. The interesting parts of an infrastructure are not always the parts that should be unusual.

Two caches, because the requests are different

The HAProxy configuration defines two caches:

cache mediacache
  total-max-size 128
  max-object-size 10000000
  max-age 3600
  process-vary on
  max-secondary-entries 12

cache jsoncache
  total-max-size 16
  max-object-size 1000000
  max-age 60
  process-vary on
  max-secondary-entries 12

I keep media and ActivityPub JSON separate because they are not the same kind of traffic.

The media cache is larger and has a longer maximum age. In FediMeteo, this mostly means the shared avatar and a few static-looking objects. Since there is intentionally almost no media, the important cached object is requested very often and remains warm.

The JSON cache is smaller and short-lived. It is there for public ActivityPub GET requests, not to store federation state forever. A 60 second cache is enough to collapse many repeated requests that arrive close together in time, without pretending that ActivityPub responses should be treated like immutable files.

This distinction is important. Caching is not one decision. It is a set of small decisions about what a response means, who can see it, how often it changes, and what happens if it is served again.

Recognizing media

For media, the ACL is based on file extensions:

acl is_media path_end -i .jpg .jpeg .png .gif .webp .svg .ico .mp4 .webm .mp3 .ogg .wav .flac .mov .avi .mkv .m4v

Then I store the result in a transaction variable:

http-request set-var(txn.is_media) bool(true) if is_media

The cache lookup is straightforward:

http-request cache-use mediacache if { var(txn.is_media) -m bool true }

And on the response side:

http-response set-header Cache-Control "max-age=3600, public" if { var(txn.is_media) -m bool true }
http-response del-header Set-Cookie if { var(txn.is_media) -m bool true }
http-response del-header Vary if { var(txn.is_media) -m bool true }
http-response cache-store mediacache if { var(txn.is_media) -m bool true }

The Cache-Control header makes the intent explicit. Set-Cookie is removed because a public media object should not carry session information. Vary is removed because I do not want the same avatar to fragment into many cache entries because of harmless header differences.

This is aggressive only if removed from its context. In this service, with this media policy, it is a reasonable choice. FediMeteo is not serving private media under these paths. It is mostly serving the same public avatar over and over.

For the same reason, I clean the request before it reaches the backend:

http-request del-header Authorization if { var(txn.is_media) -m bool true }
http-request del-header Cookie        if { var(txn.is_media) -m bool true }

I would not do this globally. I do it after deciding that the request is media. Scope is what makes these rules safe.

The result is exactly what I want: the shared avatar becomes an almost perfect cache object. Small, public, repeatedly requested, and served by HAProxy instead of snac.

ActivityPub JSON microcaching

The ActivityPub side starts from the Accept header:

acl is_ap_json   req.hdr(Accept),lower -m sub application/activity+json
acl is_ap_ldjson req.hdr(Accept),lower -m sub application/ld+json
acl is_outbox    path_end /outbox
acl is_get       method GET
acl has_auth     req.hdr(Authorization) -m found
acl has_cookie   req.hdr(Cookie) -m found

This part matters because ActivityPub uses content negotiation. The same path may return HTML to a browser and JSON to a remote instance. If the proxy pretends that a URL is always one thing, it will eventually cache the wrong representation.

So I only mark public ActivityPub GET requests as cacheable:

http-request set-var(txn.is_activitypub) bool(true) if is_get !is_outbox is_ap_json !has_auth !has_cookie
http-request set-var(txn.is_activitypub) bool(true) if is_get !is_outbox is_ap_ldjson !has_auth !has_cookie

There are several decisions here, all important.

It must be a GET, because I am not caching deliveries or anything that changes state. It must not be /outbox, because outbox collections are not the traffic I want to cache here. It must not have Authorization, and it must not have cookies, because authenticated or user-specific requests do not belong in a shared public cache.

Then the cache can be used and populated:

http-request cache-use jsoncache if { var(txn.is_activitypub) -m bool true }

http-response set-header Cache-Control "max-age=60, public" if { var(txn.is_activitypub) -m bool true }
http-response cache-store jsoncache if { var(txn.is_activitypub) -m bool true }

Sixty seconds is short, but useful. Federation often creates small clusters of identical requests. A remote server fetches an actor, another fetches the same actor, something asks for the same object, something retries. I do not need to cache these responses for hours. I only need HAProxy to answer the second and third identical request during the same small burst.

This is microcaching in the most practical sense. It reduces repeated work without changing the nature of the service.

Static media paths

There is also a rule for static paths:

acl is_short_path path_reg ^/[^/]+/s/
http-request cache-use mediacache if is_short_path

This comes from the same observation that led me to cache snac media with nginx. snac uses static media paths, and those paths often represent the kind of public, repeatable traffic that should not consume backend threads if the proxy can serve it. I call them "short", not because they are, but because the first time I saw them, I thought the 's' stood for "short", not "static". The name just stuck.

In FediMeteo this is less central than on a normal social instance, because I deliberately do not use media except for the avatar and basic static objects. Still, the rule fits the general policy: let HAProxy handle repeatable edge work, and let snac spend its threads where they are actually needed.

Vary, but not without limits

Both caches have:

process-vary on
max-secondary-entries 12

I want HAProxy to process Vary, because content negotiation is real, especially when ActivityPub is involved. But I also want variation to be bounded. If every slightly different header creates another cache entry, the cache becomes a complicated way to miss.

For media, I remove Vary before storing the response. A shared avatar does not need to vary by Accept. For ActivityPub JSON, I am more careful because the representation matters.

Again, the important thing is not the number itself. It is the decision to make variation explicit and limited.

Seeing whether it works

During rollout, I like to expose a very small diagnostic header:

http-response set-header X-Cache-Status HIT if !{ srv_id -m found }
http-response set-header X-Cache-Status MISS if { srv_id -m found }

This is intentionally simple. If HAProxy selected a backend server, I call it a miss. If no backend server was selected, the response came from cache, so I call it a hit. It is not a complete observability system, but it is enough to answer the first question I usually have after changing a cache rule.

Did this request reach snac?

A test can be as simple as:

curl -I https://it.fedimeteo.com/path/to/avatar.png
curl -I https://it.fedimeteo.com/path/to/avatar.png

The second request should be a hit.

For ActivityPub JSON, the test must use the right Accept header:

curl -I \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object

And I also want to verify that cookies and authorization prevent public caching:

curl -I \
  -H 'Cookie: test=value' \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object

curl -I \
  -H 'Authorization: Bearer fake' \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object

A cache that works should be visible. A cache that is invisible can be correct, but it can also be silently wrong. I prefer to know.

Compression and operational paths

HAProxy also handles gzip compression:

filter compression
compression algo gzip
compression type text/css text/html text/javascript application/javascript text/plain text/xml application/json application/activity+json

This keeps another common responsibility at the edge. The country instances can stay focused on snac and the forecast data, while HAProxy deals with client-facing compression for HTML, JSON, and ActivityPub responses.

There is also a local Prometheus exporter:

frontend prometheus
  bind 127.0.0.1:8405
  mode http
  http-request use-service prometheus-exporter
  no log

And I keep internal operational paths, such as statistics and Grafana, handled before the hostname map. These are small details, but ordering matters. Special paths should be explicit and early. The hostname map is for FediMeteo routing, not for every internal tool I happen to expose behind the same proxy.

What this changes in practice

The nice thing about this configuration is that none of its parts is particularly surprising.

The map keeps hostname routing manageable. The backend definitions keep each country isolated and limited. The static homepage avoids dynamic work for something that changes once per hour. The shared avatar gives HAProxy one very hot media object to serve directly. The media cache keeps public files away from snac. The JSON microcache absorbs short ActivityPub bursts. Header cleanup prevents useless variation. Connection reuse avoids unnecessary backend connection churn.

But all of this is only a longer way of saying one thing:

fewer requests reach snac.

That is the metric I care about here.

Not because snac is slow. If anything, FediMeteo exists in its current form because snac is efficient enough to make this kind of project possible on a very small VPS. But precisely because the whole architecture is small and pleasant, I do not want to waste resources where there is no need.

This is also consistent with the rest of the project. Forecasts are serialized by scripts. Updates happen every six hours. The homepage is regenerated hourly. Countries live in separate jails. Snapshots and backups are handled outside the application. No single component tries to be the entire system.

HAProxy is just another small piece, but it sits in the right place to remove a lot of repeated work.

Caveats

This configuration is not a universal HAProxy recipe for ActivityPub services.

It matches FediMeteo as it is now: almost no media, one shared avatar, static homepage, public forecasts, many small snac instances, and ActivityPub traffic that can benefit from a short public cache when there are no cookies or authorization headers.

If I decide one day to use media in forecasts, the media cache rules will need to be reviewed. If I use different avatars for each city or country, the cache will still work, but I will lose the very nice property of one shared, always-hot avatar. If ActivityPub responses become actor-dependent, public JSON caching must be reconsidered. If one country grows a very different traffic pattern from the others, it may deserve a different limit or policy.

This is why I do not like presenting configurations as magic. A good configuration is a written form of the assumptions behind a service. When the assumptions change, the configuration must change too.

Conclusion

FediMeteo started as a small idea and became larger than I expected, but I still want it to feel small in the right ways. Small does not mean fragile. Small means understandable. It means that each part has a reason to exist, and that unnecessary work is removed before it becomes a problem.

The HAProxy layer follows this idea. It terminates TLS, routes hostnames through a map, reuses backend connections, serves the shared avatar from cache, microcaches public ActivityPub JSON, avoids authenticated and cookie-based traffic, and gives me a small diagnostic header to see what is happening.

There is no single brilliant directive here. There is only the usual work of matching infrastructure to reality.

FediMeteo publishes weather forecasts as text and emoji. The homepage is static HTML updated every hour. The accounts share the same avatar because it is enough, and because it is better for the cache. Each country has its own snac instance in its own FreeBSD jail. HAProxy stands in front of them and tries, quietly, not to bother them unless it has to.

I like this kind of infrastructure.

Not because it is invisible, but because when it works well, it leaves very little to say.

Monitor your devices with LibreNMS on FreeBSD

7 May 2026 at 10:45

Monitor your devices with LibreNMS on FreeBSD

LibreNMS has been a faithful companion for years now. It quietly handles the monitoring of my servers, devices, and services without demanding much in return - exactly what you want from a tool whose job is to watch over everything else. It's a solid alternative to heavier solutions like Zabbix, and it gives you alerts, data, and graphs on virtually anything reachable over SNMP.

I usually install it on a host that is not reachable from the outside, then let it poll all the devices through a VPN: a single observation point, clean perimeter. The ability to create multiple dashboards - and to filter them by user - has also let me give clients a transparent window onto their own servers. Transparency, in my experience, is always the better long-term bet.

Together with Uptime-Kuma (and the good old Nagios/Munin pair), LibreNMS lives in a FreeBSD jail on my monitoring servers and just does its job.

This post walks through a plain installation of LibreNMS on FreeBSD: package-based, no reverse proxy, no HTTPS, no fancy hardening. The goal is to get to a working setup you can build on top of.

Assumptions

  • FreeBSD 15.0-RELEASE, in a jail or on a dedicated VM/host
  • nginx + php-fpm + MySQL 8.4
  • LibreNMS installed from the official package — not via git clone

One note before we start: in this guide I use plain HTTP just to reach the first-time setup. If your LibreNMS instance won't stay confined to a private network or behind a VPN, configuring HTTPS is mandatory, not optional.

Installation

pkg install librenms mysql84-server python3 nginx

LibreNMS currently depends on PHP 8.4. If you want to speed PHP up, install OPcache too:

pkg install php84-opcache

MySQL

Two settings need to be in place before MySQL starts for the first time. After the first start they cannot be changed without reinitializing the data directory, so it's worth getting them right now.

cd /usr/local/etc/mysql
cp my.cnf.sample my.cnf

In the [mysqld] section, add:

innodb_file_per_table=1
lower_case_table_names=0

Now start MySQL:

service mysql-server enable
service mysql-server start

On a fresh FreeBSD install, the local root user can connect to MySQL without a password from the command line. Connect and create the database and user. I'm using password here as a placeholder - don't.

mysql
CREATE DATABASE librenms CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER 'librenms'@'localhost' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON librenms.* TO 'librenms'@'localhost';
exit

php-fpm

Edit /usr/local/etc/php-fpm.d/www.conf and adjust the listen directives:

listen = /var/run/php-fpm-librenms.sock
listen.owner = www
listen.group = www
listen.mode = 0660

Then create php.ini from the production sample:

cd /usr/local/etc
cp php.ini-production php.ini

And set the timezone in php.ini:

date.timezone = Europe/Rome

nginx

Since this jail (or host) is dedicated to LibreNMS, we can rewrite the server block in /usr/local/etc/nginx/nginx.conf directly:

server {
    listen      80;
    #server_name yourServerName
    root        /usr/local/www/librenms/html;
    index       index.php;

    charset utf-8;
    gzip on;
    gzip_types text/css application/javascript text/javascript application/x-javascript image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon;

    location / {
        try_files $uri $uri/ /index.php?$query_string;
    }

    location /api/v0 {
        try_files $uri $uri/ /api_v0.php?$query_string;
    }

    location ~ \.php$ {
        fastcgi_split_path_info ^(.+\.php)(/.*)$;
        set $path_info $fastcgi_path_info;
        try_files $fastcgi_script_name =404;
        include fastcgi_params;
        fastcgi_param SERVER_SOFTWARE "";
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_param PATH_INFO $path_info;
        fastcgi_index index.php;
        fastcgi_pass unix:/var/run/php-fpm-librenms.sock;
        fastcgi_buffers 256 4k;
        fastcgi_intercept_errors on;
        fastcgi_read_timeout 14400;
    }

    location ~ /\.(?!well-known).* {
        deny all;
    }
}

Now start nginx and php-fpm:

service nginx enable
service nginx start

service php_fpm enable
service php_fpm start

LibreNMS configuration

Copy the default config:

cp /usr/local/www/librenms/config.php.default /usr/local/www/librenms/config.php

Because we installed from the package, this file already has the right commands and paths for FreeBSD - no need to hunt down mtr, fping, snmpwalk and friends one by one.

Create the directory for RRD graphs and set ownership:

mkdir -p /var/db/librenms/rrd
chown -R www:www /var/db/librenms
chmod 775 /var/db/librenms/rrd

Then the .env file:

cd /usr/local/www/librenms
cp .env.example .env
chown www .env

Edit .env and set at least:

  • DB_DATABASE - librenms
  • DB_USERNAME - librenms
  • DB_PASSWORD - the one you actually used (not password, please)

Then add this line, which tells LibreNMS we still need to run the web installer:

INSTALL=true

A note on permissions. The official LibreNMS documentation suggests chown -R www:www over the entire application tree, but on FreeBSD the package already lays down sane ownership, with storage/ and bootstrap/cache/ writable by www. There's no reason to widen the rest of the codebase. If validate.php complains later about something write-related, the first place to check is:

ls -la /usr/local/www/librenms/storage /usr/local/www/librenms/bootstrap/cache

Now generate the app key as www, since the file is owned by www:

su -m www -c "php artisan key:generate"

And tighten .env:

chmod 600 .env

Refresh the configuration cache:

su -m www -c "lnms config:clear"
su -m www -c "lnms config:cache"

Web installer

Open http://host/install and follow the steps. The validation process may fail. Refreshing the cache picks up the values written to config.php during the install:

su -m www -c "lnms config:clear"
su -m www -c "lnms config:cache"

When the web installer is done, edit .env again and remove the INSTALL=true line if it's still there. Leaving it in place re-exposes the installer to anyone who can reach the URL.

Polling service

LibreNMS needs something to actually run the polls. On FreeBSD, the package ships an rc service that runs the LibreNMS dispatcher, so there's no need to manage cron entries by hand the way most Linux guides assume.

service librenms enable
service librenms start

Validate

cd /usr/local/www/librenms
su -m www -c './validate.php'

You may see a couple of complaints right after starting the service - usually scheduler-related and self-resolving within a few minutes. Re-run validate.php once the dispatcher has had time to settle. Anything still red after that is worth investigating.

Next steps

At this point you can log into the web interface and start adding devices, configuring SNMP, and building dashboards. For that, the official LibreNMS documentation is excellent, and there's no point in me paraphrasing it here.

New blog design

20 May 2026 at 00:00

I redesigned my blog! I decided to put some more personality into it this time, after over a decade of the minimalist style. This short post is just an excuse to show up in your feed reader so you can go look at it. Cheers!

Also: I’m trying out fedi again. You can find me here: @drew@social.freebitcoin.gay.

Bill Gates’s Carefully Manicured Image Is Cracking

By: Nick Heer
2 June 2026 at 03:30

Emily Glazer, Wall Street Journal:

His [Bill Gates’] carefully crafted image has been shattered as more details of Gates’s association with the late Jeffrey Epstein have spilled into public view, challenging prior efforts by the 70-year-old to downplay his relationship with the sex offender. In a February town hall with foundation employees, Gates owned up to two affairs with Russian women referenced in Epstein’s emails.

[…]

Two different polling teams — at the Gates Foundation, and his private office, Gates Ventures — for years have closely tracked opinions about Gates, including on favorability, trustworthiness and inspiration. A media analysis prepared for the Gates Foundation found that there had been a more than 40% increase in “critical news narratives” about Gates and the foundation since the Epstein files were released through February, according to internal documents reviewed by The Wall Street Journal. 

There are so many little details in this story that are worth your time, but my big takeaway — aside from the Epstein stuff — is the neurotic obsession with building image that, I imagine, is fairly common among public figures. I know this, of course; you probably do, too. But to see it spelled out in the way Glazer does is quite something.

Gates pays people to obsess over his public perception for him — to choose his clothes, to work with Netflix on documentary-style vehicles for him, and to massage his blog and social media accounts. There is something truly bizarre about having a team edit together a video of a rich businessman going for pizza in an attempt to make him relatable and likeable, and then — presumably — tracking the performance of that Instagram post.

Gates and his foundation have done undeniable good in the world, while also being a figurehead of the mixed results of billionaire philanthropy. Also, he spent a lot of time around Epstein. It remains a mystery to me why billionaires like him also want to become beloved celebrity intellectual figures.

⌥ Permalink

Meta A.I. Support Bot Meets Robert Hackerman, the County Password Inspector

By: Nick Heer
1 June 2026 at 22:55

Brian Krebs:

A video released on Telegram by pro-Iran hackers claimed to document a remarkably simple exploit that appears to have involved using a VPN connection with an IP address that is in or near the target’s usual hometown, requesting a password reset for the account, and then choosing to chat with Meta’s AI support assistant. From there, the video shows the attacker told the bot to link the account in question to a new email address, after which the bot dutifully sent that address a one-time code that allowed a password reset.

Meta, a trillion-dollar corporation, should probably hire a few more people who have read the SMBC comic.

⌥ Permalink

Meta Legal Action Forces Sarah Wynn-Williams to Sit Onstage in Silence

By: Nick Heer
1 June 2026 at 03:41

Emma Loffhagen, the Guardian:

[Sarah] Wynn-Williams, whose bestselling memoir, Careless People, details her years working at Facebook, was due to appear in conversation with the investigative journalist Carole Cadwalladr and academic Tim Wu.

Instead, Wynn-Williams sat on stage for the duration of the hour-long discussion between Cadwalladr and Wu, without speaking or responding. She was unable even to nod or shake her head.

To be sure, Wynn-Williams’ silent appearance onstage is the kind of thing that would encourage press coverage and, presumably, this publicity could encourage book sales. Yet Meta has, for a full year now, insisted that “Careless People” is just a bunch of old anecdotes; pay no mind, there is nothing to see here. But its lawyers are vigorously enforcing the arbitration order (PDF) preventing her from making public remarks about Meta that could be construed as critical or negative.

I am no media relations expert, but I bet “Careless People” would feel much less potent if Meta realized it is a trillion-dollar corporation with a crappy reputation regardless of one ex-employee’s book, and with shareholders who do not care about what she wrote so long as the ads keep selling.

⌥ Permalink

The Effects of Another Ad in iOS App Store Search

By: Nick Heer
29 May 2026 at 18:23

Jeremy Provost of development firm Think Tap Work:

It’s been 64 days since we first noticed Apple’s second ad position in search results for iPhone and iPad. Our update after two weeks showed consistently less search ad impressions for our apps, unless we invested heavily in paying for Search Ads.

Here are some updated numbers. Just like last time, these numbers only include App Store Search impressions from iOS devices. As you’ll see, these numbers get harder and harder to compare over time.

Chris Lindsay, developer of Nihongo, a Japanese dictionary app:

Before the rollout, my organic and paid downloads had remained pretty steady for most of the last year. After the rollout, my my organic installs dropped, and my paid installs rose. My overall downloads actually stayed roughly flat, but a large chunk of what used to be organic downloads appears to have shifted into paid downloads instead:

The ads themselves still work well. The problem is that many of these paid downloads seem to be users I previously would have acquired organically.

These ads are effectively another surcharge Apple has foisted upon developers for the privilege of distributing software to my iPhone and yours. Far from being premium “curated” experience, the App Store is this way because Apple has every incentive to steadily make it a little bit worse for users and developers — because where else are you going to go for your iPhone apps?

⌥ Permalink

Checking in on Some Pro-Hate-Speech Social Networks

By: Nick Heer
28 May 2026 at 02:58

The Agence France-Presse reporting on the U.S. president’s social-media-and-cryptocurrency-and-maybe-nuclear-fusion operation:

Trump Media & Technology Group (TMTG) reported revenue of less than US$1 million for the three months ending March 31, according to a company filing.

Under $4 million in annual revenue is less than how much Twitter was earning in 2009 — unadjusted for inflation — an amount Steven Levy described as “modest”.

Speaking of Twitter, let us check in on SpaceX which, after a series of totally normal business deals, now owns the company and is preparing to trade publicly. Mike Masnick, of Techdirt:

Remember, the plan was $26.4 billion [in Twitter/X revenue] by 2028. We’re more than halfway there. How’s it going? Well… when he combines xAI (grok) revenue with X revenue (so not even just breaking out X’s ad revenue)… we get… a total of $3.201 billion in 2025. So, just to put this in perspective… when he took over in 2022 he laid out a five year plan to take the company that had $4.5 billion in ad revenue the year before he bought it up to $12 billion in five years. Three years in and… it’s now somewhere pretty far below $3 billion. […]

Earlier this year, a judge found against Elon Musk in a lawsuit filed by X against advertisers claiming they staged an illegal boycott.

The SpaceX prospectus, by the way, is one of the funniest documents to ever live on the sec.gov domain. It is lucky the business it is known for is so damn photogenic because it is, at present, a profitable satellite internet provider with side businesses of space exploration and artificial intelligence that each lose money. (How it internally accounts for the cost of sending Starlink satellites into orbit is a fantastic question.) And the present business model of the latter is something Patrick Boyle described as “renting GPUs to a competitor on terms that can vanish in a fiscal quarter”. Yet the company still claims the size of its total addressable market is over $28 trillion, or over one-fifth of the entire world’s GDP.

Even so, a $1.75–2 trillion valuation is plausible simply because of Musk. Similarly, and back to that AFP article:

According to its filing, TMTG generated US$900,000 in revenue during the first quarter, a paltry amount for a company valued at US$2.47 billion on the stock market.

That valuation is not much; at time of writing, it is worth about as much as Central Garden & Pet, owners of Nylabone and McKenzie plant seeds. That company last quarter posted revenues one thousand times greater than TMTG, with profit margins of over 12%. Nevertheless, TMTG has a connection to the U.S. president, so it is similarly valued. Lots of good, normal stuff happening in the world’s largest and most powerful economy.

⌥ Permalink

Iris, a Photo History Explorer

By: Nick Heer
27 May 2026 at 23:32

Tyler Hall (finally) released Iris, and it is excellent:

And somewhere along the way the whole emotional center of the thing shifted. I set out to build an anti-Photos utility — a search engine for a hard drive. What I actually ended up with is a memory keeper. Open a photo today and Iris tells you the date, surfaces “16 items on this day,” drops a pin on the map, and lists the people in the frame with their ages quietly calculated from their birthdays. That is not a utility. That is the opposite of anti-anything.

I have been testing Iris for a couple of months and I think it is delightful. It reads all the photo libraries you point it at — your system library, whether that is in iCloud or local, and any folders you want like the one that contains your Lightroom edits, for example — and makes them accessible in a single, giant view.

But that is not the coolest part. No, that is that it lets you explore your tens- or hundreds-of-thousands of photos in a way that treats each of them as little memory boxes. So often, it is not just a picture of your kid, or your dog, or your dinner; it is a time you would like to remember. There are a bunch of things in each file that can bring you back to that moment. Photos does a poor job of that; Iris, on the other hand, is made for exactly that, something Hall takes seriously. How many apps are there with a manifesto?

Iris is great, old-school, indie Mac software.

⌥ Permalink

Last.fm Announces It Has Gone Independent

By: Nick Heer
27 May 2026 at 23:04

After nearly twenty years under CBS ownership, Last.fm is once again independent:

Your account, your listening history, and your data remain exactly where they are. The team building Last.fm is the same. The service continues as normal.

It is difficult to know whether it is riskier for Last.fm to be independent or under the banner of the hilariously corrupt Paramount Skydance conglomerate, but I imagine it would not — uh — last long if the leadership of the latter continues making cuts. I am happy to be a paying subscriber to a service I care about, and am excited to learn what comes next.

⌥ Permalink

The Mythical App Store Reviewer Month

By: Nick Heer
27 May 2026 at 00:09

Jeff Johnson:

I’d like to make an analogy between software development and Apple App Store review. A common, cursory reaction to the obvious failures of app review, the continual appearance of countless scams in the App Store, is to suggest that Apple hire more reviewers. My contention is that adding reviewers is not a solution to the problem of App Store curation, and the belief in such a solution is a myth. I don’t claim that hiring more reviewers would make app review slower. Rather, I think that meaningful, effective curation can’t be measured simply by the amount of available labor, much like [Fred] Brooks argues that the possibility of measuring useful work in units of time, man-months, is a myth.

Apple markets the App Store as a “curated storefront”, but that is not meaningfully true if it is serving up, as Apple says, about two million apps. Meanwhile, as Johnson writes, “nobody worries about scams in Apple Arcade […] a truly curated service”.

The thing is that Apple’s App Store should have a carefully selected inventory of apps. That is Apple’s whole brand: premium, highly-desirable products, and people are willing to pay a little more. The App Store does not match that promise. I think the direction of regulatory and court decisions on the governance of iOS app distribution could be a gift for more selective curation, the kind of thing for which some third-party developers would want to pay extra compared to the competing third-party app marketplaces that would also be available.

Alas, we are on the cusp of another WWDC during which Apple seems unlikely to make major changes to software distribution across its many “post-P.C.” platforms.

⌥ Permalink

FTC Settles With Cox Media Group and Two Others Who Lied About Using Device Microphones to Collect Ad Targeting Data

By: Nick Heer
26 May 2026 at 03:22

The U.S. Federal Trade Commission:

The Federal Trade Commission will require Cox Media Group (CMG) and two smaller marketing firms to pay a total of $930,000 to settle allegations they deceived customers by falsely claiming to offer an AI-powered service that could target localized ads based on conversations captured from consumers’ smart devices and that consumers had opted into such targeting.

Congratulations to Joseph Cox of 404 Media who broke this story in December 2023 and a related story about MindSift and 1010 Digital, the “smaller marketing firms” who settled with the FTC. According to the FTC’s complaint (PDF), Cox Media Group continued its fraudulent marketing through mid-2024, around the time the pitch deck was leaked to Cox. All three of these companies helped to feed the conspiracy theory that apps use device microphones to collect data for ad targeting.

For what it is worth, Cox Media Group told Reuters it “relied on marketing materials provided by a third-party vendor about the vendor’s product”.

Like many conspiracy theories, elements of this story were covered without skepticism by websites like the Daily Mail and Zero Hedge. These are crank websites that hinge on unreliable narration driven by confirmation bias; yet, both happen to be extremely popular, particularly among those who immerse themselves in conspiracy thinking. Because companies like Cox Media Group misrepresented how they collect information and took advantage of the relatively widespread suspicion that devices are listening to everything we say for ad targeting purposes, it undermines our ability to have a reasonable discussion about the actual ways in which they are ruining our privacy. From the FTC’s press release:

According to the complaints, this service did not, in fact, listen in on consumers’ conversations or use voice data at all — nor did the service accurately place ads in customers’ desired locations. Instead, the service the companies provided consisted of reselling — at a significant markup — email lists obtained from other data brokers.

Of course that is what Cox Media Group was doing. Not only does this settlement clarify this whole audio-based-ad-targeting narrative is nonsense, it also shows the power of the normalized yet still invasive practices of data brokers and ad tech. The damage done by Cox Media Group is that it is harder to have this conversation because they have poisoned the well. Meanwhile, anyone who is clinging to the conspiracy theory might point to this settlement as evidence of a cover-up — if crank websites cover this settlement at all. As of writing, I could not find it on either the Daily Mail or Zero Hedge.

⌥ Permalink

Texas Attorney General Sues Meta, Claiming It Is Lying About WhatsApp’s End-to-End Encryption

By: Nick Heer
23 May 2026 at 05:34

Texas attorney general Ken Paxton:

Attorney General Ken Paxton filed suit against Meta Platforms Inc. and WhatsApp LLC (collectively “WhatsApp”) after the company misled consumers regarding the strength and scope of its privacy protections for its messaging app, WhatsApp.

Paxton is alleging (PDF) Meta is fully lying about the end-to-end encryption promise of WhatsApp in this wild lawsuit.

Dan Goodin, Ars Technica:

The sole factual evidence cited for the claims is an article published last month by Bloomberg. It reported that the US Commerce Department’s Bureau of Industry and Security [BIS] had abruptly closed an investigation into allegations that Meta could access encrypted WhatsApp messages shortly after one of the department’s agents sent an email outlining the probe’s preliminary findings.

[…]

Thursday’s lawsuit doesn’t indicate that the AG’s office has obtained the email itself or gathered any information from the investigators involved. Instead, it cites only the Bloomberg report for support. The complaint also noted that Meta employees receive plaintext WhatsApp messages that are reported to the company by fellow WhatsApp users. Those messages, however, are taken from the reporting party’s device only after they have been decrypted using the decryption keys available only to the reporting party.

More backdoor allegations were made in another lawsuit (PDF), this one filed in March, citing a January Bloomberg article that, in turn, says this was being investigated by the U.S. Department of Commerce and noting a 2024 SEC whistleblower report. There is no explanation in the lawsuit of how such a vulnerability could exist.

Earlier this year, before either Bloomberg article was published, a group of plaintiffs hired one of the most prestigious law firms in the United States to sue Meta with similar allegations, though they provided no technical evidence either. In later filings, the plaintiffs eventually cited the same April Bloomberg piece as Paxton. In response, Meta’s attorney submitted a forceful declaration (PDF) explaining that “the [Bloomberg] article itself included a statement from a BIS spokesperson explaining that the claims against WhatsApp were ‘unsubstantiated’ and BIS was not investigating WhatsApp or Meta”, and cited a number of external public articles questioning the technical merits of the case. The plaintiffs lawyer wrote in response (PDF) that “saying an investigation was not complete is very different than saying the facts are wrong” and, in turn, points to an article on Medium by Adrian Găitan. Găitan writes:

By the end of this article, you’ll understand not just that WhatsApp’s privacy model is broken — but exactly how it’s broken, layer by layer, from the cryptographic primitives all the way up to the FBI agent pulling your metadata every 15 minutes in near-real time.

This article feels compelling in its length, technical detail, and citation of declassified documents, but I found a closer reading conspicuously differs from what its introduction — and, indeed, these lawsuits — allege. Găitan points to eight distinct vulnerabilities. Two of them are extraction methods when data is at rest, like when it is stored in an iCloud or Google Drive backup, or bugs in the app that are exploited by a spyware vendor. This is not nothing, but it is also not a problem with end-to-end encryption; it is, in fact, a reminder of its limitations. Two others are irrelevant: Meta does not claim either A.I. prompts nor business chats are end-to-end encrypted.

That leaves four possible vulnerabilities Găitan alleges in WhatsApp’s specific security. One is the company’s willingness to install a “pen register” which provides to law enforcement a near-real-time record of user chat metadata, but not the contents of chats themselves. The second is the metadata WhatsApp stores and how it can be used to triangulate connections. Another complaint Găitan has is that WhatsApp is not open source, so it is not possible to fully verify Meta’s claims of secure end-to-end encryption. Lastly, Găitan points to research claiming it is possible for WhatsApp to surreptitiously modify the participants in a group chat.

For those keeping track, that leaves basically one vulnerability — the latter group chat problem — that would satisfy the kinds of claims being made in these lawsuits: that Meta has “unrestricted access to users’ communications”; that Meta and WhatsApp “have access to all WhatsApp users’ encrypted communications in their entirety”. One could make the case — and I certainly have — that backups of supposedly secure and private messaging platforms should be similarly inaccessible for meaningful “end-to-end encryption”. One could even make a reasonable argument that all of the issues raised in Găitan’s piece as all of them degrade WhatsApp’s privacy promise.

But these lawsuits are not making those claims. They are citing a single email from a government investigator as passed through a media report, and claims from whistleblowers and others that have not been validated. I am not stumping for WhatsApp here. If Meta has been lying about its privacy to the extent these lawsuits allege, it should face serious punishment. I suppose we will learn as they play out whether these claims have merit. It is, however, shocking to me how many lawsuits have been filed in such a short time period making essentially the same allegations yet without any actual proof.

⌥ Permalink

‘How Deepfakes Tore a High School Apart’

By: Nick Heer
22 May 2026 at 04:57

Samantha Cole, 404 Media:

On the morning of December 4, five ninth grade girls, all 14 or 15 years old, showed up for class at Radnor High School. By 8 a.m. — the sun had been up for less than an hour — it felt like the entire school already heard what happened the night before. A fellow freshman boy allegedly created AI-generated sexually explicit videos of the girls using an app, and sent them to his friends. From there, word of the videos and gossip spread from teenager to teenager, school to school, until they made their way back to the girls whose faces were in the deepfakes.

[…]

The images originated from one boy, who used an app called Movely, the girls and their parents believe. The app is similar to dozens hosted in the Apple and Google app stores and advertised on Instagram and TikTok that promise to create AI images and videos of users as superheroes, animals, or influencers; behind a paywall, however, users could edit photos and videos with text prompts.

It almost goes without saying, but the “paywall” is — or was; the app has been removed — an in-app payment from which Apple takes a 15–30% cut.

Apple released its annual justification for running software distribution through the App Store — it told European regulators it actually has five, so maybe this press release only concerns the one accessible from an iPhone — and there are some big numbers in it, as usual. Apple says it “took a number of actions to block bad actors from distributing malicious software, rejecting over 2 million problematic app submissions last year alone”. This Movely app was not one of them. It was only removed after the Tech Transparency Project reported in April that App Store search terms like “nudify” and “undress” displayed results for apps that do exactly that. In its press release, Apple says it has many features for directing kids to age-appropriate apps and restricting them from downloading those which are not but, of the software found by TTP in the App Store and Google Play Store, “31 of the apps were rated suitable for minors”.

Of Movely, the TTP said in its report:

Likewise, an App Store search for “adult AI” returned an ad for Movely – AI Photo to Video. The app offers a suite of AI photo and video editing tools including a try-on feature that will replace a woman’s clothes with outfits including bikinis and lingerie. One tool allows users to select part of any photo and edit it with a text prompt. To test this feature, TTP uploaded an image of a woman in a white T-shirt standing next to a river. After using the selection tool to highlight the woman’s shirt, we entered the prompt “topless.” The app immediately generated four versions of the woman nude from the waist up. It required a paid subscription to download the AI images.

TTP could not reach Movely’s developer, FES2 Inc., for comment. Emails sent to the developer bounced back as undeliverable.

(For clarity, the TTP says it used A.I.-generated images of women to test these apps.)

The search query used to find this app, “adult A.I.”, feels like something Apple should be testing against. If it does not want porn or porn-adjacent apps in its store, it should obviously block these kinds of keywords and flag the apps which are in the results. Moreover, Apple says:

As powerful AI development tools drive a surge in app submissions, Apple’s App Review process has seamlessly scaled to handle the volume and to help ensure every new app and app update meets the App Store’s high standards for privacy, security, and quality.

The Movely app should have raised flags here, too. The developer’s website was, according to the .co whois site, registered in July 2025, and is basically a placeholder. The app’s website was registered a week earlier, and the email address in the privacy policy does not match the one in the terms of service, nor does either match the developer’s website. Also, the blog is full of posts about generating A.I. girls and changing clothes.

These red flags are not obvious in hindsight; they should have been obvious from the time this app was submitted. Meanwhile, apps from longtime and trustworthy developers like Manton Reece and Radu Dutzan are stuck in App Review for dumb and basically invalid reasons.

⌥ Permalink

Ontario Police Are Fighting to Keep Their Spyware Secret

By: Nick Heer
21 May 2026 at 23:51

Betsy Powell, the Star:

Essentially spyware, an ODIT [on‑device investigative tool] can grant almost unlimited access. Investigators can capture screenshots, monitor keypresses, access emails and text messages — including those that are encrypted — and even remotely activate microphones and cameras. All without the owner knowing.

By August, police announced 23 arrests, 279 charges, and more than $9 million in recovered vehicles.

But the case has also done something else: It has pulled back the curtain on how police forces in Ontario — not just in Windsor, but in Toronto and Peel Region — are now using these powerful technologies to reach deep inside suspects’ devices. And despite ODITs growing use in major prosecutions in the province, government lawyers and police are fighting tooth and nail to keep almost everything about them secret: how they work; what safeguards, if any, govern their use; even the names of the companies that sell them.

The details of this report align with research published last year by Citizen Lab about Paragon’s Graphite spyware, including a likely link to the Ontario Provincial Police. It is not the only police force in Canada using ODITs, either. In 2022, the RCMP acknowledged its own use; Christopher Parsons, a civil rights advocate and director at the Information and Privacy Commissioner of Ontario, keeps a small library of related policies.

⌥ Permalink

⌥ The Metaverse Fever Dream

By: Nick Heer
20 May 2026 at 13:28

1. Meta

You probably know the gist. Predictions and dire warnings of a future lived in an immersive virtual world had been around for decades before Neal Stephenson solidified the concept in his 1992 novel “Snow Crash”, but Stephenson called it the “metaverse”, and that was important. It was a cautionary tale. Not everyone understood that. The video game Second Life, launched in 2003, provided an early glimpse of the concept in a P.C. environment. Another piece of the puzzle, consumer-grade virtual reality, began to take shape when Oculus was founded in 2012, and shipped a developer-centric version of its virtual reality headset in 2013. The company was acquired by Facebook a year later. Oculus released a few more headsets while Facebook figured out what to do to “truly transform the way we live, work and connect with each other”.

Despite this goal, “metaverse” was not yet part of Facebook’s lingo, though it was in Oculus’ vocabulary. A 2015 internal memo from Mark Zuckerberg does not once contain the word despite describing the strategy it was developing. Even “Oculus” was barely mentioned in the company’s quarterly earnings calls around this time. But in the Q1 2018 call (PDF), Zuckerberg laid out a “10-year journey” for why Facebook bought Oculus, saying “every 10 to 15 years or so, there’s a major new computing paradigm”, and it is “very likely that the next one is going to be around virtual and augmented reality”. “One of my great regrets in how we’ve run the company so far is I feel like we didn’t get to shape the way that mobile platforms developed,” Zuckerberg said, explaining that it was important to spend vast sums of money now “in order to build some of the muscles to be competitive” later. Facebook was training for a major battle that would never materialize.

In the weeks after Meta announced it was retreating from its metaverse efforts earlier this year, I revisited this and other earnings calls, plus presentations and other documentation, as I tried to better understand what the metaverse was pitched as compared to what it ultimately became. I wanted to know how something so silly was treated by executive and media figures alike as a sincere directional shift for one of the world’s biggest companies in particular. In hindsight, it feels like a particularly narrow period of hype coinciding with — and, I think, benefitting from — the most urgent years of the COVID-19 pandemic. As enthusiasm deflated, it was almost unnoticeable despite forecasters labelling it an essential next step of the internet — a necessary next frontier.

The obsession with the metaverse seems to have solidified in Silicon Valley after Matthew Ball published an essay in January 2020 in which he forecasted that, at the very least…

…it is likely to produce trillions in value as a new computing platform or content medium. But in its full vision, the Metaverse becomes the gateway to most digital experiences, a key component of all physical ones, and the next great labor platform.

Ball admits “we don’t really know how to describe the Metaverse”, but sets seven criteria that, in general, portray it as an expansion and continuation of our blended physical and digital worlds, without the constraints of a physical space and with its own economy. Most notably, he says it will offer “unprecedented interoperability” between platforms and providers. He also lists eight things it is not, among them: it is not just a virtual world, or virtual reality, or a digital economy, or a new app store, or a new platform. It is more about a set of protocols and ideas that, yes, incorporate all these elements, but the metaverse is not itself these qualities.

Ball published this essay with darkly fortuitous timing. A week earlier, Chinese health authorities had isolated a new strain of coronavirus aggressively spreading in Wuhan; a day before, they published its genetic sequence. Within a couple of months, the world had turned upside down and many of us were suddenly spending our days in a space that felt more virtual than physical. We may have only been working from home — or, at least, those of us who had the option and were not laid off — and socializing over Zoom, all while remembering the last concert we went to or the last time we ate a meal in a restaurant.

In July 2020, Forbes contributor and futurist Cathy Hackl imagined a world — one that was “for certain, it’s coming and it’s a big deal” — that connects augmented reality, neural interfaces, and a whole bunch of assumptions. In this environment, you could merely remember that you need to buy something, and then a virtual vending machine would materialize so you could order that thing. Hackl defines the metaverse as “a future iteration of the internet, made up of persistent, shared, 3D virtual spaces linked into a perceived virtual universe”.

In “The Future is a Dead Mall”, a video essay using Decentraland as a jumping-off point for a discussion of the metaverse, Dan Olson navigates several writers’ conflicting definitions before making the reasonable conclusion it is basically irrelevant:

If you comb through dozens and dozens of definitions of the metaverse you can assemble a web of broad attributes where some are generally agreed upon, while others border on being mutually exclusive. It’s a vague, largely incoherent cloud of ideas that’s malleable enough that basically anything can be called part of the metaverse, a proto-metaverse, or a semi-metaverse.

[…]

When you understand that the metaverse isn’t a distinct invention or construct, but merely a rhetorical proxy for The Future of Technology, then all of this becomes a lot easier to deal with.

I think Olson is largely correct; this is how the term is actually used. But, though not his intent, I think defining “metaverse” in vague terms is favourable to its boosters because it does not hold them to something specific. I think the explanation offered by Mark Zuckerberg in Facebook’s Q2 2021 earnings call (PDF) is actually pretty fair. This was two quarters before the company changed its name, and between prepared remarks and the question period, there were twenty total mentions of “metaverse” on this call.

So what is the metaverse? It’s a virtual environment where you can be present with people in digital spaces. You can kind of think about this as an embodied internet that you’re inside of rather than just looking at. We believe that this is going to be the successor to the mobile internet.

You’re going to be able to access the metaverse from all different devices in different levels of fidelity — from apps on phones and PCs to immersive virtual and augmented reality devices. Within the metaverse, you’re going to be able to hang out, play games with friends, work, create, and more. You’re basically going to be able to do everything that you can on the internet today as well as some things that don’t make sense on the internet today, like dancing.

So, in some ways, exactly like Olson’s definition: “different devices in different levels of fidelity” that let you socialize and do work, just like everything you currently do on the internet — plus dancing. It seems almost halfway toward being normalized in his head, though it feels as alien to read this today as it surely did then. Yet Zuckerberg is getting at something here. Virtual and augmented reality are ways of immersing us in unique environments that radically change how we interact with technology. And on the next quarter’s earnings call (PDF), Zuckerberg expanded:

[…] If you’re in the metaverse every day, then you’ll need digital clothes, digital tools, and different experiences. Our goal is to help the metaverse reach a billion people and hundreds of billions of dollars of digital commerce this decade. Strategically, helping to shape the next platform should also reduce our dependence on delivering our services through competitors.

Your avatar cannot simply be a picture of you. You will “need digital clothes” for this space. Need.

In addition to building hype among investors during these earnings calls, Facebook was pumping up its metaverse efforts in more general audience settings. In May 2021, CNet published a transcript of a thirty-minute Zoom call between Zuckerberg and Scott Stein where the former could wax lyrical about the bonafides of where Meta was at the time — “with the fidelity of experiences that are possible today, to me that just says, wow, in five years this is going to be clearly better on almost all of these fronts for a lot of the things that we do”. Casey Newton, of the Verge, was given by Facebook a copy of an internal meeting in which Zuckerberg told employees the company’s “overarching goal across all of these initiatives is to help bring the metaverse to life”. The two then recorded a soft and cuddly episode of the Vergecast that allows Zuckerberg to play visionary and rattle off the company’s metaverse talking points. “I think over the next five years or so, in this next chapter of our company,” Zuckerberg told Newton, “I think we will effectively transition from people seeing us as primarily being a social media company to being a metaverse company.” By October, Sarah E. Needleman was relaying to readers of the Wall Street Journal the words of Unity Software’s Marc Whitten the imperative for businesses to develop a “metaverse strategy”. “The metaverse is going to be the biggest revolution in computing platforms the world has seen,” said Whitten, “bigger than the mobile revolution, bigger than the web revolution”.

It is not difficult to see the deliberate strategy here. In 2019 and 2020, Facebook was not talking about the metaverse and, though a few commentators connected the just-announced Horizon social world to the concept, it was not treated yet as the inevitable future. As 2021 rolled on, Facebook’s promotional drumbeat grew stronger. Suddenly people were talking about the metaverse, and connecting it all back to Facebook. There was, it would appear, real buzz — enough, at least, for the Journal to find corroborating voices and take it seriously.

Three days after its Q3 2021 earnings call, Facebook held its Connect conference, which is centred around its augmented and virtual reality efforts. This was a big moment. This would be the keynote where the company laid out its metaverse-centric vision, and changed its name to Meta to reflect this new focus, and because it had to. “From now on,” Zuckerberg said, “we’re going to be metaverse-first, not Facebook-first”.

Rewatching this presentation in 2026 is a bizarre experience, not least of which because of how it is shot. Most scenes appear to be green screened with composited animations. Demos are virtually nonexistent, with most representations of the metaverse carrying a disclaimer that they are “not actual product images” and they are “strictly for illustrative purposes only”. Even so, Zuckerberg and other executives at Meta are all-in on hyping up an experience that, at best, only barely resembles what it ended up shipping. In many cases, it is not even close.

There is a Jon Batiste concert visualized as something that could be attended in-person by someone in Los Angeles and in the metaverse by someone in Kyoto, presumably through the glasses each person is wearing. We do not see the performance from their perspective, but the implication is that the virtual viewer would see it from the same or similar perspective to the in-person attendee. Both get invited to a virtual after-party where they can buy NFT-based digital merch and meet Batiste or, at the very least, his avatar. The reality of metaverse concerts is quite different than this concept. In 2024, Meta showed a Sabrina Carpenter performance in Horizon Worlds. The seats were great, but even in this immersive environment, it appears more like a concert film than a unbroken show viewed from a single perspective. Also, I cannot find any record of an after-party or virtual merch.

Zuckerberg touts Horizon Worlds as the place users will go to socialize, and Horizon Workrooms as the virtual environment for their job. The latter has since been completely shut down, while the former was put on ice. In gaming, Zuckerberg was particularly excited about Rockstar’s port of “Grand Theft Auto: San Andreas” which, three years later, Rockstar cancelled before it had been released. He said “remote work is here to stay for a lot of people” in this keynote, less than two years before ordering in-office work three days per week; two years after that, Instagram demanded five days per week in-office. I guess “a lot of people” does not include the people who are building the products that let a lot of other people work remotely. That is a little weird.

The wishcast-a-thon of Connect 2021 was treated by some with an entirely unearned gravitas. Dean Takahashi, of VentureBeat, called it a “historic moment” and compared it to the Manhattan Project. He thought Meta could bring about universal basic income, with Zuckerberg “paying us to use his devices so that we can make a living in his ecosystem”. In a mostly skeptical article in the New York Times, Kevin Roose raised the possibility that Meta’s focus change “could help with the company’s demographic crisis”, and advocated taking it seriously because the company “has found what may be an escape hatch” from “Facebook’s messy, troubled present”.

To mark the occasion, Zuckerberg granted interviews to four publications, all embargoed until after the Connect 2021 video was published. Dylan Byers, for Puck, was left with the understanding that Zuckerberg “doesn’t really care” about press coverage or questions about the legitimacy of this pivot — in a good way. “[I]t’s just that he’s not so bothered by the unrelenting criticism, and near-term and collateral damage,” wrote Byers, “that he’s going to check his ambitions or think twice about whether or not he’s the right person to help usher in the next phase of the internet”. Alex Heath, of the Verge, implicitly acknowledges the role Facebook’s public relations team played in creating the impression of interest in the metaverse, writing “it wasn’t thrust into the mainstream conversation until Zuckerberg started talking about it publicly earlier this year”. Heath did not break any news of note; neither did Matthew Olson, of the Information. The latter did at least contradict Zuckerberg’s protest of the “relatively high fees”, “a nod to the 30% commission” of Apple’s App Store and Google’s Play Store, by stating that while “Zuckerberg didn’t indicate what commission Facebook would charge”, “Oculus’ Quest Store currently takes 30%”.

The following day, Matthew Ball spoke with Zuckerberg in a live audio session that has since been pulled from Zuckerberg’s Facebook page, though clips remain available on YouTube. A transcript of the conversation reads like a context-free time capsule of that era, with praise for meme stocks, NFTs, and Web3 in concept more than in practice — and, of course, Ball’s writing on the metaverse. (Six months after this interview, the NFT market would well and truly collapse, with peak transactions occurring the month before Ball and Zuckerberg spoke.) Ball raises the subject of the company’s $10 billion annual spending on Reality Labs. Zuckerberg believes “the metaverse can reach a billion people, say, in the next decade, and that there can be supported hundreds of billions of dollars of commerce. And that if that’s the case, then even with relatively modest fees on the transactions that happen in our services, we think that could be a big business”. But Zuckerberg says he does not want to lose too much money, which is being treated as a “somewhat moderating force over the next period that will keep us from being able to make all of the fees maybe as low as we would want to”. The strategy is, to be clear, entirely dependent on a massive groundswell of public interest in a fundamentally new understanding of computing.

(Zuckerberg also takes time in this conversation to note his respect for intellectual property, at least for luxury brands: if “someone can just make a knock-off Gucci sweater, then I don’t think Gucci’s going to feel that good about being in that space, right, or participating in that system”. Just a few years later, Zuckerberg would allegedly approve the use of pirated ebooks for training the company’s artificial intelligence systems. The work of authors, it would seem, is not as concerning as the reaction of luxury brands.)

A few days later, Zuckerberg again eschewed traditional media outlets and sat down for an interview with Sara Dietschy; then, he chose a softer approach in spirit, if not in volume or cadence with professional talking guy Gary Vaynerchuk. Earlier that year, Vaynerchuk had launched his own NFT collection and, not long before speaking with Zuckerberg, had sold five of his paper doodles for $1.2 million at a completely real Christie’s auction, so you could say they are both on the same wavelength:

Vaynerchuk: The extremity of the NFT space is going to be even greater for what that means. It’s almost like our world is all about to become the fashion industry because we communicate so much through what we wear. The digital version of that is going to have an incredible impact on society.

Zuckerberg: Oh, totally.

Totally. Just like the fashion industry.

In 2022, Meta added support for NFTs in Facebook and Instagram, a project which it discontinued less than a year later. Digital collectibles got a shoutout in the Connect 2021 presentation, had a brief moment in the sun, and were quickly forgotten about. These things are supposed to be building blocks of the metaverse and Meta barely tried.

Meta’s annual commitment that Ball referenced, of $10 billion, represents all Reality Labs spending, including game development, some A.I. investments, and its EssilorLuxottica collaboration. Even so, despite a complete change in corporate priorities explicitly in the direction of the metaverse, Meta’s long-term interest did not match its investment. Here is a chart I made of mentions of “metaverse” in the transcripts of quarterly earnings calls from Q1 2021 — the quarter before its public relations push — through Q1 2026:

Mentions of “metaverse” in Facebook/Meta quarterly earnings calls. Source: company transcripts.
Line chart with a y-axis from 0 to 20, and a jagged but precipitous decline over the x-axis from that peak.

The highest point on that chart is the Q2 2021 earnings call I used earlier for the definition of “metaverse”; the second-highest is Q4 2021, the first earnings call after Connect 2021. The total count includes mentions in Meta’s prepared remarks, plus the question-and-answer period that follows. Investor conference calls are not a perfect proxy for a company’s priorities, but they are indicative. At the very least, for a company that entirely changed course with a new goal — “from now on, we’re going to be metaverse-first” — and a directly relevant name, one might imagine the company and analysts will be similarly eager to discuss how that is going. But no. In Q4 2022, mentions are half that of the year prior. By Q1 2024, neither Meta nor the analysts on the call seem to care all that much — while there were just four mentions of “metaverse”, there were ninety of “A.I.”.

This speaks volumes. It is the kind of thing that makes you wonder if this company was ever serious about this metaverse pivot at all. It seems like it had every intention, sure, but could it ever have executed on its vision? Of the four interviewers chosen for pieces related to Connect 2021, only Ben Thompson even thought to question its feasibility. (Thompson was also the only one to say he was permitted to view a copy of the presentation in advance. I do not know if this means the other three interviewers did not see it and, therefore, could not interrogate it more thoroughly, or if they did see it and simply did not bother to ask.) At the time, Facebook had no track record in building an operating system, barely had any credibility in hardware, and it only kind of created a platform on its “blue site”. (It arguably avoided creating platforms for developers with Instagram and WhatsApp.) This same company was claiming it was launching the successor to the smartphone and the next iteration of the internet. Every one of these chosen interviewers should have been all over this, but they were too distracted by the rebrand and Facebook’s sordid history to notice it was only a concept video more than it was any kind of real concept.

2. The Others

While Meta made itself the face and name of the metaverse, it was far from alone in promising the immersive computing platform of the near-future. Time basically acknowledged this by declaring one of the best inventions of 2021 was the Qualcomm Snapdragon XR2 — a foundational headset chip, rather than Meta’s attempt to build the platform.

In April 2020, Washington Post reporter Gene Park proclaimed the “next version of the Internet is often described as the Metaverse”, going on to confidently explain how it would be built. Of all the companies involved, Park wrote, “it’s Epic Games, with Fortnite, that has the most viable path forward in terms of creating the metaverse”, citing Ball’s seminal metaverse essay.

In April 2021, months before Facebook began asserting its commitment, Epic Games announced it had raised a billion dollars to “support [its] long-term vision for the metaverse” with $200 million of that coming from Sony. A year later, Epic raised another $2 billion, a billion of which again came from Sony, and the other billion from Lego. In 2023, a Lego game was added to Fortnite, which is not really the metaverse as much as it is a nifty Minecraft-like game-within-a-game.

Yet in Epic Games’ telling, it is basically delivering the metaverse already. CEO Tim Sweeney spoke at the 2023 Game Developers Conference about the company’s vision. Since there are around 600 million monthly active users of games, like Fortnite and Minecraft, set in virtual worlds, Sweeney reckoned “we can set aside the crazy hype cycle around NFTs and VR goggles. Yes, these technologies may play a role in the future, but they are not required. This revolution is happening right now.” Sweeney spoke of interconnectedness and open standards that would allow users to move between different spaces in a unified way. “What a user would really like is to be able to buy a cool-looking outfit in one place and take it everywhere they go” Sweeney claimed. (Why do they always mention digital clothes? My theory is because they do not view fashion as having much value beyond a basic assessment that how someone dresses is an expression of identity.) Sweeney describes Fortnite, Unreal Engine, and the Epic Games Store as “on-ramps to the metaverse”, and that the users of which already understand their in-game socialization can be extended to “going to a concert and dancing” in a virtual environment. Leaving aside the contradiction with definitions of the metaverse that mandate a more immersive environment, it is a big leap to think a brief animation of Eminem scratches the same itch as an actual performance.

Microsoft, as ever ahead of a trend without fully conceptualizing it, said it was doing metaverse stuff before Facebook started referencing it in public. Satya Nadella, defining the metaverse as “made up of digital twins, simulated environments, and mixed reality”, claimed a mix of Azure features, HoloLens, and Mesh would allow enterprises to get aboard. Last year, Microsoft said it was getting out of V.R. hardware and turning its mixed reality collaboration product into a glorified Snapchat filter in Teams.

Then there is Roblox. When Andreessen Horowitz announced its investment in the company, Marc Andreessen and David George wrote that “[w]hile pundits have been distracted by the readiness debates and questions over V.R. vs. A.R., the foundations of a global metaverse have been quietly built in the background… in Roblox”. This was in February 2020 — before Epic Games, before Microsoft, and well before Meta said anything in public about the metaverse. In January 2021, as part of Wired’s predictions for the coming year, Roblox CEO David Baszucki confidently predicted “the metaverse will experience widespread use, and start to become a human co-experience utility”. In March, the company went public at a $30 billion valuation. After Facebook changed its name to Meta, Baszucki saw that as validation of its strategy. That November, he made the rounds on business television networks like Bloomberg and CNBC to advocate for the company as a trailblazer.

In January 2022, Bernhard Warner of Fortune was getting excited about the possibilities of the metaverse, writing it “might be the most important trend in tech since the iPhone”, perhaps “a tectonic shift in tech that they [big tech and big investors] can’t afford to miss”. The way Roblox was “monetizing the metaverse” was a key piece of evidence, with virtual concerts and — most importantly — brands. “A parade of consumer brands […] have set up a presence on Roblox in the past year”, wrote Warner, citing Nike’s approach as being particularly exciting. A month earlier, it had acquired a company called RTFKT, which its press release extolled was a “leading brand that leverages cutting edge innovation to deliver next generation collectibles”. Guggenheim Securities, a subsidiary of Guggenheim Partners which has over $350 billion in assets under management, said it was the “‘best idea’ of 2022”, according to Warner. People are going to need virtual outfits, right? Yet, just three years later, Nike shut down RTFKT.

Gucci, another of the brands with a virtual presence in Roblox, sold virtual handbags for in-game currency for a limited time in 2021 and 2022; users realized they could effectively counterfeit and resell them. At least one of Zuckerberg’s predictions kind of came true. And, while Warner highlighted Disney as another company with in-game presence, it has not maintained a meaningful investment because, according to Variety, it feels Roblox is unsafe for children, a sentiment that was not helped when Baszucki appeared on the “Hard Fork” podcast. Roblox has settled lawsuits with the attorneys general of Nevada, Alabama, and West Virginia over accusations its platform features enabled child exploitation by other users. Roblox has denied any wrongdoing though it says it is enabling better parental controls and tighter restrictions on children’s accounts.

Through 2021 and 2022, the metaverse hype cycle was apparent across the tech industry. Max A. Cheney, reporting for Barron’s in August 2021, noted “[m]entions of the metaverse in earnings transcripts and other corporate documents are up five times this year compared with 2020, according to data from Sentieo”. This relative figure must have a hilariously low baseline, sure, but it is an indicator of how many businesses became briefly enchanted by this concept. There were serious financial analyses of real estate in the metaverse. Keep in mind that what is meant by “real estate” is much, much, much closer to domain names than it is land and deed. In July 2022, Technavio, a market research company, forecasted this market would be worth $5.37 billion by 2026. This report was picked up by Debra Kamin, of the New York Times, who published an article in the paper’s real estate section in February 2023 explaining this “new frontier for real estate builders and investors”. The primary anecdote in Kamin’s story is a just-completed mansion in Florida with a “twin” in a metaverse platform called the Sandbox. “As these technologies get more immersive”, the homebuilder said, “it’s going to make a lot more sense” to have a 3D virtual model of a house. Kamin was not breaking news on this specific story, as it was first reported by Emma Reynolds, of Forbes, over a year earlier. One would think that Kamin could therefore have asked some more probing questions or surveyed the actual market for NFTs which, by 2023, had fallen off a cliff. But no. Instead, the builder got the imprimatur of the Times describing the combined physical and digital sale in flattering terms. Ultimately, neither the listing nor many of the sale notices mentioned the sole marketing quirk of this house, suggesting that by 2023 the novelty of a digital model of a mansion was kind of over. I was curious if the NFT was a factor in the buyer’s decision, but did not receive a response to requests for comment I sent to a phone number associated with the current owner of the property.

Both the Times and Forbes articles are individual disasters in their own right. Sure, we might not expect a pinacle of journalistic integrity from Forbes and, to a lesser extent, the unabridged property ads that form the real estate section in prestigious newspapers including the Times. But to communicate this nonsense with the framing of “real estate” is treating wild speculation with unearned seriousness. This project was also co-signed by Sotheby’s. The whole thing is an embarrassing validation of a market that, predictably, would prove to have no substance. This was obvious by the time the metaverse mansion was being peddled. Eric Ravenscraft, in Wired in December 2021, reported that the attempts at artificial scarcity “more closely resembles early-access video games and common pump-and-dump schemes” than a real estate market. Indeed, a Coingecko analysis found metaverse “land” was worth 34% less in 2024 compared to the year prior, and 72% less than at its peak in 2022. This was an average across several platforms, and the biggest decline was in the Sandbox, the digital home of that mansion’s 3D model twin. According to a CoinDesk report published last year, the Sandbox laid off half its employees and its token has dropped in value from its peak by 90%. As of March 2026, user rights to space in Sandbox and Decentraland — another metaverse platform — that had originally sold for hundreds-of-thousands to millions of dollars were not a market totalling $5.37 billion as forecasted by Technavio. They had become basically worthless.

3. Fever Dream

Officially, Meta is still all-in on the concept around which it pivoted the entire company in 2021. It still has a whole marketing page proclaiming its belief “in the future of connection in the metaverse”. You can go shop its lineup of Quest headsets which Meta says represent the best and most immersive metaverse experience, though its flagship model is now two-and-a-half years old. It has awkwardly promoted its Ray-Bans as “A.I. glasses” despite them becoming the company’s most successful line of mixed reality products, and it is desperately trying to connect its newest muse of A.I. with its last one. The single mention of “metaverse” on its Q1 2026 earnings call (PDF) is when Zuckerberg claimed to be “excited for more of our metaverse efforts to be powered by the A.I. models we’re training as well”. If you want to be unfairly generous in your interpretation of Zuckerberg’s brief remark, you could point to a December 2020 Andreessen Horowitz piece, in which general partner Jonathan Lai refers to this shape as a “pyramid”, and says that “fully A.I.-created content” is directly correlated with “spontaneous social at metaverse scale”. Obviously. I am not feeling generous.

It is readily apparent that Meta’s metaverse momentum simply no longer exists. The company, in recent months, has made budget and personnel cuts to the team responsible for these products and, as mentioned, has discontinued Horizon Workrooms and will soon discontinue Horizon Worlds in V.R.. It also ended its third-party headset partnerships. If Meta wanted to wind down its commitment to the metaverse, these are the kinds of moves it would make.

Others in the space have not fared much better. Roblox has not mentioned the word “metaverse” in its quarterly or annual reports since Q1 2022 (PDF). Epic Games scarcely mentions it in recent news releases, either: since January last year, just one announcement contains the word “metaverse”, while seven are dedicated to the lawsuits Epic has been fighting against Apple and Google. Far from the inevitable next chapter of the internet, the metaverse, supposedly the future of how we live, work, and play online, is a non-event.

Near the end of the Connect 2021 presentation, Nick Clegg, then Meta’s global affairs chief, said “the metaverse isn’t something we’re building, so much as it’s something we’re building for”. Olson, in his video, wryly notes that, in the eyes of its promoters, “the metaverse cannot fail; you can only fail to make the metaverse”. The metaverse is so inevitable that “you might even already be in it”, according to Barron’s. But the metaverse is not predestined; it never has been. It is a construction of tech companies that saw in the pandemic their future — not ours.

A slightly charitable interpretation of what I think the pandemic demonstrated to Facebook executives, for example, was how invaluable technology companies were in maintaining connections even when most people could not do so in-person. They recognized how much time people were spending in front of screens already, even in years prior, and assumed that could be a more social experience.

But a more cynical view is no less fair. With the pandemic undoubtably came a realization of how much money Facebook stood to make, if only it had a platform. In 2019, there were two publicly traded companies worth over a trillion U.S. dollars; by the end of 2021, there were five, with Apple and Microsoft now worth over two trillion dollars each. This pandemic was not going to last forever — but it did not need to. Our world was permanently changed, or so it would have seemed, and we would surely want to virtually attend concerts and buy PNG files of band t-shirts with real money. And these companies would take their cut.

One thing I have mentioned but did not emphasize is just how often Zuckerberg and Sweeney mention Apple and Google platform fees as a primary justification for building the metaverse. Sweeney spent several years fighting lawsuits against both companies, mostly winning the one against Google and mostly losing the one against Apple. His efforts have, nevertheless, shined a spotlight on these grotesque practices. But it would be a mistake to assume this is an objection on ideological grounds. These guys just want to take those commissions for themselves. Sweeney spent his GDC 2023 presentation comparing the need for open standards in the metaverse to the openness of the web, but unlike the web, the Epic Games store takes a 12% commission. Meta beat that, though; it even beat Apple and Google. By the time the individual fees are added together, transactions made through Horizon Worlds could be levied a commission of up to 47.5%. The money thing is not even a secret; it was often the very first thing people like Zuckerberg and Sweeney discussed in interviews about their metaverse plans. This was a financial decision before it was a product or service people might actually want to use.

It would not be fair to characterize Meta’s endeavour as an impulsive flash in the pan. Zuckerberg laid out his vision in a 2015 internal memo in which he explained how the company “would like a stronger strategic position in the next wave of computing”. Then, in January 2017, the Chan Zuckerberg Initiative acquired a company called Meta, I think mostly for the name; a year later, Zuckerberg floated the idea of a rebrand. The 2015 memo that effectively set this whole thing into motion gives the impression of a surprisingly cogent document if you set aside the wildly optimistic timelines — “VR/AR will be the next major computing platform after mobile in about 10 years” — and the idea that virtual and augmented reality are so compelling it will supersede the desire for phones and televisions. If anything, the unearned confidence in this memo should have been alarming at the time. As Zuckerberg himself writes, the “core social networking work is no longer new, Internet.org is extending something rather than inventing it, and A.I. is not yet tangible”. This is not a company known for doing new, and it is now stuck with a name reflecting a bungled attempt to change that. Staff are not happy after years of mass layoffs, court losses, role reassignments, and internal surveillance to feed the company’s A.I. projects. Do not get me wrong — Meta’s business of collecting vast amounts of information about its users and selling relevant ad slots is as strong as it has ever been. But Meta the ad company is not Meta the platform innovator.

And this feels like the why of it all. If tech companies can channel a meaningful sliver of our entire lived experience into a world of their creation, one where they collect a portion of revenue, it would make them inescapable. Ball, Sweeney, and Zuckerberg may have all written or spoken about the importance of interoperability and open standards, but these platforms want to exercise a degree of control more similar to native software than to the open web. The steps for migrating from Horizon Workrooms to a competitor’s product, for instance, are not what one would expect if openness were a priority.

For a brief couple of years, it seemed like there could be enough enthusiasm from reporters in the space, venture capitalists, and executives to make the metaverse happen. Then ChatGPT launched in November 2022, and the pandemic ended in the U.S. in May 2023, and any interest anyone may have had for spending more time with people in a virtual setting largely evaporated. It turns out we are okay with having meetings and playing games online, but we actually like seeing live music in-person and travelling to real places. The problems each of these things may have — high costs, environmental impact, and so on — are notable and real, but are not ones with metaverse-based solutions.

The pandemic did not make the metaverse. There was sufficient interest in developing it well before then, and it is possible all of these companies would have announced all these products and services on the same timeline. But in a world without a pandemic, I cannot imagine anyone would have treated these metaverse announcements with anything like the seriousness they did. The pandemic officially ended in the U.S. just six months after the first release of ChatGPT, so it is impossible to disentangle the influence of either. But it is notable to me that the nosedive in mentions of “metaverse” on Meta’s investor calls occurred in Q3 2023 — the quarter immediately following the declared end of the pandemic.

As for the futurists like Hackl, who confidently proclaimed the metaverse was “for certain”, they have found an out thanks to its flexible definition. Jeff Barrett, of the Shorty Awards’ “It’s No Fluke” podcast, published a glowing profile of “the Godmother of the Metaverse” earlier this year under the headline “Why Cathy Hackl Keeps Getting the Future Right”. “When enthusiasm cooled and narratives collapsed, many distanced themselves from the space”, writes Barrett, noting with seeming approval that “Hackl did the opposite. She reframed it”. Many people — perhaps everyone, come to think of it — could predict the future if they got to retcon their predictions to fit reality.

There are many open questions about the metaverse; most glaringly among them, whether it could actually become a thing for normal people. That depends a little bit on what definition we use. If it simply means the slow erosion of the boundary between our physical and digital environments, that is probably something that will continue to happen. For most people, though, that does not look like Meta’s Connect 2021 concept animations. Whatever that ends up being will probably be the result of people finding something useful and intriguing about doing something different. It will not be the product of big companies redirecting the money hose of platform fees onto themselves.

With thanks to Marquette University for granting me access to the Zuckerberg Files. A frustrating number of Zuckerberg’s post-Meta interviews are video-based, so the transcripts produced by this effort were invaluable. Where possible, I have checked these copies against the originals.

Meta Secured Over $3 Billion in Tax Breaks in Louisiana to Build a Data Centre

By: Nick Heer
20 May 2026 at 02:48

Jon Keegan, of Robinhood’s Sherwood News:

A Sherwood News analysis shows that the breaks afforded to Meta on just the sales tax of GPUs would come out to more than $3.3 billion — enough to build 33 new high schools, pay the salaries of all the state’s public school teachers for more than a year, or pay for more than seven years of the Louisiana State Police budget. (The secretary from the Parish committee that approved the financing plans declined to comment, and the chair of the committee didn’t respond to requests for comment.)

This is the very same project where Jonathan Weil, of the Wall Street Journal, found “aggressive accounting” that “strains credibility”. Neither of these advantages would be possible for a less-resourced competitor. Meta is a company so rich it benefits immensely without carrying nearly as much risk as the scale of this project would imply.

⌥ Permalink

Bill C–22 Can Be Corrected

By: Nick Heer
20 May 2026 at 02:12

Justin Ling, the Star:

Yet Bill C-22 doesn’t mandate backdoors nor force companies to introduce any. It explicitly states the government cannot compel companies to introduce “systemic vulnerability” into their services. And it doesn’t give cops or spies new authority to intercept Canadians’ communications; it simply creates a process enlisting companies to help out with doing so.

Ottawa is now scrambling to correct the record. Anandasangaree will reply to the Republicans, conveying “this legislation does not provide for indiscriminate access to devices or communications and does not require companies to weaken encryption and introduce so-called ‘backdoors,’” according to a spokesperson. (The U.S. and the U.K., they also noted, already have these powers; Signal hasn’t withdrawn from either country.)

So the bill is not quite the nightmare some have made it out to be. But there are still some big issues.

Whether Signal is crying wolf or simply believes the laws in those countries are strong enough to prevent mandated backdoors is a good question. In the U.K., for instance, Ofcom is not allowed to require a backdoor, but it is empowered to tell providers to weaken encryption for some without compromising the privacy of their platforms for all when “feasible technology” exists to do so. On the one hand, that technology probably cannot exist; on the other hand, Signal is banking on a privacy-friendly interpretation of that law if it is ever tested.

Apple, meanwhile, has not returned Advanced Data Protection to the U.K. despite the U.S. Director of National Intelligence’s claim that efforts to compromise its encryption have been withdrawn. This demand was made under a different law that, I suppose, Signal must not feel is immediately threatening.

Bill C–22 does, as Ling writes, provide an exemption for instances where compliance with interception demands would “require the provider to introduce a systemic vulnerability related to that service or prevent the provider from rectifying such a vulnerability”. This is the same language as appeared in the Strong Borders Act proposed last year, though C–22 has new powers requiring the retention of metadata. It seems to me that a systemic vulnerability — one that “creates a substantial risk that secure information could be accessed by a person who does not have any right or authority to do so”, according to this bill — might not be found in something like metadata retention, which is what apparently concerns Signal.

⌥ Permalink

Rich Guy Quote Journalism

By: Nick Heer
15 May 2026 at 00:27

Peter Shamshiri:

The answer is that there’s an entire genre of media coverage best described as “rich guy has an opinion.” It’s surprisingly common, and once you notice it you’ll see it everywhere: entire news stories dedicated to the otherwise unremarkable opinion of a rich person, or news stories that fold the opinions of rich people into their otherwise neutral coverage. It’s taken for granted in many newsrooms that a person’s wealth imbues their opinions with newsworthiness.

Karl Bode has called this “CEO Said a Thing! journalism”, and it is all over the place. I think Shamshiri’s broader definition is useful, too, especially in lower-stakes situations.

This week, for example, the Calgary Herald published a whole entire article dedicated to the complaints of a local landlord about a new protected bike lane. She is quoted as saying “[t]here will be no parking whatsoever for any of the businesses that are already here” below a photograph of her standing in front of the large parking lot, which will remain unchanged following the bike lane upgrades. The only other person apparently interviewed for the article is the area’s councillor. This is just one wealthy person’s grievances treated as inherently newsworthy.

⌥ Permalink

Separate Lawsuits Claim OpenAI and Perplexity Are Sharing User Data With Third Parties for Targeted Advertising

By: Nick Heer
15 May 2026 at 00:16

Madeline Batt, Tech Policy Press:

The recent lawsuit Noel v. Perplexity brought the question of AI monetization onto a courthouse docket. Since voluntarily dismissed by the plaintiff, the details of the class action provided a window into how adtech in AI is likely to be challenged in the courts.

The lawsuit targeted generative AI company Perplexity, along with Meta and Google, alleging they disclosed transcripts of users’ conversations with chatbots for targeted advertising. […]

It is not clear to me why the anonymous plaintiff gave up on this case. Abandoning the suit does not necessarily mean its claims are unfounded.

Maggie Harrison Dupré, Futurism:

A new class action lawsuit accuses OpenAI of sharing data including user chat queries and personal identifying information like emails and user IDs with the tech giants — and targeted advertising behemoths — Meta and Google, without obtaining proper user consent.

Interestingly, the Office of the Privacy Commissioner of Canada recently concluded an investigation of OpenAI’s training on personal information and whether it can produce that information reliably. It seems to me like questions about third-party ad targeting were out of scope. This is notable, however:

OpenAI represented that ‘untraining’ or ‘reverse-training’ LLMs, so that they no longer use or generate specific personal information for which a deletion request has been submitted, is not currently feasible. OpenAI explained that this is because its models are trained through repeated adjustments of billions of weights (parameters) over successive runs of training datasets and do not contain or store copies of information that they ‘learned’ from.

I think we all knew this was the case, but it underscores the questionable effectiveness of robots.txt rules for website owners wishing to opt out of being a source for LLM training. It is not even clear OpenAI, for example, ensures data in its collection remains in compliance with opt-out requests when training new models.

⌥ Permalink

Signal Warns It Would Pull Out of Canada if Made to Comply With Bill C–22

By: Nick Heer
14 May 2026 at 04:03

Marie Woolf, the Globe and Mail:

Secure messaging service Signal, which uses end-to-end encryption, is warning it would withdraw from Canada if asked to compromise its users’ privacy under Bill C-22, Ottawa’s proposed lawful access legislation.

[…]

The bill would require “core providers” — which would later be defined through regulations — to retain metadata for up to a year.

Are lawmakers capable of learning from their peers elsewhere? Do we have to do this kind of thing every year, country-by-country?

⌥ Permalink

We Are All Swimming in A.I. Murk

By: Nick Heer
14 May 2026 at 03:51

Jason Koebler, 404 Media:

To browse the internet today, to consume any sort of content at all, is to be bombarded with AI of all sorts. People think things that are fake are real, things that are real are fake. Much has been written about “AI psychosis,” the nonspecific, nonscientific diagnosis given to people who have lost themselves to AI. Less has been said about the cognitive load of what other people’s AI use is doing to the rest of us, and the insidious nature of having to navigate an internet and a world where lazy AI has infiltrated everything. Our brains are now performing untold numbers of calculations per day: Is this AI? Do I care if it’s AI? Why does this sound or look or read so weird? Does this person just write like this? Is this a person at all?

I imagine there are some people who do not much care if the news article they are reading or the music they are listening to was generated by A.I. — with or without their knowledge. I think it feels cheap and shameful. There are interesting uses for generating material based on known patterns and structures but we are stuck with a bunch of spam, and it makes everything feel inherently suspicious. Perhaps that is in some way a good thing; we should be more careful, in general. I think Koebler captures the feeling of being on constant high alert, and living in an increasingly artificial and scam-filled world.

⌥ Permalink

Aaron Vegh and Ben McCarthy Launch Indigo

By: Nick Heer
13 May 2026 at 01:59

Maybe you are in the market for a great Bluesky client. Maybe you are in the market for a great Mastodon client. Maybe you are in the market for a combination great Bluesky and Mastodon client.

Aaron Vegh:

Today, Ben McCarthy and I are launching Indigo. It’s a full-featured client for both Mastodon and Bluesky, available on iPhone, iPad and macOS. Go get it on the App Store!

I have been using Indigo for a while as my primary iOS client for Bluesky and Mastodon, and I think it is terrific. I would happily use it as a standalone app for either. Mixing the two services in one app, though, is better than I had imagined. Everything feels right: posts are colour-coded, you can reply with either account, and there are clever ways of handling existing cross-posting.

Ben McCarthy:

Indigo will automatically detect when a post is duplicated across both networks. If the content is very similar and they both appear within a few minutes as each other, Indigo will merge them so you’re not seeing them twice. You can toggle between each version as well as perform actions like quoting or replying to both posts simultaneously. We’ve done a lot to make the experience of using two different services at once feel seamless.

This kind of app might not work for everyone. I understand the arguments for treating these worlds entirely differently. For me, though, this is a little bit like how I prefer reading email newsletters in my RSS app: my brain is not differentiating between articles on a website and articles sent by email when I just want to read all the new articles. Likewise, I am rarely thinking I need to check Bluesky or I need to check Mastodon; I am usually just in the mood to scroll through or post on social media. Indigo scratches that itch.

There is a caveat. Though Indigo supports multiple accounts of each type, only one of each can be active at a time. This makes sense and, I expect, would have no impact for most people. For those of us with accounts for different purposes, however, it does mean it is slightly more cumbersome than the way account switching typically works in a single-service client. This is, for me, a reasonable compromise.

Open standards are pretty great, hey?

⌥ Permalink

When su replaced login for becoming another Unix login

By: cks
2 June 2026 at 01:19

I recently read Simon Tatham's Nitpicking the shell history scene in Tron: Legacy, where one thing that surprised Tatham was the film using 'login -n root' to become root instead of 'su. This surprised me because I found that perfectly ordinary, and this turns up both a bit of Unix history and a difference between modern Unixes.

Plain 'su' can let you become another user, including root, but what it explicitly doesn't do by default is create a new login shell for that user. If you do 'su root', the new root shell normally inherits most of your environment, your current directory, and so on. Sometimes this is what you want and sometimes you really want a new login environment, and originally in Unix how you got the latter was to run 'login' from your existing shell session (and this meant that login was setuid root, like su).

This split usage of su(1) and login(1) is present in Research Unix V7 (and for login goes back to at least V3), where the respective manual pages clearly say that su doesn't change your environment or your current directory, while login's normal use (from a shell) is to 'change from one user to another'. Similar wording remains in the 4.2 BSD su(1), but in System III, su(1) picked up an option to make the new shell a login shell (and it even describes the mechanism) and login(1) lost the ability to be run from a normal shell. The 4.3 BSD su(1) picked up the System III su change, but login(1) can still be used from a normal shell, and I believe this continued on the BSD lineage in general.

As you might expect, all of the modern versions of su across Linux and the free BSDs support starting a login shell (cf the normal Linux su (also), FreeBSD su(1), NetBSD su(1), and OpenBSD su(1)). On Linux and OpenBSD, login isn't setuid root and so can't be used from a regular shell environment to become a new user; your only option is su. On FreeBSD and NetBSD, login is still setuid root and can be used to switch to another account with a login shell, although this usage doesn't seem to be explicitly documented in either's manual page. Illumos (the open source successor of Solaris) also still supports using login from a command shell, and explicitly documents this in login(1).

(OpenBSD making login not be setuid fits their general security posture, since a setuid login has been a vector for security issues in the past. I can't easily find out if Linux versions of login were ever setuid.)

PS: It's possible that login is still setuid on some Linux distributions. The normal util-linux login specifically says that it doesn't work from a shell session, but the shadow-utils login may still, and some distributions might enable that.

(This sort of elaborates on a Fediverse post I made.)

Sidebar: The early history of su

The su command goes back to V1 Unix, but at the time it was only used to let you become root ('superuser', likely the source of the 'su' command name). We don't have much from V2 (well, sort of [PDF]), but in V3 su's manpage moves to section 8 (for 'administrative commands') as su(8), where it stayed in Research Unix V4, V5 (per the V5 manual [PDF]), and V6. Only in V7 does su gain the ability to change to any user and its manual page was moved to section 1 (for general commands) as su(1).

(Potentially of interest is this reconstruction of old Unix manual pages.)

I'm not sure we'd use AppArmor much even if we could

By: cks
1 June 2026 at 02:47

The news of the time interval is a string of local privilege escalation vulnerabilities in Linux (in part in the kernel). We very much need the security boundary of Unix logins, and some of these vulnerabilities are mitigated or blocked by various Linux kernel security modules ('LSMs') (cf), so I've recently been thinking if we'd use AppArmor, the LSM that Ubuntu supports.

(AppArmor didn't block as many of the vulnerabilities as a proper SELinux setup did, but SELinux needs distribution buyin and that's not what Canonical provides.)

We've traditionally disabled AppArmor because it's had issues in our environment of NFS home directories in our own locations for them (also). So let's assume that AppArmor magically works now for NFS home directories and other directories (or can easily be set with tuning knobs), and still provides meaningful security afterward. Setting up AppArmor for our environment will take some amount of work (cf), so the question is how much protection against local privilege escalation we get.

Roughly speaking, our systems fall into two categories; systems that normal people can access and run programs on, and systems that are purely for services (including things such as IMAP mail). For services, in theory we (or the people writing AppArmor profiles) can work out what the services should be allowed to do and not do, and thus lock things down against local privilege escalations in kernel systems that the services shouldn't be touching anyway (and other vulnerabilities, such as information disclosure from reading files the service shouldn't be accessing). However, this protects against an unlikely set of chained issues, where there's both a vulnerability in a service itself and then an additional vulnerability in the kernel.

(If these issues aren't unlikely, we have bigger problems.)

That leaves the systems where normal people can run their own programs (which are the ones where we really need the security boundary of logins). On these systems we have to assume that an attacker can gain the ability to run relatively arbitrary programs, either by compromising an account outright or through, for example, a compromised package that people are using in the code they're writing for their research (or a compromised editor extension, or etc; there are lots of ways in). Since people are effectively running arbitrary code, we can't protect ourselves by having AppArmor restrict what specific programs can do the way we can on service-based machines. Instead, we have to find and inventory kernel features that people will never legitimately use, and then block them through AppArmor rules.

(This is how a strict SELinux setup appears to protect against the recent vulnerabilities; a normal login is simply not allowed to use, eg, RDS sockets.)

The Linux kernel has a lot of features and facilities, although some of them are blocked off because we don't allow user namespaces, and people doing CS research do a lot of things, some of them at least unusual. Could an AppArmor profile (or a set of them) be written so that people would be allowed access to what they use and not allowed access to things that they don't? Probably (although AppArmor is more focused on programs than on people, well, logins). Would we be able to find an out of the box set of AppArmor rules and so on that worked? Maybe, and this depends on exploits not being found in areas that people pretty much have to be given access to.

If we had a reliable set of AppArmor or SELinux profiles, we might well use them because it would be easy enough. Without a reliable set of AppArmor profiles, I'm not sure we'd try to build some ourselves unless we were desperate. And if we were going to do the work, it appears that we might get more results for less effort through things like explicitly blocking all the loadable kernel modules for Linux socket types that we don't use.

(Some people even block all kernel modules that their current configuration doesn't use. I'm not sure I'd go that far, but I suppose you can always un-block things like the netfilter modules if you turn out to want to add some nftables rules later.)

Our unusual system of "web home directories" for people

By: cks
30 May 2026 at 23:50

One of the things we operate for the research side of the department is an old fashioned general purpose web server, where everyone has a home page area of their own in the traditional '/~<login>/' style (cf). This web server has been there for a very long time, and one of the decisions that was made very early on was that for security reasons, the web server would not NFS mount people's regular home directories from our fileservers.

The traditional Apache way to do '/~<login>/' home pages is to have some location under your home directory that's exposed as your web home page area; the traditional name for this is 'public_html'. One alternative is to relocate this to a separate directory tree, but this directory tree is flat, which makes it awkward to have different pools of disk space for different people (which is absolutely required for us). Since we didn't want to use people's regular home directories for security reasons and we couldn't put everyone in one directory, we did the obvious hack: people have a different, special home directory on the web server. These home directories are in special 'webdir' filesystems on our fileservers, and these webdir filesystems are the only NFS filesystems that the web server NFS mounts.

The result is that everyone actually has two home directories in two different filesystems (although those two filesystems will come from the same ZFS pool). They have their regular home directory filesystem, which is accessible on our login and compute servers but not the web server, and their 'webdir' home directory, which is accessible everywhere. To make this more convenient to people, we create a 'public_html' symlink in people's regular home directories that points to the 'public_html' in their webdir home directory. If people have personally run web servers, these and their support files also live in the 'webdir' home directory, for relatively obvious reasons.

(We have a special short name form of people's home directories, so on the web server this short form points to their web home directory. The public_html symlink combined with this means that '/u/<login>/public_html' always refers to your web home page directory tree no matter what machine you're on.)

Because everyone's web home directory filesystem is in the same ZFS pool as their normal home directory filesystem, the web server still depends on all of our ZFS fileservers. Since our web server is reasonably active (also, also), it tends to react very rapidly to any NFS fileserver hiccups.

PS: The web home directory security decision predates me, so I don't know why it was made, but in my view it's a perfectly sensible decision. In general you should probably assume that your web server can be coaxed into reading and disclosing any Unix file that it has filesystem level access to. If you don't like the implications of this, you need to arrange for it to have access to fewer files. A dedicated set of filesystems is one relatively straightforward way to do that.

Our servers mostly don't seem to have high peak power usage

By: cks
30 May 2026 at 03:34

I wrote a bit ago about how our servers seem to have surprisingly low power consumption, where I looked at IPMI based power consumption information to look at their current, typical power usage and found that several of them were sitting at lower power usage than my desktops. I'm still interested in typical or average power usage for reasons beyond the scope of this entry, but now I'm also interested in maximum power usage that we've observed.

One reason to be interested in maximum power usage is that servers are often given relatively high capacity power supplies. A 600 watt power supply is on the small side for even a 1U server, and my impression is that 800 watt and even 1200 watt PSUs are reasonably normal in basic servers. On the one hand, the vendors building servers don't know what they're going to be used for, so they're likely to be conservative. On the other hand, that's a lot of spare power for a system that is typically using, say, 24 watts. A PSU that is that far under its rating may not be all that efficient (although the one 600 watt server PSU I looked at had a 'platinum' rating).

Obviously a server's PSU has to support not just its typical power draw but also its maximum power draw; if you idle at 107 watts but reach 500 watts or more at full load (as one of our servers does), then you need that big PSU. But if the maximum observed power draw is substantially lower, the PSU looks more like overkill.

Because we only collect IPMI sensor information every minute, I can't be confident that we've captured the absolute highest peak usage for any of our servers. If a server has a high power draw for only fifteen or thirty seconds or so, it would have to happen at just the right time for us to capture that in a sensor reading. But if the power draw lasts for more than a minute, it becomes increasingly likely that we'll capture it.

With that said, the regular 1U servers I have reliable IPMI power usage from top out at around 155 watts for a few servers (over the past year), and our NFS fileservers, which are loaded with SSDs, are under 100 watts. These machines don't seem to be putting much stress on their PSUs, to say the least (I think they all have 600 watt PSUs or better, ie bigger). I don't know what to make of this, or if it matters much from an efficiency point of view. Some extra percent of 25 watts is not necessarily a large number, and I don't know if reducing the PSU from 600 watts to, say, 400 watts would improve things much.

(Even if it did, I suspect that there's not enough of a market for it to make it an economical option for basic server vendors. I have a feeling that most people's servers are more loaded and more power-consuming than ours are.)

The Go language server can do some impressive code navigation

By: cks
29 May 2026 at 01:37

For reasons outside the scope of this entry, I recently dug into how the Go runtime did (Unix) signal handling on 64-bit x86 Linux. When I undertook this quest, I decided that the easiest way to navigate through the code of the Go runtime was to use the code navigation features exposed by the standard Go language server, gopls. In the process I was surprised by just how good its code navigation was, even in the Go runtime.

On Linux, Go's signal handling talks directly to the Linux kernel rather than going through the C library. As you can imagine, this is relatively architecture and Linux specific, as well as being relatively specific to Unix. The result is a tangle of OS and architecture specific code in a variety of signal related files in src/runtime. The first challenge for code navigation is picking out the right ones that apply to the environment you're interested with; in Go this is handled through build tags, which gopls understand. So gopls had no problem navigating from general Unix signal handling to setsig() in Linux-specific code and the 64-bit x86 Linux definition of the struct involved.

But that was only half the puzzle, because I was looking into how the Go runtime receives signals. This is done by 'sigtramp()' and the related function sigreturn__sigaction(), and it turns out that these functions are not defined in Go. All you'll find in Go is stubs of them at the start of os_linux.go. But gopls had no problems navigating from the stubs to the actual amd64 assembly version, despite the fact that the assembly version has an odd name, and then it was able to navigate from the assembly version of 'sigtramp()' back to the Go 'sigtrampgo()'.

(It turns out that one area where gopls is currently limited for Go assembly language is finding references for assembly language symbols. Fortunately I didn't need that here, all I needed was 'find definition'.)

Code navigation among Go code is not surprising, because that's what you expect from a (Go) language server like gopls. What surprised and impressed me is code navigation into and out of Go assembly code, where I was expecting to have to resort to manual searches with (rip)grep. This is almost certainly a relatively niche feature, yet gopls has basic support for it. This support doesn't come from the standard Go library; instead it's implemented specifically in gopls in internal/asm and internal/goasm.

Another nice trick that gopls can do (that I just investigated) is navigate from an interface or an interface method to everything in your codebase that implements the interface. This is done through (of course) the LSP 'find implementation' code navigation action (in Eglot in GNU Emacs, this is 'C-c i'). Gopls will also navigate backward from a concrete thing to all of the (in-scope) interfaces that it implements (again using 'find implementation'). Slightly inconveniently, if your thing has a String() method, this will report a number of interfaces in the Go standard library. Gopls currently includes non-exported interfaces in the standard library, which is technically correct but extra not useful.

(Specifically, currently this will include context.stringer and runtime.stringer, as well as fmt.Stringer, the public version (and expvar.Var, which has the same shape but incompatible return value requirements). I assume the Go runtime and standard library has multiple versions of this interface internally to limit cross-imports.)

Update: My GNU Emacs 'C-c i' key binding for Eglot's 'find implementation' command (eglot-find-implementation) is a custom personal key binding, not a standard one. Oops.

Using typing in Python leads to different sorts of code

By: cks
28 May 2026 at 01:56

So what happened is that I converted a big pile of (highly untyped) Python 2 to Python 3 recently, and then I wanted to experiment with typing-heavy Python LSP servers in GNU Emacs, so I decided to try them out by experimentally adding some type annotations to DWiki, the aforementioned pile of untyped Python (and the code powering Wandering Thoughts). The experience was educational and taught me some new things about type annotations, but it also firmed up my view that typed Python code is different than untyped Python code (although not quite to the extent that they create a different language, as I sort of felt before). There are idioms that are perfectly natural in untyped Python that are pretty annoying to deal with in typed Python.

One of these idioms is dictionaries with multiple types of values. For instance, DWiki has a dictionary that is basically 'a collection of information about the HTTP request'. The authentic type of the values in this dictionary is "str | bool | SimpleCookie | dict[str, str]", which is to say that values can be any of a string, a boolean, a HTTP Cookie, or a dictionary of string key/value pairs. Of course, individual keys in the dictionary have a fixed type for their value; for example, the key 'request-fullpath' only ever has a string value, so in untyped Python code it's natural to write something like:

if reqdata['request-fullpath'] and \
   reqdata['request-fullpath'][-1] != '/':
    [...]

If you do this in typed Python, your type checker will almost certainly complain that this indexing isn't valid for booleans and HTTP Cookies. You need to either check or type-assert that the value is a string.

In untyped Python, this is a perfectly decent data structure (although it might not be good style). In typed Python, this is a bad data structure that will cause you pain. There are ways around the pain that preserve the underlying dictionary, but they exist almost entirely to pacify the type checker. A proper data structure in typed Python is not multi-typed like this, or at least it's not multi-typed with a lot of keys.

(One way is to use typing.TypedDict, but if you have a lot of keys it gets painful).

There's a good reason for this insistence in typed Python, because right now there's nothing preventing me from putting in the wrong type of value for a particular key in this dictionary. I could slip up and set some key that's supposed to have a string value to a boolean, or a key that's supposed to have a dictionary to a plain string. Typing can't detect those errors because any of those are valid for the dictionary in general, just not for that particular key. A proper data structure in typed Python is one where the type checker itself can check your invariants, so string values are separated from boolean values and so on. This would probably also be clearer code.

This is a general issue for any sort of variable-typed container object, return values, or the like. I saw a similar thing when typing my program that uses the email packages; the email packages have old-school polymorphic API return values that typing is not fond of and that required type checks or casts. This is relatively valid on the part of programs determining typing (they're unlikely to ever do full flow control analysis to determine actual types), and is clearly part of the style of typed Python.

(Another case of this in DWiki is that I have a general caching layer that uses pickle to store and retrieve arbitrary objects. The callers know what they're storing and retrieving under a particular key, but this isn't visible in any types I could assign.)

As far as I can see, typing also changes how you want to structure multi-file code with classes and other data structures. In untyped Python such as DWiki, it's natural to have one source file declare a data structure, create an instance of it, and pass it as an argument to a function (or a class) from another file that the first file imports. In typed Python, this doesn't work so well. Because everything that either takes data structures as arguments or returns them wants to name the data structure in type hints, you need the classes for those data structures to be eventually be accessible in everything that touches them, which means a tangle of circular imports.

(This is different from forward references in that the code that accepts instances of these data structures will normally never import the code that defines them, cf.)

Circular imports work, technically (as I've sort of written about before), but they make me unhappy. I lack enough experience with typed Python to know the correct approach, but it certainly feels like one should define as many data structures as possible in low level files that are relatively standalone so they can be imported into everything without circular imports. I'm not sure how this works once you want to put methods on your classes that take other classes as arguments and so on.

(Mypy has some suggestions but its answers don't make me feel happy.)

Another practical issue I ran into was that DWiki has a stack of middleware functions to fiddle with HTTP requests. All of the middleware functions take a standard set of four arguments, each with a specific type, and I have enough of theses functions that going through and adding the appropriate type annotation to each argument for each function (and the return value) was clearly a pain (in my experiment I only did this for a few). I found myself really wishing for a way to say that the function as a whole had a particular type shape, which would automatically infer the argument and return types. I think the proper way to do this is to pass each function fewer arguments (ideally one), but I'm not sure I like it (and the four arguments aren't tightly coupled to each other).

(I also wound up feeling that I should create a 'types.py' file that had all of the basic type definitions that didn't depend on classes and so on. This would be things like the shape of callable functions, that 'data about the HTTP request' dictionary, and so on. Many of these are used in multiple files in DWiki and this avoids various sorts of annoyances. I don't know if such a 'types.py' file is considered a code smell.)

I don't regret my scratch experiments with adding some types to DWiki (partly because I learned more useful things about Python typing), but it's clear that doing it properly is somewhere between infeasible and impossible (and Python typing acknowledges that this can be the case). A reasonable typed version of DWiki would be structured significantly differently, and getting from the current code to any new type-friendly structure would be a significant rewrite (which would fix some old mess but likely introduce new mess).

(The semi-typed results of my experimentation are messy enough that I'm to discard that copy of the source code.)

(I said something about type hints on the Fediverse and some interesting things came up in the replies, eg.)

My views on some Python LSP servers in GNU Emacs (as of mid 2026)

By: cks
27 May 2026 at 02:06

Some languages have to make do with one LSP server. By contrast, Python has an embarrassment of riches; I know of at least five modern LSP servers for it. I've recently been experimenting with some of them in GNU Emacs, specifically Eglot, so before I forget I want to note down my views. The five Python things with LSP servers that I believe are modern and current are python-lsp-server ('pylsp'), Facebook's pyrefly, Astral's ty, Microsoft's pyright, and technically Astral's ruff.

The easiest to talk about is ruff, because it's not intended as a full-featured LSP server that does everything; instead it only does code diagnostics and formatting, and you need another LSP server for code navigation. Currently Eglot doesn't easily support multiple LSP servers and code navigation is a lot of what I care about, so direct use of ruff is off the table for me. Also off the table is pyright, since I don't have any interest in touching a Microsoft Python project or finding out how badly it works with anything other than VS Code (although there's basedpyright as a less-Microsofted pyright option).

Python-lsp-server is my default choice and is a solid basic LSP server with the code navigation features I normally care about, along with support for code diagnostics through either or both of mypy and ruff (via python-lsp-ruff). Python-lsp-server is also what I'd call a 'quiet' LSP server by default, without a lot of stuff popping up and being filled in in Eglot. It's supported by the community and is probably going to endure, but it's written in Python (so it's not the fastest thing) and my impression is that it's more focused on code navigation than on type checking your code. My view is that it's probably your best option if you have a lot of untyped Python code, which is my normal case.

(So after playing around with both ty and pyrefly for some time, I'm probably going to stick with python-lsp-server most of the time.)

Both ty and pyrefly are strongly into type checking and type annotations, in addition to supporting code navigation. Both support 'inlay hints' in Eglot, which fill in known or deduced types for you (and can also attach names to positional arguments in function calls; ty defaults this to on, pyrefly to off). There are some differences in what types they fill in, for example ty will tell me 'Unknown' for types while pyrefly is silent about them (with no inlay hint), and I suspect that there are differences in what types they deduce for things. I don't have enough experience with Python type checking to have strong opinions on the general choice between ty and pyrefly. Both support more or less all LSP code navigation features (ty's LSP documentation, pyrefly's LSP documentation), with pyrefly currently having one more supported navigation ('go to implementations', which lets you find the reimplementation of methods in sub-classes, and now that I've tried it that's kind of handy and it's not currently supported by python-lsp-server).

(Eglot allows you to easily toggle inlay hints off and on with 'eglot-inlay-hints-mode', in case you don't like the noise of them but do want, for example, pyrefly's code navigation. I'm not sure how much unwanted type diagnostics and notes pyrefly or ty will spit out at you on untyped, anarchic Python code bases.)

As before, I think setting up Python LSP support in GNU Emacs is worth it, especially if you're working with typed Python and pick a good LSP server for this. LSP server code navigation is really quite nice and will work across files in your Python project (and pyrefly's support for 'find everything that overrides this method' is handy if you have that kind of code base).

(GNU Emacs can do some amount of code navigation in Python code without a LSP, but you want to create and maintain a tags table and in brief experimentation the experience is not as smooth and more annoying.)

If you want the most deluxe Eglot based Python LSP experience, I think you want to set up pyrefly with however many inlay hints you want. Since I slogged through the effort to determine what special Eglot configuration you need for this, I will save people the effort:

(setq-default eglot-workspace-configuration
   '([...]
     :python (:analysis (:inlayHints (:callArgumentNames "partial")))
    )
 )

As (sort of) covered in pyrefly's LSP documentation, pyrefly doesn't use its own name for these settings, it uses names that pyright apparently originated. Fortunately Eglot will send (all of) your settings to whatever LSP you're currently running, regardless of their names. I believe you can also configure this in per-project configuration files, which would also let you entirely disable pyrefly type checking in places where you don't want it (per the configuration documentation).

(Some bits of the pyrefly experience in GNU Emacs will get more deluxe in GNU Emacs 31, when Eglot will acquire support for reporting things like call and type hierarchies.)

Sidebar: A brief experience with basedpyright

I ran a little poll on the Fediverse and a surprising number of people (to me) turned out to use pyright or basedpyright, so I gave it a try. The result is, effectively, a failure for my code. Even code that I thought was well typed and free of problems came out full of diagnostics in basedpyright's default configuration. It does have more or less the same code navigation features as pyrefly, but for me the cost of getting them is too high.

But if you want to write extremely strictly typed and careful Python code, basedpyright will make you do it (assuming you make it have no errors and keep its strict default settings).

(The poll also suggested that very few people use pyrefly, which surprised me a bit.)

Anti-robot techniques can be nice but the problem is, they're not static

By: cks
25 May 2026 at 17:13

I've recently come up with what I expect would be a quite good anti-robot, anti-crawler tactic, which I will give the snappy label and summary of "robots don't POST". Simply require a HTTP cookie to see your web pages and then if visitors don't have the cookie, put up an interstitial page with a HTML form that requires them to POST it to get the cookie. All the form need is a "click me to get your entrance cookie", because right now, few or no robots or crawlers will make that HTTP POST request; they only do HTTP GETs. To distract bad crawlers you might need some other links on the interstitial page, optionally going to content tarpits.

(If you're going to do this in practice you'll want to exempt syndication feed requests and perhaps requests from bingbot, Googlebot, and so on. Although maybe not Googlebot any more.)

The obvious problem with this technique is that if people start doing it in any quantity, the "robots don't POST" thing won't last. Bad crawlers will start hitting POST endpoints for forms that just have a "click me" button, and then POST endpoints for forms that have an "I am human" tick box to mark or a field to fill in or whatever the elaboration people come with is, and so on. Bad crawlers are in an arms race with websites and this is a problem.

Arms races require two active participants. An inactive participant in an arms race usually loses by default. In today's environment with aggressively bad crawlers, you can't simply set up a website and walk away from it, not if you want it to survive; you're forced to participate in the arms race. Your website may be static but your operation of your website increasingly can't be, not unless you want to wake up one day and discover that you don't have a website, you have a smoking hole in the ground and perhaps a big bandwidth bill from your hosting provider.

I don't have any answers to this. Instead, it feels like this whole situation is another obstacle in the way of people having their own low-attention websites (after the comment spammers made it impossible to have your own low-attention comment system). Someone has to pay attention, so that's either you or someone you outsource it to, and that someone is most likely going to need to be paid sooner or later.

(There are exceptions, but they're rare. Also, if you run your own website you sort of have to maintain the software involved, but automatic updates (and static websites) have mostly made that easier.)

An idea: user level WireGuard for UDP based encryption and authentication

By: cks
24 May 2026 at 14:24

In some environments, you want to connect programs together with mutual authentication and encryption of their traffic (so each end can trust the other and the traffic is immune to easy eavesdropping). If the programs are talking to each other over TCP, there's a well developed solution for this in the form of mutual TLS (mTLS) (although you'll probably get to enjoy the fun of running your own private Certificate Authority). But if you're using UDP, things are less clear. When this came up recently in a Fediverse discussion I was a peripheral part of, it occurred to me that we already have an existing, well regarded, UDP-based mechanism for authentication and encryption in the form of WireGuard.

(Yes, there's QUIC, but that still leaves you with TLS and it gives you a reliable stream model instead of a UDP model.)

WireGuard is normally used to create general networking connections between two machines, a connection that other programs can use to pass whatever traffic they want. But in theory it doesn't have to be used this way. There are purely user-level WireGuard libraries, and if you have a suitable library, you can have the program it's embedded in receive and handle the packets from inside the WireGuard connection, without injecting them into the operating system and exposing them to other programs (and it can also send its own packets back). You're probably going to need your own user level IP implementation to make the WireGuard library happy, and you may want or need a user level UDP or TCP implementation to make handling your own traffic simpler, but all of those are available if you hunt around.

What you get out of this is a well regarded protocol with simple, straightforward authentication, and to some degree it handles out of order packet delivery and packet loss, which is presumably something you care about if you picked UDP to start with. You don't have to deal with the complexities of any variant of TLS, you don't need a private Certificate Authority, and you can always directly know what one program is willing to talk to, because you have a list of public keys.

The drawback of this is that you have to put it together yourself. If you can find a suitable QUIC library (eg), it should do all of this for you in a hopefully straightforward API that looks a lot like (TCP based) TLS. The one potential drawback of the QUIC approach is that I believe it's only a stream based protocol without UDP-like, out of order delivery (and possible packet loss). If what you want is 'an authenticated, encrypted stream but over UDP', then QUIC is probably more like this than WireGuard. If what you want is 'authenticated, encrypted UDP', then WireGuard might be closer to that than QUIC.

Sidebar: WireGuard, QUIC, and out of order packets

As far as I can tell from reading the basic WireGuard protocol summary, each WireGuard encrypted packet is independent. While packets are sort of ordered by a counter, WireGuard can handle them out of order and will accept them out of order within a window (cf). If you're sending UDP datagram traffic within WireGuard, I believe this means that your underlying system can still have the UDP properties of non-blocking, non-sequential receives.

As I understand it, a single QUIC connection carries multiple streams within it and each of these streams runs independently, so packet loss and delays on one stream don't affect other streams. However, I believe packet loss and packet reordering will block a single stream because it's, well, a stream.

How our environment still needs the security boundary of Unix logins

By: cks
23 May 2026 at 23:28

In a comment on this recent entry, I was asked if we still considered Unix logins to be a serious security boundary. This is a sensible question; there are a horde of Linux local privilege escalation vulnerabilities going around right now (and one FreeBSD one for spice), and in general (some) security people have been saying for years that once an attacker had local code execution, the game was over. Our answer is that yes, we consider it a serious security boundary, and if that situation ever changed we'd need a drastically different system environment from our current environment.

Our current environment has shared NFS fileservers where people keep all their files and data, shared login servers for both general usage and compute, a (shared) SLURM computer cluster, and a reasonably flexible shared web server environment where people can run programs. While some people are still using our login servers interactively, others are running software (such as VSCode) that connects to them somewhat behind the scenes and uses them to run tools. All of this is critically dependent on the security provided by Unix logins; if Unix logins weren't a real security boundary any more, anyone on any of these machines could read other people's files or run programs as them.

Since these machines are all shared machines with multiple people logged in at once, switching to Kerberos authenticated NFS wouldn't solve the problem. If we assume that attackers can merely become any other person, then they can gain access to the Kerberos tickets of anyone else who's currently logged in and access their files. If we assume that attackers can compromise root, then all bets are off and once a person has used that machine it can't be trusted for any future use (since the attacker could have compromised programs to capture the login credentials of future people logging in).

Basically, if you lose the security boundaries of Unix logins, you lose shared machines. You need to create a new environment without sharing (or with sharing boundaries that people can't break out of). Today, it appears that the only way to do that securely is a separate virtual machine for each person, with Kerberos authentication to our NFS fileservers (given some of the Linux security issues, containers are clearly not good enough). I'm not sure how you manage a SLURM cluster in this environment, but it certainly wouldn't be the straightforward way we do it today.

This would be a drastic change for people here and it would also be a significant increase in resource requirements (since realistic virtual machines are much more heavyweight than even full login sessions). We couldn't leave 'your' virtual machine (or machines) running all the time (we have too many people using our systems for that), so you'd have to use some web interface to request it be started with some resource allocation. Managing, maintaining, and updating these virtual machine images and running VMs would be at least a bit painful, and people would probably experience more disruption in their activities. Some things would become effectively impossible, such as running CGIs on our web server.

My views on Flymake and Flycheck in GNU Emacs (as of mid 2026)

By: cks
23 May 2026 at 02:57

One of the divisions in GNU Emacs people is between using Flymake, which is built into GNU Emacs and is well supported by other standard GNU Emacs packages such as Eglot, and using Flycheck. I've used Flycheck for a long time (cf) and recently tried using Flymake, which has given me some pragmatic opinions for my own usage.

(For non GNU Emacs people, Flymake and Flycheck both exist to present (and to some extent detect) 'diagnostics' about your code or whatever file you're editing.)

For me, Flymake and Flycheck are about as good as each other, at least in LSP based environments and Emacs Lisp. Flymake is better integrated into Eglot and can make errors more visible, Flycheck comes with more keybindings by default, and I go back and forth about how I feel about their modelines (after I diminished Flymake's verbose modeline name down to 'FlyM' and changed the colours a bit). Why I prefer Flycheck is that it's more flexible in one way that matters to me.

My particular taste with checkers is that by default I only want to see actual errors (or relatively strong style issues), but I want to have access to linters that express views I may not agree with in order to see what they say and maybe fix some things they complain about. This way I can keep my code free of real, core issues (that are reported by the error linters) and have a nice clear modeline showing '0' issues (and not have to remember how many baseline non-issues a file has), while still being able to conveniently see style issues if I want to consider them.

As far as I can tell, Flymake has no built in support for (easily) changing what sources of diagnostics it draws on. Things are just magically supposed to get it right, which is fine if they actually do but sub-optimal if they don't. One case where they don't necessarily is in Eglot, where as far as I know the normal diagnostics will only come from the LSP server you're running and will cover only what it provides. Even in cases where it's possible, changing what diagnostics you get from a LSP server isn't simple.

Perhaps because you can switch Flycheck checkers around, there are a bunch of third party Flycheck packages that support optional Go and Python style checkers (and some for other languages). Flymake has some third party checkers, but not really in the way Flycheck does (and what third party checkers it has can be rather out of date). The Flycheck situation is convenient and useful for me, because it means I can easily run (for example) golangci-lint against my Go code within the Flycheck framework with all sorts of jump to complaint support.

(There is an adapter to connect Flycheck checkers to Flymake, but as far as I know you're still left without a convenient way to pick your checker.)

Although Flycheck is my default, I've kept my Flymake configuration around and wired up some personal functions so that I can switch back and forth (either buffer locally or globally). Sometimes I flip over to Flymake to see what it says or use some of its other features.

(There's also Flycheck's comparison page with Flymake. A bunch of the differences that Flycheck lists aren't important to me, partly because I don't use GNU Emacs to edit everything in sight so the large collection of languages and configuration files that Flycheck supports aren't as important.)

PS: I'm dating this in the title because both Flymake and Flycheck have changed over time. My impression is that Flymake stagnated for a while, putting Flycheck clearly ahead in those days, but that things are more even today (especially in LSP environments, where both are getting the same diagnostics from the LSP server).

Notes on respectfully getting a personal copy of a website's contents

By: cks
21 May 2026 at 22:45

Suppose, hypothetically, that you want to have a personal copy of the content of some website that you feel is important (to you). There are perfectly good reasons to want such a copy; websites go away all the time on the Internet, and not everyone is online all of the time. It's generally possible to do this (and it's certainly possible to do this with Wandering Thoughts), but there's some things the hypothetical you is going to more or less need to do. These things will be work, but that's the difference between successfully getting a personal copy and turning a brute force crawler lose and then getting ratelimited and blocked. It's also the difference between being polite and being rude, and hopefully you care about that.

(With the increasing decay of Internet search engines, you might also want to build your own personal index of useful website content.)

First, you need to work out the URLs for the real content of the website. Many websites of interest have some mixture of real pages and various sorts of indexes and other aggregations of those real pages, and it's not uncommon for the index pages to outnumber the real pages, sometimes vastly. Your personal copy of the website contents doesn't need all of those index pages, you probably don't want them because they'll inflate the size of your copy, and the website itself will probably be unhappy that you're fetching a ton of redundant index pages.

(The amount of index pages varies with site design. Static sites are usually much friendlier than dynamic sites because it's more work to have a lot of index pages in a static site.)

If you're extremely lucky, the website will have an accurate, up to date (XML) sitemap and will put a tag mentioning this in the HTTML <head> of its pages. If you're not so lucky you will have to manually look around to see if it has any particular index pages that you can mine for URLs (eg) and then work out what additional links and pages you need to also fetch to get what you consider a full copy (for example, to also get comments or 'talk' pages or the like, or to fetch images used in the web pages). In less friendly cases you'll have to go through a whole collection of category pages to accumulate the URLs.

(It's possible that the website supports paged syndication feeds and you can go back through its syndication feed to collect a full set of initial URLs, but I suspect that's not any more likely than a discoverable sitemap.)

Having accumulated your list of URLs, it's time to start fetching them, respectfully. Respectful fetching means doing two things: working slowly, and having an honest HTTP User-Agent. Working slowly means that getting a full copy will take a significant amount of time, but unless you think the website is going to go away tomorrow, you have that time. By 'slowly' I mean a request rate of one every 30 seconds or every minute, and if you get HTTP 429s or other indications of rate limits, you should slow down, even if you think this is absurdly slow. In my view, an honest HTTP User-Agent admits to what you're doing and optionally names the software you're using to do the fetching, because the web site operator cares much more about why these requests are happening than that you're using curl, wget, or whatever to make them.

(You especially shouldn't pretend to be a regular browser, or directly use a headless one. In these days of aggressive stealth crawlers, that makes you look extremely suspicious and may well get you blocked rapidly.)

Once you start fetching, you should monitor your fetching for problem indicators. Basically anything other than a HTTP 200 success may be a sign that either you have the wrong URLs or that you're in some way not welcome to do what you're doing. Continuing despite a spate of HTTP redirections or HTTP errors isn't particularly useful for your content copying project; you're only going to have to weed the results out of your copy.

(Also, continuing when a website is telling you 'no' is being rude. You're saying that your desires are more important than the website's views, and this generally makes you a certain sort of person.)

What all of this will get you is a personal copy of the website's content, possibly in addition to a skeletal set of index pages that you can use to navigate through it (you collected these pages when you built the initial URL set). It won't get you a complete archive of the website in HTML form that you could stick up somewhere else. A full website archive is a different thing, one that websites may be much more hostile to depending (in part) on how much redundant content you will wind up crawling in order to assemble your 'complete' version.

(Even if what you want is a full archive of everything, including index pages, starting with the important content first gets you the important content if something goes wrong.)

PS: Wandering Thoughts has a sitemap, which I bashed together many years ago to make Google happy and then found it was convenient for testing because it gave me a list of all pages that I really cared about the HTML rendering of. Interested parties can access it by putting a '?sitemap' on any directory URL. It's not (currently) in the HTML <head> of any pages because when I set it up, that wasn't really a thing. Given the modern web environment, I'm not certain I'll ever make it visible in the HTML <head> because I'm not certain I want to hand every abusive crawler a nice obvious map to the juicy bits.

(I have no idea how long it's been since Google accessed the sitemap; I suspect it's been years. But then, I increasingly don't care about Googlebot, although that's another entry.)

Unix has been changing, but in places where I don't see it

By: cks
21 May 2026 at 01:57

For reasons beyond the scope of this entry, I've wound up thinking about how stable or unstable the Unix landscape has been 'recently' (which means for more than a decade, and especially as compared with the 1990s and early 00s). I've written about aspects of this before, such as the fading out of multi-architecture Unix environments. In thinking about it more after my Fediverse post, I've come to feel that the Unix environment has still been changing but in places where I'm not as conscious of it.

The biggest change is probably the growth of cloud Unix, which I could characterize as "Unix machines on demand". In practice, cloud Unix is a whole new Unix environment that is quite different from traditional Unix, with different tools and especially different practices. Some of the practices are (sort of) extensions from old fashioned large scale Unix administration but many aren't really. I'm aware of cloud Unix and this gulf between operating it and what we do if I think about it, but I don't usually.

Cloud Unix practices spill over into what people want to do outside of the cloud, in the form of things like containers. Operating software through containers is quite different from traditional Unix system administration, especially if responsibility for the containers themselves gets moved from the system administration team to other people.

(There's also the idea of immutable systems created through declarative means, which isn't mainstream but also isn't a tiny corner any more. You can find plenty of people using Unix this way on servers and even desktops.)

I think that all of this has led to a significant change in how people experience Unix. Increasingly, Unix is either a desktop environment (not necessarily a graphical one, consider WSL in Windows) or a backend target; it's not something you explicitly (remotely) log in to very much. We've seen less and less direct use of our login servers and more use that is, for example, modern desktop IDEs starting remote sessions to run development tools on our servers. If VSCode could start SLURM jobs for people, some people here might never explicitly log in to our compute servers. I personally still log in to lots of remote Unix machines, but I'm increasingly an exception.

(I can't throw stones here since I recently carefully set up my desktop GNU Emacs so I could run remote LSP servers (and Git) through Tramp.)

A quiet but significant development is that after narrowing to x86 in practice for a while, Unix is moving back to being multi-architecture. There are a steadily increasing number of ARM servers and ARM devices that run Linux and other Unixes and that you'll find in the wild, primarily in clouds and as small Unix computers that you might put on your network to do specific jobs. It's plausible that some day we'll also get RISC-V servers and devices, or see ARM on general (Unix) desktops. People now routinely care about multi-architecture support for languages, compilers, distributions, and so on, where I think ten or twenty years ago that was a relatively niche concern.

(We've actually looked at small ARM-based Unix devices repeatedly and passed on trying to actually operate any of them for various reasons. Moderate sized, general purpose ARM servers don't seem to really be a thing so far, but maybe someday.)

In Linux, systemd is a drastic (and good) change on how init systems worked and how you interact with them, and makes that part of system administration relatively different from the pre-systemd days. Although I don't know when exactly it happened, the BSDs have gone through a similar evolution that regularized and improved the old ad-hoc BSD init system, making it rather easier to operate. This is probably the most dramatic change a system administrator from 2006 would notice if you jumped them 20 years ahead to today (and had them work with on premise servers without containerization).

There are certainly things that are part of my day to day use or at least administration of Unix that weren't there a decade or two ago. Even on old fashioned on premise servers, there's a lot more JSON and YAML than there used to be, partly because JSON has become the universal program-readable output format that everyone can agree on (and good tools, such as jq, have become widely available). But broadly, I feel that Unix has carried on being Unix and the experience of logging in and using the environment hasn't changed dramatically. If anything, different Unixes have become more similar, partly because lots of Unixes use the same programs (such as Bash and vim) and partly because Unixes have converged on common options for common programs (through both POSIX and pressure from people using them).

(Bash and vim aren't necessarily the default experience on all Unixes, but they're commonly available, partly because people want them.)

PS: The switch from X to Wayland is (still) a change that's in progress, but at the same time it's broadly supposed to be an invisible one to most people. Whether it should count as a change in Unix I will leave up to you.

Sidebar: My history with universal dotfiles

A long time ago I tried to have universal dotfiles for my shell environment across all of the multiple Unixes that I then had accounts on. The result was complicated, with lots of per-Unix and per-group settings. Today, I'm relatively certain that I could do a version for the surviving Unixes and system environments (and accounts) that had almost no conditionals. Some of this is through Unixes converging, some of it is through vendors with weird Unixes going away or becoming irrelevant to me (I'm unlikely to ever log in to another AIX machine, or a Solaris family one), and some of it through a relative convergence in how to administer machines.

Notes about reading messages with the Python email packages

By: cks
20 May 2026 at 03:01

I have a long standing personal program to display MIME formatted email messages in the terminal in a sensible way (it was mentioned in this old entry on my email tools and its comments). For a long time this was a Python 2 program, using the Python 2 version of the email package. Recently, I moved this program to Python 3 as part of my sudden enthusiasm for Python 3 conversions, using the Python 3 version of email and its sub-packages. In the process I have wound up with some notes and opinions on practical use of the Python 3 email packages.

(The Python 2 version of email had its own quirks and oddities, but I worked all of those out that hard way years ago, have mostly forgotten them since, and they're not interesting any more now that the era of Python 2 is over.)

The Python 3 email documentation will tell you that the modern interface for email messages is email.message.EmailMessage. The older email.message.Message is (theoretically) only there for Python 3.2 compatibility and you should ignore its methods and use only the EmailMessage methods. This is not entirely the case. If you look behind the curtain, you'll discover that many of the EmailMessage APIs for reading message contents are in fact Message APIs with masks on, and especially they're various masks for Message.get_payload(). That get_payload() isn't obsolete in practice matters, because it turns out that get_payload() is the only way to do certain things you (I) need.

As with decoding email headers, my strong impression is that the entire set of email parsing and message reading APIs are only really designed to deal with well formed email messages with fully correct MIME. This isn't what you find out in the real world, both due to programs being imperfect and also due to things like other mail systems sending you a bounce message that includes a message/rfc822 version of the original message where the other mail system has retained all of the message headers, including the Content-Type that says the original message was a multipart/alternative, but has replaced the entire body of the message with '(Body suppressed)'. As far as I can tell, there's no EmailMessage API that will give you (just) the body text of that (malformed) message/rfc822; your only way to dig it out is to use the older Message.get_payload() API.

(That bounce example is a real case that I've seen.)

At the same time, EmailMessage.get_content() is a handy API that does a lot of the work for you for things like extracting a de-mangled, Unicode version of a text part (or anything that's sufficiently text-like, although you will get back a bytes thing instead of a str and then decode it yourself). So I use get_content() as much as possible but some things have to fall back to get_payload(). The one thing I'm cautious about with get_content() is that it has a cheerful trust in the asserted character set encoding of the MIME part, when I'm pretty certain that some mail creation programs blithely assume you'll typically interpret stuff as UTF-8 (especially if it has no type specified, which in theory means ASCII).

(get_payload() will also probably give you heartburn if you're trying to use typing, but this is a general email problem with API typing.)

The email package parses your messages with stuff in email.parser, which has some additional notes on how it theoretically parses things. Some of these notes are experimentally false, especially the one for message/delivery-status. The actual story is in comments in the source code:

message/delivery-status contains blocks of headers separated by a blank line. We'll represent each header block as a separate nested message object, but the processing is a bit different than standard message/* types because there is no body for the nested messages. A blank line separates the subparts.

Although the actual text of a message/delivery-status part is plain text (admittedly in a specific format, in theory), the parsed version is a multipart EmailMessage object containing a series of text/plain EmailMessage children, where the actual contents are in the headers of those text/plain children (and the 'body' is empty). The best way to extract the actual contents as text to print or process them is to use EmailMessage.as_string() on each child. This is quite confusing if you expect a message/delivery-status to have obvious contents or to match the documentation (and EmailMessage.get_content() doesn't work right on the multipart parent object; this may be a bug that will be fixed at some point).

PS: The reason you don't want to use .as_string() on text or broken MIME parts is that MIME parts have headers, namely the various Content- ones, and .as_string() will give you those headers as well as the text you want. There's no option in the EmailMessage API to not get the headers.

Sidebar: Types for email stuff

Because sometimes I get enthusiasms, I added types to my program that's using email. It was somewhat painful and the kind of thing that you describe after the fact as "a valuable learning experience". In order for future me to not lose that learning experience, here's some notes.

My first problem was that often, mypy inferred that something was an email.message.Message instead of an email.message.EmailMessage; the latter is a subclass of the former. Much of this could be fixed with isinstance() to create type narrowing. I found the most convenient way to do this to be an assert(), for example:

prs = email.parser.BytesParser(policy=...)
m = prs.parse(fp)
assert(isinstance(m, EmailMessage))
[...]

Here I know that email.parser.BytesParser will return an EmailMessage because that's what my policy is set up to do (cf), but mypy can't see that.

A more involved situation is the return value of Message.get_payload(), which mypy typically typed as including 'list[Message]' when I know that what I have is a 'list[EmailMessage]'. Fixing this requires typing.cast():

def showalternative(p: EmailMessage) -> None:
  m = p.get_payload()
  if isinstance(m, str):
    [...]
    return

  assert(isinstance(m, list)) # for safety
  m = typing.cast(list[EmailMessage], m)
  [...]

You need to use typing.cast() to correct mypy's idea of the member type of a list or other container.

(Technically mypy and any other type checker that does similar inference. I don't know my way around the Python typechecker landscape, although I've wound up with a few of them installed.)

The hardware needs of our mail system (as of mid 2026)

By: cks
19 May 2026 at 21:08

In a comment on my entry on universities, email, and the issues of running things in house, I mentioned that our departmental email system has a non-trivial cost in hardware alone to keep going. To better illustrate that, I'll describe all of the servers that our email system currently requires (because it's more than one). Some of these servers exist for historical reasons and may go away at some point, but many of them don't.

Currently, we have:

  • A server as our external mail gateway (our DNS MX target). This is separate from other mail servers because it's much simpler to configure and operate this way.

  • A server for the (FOSS) anti-spam and anti-virus software we use (and everyone needs some version of). This could be folded into the mail gateway server (and it was in our recent backup MX, but we weren't sure about the software's resource usage and system impact when we set it up. Keeping it separate also means we can move it to a new OS version for more up to date software without having to worry about any changes in new versions of the mailer that the mail gateway runs.

  • A server for our central mail machine that handles all aspects of email to local addresses, which for various reasons (cf) can include sending email to the outside world. This machine doesn't store any email locally; instead, to simplify slightly, email lives on our general purpose NFS fileservers.

  • A separate server to handle forwarding known spam to outside email addresses. We're required to support this by people using our email system and we found it necessary to put this work on a separate machine.

  • A server to handle unauthenticated mail submission from inside our networks. Separating mail submission from the central mail machine makes for a simpler configuration for both (eg), and we historically started with only an unauthenticated mail submission machine.

  • A fairly powerful server to handle IMAP and authenticated SMTP submission, which these days also has /var/mail (where all our inboxes live) on local storage and thus also acts as a NFS server.

  • A server for a webmail frontend (to our IMAP server). We put this on a separate server than IMAP for multiple reasons, including resource usage and that it decouples the OS and packaged software version requirements of our webmail (for instance, certain versions of PHP and Apache) from everything else.

We've found it very important for practical reasons to use separate IP addresses for different sorts of outgoing email (also). We can do this on a single machine (and we do), but in many ways it's simpler to use separate machines for different sorts of email. It's also simpler to handle things like rate limits if we use different machines for things that need different rate limits.

All of these servers rely on existing elements of our general infrastructure, such as our general purpose NFS fileservers, our local DNS resolvers, and our system of propagating account information. I hope that at some point in the future our IMAP server machine will also wind up relying on our local OIDC identity provider (and indirectly on the LDAP server it uses), but that's currently not possible in practice. I'm mentioning these because a stand-alone mail environment would require some equivalent of all of them; you have to store mailboxes somewhere, get account and authentication information, do DNS resolution, and so on.

Most of these servers are 'basic' 1U servers, which these days means that they have 16 GB to 32 GB of RAM, a mirrored pair of SATA SSDs, a reasonable CPU, and traditionally cost a few thousand dollars each if bought new (their prices are probably higher at the moment). These specifications are good enough that we don't have to worry about the exact resource requirements of each server's job (although we made sure to give the anti-spam software machine 32 GB of RAM and a decent CPU). If we used smaller machines we'd have to be more careful; I'm pretty sure that not all of these roles would be happy with only 8 GB of RAM in practice (much less 4 GB). Basic 1U servers used to be cheaper, and these days we've got a stock of older servers that are good enough for these jobs. But if we were setting up a green field environment from scratch and had to buy all of these new, five or six servers (possibly plus a spare) would be a non-trivial cost.

(Because we're using the same sort of servers for these as we use for everything else, there's no dedicated spare for specific machines; we have spare server hardware in general.)

The one server that is an exception is our IMAP server. The current version has 64 GB, four relatively large SATA SSDs, a decent CPU, and 10G-T networking, and because it's so important we have a spare server ready to be pressed into use immediately in case of a hardware failure. The current hardware is old enough that we'd like to replace it, this time with more memory (so more things get cached) and NVMe SSDs instead of SATA ones. Unfortunately, in the current environment the price quotes we got are jaw dropping and unpleasant (especially since we have to buy two of the basic server to have a spare, although we don't need two sets of the NVMe drives).

All of this serves a department with somewhat over a thousand active people, about 1.5 TBytes of inboxes (if we talk about the likely uncompressed size; since we use ZFS for /var/mail, we have compression turned on), and an inbound mail volume that is probably around 10,000 messages a day. As mail system sizes go, this is modest.

(We have several thousand inboxes (and Unix accounts to go with them), but many of them are inactive for various reasons. The size distribution of inboxes is also extremely uneven, as you might guess.)

(Publication of this entry was delayed by me getting distracted and forgetting to actually publish it last night. I didn't realize it was still sitting in my drafts area until I noticed the stray editor window just now.)

Unicode and Emoji in terminals, or my simple but difficult wish

By: cks
18 May 2026 at 02:26

On the Fediverse, I had a simple sounding wish:

This is my face that I need a simple pagination program for Unix (show a page, pause, hit CR to show the next page) that is Unicode and emoji aware, so that it knows how long lines with them are. AFAIK less can't be used for this when I don't want it to ever clear the screen, just to keep printing the next page for however long.

(... because you have to know how long lines of text are so that you know when you've printed a full page.)

(So what I'd like is 'cat, but paginated'.)

This sounds like a simple, easy wish. Some of my readers are now laughing flatly, because it's not. In fact I believe it's impossible to write a simple general program to do this; you need either terminal program specific knowledge or to do some relatively extreme tricks as you print text.

Once upon a time, physical terminals and thus terminal programs were simple. They showed a set of characters in a monospaced grid, commonly with bytes mapping one to one to displayed characters (I'm ignoring DEC's double-sized character escape sequences). In this world, 'cat but paginated' is relatively simple, and indeed I have a program that does exactly this job; the only real complexity is handling tabs (where you have to work out what the next tab stop is in order to correctly track the width of the line).

(You have to track the width of a line because you need to know when the line you're printing spills over to a second physical line despite the lack of a newline character.)

The first problem that terminal programs give a pagination program in the non-Latin world is over-sized characters. Latin text has relatively simple character shapes that are easy to read at modest font sizes, but other scripts and other sorts of characters have much more complex shapes that are hard to read if you squeeze them into the same monospaced grid block as a Latin character at a given point size. So some of the time, some terminal programs don't; they render the characters larger. Which characters are rendered larger? It depends on the terminal program and, I think, the font (and certainly the character; see Let's Stop Ascribing Meaning to Code Points).

The second problem is emoji. Emoji are one of the common cases of Unicode characters combining together, or more exactly I should say Unicode code points. Famously, many flag emoji are actually two emoji put together. For example, the Canadian flag emoji, 🇨🇦, is the 🇨 emoji followed by the 🇦 emoji (CA is the ISO country code for Canada). Whether this renders as a Canadian flag or as C followed by A depends on whether the terminal program and the font rendering environment knows about this specific combination and is willing to turn it into a flag.

(There can be multiple reasons for not rendering an emoji flag as a flag, including that sometimes flags are new and sometimes flags are politically charged, for example 🇹🇼 or 🇵🇸. I would not be surprised if in some environments, one or both of those flags is not rendered as a flag, but as two emoji characters.)

As a practical example, in my X environment the only terminal program that combined 🇨 and 🇦 to make a Canadian flag is konsole. None of xterm, rxvt-unicode, or to my surprise gnome-terminal did (gnome-terminal does render many emoji, but apparently not flags or even emoji characters). What this means is that how many displayed characters this sequence takes up depends on the terminal program. A pagination program that assumes it's some fixed width is guaranteed to be wrong some of the time.

(Emoji rendering can also be an example of wider character rendering. In konsole, 🇨 is as wide as two regular Latin characters; in the other three terminal programs, it's single width. My graphical GNU Emacs also combines emoji characters to make flags and displays emoji characters as double width in a monospace environment. In gnome-terminal, emoji that are displayed properly (as emoji) are typically double width.)

If you want your pagination program to strictly print output without manipulating things through cursor positioning, I'm not sure what a good way to handle this is. From where I sit, it certainly looks like a program that satisfied my simple sounding wish would have to hard code knowledge of how various terminal programs I use render emoji.

(I'd be remiss if I didn't point to It’s Not Wrong that "🤦🏼‍♂️".length == 7.)

PS: The 'less' pager seems to be able to cope with this, so it's possible in general if I'm willing to give up my quixotic wish for a 'cat with pagination' instead of something that clears and overwrites the screen, ruining my scrollback.

PPS: Arguably the correct place for this sort of pagination is in the terminal program itself, but xterm doesn't do that and I'm very attached to xterm. Also, I'm not sure if any existing good X terminal program does this today.

Our servers seem to have surprisingly low power consumption

By: cks
17 May 2026 at 03:03

For reasons beyond the scope of this entry, today I was curious how much power our servers were using. If you have sufficiently fancy PDUs, I think you can get per-outlet measurements that are trustworthy, but we don't have such PDUs. Instead, the best I can do is look at the information that some of our servers report through IPMI. I'm not sure how accurate the IPMI information is, but at least some of the numbers seem plausible, so I'll assume that it's not massively off. The results surprise me with how low they were.

Most of our servers are basic 1U servers with a pair of SATA SSDs. They're typically not very active, and the not particularly active servers that report IPMI power usage are reporting anything from 22 watts to 26 watts right now. A few servers also have a pair of HDDs, and one has four SSDs and is typically active; all of them report 44 watts right now. Our NFS fileservers currently have 24 SSDs in each and are reporting a range of power usage from 47 watts to 62 watts. One interesting case is our perimeter firewall, where we have the active server and an identical running hot spare, both 1U servers. The active server is at 52 watts (and handling about 40 Mbytes/sec of network traffic); the hot spare is idle at 26 watts.

We have a few compute servers that report power information through their IPMI, and the ones that are currently active are reporting the highest power usage. However, even this isn't spectacularly high. The most power hungry machine is a GPU SLURM node, where its IPMI is reporting 330 watts of total power while its GPU is claiming about 166 watts of power draw (its CPU is busy too).

Some of our servers don't report power usage in their IPMI sensor data but will report it through the web interface of their BMCs. I checked two of them, both powerful 1U servers that are essentially identical to each other. The one that is our primary login server is reporting an average of 149 watts and a peak of 195 watts over the past hour. The SLURM compute node, which is currently in active use, averaged 501 watts in the past hour with a peak slightly higher (when not in use it appears to idle around 107 watts).

One of the reason these numbers surprise me is that many of the idle numbers are lower than my desktops. I have a mental image of servers as not being particularly low power or power efficient, just as they're not particularly quiet, but that seems to be wrong. I suppose it's not too odd that people making 1U servers care about power usage and power density, since that's definitely a concern in general in data centers, it just hadn't really occurred to me before.

(Our own use of 1U servers is not particularly constrained by power and cooling.)

Getting C code navigation even for Debian (or Ubuntu) packages

By: cks
16 May 2026 at 02:00

Every so often, I want (or need) to make modifications to programs in an Ubuntu package, and often the programs are written in C (and these days I'm using dgit to manipulate the package). One of my challenges when I do this is that I generally don't start out knowing where and how to change the code to do what I want; instead, I have to navigate around an unfamiliar code base and work out enough of its structure to find the specific bit of code I need to change.

These days, the dominant way to get smart code navigation and other code knowledge things is through LSP servers and clients. A variety of modern and semi-modern languages have LSP servers that you can immediately use in your editor of choice and then navigate around random code bases with handy features like 'find definition' and 'find references' (for example, Go, Python, and Rust). Unfortunately, C isn't such a language. In the general case, understanding C code requires knowing how it's compiled, and that means you often have to tell C LSP servers this information. Well, specifically you have to tell this stuff to clangd, the dominant LSP server for C and C++.

(There's also ccls, which may work out part of this information on its own, but it seems to be less popular and I have no experience with it.)

Fortunately for people like me, there is a simple way to gather this compilation information even if the program's build system doesn't do it for you, and that's Bear (which is available as a standard Ubuntu package for extra convenience). Bear operates as a front-end on however you normally build your program; you build your program (or collection of programs) with 'bear -- <build command>', and Bear monitors compiler execution and records everything. This is slower than a normal build (sometimes significantly so), but you get a compilation database out of it and then you can use LSP tooling to jump around the source code.

(My understanding is that gcc, clang, and so on can generate this compilation information if they're asked, and modern build systems often ask them to do so, but an old fashioned build system using things like 'make' won't include the magic compiler options necessary. Possibly you can include them yourself by hand, but Bear takes care of the work for you.)

Somewhat to my surprise, Bear not only works with programs built by 'make', it also works when you build Debian or Ubuntu packages under Bear with 'bear -- dpkg-buildpackage -uc -b'. If you're building a substantial package (such as Dovecot), you're definitely going to notice the slowdown, but you do get LSP based code intelligence out of it (and you only have to do this once, not every time you change the code).

(Under some circumstances you may have to edit the generated compile_commands.json to take out gcc options that clang doesn't support, but fortunately the JSON file is in a human friendly format where each compiler option is on its own line. Possibly there's a way to manipulate the Debian/Ubuntu package build process to not use such options in the first place.)

Building Debian and Ubuntu packages contaminate your source directory, so once you've run a build under Bear to generate the compile_commands.json file, you need to move the file to safety and then reset your source directory somehow. If you're using dgit (which I very much think you should be), I believe this can be done with a variant of the standard dgit source directory reset instructions:

git clean -xdf -e compile_commands.json
git reset --hard

The process I suspect I'm going to follow in future dgit modifications of Ubuntu packages is to set up the package with dgit, build it once under Bear in unmodified state, rm the generated .deb and .ddeb files, and then start poking around the source code with LSP intelligence to find where I need to make my modifications (and then commit them and do a dgit build as usual).

(This elaborates on some Fediverse posts.)

I've finally ported DWiki from Python 2 to Python 3

By: cks
15 May 2026 at 00:17

DWiki is the pile of code that underlies Wandering Thoughts. It started out many years ago as a Python 2 program (partly because there was no Python 3 at the time), and it stayed that way for a long time, making it the most significant and by far the most substantial Python 2 program I still cared deeply about. Years ago I said I'd port it to Python 3 someday and somewhat to my surprise, that day has now come (well, it came yesterday).

The direct trigger was discovering that Python 3.13 had dropped 2to3, which made me feel that I should run 2to3 over DWiki's current Python 2 code base while I still could (I had an old conversion from many years ago, but that converted code base was very out of date). One thing led to another, as it often does with me, and I wound up doing a full port and then putting it into production, which is to say serving this blog. I suspect that part of me just felt it was time.

(The 2to3 removal is in the Python 3.13 release notes, and it comes after 2to3 and its infrastructure were deprecated in 3.11 for reasonable reasons.)

As I expected years ago, the stuff that 2to3 could handle was the easy part. Much of the actual work of the port was sorting out the boundary between Unicode strings and byte strings in a Python 3 world. Some of this would have been easier if I'd found PEP 3333 earlier and followed it in my own discount WSGI implementation, but a bunch of it I had to find the hard way, by trying things and having them blow up, sometimes in production.

(I wound up in the same place as PEP 3333 just from the inherent requirements of the web. For example, the HTTP Content-Length is in octets, so if you're using it to read a POST body, the object you're reading from has to be providing bytes. And it turns out that you can't write HTTP headers to a text mode file object because that will turn \r\n sequences into \n, which will make things unhappy with you.)

Not all of the changes were at the IO boundaries of DWiki (and the IO boundaries themselves weren't always simple or obvious). Python 3's handling of cryptographic hashes requires bytes, which rippled through to several places where I use them in DWiki (and the hmac API changed a bit, which wasn't fixed up by 2to3). Python 3 also really wants your regular expressions to be in r"..." strings, because otherwise it will complain about you using regular expression backslash escapes like '\s' that aren't string backslash escapes.

I don't have a DWiki test suite, but long ago I built scripts that would crawl and collect all real pages from an old and a new version of DWiki. I originally used these to check for changes in how pages got rendered when I changed the wikitext processing code (often I wanted no changes), but this time around I was able to use them to verify that the Python 3 DWiki could at least render all existing pages into essentially the same thing (there were \r\n sequences that turned into \n instead of being passed through, but that's probably a good change). But that still left things like writing comments, and also the two sets of code involved in how DWiki runs in production instead of in testing.

I probably wouldn't have tried to do this if I hadn't had a relatively substantial block of free time. It took me more or less all day yesterday to get up to the current production state, with a lot of back and forth, experimentation, and tweaking. There was a lot of code and problem context that I might not have retained if I'd had to slice my work up into half hour or hour long chunks of work, and once I started running the Python 3 version as the live server I was relatively committed to fixing any problems that came up on the spot.

(I could have rolled back to the Python 2 version but it would have been at least a bit awkward for various reasons, including a pickle format change.)

The current Python 3 DWiki code still needs additional cleanups, partly to undo unnecessary 2to3 changes like changing 'for ... in dct.keys():' to 'for ... in list(dct.keys()):'. But it's running stably now for, well, not quite 24 hours yet but for at least a bunch of all of the typical traffic that Wandering Thoughts gets. Probably there aren't any remaining Unicode conversion issues, although re-reading one of my old entries makes me feel I should audit every use of EnvironmentError when dealing with files.

(2to3 appears to always put list() around things that changed to return generators in Python 3. Sometimes this is important, but it's not necessary if the result is only being used in a 'for'.)

I also want to think about what Unicode error handling to use in various circumstances, although these days I'm inclined to be draconian. For example, if someone tries to write a comment with invalid UTF-8, I probably don't want to backslash escape the invalid bits, so the default 'replace' handling is fine (in my case, this comes from using urllib to decode POST bodies). And currently all of the existing content in Wandering Thoughts is UTF-8 clean, at least as far as I can tell.

(The whole Unicode and bytes issue is something where types would be handy (or an option to turn off all of Python 3's implicit conversions), but adding typing to DWiki's 'originated in Python 2' codebase is both a lot of work and also extremely messy, because it uses things in ways that mypy is already unhappy about.)

PS: The Github version of DWiki is now significantly out of date and I'm probably not going to update it for reasons that don't fit in the margins of this entry.

Sidebar: The Python 3 WSGI rules in a nutshell

To summarize PEP 3333 in my own way, HTTP headers are Unicode strings, ie str, but must be limited to iso-8859-1 characters (at least when you write them). The wsgi.input file object produces bytes and your HTTP response body is also bytes. In a CGI environment, you read from sys.stdin.buffer and your WSGI CGI implementation writes to sys.stdout.buffer (including the headers, after encoding to iso-8859-1).

If your WSGI implementation is talking to a network socket, you can and must leave the network socket as a binary file object. In my case, this generally means wsgi.input is created with 'os.fdopen(fd, "rb")'.

A GNU Emacs learning experience with text-mode hooks

By: cks
14 May 2026 at 17:51

For a while, one of my little irritations with my Emacs environment was that sometimes, when I fired up Emacs to edit some code and then quit out of it, Emacs would complain that there was still an ispell process running and ask me what to do with it. This was especially mysterious to me as I don't normally use flyspell-prog-mode (I find it too irritating for general use). Recently I got sufficiently irritated to use a combination of the ELisp debugger and strategic '(message ...)' usage to track this down, which initially looked like one issue and actually turned out to be another one that I discovered only as part of writing this entry.

One of the major modes in GNU Emacs is text-mode. I have a text-mode hook, probably like many people, and one of the things it does is turn on flyspell-mode in that buffer, which causes flyspell to invoke ispell and thus start an ispell process. It's also my custom from long ago to set the default major mode of buffers to text-mode (the out of the box default is fundamental-mode). If I'm editing something and it's not program source code, it's almost always text and having to say 'M-x text-mode' all the time is the kind of annoyance GNU Emacs is designed to erase.

When I used debug-on-entry to find out where the ispell process was starting from, it pointed to my text-mode hook. At first I theorized that code buffers were starting out in the default mode (and thus triggering my text-mode hook) before being switched to their proper mode, but strategic use of '(message ...)' in my text-mode hook revealed that it was actually being triggered on a scratch buffer for Flycheck. So I switched my theory to Flycheck creating scratch buffers without specifying their mode, so they would up in the default major-mode, which for normal setups is fundamental-mode but for me is text-mode, triggering my text-mode hook and starting ispell.

Except I looked at the Flycheck source and this is wrong. Here, let me quote a small bit:

(define-derived-mode flycheck-error-message-mode text-mode
  "Flycheck error messages"
  "Major mode for extended error messages.")

Flycheck explicitly derives the mode for some of its scratch buffers from text-mode, which of course means that they run text-mode hooks. This is a perfectly reasonable thing to do in general, since text-mode is the appropriate mode in general for, well, text, but it leads me to today's GNU Emacs learning experience which is that text-mode hooks may run in surprising buffers, not just text files I'm visiting and editing. I shouldn't put anything in my text-mode hook that I want only for real text files that I'm editing, at least not without guarding it somehow. One of those things is flyspell, not just because of its side effects of starting an ispell process but also because I don't particularly want flyspell to mark 'misspelled' words in, for example, Flycheck diagnostics.

(Flyspell's markings also get in the way of mouse based copy and paste.)

My solution was to guard what my text-mode hook did so that it only happens in buffers associated with a file:

(defun cks/text-mode-hook ()
  (when buffer-file-name
     ....))

It's possible that some day I'll want my text-mode setup in an anonymous buffer, but until that day I'll leave such scratch buffers alone. I could probably do a bit better by looking for buffer names that start and end with * (this is the usual GNU Emacs naming convention for explicit scratch buffers), but that would take a bit more work.

(Although not much more, now that I've found string-prefix-p and string-suffix-p.)

Going from a ZFS object ID to its path the easier way

By: cks
14 May 2026 at 02:52

It's not uncommon that people using filesystems want to map from an internal object number (an 'inode number' for normal filesystems, an object id or object number in ZFS) to a path. ZFS itself wants to do this efficiently for things like 'zfs diff' and the 'zpool status' report on what files are damaged. To help with this, ZFS stores the likely parent object for every normal filesystem object. If you use zdb to do a sufficiently verbose dump of any particular object, you can find this as the 'parent' attribute.

If you want to do this mapping yourself, you can use zdb or something like it to manually follow these 'parent' pointers (and also look up the name of everything in its parent directory). However, that would require high privileges, and ZFS doesn't want to make things like 'zpool status' require that, so the kernel and libzfs expose an API for this. In libzfs, this is 'zpool_obj_to_path()', which uses the kernel's ZFS_IOC_OBJ_TO_PATH ioctl(). Because it's intended for internal usage, this API doesn't take a pool and filesystem name (in addition to the object ID); instead it takes a pool handle and a dataset ID. It's up to callers, such as 'zpool status', to do the mapping.

(One reason you might want to go from an inode number (object id) to a path is that various things only give you inode numbers, such as NFS v4 locks on Linux NFS servers. Or you might have NFS activity tracing software that can only reliably report the inode number of files and directories that people are using heavily.)

In OpenZFS, years ago someone wrote a command that used this libzfs API to do all the work for us, zfs_ids_to_path (also). Like the API, this requires the dataset ID. Helpfully we don't need to use 'zdb' to get this; instead we can ask 'zfs list' for it. This gives us:

# zfs list -o name,objsetid ssddata/homes
NAME           OBJSETID
ssddata/homes       431
# zfs_ids_to_path 431 1920047
/homes/cks/.rcenv

Illumos and FreeBSD don't ship a version of zfs_ids_to_path, but the source code is sufficiently small and self contained that you could probably compile it yourself.

(Although my test FreeBSD 15 instance doesn't have the libshare.h header that's needed by libzfs.h, presumably through a packing mistake.)

If you needed to do this frequently and found it annoying to look up the dataset ID every time, I believe that it wouldn't be too hard to work out and write the code you needed in order to go from a name like 'ssddata/homes' to a pool object and a dataset ID. Sorting through, for example, the source code for 'zfs list' might take some work (there's a whole collection of callbacks and so on), but it's doable (and perhaps someday people will write a slightly handier version).

(The lazy person can write a front end script today that combines 'zfs list' with zfs_ids_to_path.)

In praise of the Linux kernel netconsole (in the right circumstances)

By: cks
13 May 2026 at 00:29

The Linux kernel's netconsole is a kernel module that will "log kernel printk messages over UDP" to a remote system, which makes it another form of kernel (message) console. These days it can be activated either on boot or after boot, and in the past I've had mixed views of it. However, I recently had a nice experience with netconsole that's left me more well inclined to it in specific situations.

A while back, my home desktop started locking up every once in a while. Several years ago my home desktop had a somewhat similar problem that was due to hardware issues, but the lockups this time were different, in that the machine would lock up for a bit and then reboot on its own. Local logs showed nothing, but I happen to have another machine sitting around so I thought I might as well try netconsole again. These days netconsole can be enabled on the fly:

modprobe netconsole
cd /sys/kernel/config/netconsole
mkdir heedra
cd heedra
echo em0 >dev_name
echo 192.168.X.Y >remote_ip
echo 1 >enabled

(This other machine is called heedra for obscure reasons.)

On the other machine I ran a simple script to capture output inside a screen session:

#!/bin/sh
while :; do
   nc --recv-only -u -l 6666 |
      tee $HOME/work/h-logs/netconsole
done

(The advantage of --recv-only is that nc won't complain if I hit CR a few times in the screen session to create blank lines, so new messages are more obvious.)

After a while, my home desktop locked up again and rebooted soon afterward. When I checked the netconsole log file on the other machine, I discovered that I had actually captured kernel log messages, and reasonably useful ones at that.

The kernel logs revealed that this appears to be a kernel 'soft lockup', where all cores had gone to 100% system usage during what appears to be TLB flushes or cross-core kernel communication. In several of the kernel stack backtraces, bpf_trace_run4 appears, so I suspect that there's an uncommon eBPF locking race or issue that's infrequently tickled by the eBPF metrics gathering programs I normally run on my desktop.

(It's probably not from the eBPF programs systemd uses for network access control, since those are used widely.)

Capturing these kernel messages doesn't give me a solution, but at least it gives me a way forward if the lockups get too frequent and annoying (I can try disabling my eBPF metrics collectors). And I couldn't have gotten these messages with anything else except a serial console, which I don't have available on my home desktop and anyway would have needed a second machine in physical proximity (which is awkward in my home setup).

My understanding is that netconsole isn't quite as reliable as a serial console for getting last gasp kernel panic messages out, since you need more kernel pieces to still be working to transmit network packets. But it's more reliable than anything short of a serial console, and serial consoles are generally in short supply on modern desktops and desktop-like things (including hand-built SLURM nodes). For one off, small scale use my listening script would be fine, although if we needed to use it on a larger scale, we'd need some infrastructure to collect netconsole logs from multiple machines.

(Some suggestions for that are in the comments on my earlier entry.)

A code (reformatting) conundrum in Python, and heuristics

By: cks
12 May 2026 at 02:26

Suppose that you are a Python code reformatter, and someone hands you the following snippet of Python code to act on:

if something:
    blah blah blah
    [...]
    final-line
some-statement

[... more statements ...]

Here's the question: should you reindent 'some-statement' so that it's part of the 'if' block?

One answer is that you absolutely should not. The current code is valid Python code, and you are a reformatter for style, not to correct (presumed) errors. Since this is valid code, you should re-flow line wrapping and so on within blocks, but not change what block valid code is part of.

Another answer is that maybe the person writing this code made a mistake. Style wise, it's common to add a blank line between the end of an indented block and following code; the lack of a blank line suggests that a mistake was made. So maybe you should reindent 'some-statement' to where it properly should be, especially if you have a style rule that says that there should be blank lines in this sort of situation.

(Of course, you could also opt to add the blank line that your style guide says should be there and not change what block a statement goes in. But we're in heuristics territory here.)

If you're a heuristic reformatter, your opinion may change depending on what the 'final-statement' is. For instance, if the final statement in the if block is 'return', it is pretty obvious that there's not supposed to be anything after it. Anything after it is dead code, which would be a different and less likely error. So you should leave 'some-statement' alone and it's valid style to not have a blank line between the last statement in the 'if' block and 'some-statement'.

Python doesn't have all that many statements that definitively end blocks, but it does have some that are extremely suggestive. Consider this pattern of code:

try:
   something
except SomeError:
   pass
some-statement

The pass statement is a no-op, not something that affects control flow, so it's perfectly valid to have statements after a 'pass'; they will be executed normally. At the same time it's commonly used this way when there's not going to be anything after it, so a heuristic Python code formatter that moved 'some-statement' up into the 'except' would make lots of people unhappy.

One such heuristic Python code reformatter is the one used in GNU Emacs in both its conventional python-mode (which 'parses' Python code with regular expressions) and python-ts-mode (which fully parses Python code with a tree-sitter grammar). I'm not sure if these are the same reformatters, but they have the same effects. This particular reformatter heuristic turns out to be the root cause of my Python code reformatting glitches.

(In fact the GNU Emacs Python code reformatting appears to take a 'pass' as a hard end of block and will out-dent anything after it, regardless of which this does to control flow. If you add a 'pass' in the middle of a function and reflow with M-q, GNU Emacs will happily make all statements after it module level ones.)

I experimented with some stand-alone Python code formatters I had sitting around, and none of them behaved this way, which I guess isn't surprising (I tried black, ruff, and yapf). Since the normal pylsp Python LSP server relies on one of them for code reformatting (which one depends on your configuration), this also means LSP-driven code reformatting won't do this. It's possible that only GNU Emacs has this (arguably incorrect) heuristic reformatting.

(I was led to discover all of this by a comment ae left on my earlier entry about Python 2 LSP problems.)

PS: There are other heuristic decisions you can make depending on what 'some-statement' is and where it currently is in the overall block. For example, if 'some-statement' is the last statement in a function and in a 'return', then it's almost certainly correct in its current place. But these heuristics multiply endlessly.

Moving from lsp-mode in GNU Emacs to Eglot

By: cks
11 May 2026 at 03:15

Recently, I decided to take my long standing, perfectly good GNU Emacs lsp-mode setup and completely replace it with Eglot, the now built in GNU Emacs LSP solution. At one level I didn't have any particularly strong specific reason to switch; I started by trying out Eglot after switching entirely to Corfu then just kept going to see how far I could get towards a good Eglot environment. The result is perfectly good and some things work better (Eglot will do 'complete to common prefix' in Go and Python modes) but it took more than a little bit of yak shaving to get here.

At another level, lsp-mode with lsp-ui is what I'd call a busy interface, with all sorts of things going on, and these days I've decided that I want a quieter LSP experience. Eglot is famously more minimal and quiet than lsp-mode, although you can and should augment Eglot's interface with additional packages. I could have tamed lsp-ui more with additional settings and fiddling, but switching to Eglot took care of all of that all at once, with other benefits. Overall I'm happy to have switched, although it was more work than I was entirely expecting.

(Should you switch? I don't know, but if you stick with GNU Emacs and use it in the modern way, I think you will sooner or later.)

As I've described in an earlier entry, Eglot's minimalism is because it's a modern GNU Emacs package that expects you to fill in features with other packages that interact with it through standard Emacs Lisp APIs. This means that for a good (but non-busy) LSP experience in Eglot, I needed to hook up a variety of additional things.

  • Corfu just worked for completion; my general Corfu settings were fine.
  • To get a good cross reference setup where I could get lsp-ui like previews of references to something, I needed to connect consult to the general Emacs xref system by setting 'xref-show-xrefs-function' to 'consult-xref'.

  • I went back and forth between Flycheck with flycheck-eglot and Flymake before eventually settling on Flycheck. Flymake is better integrated with Eglot (in a way that I notice a bit) but I can make Flycheck work well enough and I prefer it in general. Eglot normally automatically puts buffers into flymake-mode, so to shut that off I do (in my use-package declaration for Eglot):

    :config
    (add-to-list 'eglot-stay-out-of 'flymake)
    

    And then to automatically activate flycheck-eglot:

    :hook
    (eglot-managed-mode . (lambda () (if (eglot-managed-p) (flycheck-eglot-mode 1))))
    

    (In theory flycheck-eglot has a global mode, in practice it didn't work out reliably for me and the brute force of a hook was the easiest approach.)

Eglot has some configuration settings that you'll want to experiment with. I found that I wanted 'eglot-extend-to-xref' to be 't', partly because that makes M-? find other uses in my own project of whatever external thing I've jumped to.

Eglot doesn't ship with any key bindings and I definitely needed some, partly to make LSP code actions more accessible. Since it's early in my Eglot usage, my key bindings are probably going to change, but my current set are:

("C-c r" . eglot-rename)
("C-c o" . eglot-code-action-organize-imports)
("C-c h" . eldoc)
("C-c a" . eglot-code-actions)
("C-c q" . eglot-code-action-quickfix)
("C-M-<mouse-2>" . eglot-code-actions-at-mouse)

The mouse binding exists because of one way flycheck-eglot isn't as fully hooked into Eglot as I'd wish, but it turns out to be generally convenient for access to LSP 'code actions'.

(I have deliberately not bound eglot-format to anything. In Go, the one language where I would trust LSP-driven code formatting, I already go-mode's gofmt command that I'm accustomed to using. I also don't expect to use the LSP 'organize imports' often, but maybe in Python.)

This is in addition to key bindings for other packages, such as Flymake, where in order to get nice navigation of Flymake reports, I needed to set up a key binding for consult-flymake along with a few others for Flymake functions. This became a somewhat unnecessary side trip when I went back to Flycheck, but since I built a working Flymake setup, I'm keeping it for any time when I want to use Flymake instead.

Looking back, I'd estimate that most of my work in switching from lsp-mode to Eglot wasn't in configuring Eglot, it was in configuring other packages. But to say it that way makes it sound more straightforward than it was. The actual process involved a lot of looking around for additional packages, trying things out, discovering things that didn't work for me, and so on (and some amount of backtracking, like my adventures with Flymake). To be fair, this is more or less what I went through with lsp-mode when I first set it up.

Eglot officially recommends that you start it by hand (cf), but I'm too lazy for that. Instead, as I did with lsp-mode, I arranged to start it automatically for local files in the relevant modes.

(use-package eglot
  :defer t
  :init
  (defun eglot-ensure-local-only ()
    "Enable Eglot only on local buffers."
    (unless (file-remote-p default-directory) (eglot-ensure)))
  :hook
  (python-mode . eglot-ensure-local-only)
  (go-mode . eglot-ensure-local-only)
  [...]

One potential limitation of eglot-ensure as compared to eglot is that if you have multiple LSP servers for a particular language (such as 'pylsp' and 'ruff' for Python), eglot-ensure just picks the default one while eglot offers you a choice. To change afterward, you need to shut down the current LSP server and invoke 'eglot'.

(There's a program to multiplex LSP servers (discussion) if I ever want to run several at once.)

LSP servers can offer you a profusion of 'code actions'. Sadly Eglot doesn't make these particularly conveniently accessible (but then neither did my lsp-mode setup), although I hacked around that with a mouse binding (mentioned above). At one level this is technically fair and correct, because LSP servers only offer you code actions when you ask (and code actions are specific to a particular spot). Eglot also doesn't give you any way of filtering what specific code actions it will show you out of a potentially long server list that you find mostly irrelevant (and some, not working), which sadly makes them rather 'busy' for both Go and Python.

Once I had a basic Eglot setup working, I had a fun time learning how to disable some checkers in pylsp, the Python LSP server I use, because my tastes are strongly against style-based linters in 'present all the time' diagnostics. Lsp-mode provides convenient controls to turn off, for example, diagnostics from the 'mccabe' complexity linter. With Eglot, I got to learn all about user specified workspace configuration, which is definitely the morally correct approach to this but which is much more complex. Here, let me show you:

(setq-default eglot-workspace-configuration
   '(:pylsp (:plugins (:mccabe (:enabled :json-false)
                       :pylint (:enabled :json-false)
                       :pylsp_mypy (:enabled :json-false)
                       :mypy (:enabled :json-false)
                       :pycodestyle (:enabled :json-false))
                      )))

Yes, sometimes the mypy stuff is "pylsp_mypy" and sometimes it's just "mypy". This is an internal pylsp detail that Eglot makes you learn. Also, that 'setq-default' is load bearing; you can't use setq.

I find it unfortunate that Eglot doesn't have any convenient way to temporarily set LSP server parameters for a project. If you have specific settings, your life will be much easier if you put them in a correctly formatted .dir-locals.el file, which may look like this:

(( nil
   . ((eglot-workspace-configuration
       . ( :gopls (:analyses
             (:unusedresult :json-false
              :QF1012 :json-false
	      :fmtappendf :json-false)))))))

(As you can tell, what you need to set varies from LSP server to LSP server. Gopls for Go is completely different than pylsp. This is a directory local setting for me rather than a global one because they only mis-fire on some of my code.)

If you want to change these settings on the fly, Eglot has documentation on that but it's not fun to deal with. If you sometimes want to turn on mypy for your Python (LSP) code but not always, as I do, you'll get to use 'dir-locals-set-class-variables' to set up a new class, then use a function that looks like this:

 (defun cks/mypy-enable ()
   "Set Python eglot workspace configuration to enable mypy."
   (interactive)
   (let ((server (eglot--current-server-or-lose)))
     (dir-locals-set-directory-class
        (project-root (eglot--project server))
                      'cks-mypy-enabled)
     (eglot-signal-didChangeConfiguration server)))

That this elaborate process is required is an accurate reflection of reality. Eglot is running one LSP server (per language) across your entire 'project' (directory tree), and settings for that LSP apply to all files you're editing in the project, so it can't have any notion of file or buffer local LSP server settings; they have to be project wide. By extension, setting 'eglot-workspace-configuration' through conventional means is a bad idea; that makes it a buffer local variable, which does nothing useful and will only confuse you.

Sidebar: My journey with Flymake and Flycheck in Eglot

Eglot works better with Flymake than with Flycheck and flycheck-eglot, at least currently. Specifically, with Flymake, Emacs will put a button 2 popup menu on the note itself with any LSP server driven corrections (usually a 'quickfix' LSP code action), but with Flycheck, all you get is the error being marked and you have to look for and trigger LSP code actions in another way. I initially switched to Flymake because of this, but Flymake took me some effort to configure so that I liked it.

However, after switching from Flycheck to Flymake, I found that there were still some things that Flycheck did better and sometimes I wanted Flycheck instead. So I retained my Flycheck setup as well (with flycheck-eglot too), which was convenient when the flycheck-eglot author came up with a nice workaround for my issue.

There's stuff to use Flycheck checkers in Flymake but I haven't done much experimentation with it, although I installed the package and set up some support infrastructure. My impression is that Flycheck has a larger collection of checkers than Flymake does and it's easier to shuffle among them. In theory a LSP server should make all other checkers unimportant, but in practice not so, especially if you want to sometimes invoke 'linter' level checkers.

I do sort of miss Flymake's 'show diagnostics at end of line' option, because it was a good way to make LSP diagnostics glaringly obvious, for times when I want that. There's flycheck-inline, but that only displays the current warning when you're on it, not all of the warnings when you scroll through. Sideline with sideline-flycheck has the same limitation but in my view a better UI experience.

Using a Python 3 LSP server with Python 2 code works (more or less)

By: cks
9 May 2026 at 21:57

I still have a certain amount of Python 2 code, both for work and for personal projects (for example, DWiki, the wiki software behind this blog; it will be Python 3 someday, but not so far). For a long time, I've preferred to do any significant editing of Python code in GNU Emacs, my normal choice for a superintelligent editor, and for a while, I've used LSP based Python editing. There's a very old LSP server for Python 2, but all of the Python LSP servers you actually want to use are specifically for Python 3, and recently I hit a problem that made me turn off the Python 2 LSP server. Since then I've been editing my Python 2 code (cautiously) with pylsp (my normal Python 3 LSP server) and recently, a little bit with 'ruff'. Somewhat to my surprise, this has more or less worked.

My minimum standard for more or less working is that the LSP doesn't malfunction obviously or deluge me with errors and other diagnostics that aren't applicable because it's applying Python 3 rules to Python 2 code. It's even better if the LSP can actually identify real problems, such as misspelled variable names or function names, and recently I've had pylsp do that for some of my code (code that was never tested or used, or I'd have found the problems much earlier; possibly this is a sign that I should have deleted the code instead of fixing it).

(The LSP server does obviously complain about Python 2 code that's using 'print' as a statement, since it's invalid Python 3 syntax, but this is easily fixed even in Python 2 code, and I want to fix it in anything I intend to maintain.)

Much of my Python 2 code mixes spaces and tabs for indentation, and I expected this to upset the Python 3 LSP servers. To my surprise, it hasn't for either pylsp or ruff. Although I can't tell for sure, I think that they're even still correctly interpreting the result (in terms of indentation levels and so on), or at least they're not complaining about syntax errors or other things I'd expect them to if they had the wrong idea of the code's structure.

(Parts of GNU Emacs' python-mode do seem to get confused and (re)indent stuff incorrectly in my old school Python 2 code with 8 space indents and real tabs, which is somewhat surprising. But I guess very few people are editing Python 2 code with tabs in GNU Emacs these days.)

I've done some testing, and as far as I can tell LSP features like 'go to definition' and 'find references' more or less work as I'd expect them to in pylsp. In my (GNU Emacs) environment I think pylsp is limited to cross references within the set of Python files that the editor has loaded and told it about, but within that it's handy.

All of this makes it clearly worthwhile to me to keep LSP stuff enabled for my Python 2 code and to continue to use a superintelligent editor for editing it (although I still make quick changes to Python 2 code with vim). Which is good, because it's also easier and sometimes I'm lazy.

(Work still has Python 2 programs because those programs are load bearing and doesn't particularly need to change, at least most of the time. Could we port them to Python 3? Sure. Could we be sure they didn't have lurking Unicode issues or other problems? No, not necessarily. I did one Python 2 to Python 3 conversion for a load bearing set of programs, our suite of ZFS management tools (including our spares management system), and it was somewhat nerve wracking.)

PS: In my current GNU Emacs environment using Eglot, I don't think the LSP server is called when I hit TAB or M-q (based on the server events reported by eglot-events-buffer), so it's not going to be involved in any rerun of my problem with lsp-mode and the Python 2 LSP server. The LSP server will reindent and reflow the entire file (Emacs buffer), but I have to very specifically ask it to do that. If I have Eglot ask pylsp to reformat a function (selected as a region), pylsp ends back a null result, which I believe means 'no changes', so perhaps pylsp is throwing up its hands at my mixed tabs and spaces indentation.

Notes on using GNU Emacs' Tramp system in an unusual shell environment

By: cks
9 May 2026 at 02:09

Tramp is a famous and often praised GNU Emacs system for editing remote files; lots of people will call it one of Emacs' compelling features. I've always had a decidedly different view of Tramp because Tramp has mostly not worked for me in opaque ways. I recently took another run at getting Tramp working (so I could have an informed opinion on why I'm not a fan), and in the process I've learned a bunch of things that I don't want to forget.

Although Tramp has a bunch of ways to get access to files remotely ('methods' in Tramp jargon), the dominant way is for Tramp to SSH in to the remote system and do stuff. In order to work with your remote shell, Tramp really wants your login on the remote system to have a conventional shell environment, ideally one that uses the Bourne shell (especially Bash).

(But see Remote shell setup hints and the Tramp FAQ.)

In specific, Tramp has requirements for its ssh method in a stock setup:

  • Your shell must have a relatively conventional shell prompt. Defining this is beyond the scope of this entry; see the definition of tramp-shell-prompt-pattern in tramp.el.
  • Your shell must accept and use backslash quoting of more or less arbitrary characters in command lines.
  • Your shell login can't pause to ask questions; it can produce some additional output but it needs to drop you to a shell prompt (that Tramp can recognize).

All of these are required because with the 'ssh' method, Tramp ssh's in and starts a full login session, then switches to /bin/sh (or the Tramp remote shell you've set) with some special things that will let it reliably recognize its own Tramp (shell) prompts. Using the 'sshx' method can bypass a lot of this because with it, Tramp directly runs /bin/sh without going through your remote login session. I believe sshx is also often going to be faster, at the cost of not establishing all of the environment variables and so on that your login session would (including your remote shell's normal $PATH).

If your login shell environment doesn't match all of these you're going to have a varying amount of problems, especially with the 'ssh' method. If you have an unconventional prompt, you can sort of fix it, but a shell with different quoting rules will be painful. Tramp has some mechanisms to deal with additional questions but my impression is that they're at least a slog (see parts of Remote shell setup).

(Since I went through this, to deal with quoting issues you need to redefine tramp-end-of-output to something that doesn't require quoting that your shell doesn't support, and then make sure that your tramp-shell-prompt-pattern matches it in addition to everything else. The only characters that won't be quoted with backslashes by GNU Emacs are -, ., /, 0-9. and a-zA-Z (this is deep in shell-quote-argument). There are some things that may break inside GNU Emacs and Tramp if you do this but I haven't had any problems yet.)

If you ask Tramp to use the (remote) $PATH your remote environment sets up, it must be able to run '/bin/sh -l -c ...' in a way that successfully runs the command string without having your .profile blow things up, despite your .profile probably not being able to detect this. This is typically triggered by you putting 'tramp-own-remote-path' somewhere in tramp-remote-path (either the global version or a connection profile). Because Tramp is that way, the remote path is not part of the predefined connection information that you can set directly.

Despite Tramp carefully initializing your remote login session (if you use 'ssh'), Tramp then normally ignores your remote $PATH and instead generates its own, based on tramp-remote-path. Various bits of Tramp documentation will imply that you can use '~' in things you add to tramp-remote-path (cf some of the examples), but as far as I can tell this is what you would call inoperable. As part of connection setup, Tramp reduces tramp-remote-path down to the directories that exist on the remote machine, and the mechanism Tramp uses for this appears to be incompatible with the use of either '~' or environment variables like '$HOME'.

(Tramp does this path check using the tramp-bundle-read-file-names defconst and you can read what that expands to in order to see the details, along with the tramp-get-remote-path function and the stuff it calls. Since the shell snippet Tramp sends to the remote end quotes all of the directory names it checks, whether or not the remote shell supports '~' is irrelevant and it won't expand $HOME for you. It's possible that this is a bug and Tramp will get fixed some day, but don't hold your breath.)

There's no particularly good fix to this that I know of; instead, I think you have two options. The first is to make tramp-own-remote-path work (it probably will if you use a conventional shell and .profile), add it to tramp-remote-path, and set up and handle your $PATH properly in each machine's .profile. This is probably the better option if you can arrange it, in part because you probably want a correctly set remote $PATH for when you're logged in to the machine directly. The second option, suitable only if you have a common home directory name pattern or two across all your machines, is to add all likely directories to your tramp-remote-path in whatever variations of your home directory you might have:

(dolist (pe '("/home/cks/go/bin" "/u/cks/go/bin" ....))
  (add-to-list 'tramp-remote-path pe))

(Or you could write an ELisp function that generated the list from multiple sublists, one for things relative to your home directory and one a list of possible home directories.)

Many modern Unix systems in standard configurations will make your home directory be /home/<login>, so you can cover all of them by a few paths in tramp-remote-path. Well, assuming you have the same login on all of them. Otherwise, you'll probably have to venture into the world of connection local variables and profiles.

When changing tramp-remote-path there is something very important that can cause you (me) a great deal of frustration if you don't know the full story. At the very end of Tramp's documentation on remote programs, there is this critically important bit:

When remote search paths are changed, local Tramp caches must be recomputed. To force Tramp to recompute afresh, call M-x tramp-cleanup-this-connection RET or friends (see Cleanup remote connections).

If you're me, you might innocently think that it's safe to, for example, set or modify tramp-remote-path before you make any connections. This is false, and calling tramp-cleanup-this-connection is not sufficient to force 'local Tramp caches' to be recomputed. In fact, not even quitting and restarting Emacs will do so. Tramp maintains a persistent file based cache of information about each host you've ever connected to, including the remote $PATH it determined at the time of the first connection (with the first connection's tramp-remote-path), and it will use that cached remote $PATH value until and unless you clear the entire cache by, for example, deleting ~/.emacs.d/tramp (with Emacs not running), or you use tramp-cleanup-all-connections, which I think is probably sufficient.

Given its persistent and dangerous effects, you might want to disable this Tramp cache file. The fine documentation asserts that you can do this by setting tramp-persistency-file-name to nil. This appears to be technically correct but practically inoperative, because you cannot customize the variable to nil (only to a filename) or usefully setq it before Tramp is loaded. You can only setq it to nil (and have it stick) after Tramp is loaded (and you probably also want to invoke tramp-cleanup-all-connections to get rid of anything Tramp may have loaded).

Tramp isn't a mode and so doesn't have any hook that fires when it loads and starts to activate, which would be the right time to augment tramp-remote-path, clear any cached data Tramp loaded, and so on. This is unfortunate but use-package provides a way to work around it:

(use-package tramp
  :defer t
  ;; :config will be run right after Tramp loads.
  :config
  (cks/tramp-setup)
  )

This appears to reliably fire as I start to enter '/sshx:' or '/ssh:' or what have you.

(The manual version of this would be to directly use eval-after-load, but I might as well stick with use-package even if that's what use-package is using under the (macro) hood.)

When it works, Tramp can be pretty magical. However, my voyage of getting to this point was anything but smooth, and parts of it were extremely frustrating. That part was the part with the Tramp file cache, which made various changes to tramp-remote-path have no effect and then sometimes have effect and then go back to having no effect because I wasn't religiously clearing and removing the cache.

(Tramp badly needs a command that reports all of the relevant parameters for the current connection, such as the current remote path that Tramp is using. I could probably put my own version together with enough determination, but I shouldn't have to.)

PS: This entry was written in my working Tramp configuration from my home desktop, but I'm not sure I'm going to bother doing this again (I normally write entries in vim on the host that Wandering Thoughts is on). The red squiggles under (potentially) misspelled words are sort of nice, but on the other hand I turn out to have lots of vim reflexes for writing Wandering Thoughts entries.

(The reflexes aren't triggered by writing in general, because these days I write a lot of email in GNU Emacs and that goes fine.)

❌
❌