Reading view

There are new articles available, click to refresh the page.

voice modems

If you've done much with modern cellphones, you've probably noticed just how odd the architecture can be around audio. Specifically, I mean call audio: modern smartphones have made call audio less of a special case (mostly by just becoming more complicated in general), but in older phones you would often find arrangements where the cellular modem 1 had direct analog audio to the microphone and speaker, perhaps via some switching to share amplifiers. That design meant that the cellular modem functioned basically as a completely independent device, a fully-capable "cellular phone" with the ability to make and receive voice calls. The role of the rest of the smartphone, and its operating system, was just to provide control messages for starting and ending calls.

In modern phones the audio path to and from the modem is digital and it's more integrated into the operating system audio service, but still not fully. You might have noticed, for example, that it is excessively difficult to record call audio on most phones. Regulatory and liability pressures are one reason for this, but another is that it's actually kind of difficult: there may not be any physical way for software running on the main processor to receive audio from the cellular modem. The designer has to put in explicit effort to make that work, effort that only became common more recently to facilitate automatic transcription—and VoLTE, a whole complication that I will simply ignore for the sake of a cleaner historical narrative. You come here to read about old phones, not new ones.

You've probably read enough of my writing to know where this is going: the design of cellular radios, which assume call audio to be part of Their Exclusive Domain, is a legacy of an age-old architectural decision traceable to the original Hayes Smartmodem. It relates to a feature of modems that was widely available, but sparsely used, for much of the PC revolution. The details are odd!

First, for context, let's recede into our mind palaces and travel back to the 1980s. AT&T-designed modems like the Bell 103 had created a standardized family of protocols for data over voice lines, and a company called Hayes introduced a Bell 103-like implementation called the Smartmodem. The Smartmodem was quite successful on its own, but it was more significant for having introduced a common control interface between the modem and the computer. Previous modems had acted as transparent devices that expected Something Else to perform call setup tasks, while the Hayes Smartmodem could pick up the line and dial all on its own. That required that the computer send commands to the modem to configure and start a call.

Hayes designed a simple scheme for sending commands to the modem and switching it in and out of transparent data mode, and that protocol was then widely copied by other modem manufacturers. You could call it the "Hayes command set," and older documents often do, but these days it's more commonly known by the two characters that prefix most commands: the AT protocol.

From its origin in 1981, AT has shown remarkable staying power. Virtually all computer-connected modems, to this very day, continue to use AT commands for basic configuration. Likewise, the basic architecture of the Smartmodem persists: the Smartmodem connected to the host computer using a single RS-232 link that switched between carrying control messages and data. The very latest 5G modems still work the same way, complicated by the addition of multiple separate UART serial channels (so that, for example, control commands, data, and GNSS data can each have their own separate channel) and the adoption of the USB communications device class "Abstract Control Model," a standard UART-over-USB implementation mostly intended to simplify modems. Plug a modern 5G modem into a Linux machine and you can easily observe this: virtually all cellular modems are USB-attached and will appear as a USB composite device with multiple serial adapters, usually attached as /dev/ttyACM* due to the USB-CDC ACM class.

Courtesy of the V.250 standard (a formalization of AT commands) and considerable effort by driver implementers, USB-attached modems "Just Work" as network interfaces on modern Linux—but under the hood, the kernel is communicating with the modem over separate serial interfaces. Back in the olden days, it was common to run PPP (point-to-point protocol) over one of the serial interfaces to use the actual data (bearer) channel, but now PPP has mostly given way to "Direct IP" where you just push packets over the serial link.

Just to complicate things a touch more, there are vendor-specific standards like QMI (Qualcomm) that completely replace AT and find use in modern smartphones, but they're messy with regards to Linux support. If you are personally interacting at this layer, messing with modems or writing communications software or whatever, you are almost certainly going to stick to AT commands. Modem vendors continue to build on AT. If you look at LTE modems made for IoT applications, for example, it's common for them to provide a complete HTTP implementation (and sometimes MQTT, and sometimes some kind of proprietary message broker protocol) accessible via AT commands. That means you can implement an IoT device without a network stack at all, deferring all network operations to the modem itself. With a JSON-over-HTTP backend, for example, you might send AT commands with JSON payloads over the serial control channel and then get JSON back. You never interact with the network at all, the modem is a completely self-contained system. At the extreme, you might implement your entire device using exclusively the modem. This is a common approach for telematics devices like GPS trackers: they consist of nothing but a cellular modem, the telemetry application is built for the modem using an SDK from its vendor, and you interact with it using AT commands. IoT-class modems frequently provide GPIO and user flash for just this purpose.

None of that is actually what this article is about, but I want to make clear how profound the implications of the Smartmodem heritage are. In 1981, the Smartmodem was a standalone device controlled over serial because the limitations of the era's computer made that a practical necessity. Processors weren't fast enough to run the modem DSP alongside other workloads, certification requirements for telephone-connected devices were stricter, etc. Despite the late-'90s detour into "winmodems," most of those constraints still exist, just in the different forms of the cellular network. Today's modems are less v.54 and more 5G, but they still act as standalone devices controlled over serial channels.

Most telephone modems of the 1980s were exclusively data modems. You could use AT commands to make a call, switch into data mode, and then you basically had a very long serial cable from your device to the computer on the other end of the call. That was all these modems did; their only interaction with "The Telephone System" besides as a pair of wires was for basic call control like detecting dial tone and sending DTMF dialing. That was quite natural considering their evolution from acoustic coupler modems (where you dialed the phone yourself and then set the handset on the modem), but by the late '80s, as devices like the Smartmodem with their own call control were common, it started to feel primitive. With Carterfone and the breakup of the AT&T monopoly, computers were starting to feel like first-class citizens on the telephone system. Shouldn't they have more complete support for, well, telephone things?

From a modern perspective, it might seem odd that fax came to modems before voice, but it makes technical sense. Fax machines use a digital protocol that is loosely derived from Bell 103 and belongs to the same extended family as other telephone modems, so modems already had the hardware. Implementing fax support was just a matter of software. With some extensions to the AT command set, your computer became a fax machine. By the late 1980s, fax support was common in modems, usually distinguished by marketing the modem as "data/fax."

For example, the command AT+FCLASS=1.0 changed the modem to T.31 fax mode (fax class 1.0). T.31/EIA-578 is a standard for sending and receiving faxes using a serial connection to a telephone modem, and it was widely implemented by commercial software packages. There were so many "PC fax" packages available in the 1990s that you could stock half an aisle of an OfficeMax with them, and indeed that's what happened. The legacy of this industry is that there are still dozens of "fax server" products built around data/fax modems, like the open source Hylafax.

Fax modems also made a more general contribution to the modem state of the art: the concept of distinct modes. "Fax class 0" was data mode, while values like 1 and 2 and, oddly, 1.0 and 2.0 were used for different fax implementations. There was an obvious, and tantalizing, opportunity: more modes. Maybe, even, a modem mode for that most classic application of the telephone: voice. Could you use your computer for telephone calls?

The idea is obvious, so it's no surprise that several vendors were working on it all at once. Early efforts at telephone-on-computer could be quite comical, consisting of a telephone that was more or less glued to a computer, no electrical connectivity between them. The IBM Palm Top PC 110 is my favorite example of this form, a Japanese-market miniature laptop with a speaker and microphone on the front edge so that you could hold it up to your face to make a call. Besides amusement, it illustrates a fundamental challenge of merging computers with telephony: real-time media is hard.

It seems very funny to build a telephone into a computer because computers are general-purpose devices defined by software. Putting a phone in the computer should not mean physically putting a phone in the computer; the phone should obviously be a software application. Well, obviously from our modern perspective, but real-time media has always been difficult for computers (which, for architectural reasons, are mostly seen today as fundamentally asynchronous, non-real-time devices). Modern computers get away with it by brute force; they're just so fast that they can be wildly inefficient with media and still keep up to real-time. But things were different in the 1990s. Real-time audio processing was a fairly demanding application and most of the computer industry preferred to leave it to hardware.

Still, the voice modem was an inevitability. In 1991, the Los Angeles Times reported that at least three companies were working on some form of "modem with voice support" for 1992. They focused mainly on Rockwell International, which proved the right call. We don't remember Rockwell as a semiconductor company today, but in the 1990s they very much were—Rockwell Semiconductor later spun out into Conexant, now part of Synaptics. At the time, Rockwell was a major player in semiconductors, especially for communications.

Rockwell had particular expertise in answering telephones. During the 1970s, the Rockwell Galaxy Automatic Call Distributor just about invented the modern call center. It was the first digitally-controlled system that answered calls on a pool of telephone lines, placed them on hold, and distributed them to a pool of operators. The flexibility and efficiency of Rockwell's computer-controlled system, which was specifically designed to cut costs by presenting calls to operators as rapidly as possible, displaced AT&T's contact center systems (like turrets) and Rockwell had almost complete dominance in the new world of 1-800 customer call centers during the 1980s.

Rockwell did not manufacture complete modems, but instead chipsets that were integrated into modems by other manufacturers. That makes it a little tricky to figure out the first model that Rockwell shipped with voice support, but it was sometime in 1992. Rockwell's chips quickly lead to a generation of "data/fax/voice" modems from all the usual manufacturers. Despite competition from other chipset vendors like Cirrus, Rockwell's voice modems became the Hayes of voice. By the mid-'90s, data/fax/voice modems were widespread and the majority either used a Rockwell chipset or a chipset that matched Rockwell's control protocol.

Let's talk a bit about that protocol, and how voice modems actually worked. Although the v.250 standard for many AT commands was in place, there was no standard for voice control. To oversimplify a bit, the "core" AT commands are generally AT followed by one or two letters. "AT+" came to be used as a prefix for standardized "extension" commands, like those added for fax support. Manufacturer-specific extension commands used AT followed by some other character, and Rockwell chose AT#.

To start, voice mode was presented as just another case of fax classes. Specifically, Rockwell voice mode is fax class 8, so to put a modem into voice mode you sent AT#CLS=8 (it seems like the properly standardized AT+FCLASS=8 also worked on many later chipsets, but I'm not sure about the early examples) to enter Fax Class 8. In most cases, you will also want to use the commands AT+VLS=?, which retrieves the modem's voice feature support, and AT+VSM=?, which returns the list of audio codecs supported by the modem.

Once in voice mode, you can use a set of voice-specific AT# commands to dial an outgoing call or answer an incoming one. The modem provides messages about call state, so once the call is up, you can issue the command AT#VTX to begin transmitting voice. This puts the modem into a mode much like a data or fax connection, in which everything sent over the serial connection is interpreted as audio data to be played onto the telephone line.

This gets into one of the ugly details of voice modems. Early voice modems had no audio connection to the computer, only serial, so audio data had to be sent over the serial connection. In keeping with telephony conventions, 8-bit PCM was widely supported, but practically required a fast 115200 baud serial connection. That's basically the limit for RS-232 and it was desirable to provide lower-bandwidth options, which mostly appeared as a proprietary form of ADPCM (adaptive differential PCM).

During transmission of voice data, the modem would indicate events ranging from hang-up to received DTMF digits using escape codes prefixed with DLE (ASCII 0x10). Similarly, the computer could send the modem an escape code to indicate the end of voice data. Things worked much the same in reverse: AT#VRX put the modem in recording mode, and the modem sent audio data back to the computer over the same serial connection. The computer would use an escape code to stop recording, and could also receive a number of escape codes indicating various state changes from the modem during recording.

The design of voice modems is pretty simple, just about the minimum viable product for sending and receiving voice data over a conventional telephone modem, and that's probably why it stayed. Voice support appeared in most Rockwell chipsets by 1995, and many other chipsets had a similar (but often not completely compatible) proprietary voice command set. Still, due to what I suspect may have been some ugly industry politics, it was not until 1998 that voice modem behavior was standardized by V.253. V.253 voice modem commands are mostly the same as the Rockwell proprietary scheme, but using the AT+ prefix. For example, AT+VRX and AT#VRX do the same thing in the V.253 and Rockwell command sets respectively, and in practice many post-V.253 modems seem to have just recognized both.

I suspect that Rockwell's interest in voice modems has some direct heritage to their contact center solutions, because the Rockwell voice command set provides more or less exactly what you need to build a computer-based interactive voice response (IVR) system... IVRs being a concept that Rockwell could fairly claim to have invented. And indeed, that was one of the main applications.

The voice modem almost immediately spawned a whole little industry of PC-based small office IVRs. As The Internet came into popularity, many businesses would already have a computer modem connected to their phone line for internet use, so why not let the computer answer the phone as well? Widespread addition of voice support to otherwise ordinary modems meant that some people already owned all the hardware necessary for an automatic attendant, and for those that didn't the cost of telephone modems (with voice support!) was crashing in the mid '90s.

The Internet Archive has preserved two such products from the early days of voice modems, Super Voice 2.0 from 1994 and Phone Secretary v1.01 of the same year. Phone Secretary (by Unique Software) has become quite obscure, and while there is still a company called Unique Software in the telephony market, I think they might be unrelated. Super Voice, on the other hand, made it to SuperVoice Pro 11 in 2016 and you can still buy a license today for just $99. Both of these products, and many more like them, had features like multi-mailbox voicemail and menu trees that used to require far more expensive equipment. Much of the original marketing for these products emphasizes the "Big Business" feeling of a phone tree, and Super Voice's description still says "Ideal for individuals, professionals and businesses that want to grow or appear bigger!"

Given how much people hate phone menus today, it's funny to think of them as a feature for your business's reputation, but small businesses are often trying to appear larger and having a menu of departments behind your single inbound phone line was a great way to do it. Besides, plain digital answering machines were very new in the '90s, so PC-based solutions promised more features at a similar price.

Voice modems got a big boost in 1996, when Microsoft introduced the Voice Modem Extensions for Windows 95. That package extended Windows Telephony API (TAPI) with standardized support for voice modems. TAPI fully abstracted voice modems, which proved surprisingly complex and quite important. While most all voice modems used either the Rockwell commands or a very similar protocol, the actual details of audio handling were surprisingly varied. Poor handling of audio by the computers of the era meant that a lot of external voice modems had their own headset and microphone connections, and sometimes built-in speakerphone, for voice use.

External modems almost universally used a single RS-232 interface that switched in and out of audio data mode, as described previously, but by 1996, internal (ISA or PCI) modems were starting to use a separate UART channel for payload, including audio codec data 2. Even better, some manufacturers of internal modems were starting to integrate them with audio cards on a single expansion board. This allowed the modem to "cheat" off of the sound card's ADC and DAC components, and the OS's audio stack, reducing hardware cost with the downside that TAPI had to support a bunch of different cases for how the control and audio channels related to each other, and whether the audio channel was "Rockwell style" or "Soundblaster style." And note well that V.253 wasn't yet published, so while most voice modems were at least Rockwell-adjacent there were considerable inconsistencies from chipset to chipset when it came to the fine details of escape codes and behavior. A lot of software for voice modems from the era have noncommittal language about supporting "most" voice modems, and from 1996 forward that usually meant that they supported whatever voice modems Microsoft had special-cased in TAPI.

The latter type, of integrated modem/sound cards, is particularly important as the Creative Phone Blaster and Creative Modem Blaster were popular examples of consumer data/fax/voice modems. They were indeed Sound Blasters with modems added on, usually just stereo audio out, mono in, and a telephone jack. I am not clear on the difference between the Phone Blaster and Modem Blaster product lines, and I think those might just be older and newer names (respectively) for the same lineup. They are not special, lots of manufacturers had similar products, but they are good examples of just how strange consumer voice modems became.

The appeal of the voice modem to businesses is pretty obvious, since they were a far lower cost option for business phone systems. Unlike many telephony products, though, computer modems of the 1990s had little differentiation between consumer and business or "enterprise" products. Many large modem banks ran on the exact same modem chipsets as consumers used for dial-up connections, sometimes packaged in rack-mount modem banks but sometimes just as a whole lot of external modems taped to shelves. That meant that consumer modems tended to have the feature of business systems, including the infrastructure for an IVR system.


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


The way that this manifested in commerce was... odd. Take the Creative Modem Blaster DI5660, described on the box as a "56kbps internal data fax modem with voice." While not exactly bargain basement, this also wasn't special, it was a pretty standard modem for the late '90s and received revisions until at least 2000. In the manual, after some mind-numbing complexity around integration with other sound cards 3, and the customary 20 pages of screenshots of the driver installer, the manual gives a feature matrix that promises:

Able to record and play voice messages over the telephone line.

Multiple mailboxes using included communications software.

High performance speakerphone.

So, the Creative Modem Blaster is not just a modem. It's not just a sound card. It's an answering machine. Wow! Nothing else is said about voice features. The install CD includes Cheyenne Bitware for sending and receiving faxes, but I'm not sure that this modem even had voicemail software on the CD. Earlier Creative modems do seem to have shipped with a software voicemail implementation, but only a very simple one integrated into the driver. The short AT command reference on the CD doesn't mention the voice-related commands at all, and at no point in the documentation is anything about IVRs or business applications mentioned.

This situation is really common with consumer modems. Find the manuals or driver CDs from 1990s modems (the Internet Archive has tons). Flip through them, and you will often find mention of multi-box voicemail as a feature and then absolutely nothing else about voice capability. "Answering machine" was the exclusive consumer application to such an extent that some of the manufacturers, including Creative, seem to have adopted the terms "voice mail modem" or "message modem" as replacements for "voice modem."

This narrowing of vision doesn't reflect any technical limitation; these modems all supported the full Rockwell (and later V.253) standards. You can find them on compatibility lists for business IVR packages, for example. Instead, it looks like voice mail was just the only consumer application anyone could come up with. The rest of the potential of the voice modem, for complete custom applications, was complicated and difficult to support, so it was easier to just not bring it up.

This is a general problem with more advanced features in modems. Telephone modems for internet access had all these downsides, like hogging a phone line that they arguably still did not fully utilize, that manufacturers kept trying to solve. Consumers, on the other hand, seem to have simply not cared. Computers of the 1990s were relatively big, loud, costly, and unstable, so people didn't tend to leave them running 24/7. That severely cut into the practicality of a modem-based voicemail package, especially for consumers, for whom the value of a more sophisticated inbound calling experience didn't justify much extra computer maintenance.

Other inroads towards unified telephone/modem features met similar adoption challenges. Consider "modem-on-hold," a common feature of late-'90s modems that allowed for answering incoming voice calls while the modem was in use. It seemed like it solved one of the #1 complaints about telephone modems, but in practice it was tricky enough to use that hardly anyone did. Efforts like ASVD and DSVD provided actual multiplexing of Simultaneous Voice and Data (SVD, in Analog and Digital forms), meaning you could talk on the phone and exchange data at the very same time, but hardly anyone did.

There are a few reasons these didn't succeed. One is compatibility; voice modems had a rough start with the long period between the Rockwell implementation and the actual V.253 standard, but at least most modems were fairly similar to Rockwell and the incompatibilities were isolated to the computer/modem interface where they were easier to handle. Protocols like SVD required support in the modems on both ends and were seldom compatible between vendors, which became an even bigger issue as independent BBSs gave way ISP modem banks (which were fundamentally incompatible with the concepts behind SVD).

More time might have solved the compatibility problems, but consumer telephone modems were also proving to be a dead end. ISDN did not sweep away the modem by 1995 as the telcos had expected, but a few years later DSL did. DSL could do simultaneous voice and data much more seamlessly than even the most sophisticated modem SVD implementations, and with more bandwidth for both channels to boot.

To the modern computer user, the idea that a common telephone modem would casually incorporate a voicemail system is silly. Despite the obvious technical overlap, those are very different domains of consumer products. The window in which it made sense to combine them was fairly short, and the practical realities of the Windows 95 and 98 era severely limited the promised convenience. At least in the consumer realms, voice modems went pretty much nowhere.

Still, it is a technology with remarkable staying power. For example, you might assume that the late-1990s reshaping of the modem industry around thin "winmodems" (that eliminated most hardware in favor of software on the host CPU) wiped out voice support, but it didn't. By moving most of the logic to host software, winmodems actually tended to make voice features simpler to implement. V.253 voice support was quite common on winmodems.

In fact, it's still common on modems today. Consider the StarTech USB 2.0 v.92 modem, a very common choice for anyone in need of a telephone modem today. StarTech titles it as "Computer/Laptop Fax Modem" and "USB Data Modem," never quite invoking the traditional Data/Fax/Voice, but if you look at the feature list you'll find: "VOICE AND CALLER ID SUPPORT: Make and receive phone calls through your desktop / laptop computer." Despite the callout in the feature list, the spec sheet makes no mention of any voice command set. It does tell us that the modem is built around a Conexant CX93010-21Z (recall that Conexant is the descendant of Rockwell), and the Conexant reference manual for that chipset describes a voice command set that is "based on V.253."

This gets at one of the other fatal flaws of the voice modem. It's peculiar that a modem manual would specifically claim voice support and then list all of the supported data and fax modes, but not list any voice modes at all. Every single modem I've found after the mid-1990s is like this: they tell you that they have voice support, but absolutely nothing about it. You walk away with no idea how you would use a voice modem, or what a "voice modem" even is, outside of the one feature that the manufacturer may have deemed fit to ship on the CD, which is invariably voicemail.

I think this is the manifestation of an underlying problem: the compatibility and standardization problems with voice modems go beyond the fact that it took six years for Rockwell's command set to turn into an ITU standard. It doesn't appear that voice modems were ever that well standardized. Despite the wide range of voice-compatible modems on the '90s market, and the wide range of software for using them, most all of those software packages come with a manual full of warnings, caveats, limitations, and workarounds. Despite V.253, different manufacturers had different opinions about the meanings of escape codes. Several common chipsets had outright bugs, which applications had to work around. As we found today, it was often the case that understanding voice modem compatibility required identifying the underlying chipset and getting the manufacturer's command set reference manual.

Modem manufacturers, the end-user brands, seldom made their own chipsets. They were at the mercy of companies like Conexant, who seemed uninterested in further evolution of voice features. No wonder that they found it easier to just not talk about it.

For how little it ultimately mattered, voice modems got a surprising amount of technical investment from the software world. I suppose that in a world before widespread DSL and VoIP, they seemed like a solution to two problems: the disconnect between computer and telephone operation on the phone line, and the desire to use computers to automate telephone call handling. In 1997, reviewer Mark Spiwak wrote for Windows Magazine:

Thanks to the Hayes Accura 288 Internal Fax Modem with Voice, small offices seem large to outside callers. This single expansion card from a trusted modem manufacturer equips a PC with a fast modem and fax capabilities, and provides speakerphone and voice-mail functions. The modem distinguishes among incoming voice, data and fax calls, and routes them accordingly. With the modem's speaker, you don't need a sound card (although it will work with one, and a cable is included). There's also a microphone in the box.

There are ups and downs in this review. First, consider the degree of complexity hidden in the modem "distinguishing" voice, data, and fax calls. That's a longstanding hard problem in telephony that engendered a lot of hacky solutions. I think that what Hayes is referring to here is actually more of a Windows/TAPI feature than a modem feature; as part of TAPI Microsoft shipped something called the Microsoft Operator Agent that would answer phone calls on a voice-capable modem, detect modem tones or prompt the user to select a call mode as an IVR menu, and then invoke the specified application to handle the rest of the call. There was a whole generalized, registry-based routing mechanism to allow multiple applications to accept phone calls based on caller intent. It feels like a rather sophisticated piece of operating system functionality to have almost completely withered away.

On the other hand, there's "a microphone in the box" and the promise that the modem will work with a sound card even if one isn't required. These are hints at the kind of complexity that a typical consumer would have found fairly impenetrable, the ugly tradeoffs involved in running a real-time media application on a Pentium system that would use AC'97 audio if you were lucky.

During the early '00s, Microsoft invested in fax as an OS feature in Windows, although I don't think it ever received that much use. It was already too late for voice modems, though, and Windows support for voice modems (and TAPI) generally declined after XP. On the Windows Server side, NT4's TAPI could provide a fairly complete IVR system with Windows alone, but the potential of COM-based TAPI business phone systems got wrapped up in Microsoft's Unified Communications craze and then died along with it.

Voice modems surely lasted longer in business applications, particularly since a number of business products like PABXs actually implemented V.253 as a bridge for TAPI applications. Quite a few voice-modem-based IVR products are still for sale, suggesting that they have enduring users. Of course, the more actively developed of them now all offer VoIP capabilities as well. Practically speaking, SIP has completely replaced the role of the voice modem, and it says something about voice modems that even with its many idiosyncrasies SIP is the simpler and better supported path.

And now, we come back to our modern era. In the course of researching voice modems, I ran into a 2017 blog post, The sad state of voice support in cellular modems. The author, Harald Welte, laments that USB-attached modems do not present a USB audio device, but "instead, what some modem vendors have implemented as an ugly hack is the transport of 8kHz 16bit PCM samples over one of the UARTs."

In the full light of history, we realize that what Welte describes as an "ugly hack" is, in fact, exactly the way it has always worked. In 1981, Hayes introduced the Smartmodem, a standalone device that abstracted every part of telephony into a serial channel and a simple command format. In 1992, Rockwell made the modem speak, by putting PCM audio over that same serial channel. In 2017, very little had changed—Welte's post has aged poorly only in that many modern phones have the modem more fully integrated into the SOC and use a completely proprietary interface instead of the traditional serial channel.

Of course, that's for smartphone manufacturers, who have the extensive engineering department and the close relationship with their chipmakers to implement such proprietary interfaces. For the rest of us, with our USB modems (now universally USB, even when packaged in m.2 form), the modem is still a black box with a serial port. The computer does not participate in a phone call, that's the modem's job. If you want to say something, you better PCM encode it and send it to the TTY.

  1. It is very common to refer to this component of a smartphone, the actual RF and cellular implementation, as the "baseband." Baseband is an overloaded term in telephony and the connection between the word's original meaning and its use to refer to a smartphone component is so indirect as to verge on rhyming slang, so I will simply call it the "cellular modem" for clarity.

  2. Mixing arbitrary communications data (the bearer channel) with control commands on a single serial channel created a heap of problems around escaping and mode switching that were never fully resolved. Consider, for example, that Rockwell modems used ASCII DLE as an escape character despite the fact that the 0x10 byte could, and would, appear in PCM audio data. That meant that any instance of 0x10 in the audio data had to be replaced with 0x1010 (DLE escaping itself to produce a literal DLE), at the cost of some bandwidth overhead and implementation complexity. And that's simple and elegant compared to how the same problem was handled for data mode, with the infamous "+++" sequence. The result was that modem manufacturers started separating the bearer and control channels into two different UARTs as soon as it became feasible. I will save more depth on this for a future post but modern LTE modems, depending on configuration (and this is indeed configurable), may present as many as 7 separate UARTs for different purposes. You only need one of them, but using separate UARTs instead of switching modes on the primary one saves a whole lot of headache.

  3. Some sound cards in this era had a connector for a jumper wire to a telephone modem, so that the sound card could exchange audio with the modem. That allowed you to set up the same logical architecture as integrated modem/ sound cards, but with two different cards. It's amusing that Creative provided such a connector with an integrated modem/sound card, but they explain that you might want to use it so that you can use your computer as a speaker phone without having to have a second set of speakers plugged into the Modem Blaster. Yes, in practice, things got pretty ugly. You could come up with an elegant setup but it would take some doing.

IrDA

Light: it's the radiation we can see. The communications potential of light is obvious, and indeed, many of the earliest forms of long-distance communication relied on it: signal fires, semaphore, heliographs. You could say that we still make extensive use of light for communications today, in the form of fiber optics. Early on, some fiber users (such as AT&T) even preferred the term "lightguide," a nice analogy to the long-distance waveguides that Bell Laboratories had experimented with.

The comparison between lightguide and waveguide illuminates (heh) an important dichotomy in radiation-based communication. We make wide use of radio frequency in both free-space applications ("radio" as we think of it) and confined applications (like cable television). We also make wide use of light in confined fiber optic systems. That leaves us to wonder about the less-considered fourth option: free-space optical (FSO) communications, the use of modulated light without a confined guide.

Well, if I had written this two or three years ago, free-space optical might have counted as quite obscure. The idea of using a modulated laser or LED light source for communications over a distance is actually quite old. Commercial products for Ethernet-over-laser have been available since the late 1990s and achieved multi-gigabit speeds by 2010. Motivated mostly by Strategic Defense Initiative and Ballistic Missile Defense Organization requirements for hardened communications within satellite constellations, experiments on a gigabit laser satellite-to-ground link were underway in 1998, although the system ultimately only provided satisfactory performance at a rate of around 300 Mbps. As it turns out, FSO computer networking is nearly as old as computer networking itself, with a 1973 experimental system briefly put into use at Xerox PARC.

Despite the fact that FSO systems have been generally available and even quite functional for decades, they remained a niche technology with very little public profile until the phenomenon of low-orbit communications constellations (namely Starlink) put the concept of intra-satellite laser communication into the spotlight. Despite various experimental satellite-to-satellite systems dating back to the early 2000s, and more or less clandestine military applications over the same period, the first real production system is probably the EU's EDRS, which went live in 2016. Starlink didn't really get the laser technology working until 2022. That's one of the interesting things about FSO: it seems intuitively like it should work, it does work, but it's a technology that has often sat dormant for many years at a time.

Well, thinking about satellites, we all know that space is hard. There are formidable technical challenges around aiming and detecting lasers in space, and the rate of iteration is slowed by the long timelines of aerospace projects. But what about down here on earth? Where everything is so much easier? Well, we got Li-Fi. Li-Fi is a largely stillborn technology, and not the topic of this article, so I will resist the urge to explain it too thoroughly. As the name suggests, it's intended to provide a capability similar to Wi-Fi (short-range networking between multiple devices) but using visible light. Despite various demonstrations of gigabit or faster systems, Li-Fi has next to zero commercial adoption, with most uptake in military applications. There's just something about the military and light, which we'll get to later. But here's what I want to discuss today: the golden age of FSO communications, a brief period where the cutting-edge technology behind the television remote control appeared to be the future of short-range computer networking.

During the 1980s, Hewlett-Packard manufactured scientific and graphing calculators with a feature set that increasingly overlapped with personal computers. This era can be hard to understand for people around my age or younger, who associate graphing calculators exclusively with the few Texas Instruments models blessed (and demanded) by common high school math textbooks. These are a holdover, a specter, of earlier years in which scientific and graphing calculators were serious technical instruments and some of the most sophisticated computing devices available to many of their owners. Features like BASIC programmability, still widespread in graphing calculators but increasingly divorced from actual applications (besides ignoring the math teacher and writing primitive CRPGs), used to be an important part of business computing.

Engineers would obtain (even buy!) calculator BASIC programs that automated common calculations. Life insurance salespeople might quote rates using a calculator BASIC program. Calculator manufacturers often sold ROM cartridges that added domain-specific functions, and these modules now represent the many applications of the programmable calculator: financial modeling. chemical engineering. statistics. Not that many years later, the whole field was virtually wiped out by portable computers, but not before calculators and computers underwent an awkward near-convergence (best exemplified by the TI-95, a calculator with computer characteristics, and the TI-99, a computer with calculator characteristics).

The point is that calculators might fairly be called the first practical portable computers, and people increasingly used calculators as part of business and engineering workflows. The challenge here is that computer applications tend to involve storage and processing of large numbers of records, a task that the small (and often volatile) memory of calculators didn't encourage. A bookkeeper might use a calculator to total the day's transactions, an engineer might use a calculator to compute the capacity of a beam. Both of these tasks involve math, the forté of the calculator, but they also require documentation. The accountant and the engineer both need to record the results of their work for later review.

An interesting early approach by HP reflects the tradition of accounting: bookkeepers tended to use "adding machines" rather than calculators, a distinction that is mostly forgotten today but still apparent to anyone who buys an adding machine and finds it to be a little odd compared to a typical calculator. Besides the keypad layout (with an oversize dual-function +/= key), adding machines usually include a printer. Turn on the printer, total your transactions, and you now have a slip of paper that you can use to check your work, and even retain as part of your records.

These machines were big and bulky, though, and they still are today. What if you could have the convenience of a pocket scientific calculator and a printing adding machine in the same product family? Well, you could: by the end of the 1980s, many HP scientific and graphing calculators supported the 82240B accessory printer. It was even wireless.

HP calculators sent data to the 82240B printer using infrared light. For this purpose, HP developed a simple unidirectional protocol based on a UART hooked up to an LED (and, on the other end, a UART hooked up to a PIN diode). Called "RedEye," the calculator-printer application seems to have evolved over a short span into a more general-purpose, bidirectional protocol called HP SIR, for Serial InfraRed. As the name implies, SIR provided an interface very much like an RS-232 serial port, using a signaling scheme that was even fairly similar to RS-232, if you put the whole thing through an LED-to-photodiode step each way.

There were, naturally, a few adaptations to the nature of FSO communication. HP SIR was bidirectional but only half-duplex, because the realities of optics make it very difficult to build an IR transceiver whose receiving detector will not be completely blinded whenever the transmitting LED is active. Power was also a major concern: the infrared LEDs of the late '80s were not very efficient, and portable devices like calculators were expected to achieve a decent runtime on a few AAs. To cut down on power consumption, HP SIR replaced RS-232's bipolar non-return-to-zero signaling with a return-to-zero scheme in which the actual pulses (i.e. the periods when the LED is active) were much shorter than the bit interval, resulting in a low duty cycle. Since you can only really turn an LED on one way, HP SIR replaced bipolar signaling (e.g. positive for 1 and negative for 0) with a system in which the presence of a pulse in a bit interval indicated 0, and the absence of a pulse indicated 1.

If you paid much attention in your college data communications course, you might wonder about clock recovery when there are a lot of 1s in a row (and thus no pulses at all). We'll get to that later. I had a very painful data communications class that I am trying to forget, and I haven't quite braced myself to discuss line coding yet.

HP SIR was extended to numerous applications, including what we would definitely recognize as portable computers today. HP's "palmtop" computers, like the x00LX series, were just about the size of pocket calculators but ran a full-on DOS. These were a transitional stage between early portables and later PDAs, but they introduced a need that would only become bigger in the PDA era: a quick, convenient way of transferring data between the portable computer and other devices. HP SIR was the perfect answer. Start some software, point the palmtop at the desktop, and press send... infrared provided a surprisingly cost-effective way to implement these local connections without the need for cables. HP didn't forget the printers—this was HP, after all—and palmtops could wirelessly print to select HP printers by "point and shoot."

HP wasn't the only company with a short-range infrared protocol. Japan's Sharp had developed a similar protocol, also for calculators, that might not be much remembered in the United States except for its adoption on Apple's Newton series of PDAs. The Newton's awkward sibling, General Magic's Magic Cap, had a similar (but of course incompatible) infrared capability called MagicBeam. The early '90s brought the PDA and the PDA brought an obvious demand for a consumer-friendly, short-range wireless network protocol... and that's why we ended up with at least a half dozen of them, virtually all infrared-based. Everything from the relative cost of components to the regulatory landscape meant that infrared was easier and cheaper to productize than radio frequency protocols, so infrared was the direction that almost everyone went.

While just about everyone in consumer electronics eventually got involved, it seems to have been Hewlett-Packard that stepped up to drive standardization. I don't have conclusive evidence, but I think there's a fairly obvious reason: HP was one of several companies making portable devices, an area where they weren't unsuccessful but never enjoyed total market dominance. Printers, though... printers were a different matter. HP enjoyed clear leadership in printers from the late '80s and perhaps to our present day. People might own a PDA from one of many brands, but when it came time to print, they'd be pointing that PDA at an HP product.

In 1993, Hewlett-Packard hosted an industry meeting that kicked off the Infrared Data Association, or IrDA. As a group of HP employees recounted the event, it was a smash hit: there were far more attendees than expected, representing more than fifty companies in both consumer and industrial electronics. Within a few years, IrDA's membership grew to 150 companies—including IBM, Microsoft, and Apple. Commercial adoption was similarly impressive: in the late 1990s, IrDA transceivers were a ubiquitous feature of phones, printers, and computers. You may have never used IrDA, but if you were old enough in the 1990s, you almost certainly owned devices with IrDA support.

During the early meetings of IrDA, various candidates were considered before, unsurprisingly, HP SIR was selected as the basis of the new industry-standard protocol. IrDA 1.0 is essentially a rebrand of HP SIR, and "SIR" persisted as the common name for IrDA, although the "S" was changed to "Standard" or, as later versions introduced higher speeds, "Slow." Slow it was, at least by modern standards. IrDA ran at 115 kbps, but for the then-typical purpose of replacing an RS-232 serial connection, 115 kbps was plenty (the same as the maximum speed supported by common serial controllers at the time).

Early versions of IrDA suffered from a lack of standardization. Adopting HP SIR as the basis for IrDA brought in an already mature technology, a simple RS-232 based signaling scheme that was even easy to implement with UARTs. But that low-level standard was basically all IrDA standardized. The application layer, and even error detection and reliable delivery, were left to implementers. As you can imagine, everyone did things slightly differently and interoperability fell apart.

This is an old story in technology standards: you align on the low level, and then the next level up becomes the problem. Reliable interoperability ends up requiring standardized application protocols and, more than likely, peer and service discovery protocols. Well, guess what the IrDA spent the mid-'90s on?

The full IrDA protocol stack, mostly created over the first few years of IrDA's existence, can be a little bit confusing because of the way that it reflects the history. The first IrDA standards were limited to the physical layer, initially SIR, and the IrLAP Link Access Protocol. Besides some basic link control functions, IrLAP handles discovery. When a device wishes to initiate IrDA communications, it repeatedly transmits a random 32-bit ID. The frame timing of the beacon establishes the baseline for a time-slot-based access control mechanism; other IrDA devices that detect the beacons select a random time-slot (in between beacon transmissions) in which to respond with their own ID. This discovery process happens at low speed with small packets, but it also includes capability information on the devices used to negotiate the highest speed supported on both ends.

Once the IrLAP discovery process completes, two IrDA devices will know each other's IDs and have an agreed upon set of basic parameters for ongoing communications. You will note that I say two devices. One of the interesting things about FSO networking is that the nature of light, that most things are opaque to it, creates significant limitations. While IrDA always incorporated features to enable multi-point connections (such as the time-slot contention management procedure during discovery), the physical specifications for IrDA only accommodate connections of up to one meter with the two transceivers contained within a 30-degree-wide cone originating at each of the devices. In practice, IrDA devices had to be so close to each other, and so exactly oriented towards each other, that it was rarely practical for more than two devices to participate in an IrDA connection. This de facto limitation to point-to-point applications became solidified by later development on IrDA standards, which (for the most part) ignored the possibility of multi-point applications. HP application notes describe point-to-multi-point connections as possible but not yet implemented in the higher layers, and it pretty much stayed that way.

Let's take a look at the higher layers, because SIR and IrLAP alone did not specify enough functionality to meet real use-cases. Initially, IrLAP was used as a basic transport layer for various proprietary protocols, but IrDA quickly developed a few open standards that went higher up. IrLMP, the Link Management Protocol, is the most important.

IrLMP can be divided into two sub-layers, although they're less layers and more parallel features that operate at the same level. LM-IAS, Link Management Information Access Services, is a discovery protocol for high-level applications. LM-MUX, Link Management Multiplexing, does what it says on the tin: multiplexes multiple logical connections over a single IrLAP interface.

First, we'll discuss LM-MUX, because it will make LM-IAS clearer. By the time IrDA was in development, TCP/IP was gaining ground as the industry standard for computer networking, and IrDA employed similar ideas about layering and logical connections. LM-MUX specifies a simple frame header that includes the 32-bit addresses of the two IrDA devices and a seven-bit value known as a "selector" or LSAP-SEL. LSAP stands for Link Service Access Point, and LSAPs are the main abstraction for application connections over IrDA. The seven-bit selector is analogous to a TCP/IP port number, so we could compare an LSAP (consisting of a source address, destination address, and selector) to a TCP/IP address/port 4-tuple. If you're not familiar with this part of networking theory, it goes like this: in the world of TCP/IP, the combination of source and destination IP addresses and TCP port numbers uniquely identifies a TCP connection. Similarly, an IrDA connection is uniquely identified by the 3-tuple of source and destination addresses and selector. Like some of TCP/IP's competition, IrDA uses a single selector value for both sides of a connection. If you have read enough about networking to realize the implications of this fact, well, they are indeed true limitations of IrDA. In fact, as we will see later, there are some even odder limitations that emerge from the particular choices behind LSAPs.

Seven bits is not really that many bits, it only allows for 128 or so options. This made it impractical to statically assign selectors to applications, and IrDA never tried. Instead, IrDA relies on a port-mapping approach in which selectors are arbitrarily assigned to applications as needed. The mechanics rely on LM-IAS, so let's explain that.

LM-IAS is the exception to the rule, using a statically assigned selector (0). LM-IAS is based on a data structure (called an "information base" because this was networking in the '90s) consisting of a set of objects. Each object has a "class," identified by name, and an arbitrary number of key-value pairs called attributes. Both class names and attribute names are arbitrary strings, but IrDA encouraged (and to some extent mandated) a colon-separated hierarchical format. For both class names and attribute names, values starting with "IrDA:" were standardized while vendors were free to adopt their own prefixes for internal use (HP papers use the example of "Hewlett-Packard:").

The most important parts of LM-IAS were the "Device" class, which provided basic information on the device itself including a human-readable name, and the list of other classes which represented the applications that the IrDA device supported. For example, the class "Email" described the capability to transfer email messages, and contained a standardized attribute "IrDA:IrLMP:InstanceName" which contained a human-readable name for the email capability (useful if the device, for whatever reason, exposed more than one object of the Email class—which IrDA fully supported).

So, let's go back to the discovery scenario and add in LM-IAS. You point one device at another, it sends beacons, the second device responds, and a connection is established at the IrLAP layer. Now, IrLMP kicks in: an exchange of LM-IAS data allows the first device to present the user with a list of discovered devices (including their names) and a list of applications supported by those devices.

The attributes on LM-IAS objects can be anything, but one was particularly important and, in fact, required by the standard: IrDA:IrLMP:LSapSel (I do not know why it is capitalized this way!). That attribute provided the LSAP selector to be used to communicate with that specific application.

Well, that all sounds simple enough, but let's complicate things further. IrLMP went further than IrLAP alone, but still curiously lacked one of the properties we would expect: flow control. It's actually a little odder than that, IrLMP did provide basic flow control in an exclusive mode with only one logical connection, but when multiplexing was in use, it didn't attempt to address the many flow control problems that emerge with multiple connections (deadlocks, contention, etc). We need another protocol!

At this point, you can tell that internet influence is becoming significant, as IrDA introduced the "Tiny Transport Protocol" or TinyTP. TinyTP is, as the name suggests, much more comparable to TCP in the internet stack. TinyTP added flow control (at the individual connection level) and a robust mechanism for segmenting large payloads, a problem that IrLMP left as an exercise for applications.

What makes TinyTP confusing is its relation to the other protocols. TinyTP is mostly a layer on top of IrLAP, but not quite. TinyTP relies on the exact same selectors as IrLMP, LSAPs, and it relies on LM-IAS for devices to negotiate which applications will run on which LSAPs. We can add a new standard attribute for LM-IAS objects: IrDA:TinyTP:LSapSel, which indicates that a service is available over TinyTP, and the LSAP selector to use. The result is that TinyTP and IrLMP are two parallel alternatives, providing similar interfaces and running over the same lower layer, but TinyTP also relies on IrLMP for initial connection setup.

Now, let's loop back to the implications of LSAPs. In the TCP/IP world, a connection is uniquely identified by the 4-tuple of addresses and ports. This means that one host can open two connections to the same service on another host, differentiated by the choices of source ports (which are ephemeral ports in the TCP/IP design). With IrDA, connections are identified by a 3-tuple, which means that you actually can't do this. A given host can only have one connection to a given service on another host, because there's nothing in the headers to differentiate two connections with the same LSAP selector. This is mostly just of academic interest, since the protocols defined on top of IrDA were designed with this in mind, but it's always interesting to see these differences between network architectures. So, here's another: since IrLMP and TinyTP use the same LSAP selector format, and run under IrLAP with the same addresses, you cannot differentiate between IrLMP and TinyTP connections to the same service. Once again, not a big problem in practice because everyone knew about this limitation and would not attempt to connect to the same service with both protocols, but it's still interesting that you can't. If we compared the difference between IrLMP and TinyTP to that between UDP and TCP (which is an imprecise but still useful comparison), the difference stands out, because UDP and TCP connections are differentiated in several different ways. In practice, IrDA applications addressed the problem by assigning different LSAP selectors to the same application for TinyTP and IrLMP, for those applications that supported both.

IrLMP and TinyTP provided a pretty robust capability that met most needs for basic network connections, especially since TinyTP was similar enough to TCP/IP in its semantics that TinyTP connections could be treated as Berkeley sockets. IrDA applications could thus be written a lot like internet applications, using some of the same libraries and techniques. Of course, mentioning this must make you wonder: TCP/IP over IrDA? Yes, that's an application!

But first, we'll revisit the physical layer, because IrDA received several revisions of its bottom layer. SIR, the 115Kbps mode taken directly from HP, gave way with IrDA 1.1 to MIR, the Medium speed Infrared physical protocol. MIR operated at 1 Mbps, and it's particularly interesting to me because it reflects one of the major trends of IrDA development. IrDA 1.0 was published in 1994, basically as a formalization of existing HP SIR devices (which generally became IrDA devices with just software changes). IrDA 1.1 was published in 1995, just a year later, but is far more reflective of IBM than HP.

In 1995, IBM's ThinkPad line of portable computers was massively successful and basically defined the "business laptop." ThinkPads sold like hotcakes and found widespread use in the kind of applications that used to have people reaching for programmable calculators... and many more. But wireless networking was still a problem, and Wi-Fi wouldn't achieve widespread adoption for several more years. IBM leaned heavily into IrDA, and for many years it was a feature of every ThinkPad model.

MIR was fairly similar to SIR in terms of line coding, but faster. To address clock synchronization problems at the higher speeds, MIR used a bit-stuffing scheme taken from HDLC, part of the ISO network stack that was directly derived from SDLC, which was the data link protocol for IBM's SNA network stack. So, just as TCP/IP took over, IrDA went the SNA/ISO path, at least in a small way.

Subsequent revisions of IrDA introduced FIR (Fast Infrared) at 4 Mbps (1998), and VFIR (Very Fast Infrared) at 16 Mbps in 2001. By the time IrDA got to 16 Mbps, the "basically RS-232 over an LED" scheme had been replaced with a more sophisticated non-return-to-zero run-length-limited line coding, much more similar to what radio protocols used (which were both more sophisticated and more common by 2001). The fact that radio modulation methods could be applied to IR meant that the sky was the limit, and subsequent work introduced a 96 Mbps protocol (UFIR) and a 0.5 or 1Gbps protocol called GigaIR. Unfortunately, GigaIR came much too late in the limited lifespan of IrDA. As far as I can tell, it never made it to any widely available products. Even 16 Mbps VFIR is rare in practice, so for most purposes we can consider 4 Mbps the fastest speed achieved by IrDA.

Aided by the simple and internet-like interface of IrLMP and TinyTP, IrDA was applied to quite a few tasks. One of the more common application protocols used over IrLMP was IrCOMM. I suppose that IrCOMM stands for Infrared Communications, but that doesn't mean that much, does it? IrCOM might be better, because IrCOMM provided emulation of traditional serial and parallel ports over IrDA. It sounds silly to run a serial port emulation protocol over a network protocol over a data link protocol over a physical layer that was originally designed to emulate a serial port, but the scope of IrCOMM's support for physical serial ports goes beyond just providing a character-oriented communications channel. IrCOMM provides a set of control messages and behavioral standards to replicate the full feature set of RS-232, including the control signals. It also provides integrity and reliable delivery, so that traditional serial port applications will reliably work over the unreliable IR link. IrDA as a drop-in replacement for serial ports proved popular, especially in industrial diagnostics and programming. A number of early digital cameras also supported IrDA for transferring images to computers or printers (or, in Japan, sending images over select IrDA-equipped payphones, because Japan is like that)—and they used IrCOMM to convey the same vendor-specific protocols they had used over serial cables.

Considering IrDA's history, it's no surprise that printing was another popular application. There were a few ways that an IrDA device could send a job to the printer (fragmentation remained a problem, to some extent, for the entire lifespan of the protocol), but the Cadillac option was IrJetSend. JetSend is a topic of its own, a surprisingly complicated and highly generalized protocol that could probably be used for just about anything. But HP developed it, so it was used for printers and scanners. IrJetSend let the printer describe complex user interfaces to the client, so you could print from your PDA with access to all of the capabilities of your workgroup printer. Living the dream.

My personal favorite IrDA application protocol is OBEX, and it's also the apex of IrDA's goal of interoperability. Many descriptions of OBEX compare it to HTTP, which is fair, and there are several tells (besides timeline) that suggest that OBEX's designers had the world wide web on their minds. OBEX operates over TinyTP, and once establishing a TinyTP connection to the OBEX object from the LM-IAS, an OBEX client sends a "CONNECT" message that includes a service name to indicate what the client wants to do.

Like HTTP, OBEX was designed for very general document-moving purposes and can thus be used for a lot of different purposes. If we stretch the HTTP analogy, we could say that the service name provided in the CONNECT message is a bit like the Content-Type header, as it tells the server what kind of thing the client intends to interact with. This analogy isn't great, because different OBEX services are likely to be handled by different applications. Well, let's just consider examples. One OBEX service is "file transfer," which is just for generic file copy operations. Others are IrMC, which performs offline PIM synchronization in vCard-family formats, and SyncML, which performs offline PIM synchronization in SyncML (an XML format). If you have no idea what I'm talking about when I say "offline PIM synchronization," well, I should probably write an article about it. In the meantime, look up "Microsoft ActiveSync" and try to imagine that it is 2003 and you own a PDA.

Once an OBEX connection is established, it proceeds using the verbs GET and PUT in basically the same way (and for the same purposes) as HTTP. You can get files, you can put files. The only real difference between the behavior of HTTP and OBEX here, besides that OBEX is vastly simpler, is that OBEX is stateful about the working directory (similar to FTP) so it has a SETPATH verb for changing the working directory.

So I guess OBEX is more like FTP? But it has headers like HTTP and HTTP verbs. I don't know, pick your favorite comparison.

Here's where all this matters: one of the showcase applications of IrDA is that you could send things between phones, like AirDrop or something. You flip open your phone (because this is an era in which it does indeed flip open), push a couple of buttons, point it at your new acquaintance's phone, and it is as if you provided a business card. You aren't just using infrared networking, you are Networking using infrared. Similarly, you can send files, but considering IrDA speeds you probably want to keep them slim. Fortunately the most popular use-case here was sharing photos, and any camera on a phone with IrDA was probably just getting into the megapixels.

In the early days of IrDA, this didn't actually work reliably, because of different application-layer implementations (such as IrMC vs. SyncML). When it did start working reliably, it's because your respective Nokia phones were exchanging vCard files using OBEX over TinyTP over IRLAP over FIR, the IrDA stack in its full glory.

Wait—stop—hold on. When I say that you can, you know, send a file to your friend's phone... maybe your favorite MP3... what would you call that? If you are the same kind of person as me, you will say "squirting," of course. You squirt a file to someone.

I want to squirt you a picture of my kids. You want to squirt me back a video of your vacation. That's a software experience.

Steve Ballmer said that, in 2006, about a feature of the Zune. Using Wi-Fi Direct, two Zunes could send files to each other. It's a little odd that Ballmer doesn't mention using this to send music, but keep in mind that Microsoft had to tread very carefully to avoid the attention of the RIAA. The point is that it was a Wi-Fi version of IrDA's OBEX file sharing, and Microsoft was widely mocked for calling it "squirting." The funny thing is that I think this is a little overplayed, the Zune didn't actually it "squirting" in the UI and I don't think that term was even used in documentation. But Steve Ballmer used it, as did other Microsoft employees, so it seems to have been the term they used internally.

The other funny thing is that they didn't come up with it.

Squirting was already the accepted term for this feature by the time the Zune came around. It's generally assumed that one-shot file transfers came to be called "squirts" because "squirt" is used in a similar way for brief radio signals (back to at least WWII). I can't promise you that I have found when this happened, but I have a good contender: HP's CoolTown research project, which ran from 1999 to 2000 or so and developed an IrDA protocol they called "e-Squirt." By 2000, papers about IrDA referred to file sharing as squirting. I have no doubt that the Zune team would have picked up the term from there, given Microsoft's extensive involvement in IrDA.


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


Well, I didn't expect the tangent about squirting, but we had to do it. Back to our main program: whatever happened to IrDA?

It's frustrating, because IrDA had a lot of potential. The work on GigaIR, even though unrealized, showed that IrDA doesn't have to be slow. IBM invested a lot of time and verbiage in IrDA AIR, or Advanced Infrared, which circled back on the whole "point to multi-point is possible but unimplemented" thing. AIR was an overhaul of the whole IrDA stack that made networks of three or more devices possible although, as it turned out, not necessarily practical. AIR was supported by a lot of IBM products but never really used for anything, because by that point devices were getting Bluetooth and Wi-Fi.

That's about it: IrDA just got wiped out by Bluetooth and Wi-Fi. Wi-Fi meant that "thick" mobile devices like PDAs would just connect to a network, wiping out the whole synchronization scenario. For more transient purposes like file sharing, Bluetooth promised more convenience, even if I'm not sure it delivered it. IrDA did require that the two devices were in direct sight of each other, which imposed design constraints on mobile devices and implied that you would hold them pointed at each other the whole time.

By the mid-2000s, things just got worse. Integration of Bluetooth and Wi-Fi modules meant that Bluetooth was "free" in a lot of mobile devices, which is even cheaper than the IrDA components. IrDA mostly dealt with lighting well, but not direct sunlight, and smartphones probably made that scenario pretty frequent.

IrDA published IrSimple, a comprehensive set of improvements to IrDA, in 2005. It was mostly a Japanese effort, because Japan being the way it is, IrDA was more widely used there. Infrared was disappearing from mobile devices when the iPhone killed IrDA entirely. Without mobile devices, IrDA was a solution without a problem. The technology, the standard, and the Infrared Data Association itself all faded into obscurity.

This isn't to say that IrDA is gone. It's probably actually pretty widely used, as far as short-range wireless protocols go. There are several common embedded applications of IrDA and IrDA-derived protocols, ranging from power meters to laundry machine diagnostics. These applications benefit from IrDA's low cost: modern documents on IrDA often call out that it can be implemented with bit-banged GPIO or unused UARTs and very few additional components. They also value that IrDA is easily compatible with waterproofing and that it doesn't provoke the regulatory requirements associated with RF.

There's another upside to IrDA, as well: security. IrDA isn't quite visible light, but infrared behaves similarly. Most materials are opaque. If you can seal a room against light, you can seal a room against IrDA—and that's a lot easier than reliably blocking RF. There are probably some enduring applications of IrDA because it's permitted in some areas where RF communications are not, due to concern of eavesdropping or malicious interference.

Okay, you've made it this far, and I did not expect this article to be this long. Let's have a little dessert. As we've seen with Li-Fi, there's interest in FSO communications as a way to connect portable devices to IP networks. IrDA sure has a lot of the ingredients, it just wasn't built for IP.

Well, don't worry, there was a solution to that: IrLAN. IrLAN implemented IP over IrDA with a surprisingly Wi-Fi-like architecture, including support for multiple clients. That part is particularly interesting: IrLAN supports an "access point" mode with one AP communicating with multiple clients. You could, in theory, use it as a direct alternative to Wi-Fi. It seems like HP even built a device for this, the HP NetBeamer, although it is so obscure that I suspect that it only ever existed as a prototype.

But, remember, point-to-multi-point was unimplemented in IrDA. Well, except for AIR, which saw little adoption. How to square the circle?

It's strange, I'm honestly a little confused by it, but the original IrLAN specification has this odd sentence buried in a definition in the glossary:

IrLMP is multi-point-capable even though IrLAP is not. When IrLAP becomes multipoint-capable, multiple machines will be able to communicate concurrently over an infrared link.

Well, it's true that nothing about IrLMP really prevented point-to-multipoint. But just saying that IrLAP didn't support it is rather understating the problem, AIR had to make changes to the physical layer to get feasible multipoint support (and AIR had a bad reputation for performance as a result). Later in the specification, we read that "it is quite reasonable to expect future implementation of access point devices to support multiple concurrent clients connecting to the LAN."

So, it turns out, the authors of the IrLAN spec defined support for multiple clients on the assumption that it would become possible, through some effort like IBM's AIR... and then that just didn't happen.

Honestly, IrLAN doesn't seem to have gone anywhere. Even at the time, many reportedly thought it was a bad idea. Much better to just use PPP, designed for a serial channel with the exact behavior provided by IrCOMM. It's nice to know that at least the protocol side of an IrDA-based alternate Li-Fi was developed. I hope to one day find an HP NetBeamer. I want to pick up some IrDA devices and experiment with OBEX. We could send some files around at 4 Mbps. With a quiet IR background, later-generation IrDA transceivers apparently worked over impressive ranges. AIR targeted five to ten meters. We could build a LAN. We could squirt. That's a software experience.

telecheck and tyms past

Years ago, when I was in college, I had one of those friends who never quite had it together. You know the type; I'm talking lost a debit card and took three months to get a new one because of some sort of "mixup" with the credit union that I think consisted mostly of not calling them for three months. In the mean time, our mutual friend ended up in a quandry: at WalMart, at one in the morning, with a $2 purchase and no cash. Well, this was no problem for that particular space case: he had his checkbook.

If you think about it, it's actually pretty remarkable that grocery stores accept personal checks. It's a very high risk form of payment. Even if the check is genuine, the customer could be writing it against an empty account. On top of that, with modern printers and the declining use of MICR, forging checks is trivial. When you offer a check, the retailer has very little to go on to decide whether or not you're good for the money. Surely, fraud must run out of hand—and yet, just about every major grocer still accepts personal checks.

Retail point-of-sale acceptance of personal checks is the product of an intriguing industry that handles all the challenges of checks at once: a combination of digital payment network, credit reporting firm, insurer, and debt collector known as a check guarantee service. The check guarantee is older than the ATM, and depending on how you squint, check guarantees are quite possibly the first form of real-time, telecommunications-based point of sale payment processing.


Harry M. Flagg was born in Frankfurt in 1935, but spent most of his childhood in Milwaukee, Wisconsin. He attended MIT, major unknown, and graduated in 1957. I think he was probably an ROTC student, because some sort of Navy service took him from Massachusetts to Hawaii, where just a few years later he was out of the Navy and working as some sort of "management consultant." Flagg was entrepreneurial to his core, so while I knew few details about it his consulting work is unsurprising given the wide variety of business ventures he was soon involved in. We can be fairly confident, though, that his clients included retailers—retailers who struggled with personal checks. In 19641, Flagg quit consulting to focus on checks alone.

His idea was straightforward: keep a list of people known to pass bad checks. When a retailer is given a check, they just check the list, and if the writer's name appears they should turn the check away. As legend has it, Flagg took the idea to a Boy Scout meeting where he happened to describe it to a crowd of Honolulu business leaders, one that presumably included his soon cofounder Bob Baer. They agreed on an informal arrangement: Honolulu businesses would report writers of bad checks to Flagg's consulting office, where his small staff would look up names on request. It was such a success that Flagg's staff were soon overwhelmed. Tracking the writers of bad checks became Flagg's full time business.

TeleCheck newspaper ad

He christened his new venture TeleCheck—Tele, perhaps, for Telephone, or Telecommunications. Whether his MIT education or his Navy experience, something had introduced Flagg to the potential of the computer. Having seen his busy office staff, taking calls and digging through files, he imagined TeleCheck as a centralized, real-time computer system. By the time he announced the new company, an IBM system was already on order. General manager George Duncan set about designing and testing the process, and somewhere along the way they picked up the engineering talent to build a database for questionable checks.

As explained in TeleCheck's ads, accepting checks required only a phone call. Once connected to a TeleCheck operator, customers curtly said their TeleCheck account number followed by the driver's license number of the person who had written them a check. By the time TeleCheck matured, they settled on a system of three possible results: a "code 1" indicated a low-risk check that TeleCheck would guarantee. A "code 3" meant that TeleCheck didn't have any specific evidence against the check writer, but the value of the check or other risk patterns meant that TeleCheck was not willing to guarantee it. Worst of all was "code 4," telling the retailer that TeleCheck would absolutely not guarantee the check, because the writer already owed them money.

A 1964 newspaper photo shows TeleCheck operator Dorothy Nicholson sitting at the console of an IBM 1440, Harry Flagg looking on from the side. She's answering the phone with her left hand, right hand poised on the keyboard of the teletypewriter. This is probably a staged shot for the newspaper, I sincerely hope that they found Dorothy a headset (admittedly a surprisingly expensive proposition in the 1960s). It also contradicts claims in other newspaper articles that TeleCheck didn't go into operation until 1965, but I think that there was an extended "trial" phase before the service was generally available. I pay attention to these details because they tell us something about the company's early days. Flagg brought on quite a few business partners, so many that I struggle to keep track of them, and I assume that the computer was much of the reason. They were probably renting it, but that rate would have apparently been at least $1,500 a month, equivalent to about ten times that today. TeleCheck had capital. I assume that many of their early customers, taken from that Honolulu Boy Scout meeting, were investors as well.

Into the IBM 1440, TeleCheck combined several data sources: they formed a partnership with the Honolulu Police, from which they received copies of police reports on bad checks. They invited banks to submit records of bad checks they'd received. This information formed what Baer called a "positive" credit file. Instead of collecting data on all consumers, TeleCheck collected data on only the writers of bad checks. This distinction doesn't seem particularly interesting today, but TeleCheck really leaned into it, perhaps because consumer credit bureaus were both a growing business and a growing source of controversy in the mid-1960s. It probably served TeleCheck's interests to maintain some space between themselves and proto-Equifax organizations like the Hawaii Credit Bureau.

You might wonder about the business model; one of the advantages of checks is that they are relatively cheap to process. TeleCheck charged businesses a fee, at least initially set at 2%, but that wasn't just for the risk database. For the merchants, TeleCheck actually had a much more compelling offer than tracking check frauds. TeleCheck would guarantee each check they approved. If a merchant accepted a check on TeleCheck's advice, and the check bounced, TeleCheck would reimburse the merchant. In exchange, it asked for the bad check to be endorsed over to TeleCheck themselves.

Eating the cost of these bad checks could have been rough on TeleCheck's books, but they had their reasons. First, the reimbursements gave their customers a clear incentive to submit every bad check to TeleCheck. While TeleCheck marketing emphasized police and bank sources, it's clear that the primary source of their data was always their own customers.

You might realize that the guarantee service could create a new kind of fraud: a business might fabricate a bad check, or even knowingly accept one, and then let TeleCheck reimburse it. TeleCheck's insurance scheme was closely coupled to their credit bureau scheme. In other words, TeleCheck was able to control their risk on reimbursing bounced checks by making the decision of whether or not to accept the check at all. For businesses to claim reimbursement on a check, they had to prove that TeleCheck had agreed to guarantee it. They did that with an authorization number, a four-digit code provided over the phone that the cashier wrote on the back of the check before it was deposited.

Second, it was a business of its own: TeleCheck was a debt collector. And not just any debt collector, but one with the leverage of control over check acceptance at hundreds of businesses. TeleCheck presented this as a simple arrangement that does seem quite charming compared to the modern credit reporting industry: you could pass one bad check on TeleCheck's dime, but only one. Your identification remained in TeleCheck's database of unacceptable risks until you contacted them and made good on the original bounce. In other words, rip off TeleCheck and you'll never pay by check in this town again.

When businesses rejected checks, due to a negative TeleCheck response, they were instructed to provide the customer a "courtesy card" with an explanation of TeleCheck's operation, ways to contact them, and a reference number for the database entry that lead to the decline.


One of the interesting things about TeleCheck is its place in the history of check guarantee and its rapid growth. TeleCheck was not the first check guarantee service, Flagg personally knew of at least one other in New York City. For that reason, and likely others, TeleCheck had been given legal advice that they could not protect their business model by patents or other means. Flagg told the Honolulu Star-Advisor that "this means that we have to expand just as fast as possible before others get the same idea." And expand they did.

TeleCheck had barely started commercial operations, perhaps not started them at all, when they renamed from TeleCheck to TeleCheck International, signaling ambitions far beyond Hawaii. Existing operations were moved to TeleCheck Hawaii, a subsidiary, which would soon be joined by TeleCheck New York.

Today, checks seem an odd way to pay at retail because of the ubiquity and stronger security guarantees of debit and credit cards. In the 1960s, though, card payments were not widely available—if you weren't carrying cash, you paid by check. Checks were particularly problematic in the case of travelers, and that explains TeleCheck's Hawaiian origins. Checks are much easier to confidently accept when the bank, or even better the writer, are known to the merchant. That usually meant that personal checks had to be drawn on a local bank, at the very minimum, and initially even TeleCheck only guaranteed checks from Hawaiian banks.

But what, then, of tourists? Merchants would sometimes accept out-of-town checks with additional identification measures that ranged from copying down a driver's license to taking a photograph and thumbprint. Most just didn't, expecting visitors to obtain "travelers checks" issued by a well-known national bank and then usually cashed at another of that bank's own branches. Besides the inconvenience to the traveler, tourism economies like Honolulu's must have acutely felt the unwillingness of visitors to spend money when it involved multiple preparatory steps.

TeleCheck knew this going in, so expansion was inevitable. TeleCheck Hawaii and TeleCheck New York were set up as independent operations with their own computers and databases, but they were connected: bad check records from each were automatically transmitted to the other. As TeleCheck expanded, data sharing between regional operations built a distributed nationwide database, one that allowed a merchant in, say, Honolulu to accept a personal check from New York under full guarantee. If you asked Flagg, or Baer, all that was needed for complete nationwide acceptance of personal checks was a TeleCheck computer in every major city. Within a few years, TeleCheck International reorganized as TeleCheck Services, a franchise corporation. They started recruiting franchises in every state and, within a decade, Canada.

TeleCheck grew extremely rapidly, the kind of growth we might call a "unicorn" today. In 1969, TeleCheck estimated that their seven full-time operators took about 70,000 calls a month, 100,000 in the holiday shopping season. They guaranteed $6.75 million in purchases each month and paid out over a million a year in bad check reimbursements.

Check guarantees weren't everything, though. TeleCheck also diversified, expanding into just about every business they could think of until it started to seem comical. The same year that TeleCheck started, they acquired a company called Professional Services Inc. that did something we would now recognize as medical billing. It only took a couple of years for TeleCheck to dominate the Hawaiian medical and dental billing industry. In the words of one journalist, TeleCheck was hooked on computers, and hunted for any opportunity to make money off of the IBM 1440 and the larger machines that soon joined it.

Consider, for example, Match-Mate: the premier computerized matchmaking service of 1960s Hawaii. Lonely islanders filled out a questionnaire, conveniently distributed as a newspaper ad, and mailed it in with a payment of $7.00—by check, of course. The questionnaires were entered into TeleCheck computers and participants received a report with two likely matches and a booklet entitled Dining in the Islands. Considering that the book alone would "retail for $5.00 and has a full purchase price of $75.00" it was quite the deal for love. Well, maybe not, Match-Mate didn't last for long. It's interesting though that, alongside the address of TeleCheck International, the newspaper ads mentioned a CDC 3100.

Match-Mate questionnaire

The IBM 1440 was something of a budget computer, intended as a lower-end alternative to the "flagship" IBM 1401 mainframe for the many small businesses and accounting firms that couldn't afford a 1401. Within a couple of years, TeleCheck appeared in a directory of computer services provider as a consulting, accounting, payroll, etc. data processing firm with a Honeywell 200, a semi-clone of the IBM 1401 that could mostly run the same software, so they had apparently upgraded. Then, in 1966, Match-Mate associates TeleCheck with the CDC 3100. The 3100 was the runt of the CDC 3000 family but still ran about $120,000, over a million dollars today. Once again, all of the machines were likely rented, but still... in its first two years, TeleCheck acquired more computers than most established businesses would over five.

Some of TeleCheck's side ventures were quite logical. They had accounting and payroll businesses, which naturally fit the transaction processing skillset they had built for check guarantees. Consumer credit cards emerged during the 1960s, and TeleCheck was enthusiastic about those too. I don't think the scale of this operation was ever that large, but TeleCheck apparently handled online verification for some retailer credit cards in Hawaii, quite possibly by treating them as a special kind of check (a trick that TeleCheck would use repeatedly over the years to offer new services over existing equipment). Once again leading me to suspect that Flagg was doing some kind of engineering at MIT and in the Navy, TeleCheck's software situation seemed sophisticated for the 1960s. Newspaper articles describe real-time multitasking between batch processing of medical billing folios and online check inquiries. In a couple of years, they threw some sort of telephone order business and construction supply catalog into the mix.

But TeleCheck didn't keep to its computer roots. By the end of the 1960s, TeleCheck owned Honolulu Business College and Cannon's College of Commerce. They owned Minneapolis-based Boatel, manufacturer of houseboats and snowmobiles (Flagg seemed to have moved to Minneapolis at some point along the way). TeleCheck's Marine Science Division, made up mostly of subsidiary Pacific Submersibles, operated a Perry PC5C research submarine for which they were building a custom robotic manipulator. In 1968, TeleCheck Hawaii announced a complete rebranding to Data-Pac, a name that would better reflect their diversified interests. The franchise parent, TeleCheck International, kept the TeleCheck name.

In 1972, TeleCheck International went bankrupt.


The story of Harry Flagg is a complicated one, and I do not think that I have all of the information. There are just certain, you know, oddities. In the mid-'60s, Flagg was repeatedly lauded as the founder of TeleCheck. Robert Baer seems to have been around from the beginning, but it's not until later on, in the '70s and '80s, that he is widely referred to as TeleCheck's founder. Flagg is conspicuously absent from these versions of the company's history.

Similarly, the 1972 bankruptcy, triggered by the parent of a company TeleCheck had acquired calling the loans it granted to facilitate the acquisition, left little paper trail. Well, it was a bankruptcy, so there's a voluminous docket of the legal and financial details, but TeleCheck's leadership and their thinking are now opaque. We do know this: after the 1972 bankruptcy, Flagg was no longer in charge of TeleCheck International.

In 2005, a court ruled for the FTC in its case against Trek Alliance and its founders, including Harry Flagg and his son Kale Flagg. Flagg had moved to Arizona and founded Trek Alliance in 1997, a vague company with a confusing set of subsidiaries that included some sort of sales training. Primarily, though, Trek—which is unrelated to the better-known bicycle manufacturer—was a pyramid scheme. At least, that's what the court ruled. According to the FTC's complaint, Trek's "Independent Business Owners" sold water filters, cleaning products, nutritional supplements, and beauty aids. Trek's compensation plan, "one of the most lucrative in the industry," included a series of Bonuses and a 22-level Pay Plan assigned according to dollar volume of an Independent Business Owner's "downline."

This is, of course, the gold standard of business excellence in modern Utah, but Flagg was in Arizona and the 2000s. He and the other parties settled, denying fault but agreeing to shut down Trek. Together with their insurance company, they paid millions in restitution and suffered a permanent injunction against involvement in any multi-level marketing schemes.

Honestly, I'm not inclined to view Flagg as a fraudster, although it would be deliciously ironic considering where he started. I think he was just a little too ambitious and not quite cautious enough, entangling himself in everything that sounded like a good idea until it was just too many things to keep up. Still, how remarkable it is that the creator of the nation's most successful anti-check-fraud scheme would become separated from it, only to later be caught cashing checks from the top of a pyramid scheme. Now that's vision.


Despite TeleCheck's over-expansion and leadership troubles, the company was unstoppable. Baer became president of TeleCheck Hawaii while his son, Jeffery Baer, moved to Denver and established the headquarters of TeleCheck Services Inc., the new parent company of the TeleCheck franchise system. Those franchises multiplied: here in the Southwest, TeleCheck Texas was founded in 1982 and rebranded to TeleCheck Southwest two years later, when it bought the franchise rights for New Mexico, Oklahoma, and Arkansas. TeleCheck Texas had processed $325 million in checks in 1983. Elsewhere, there were TeleCheck franchises operating in more than half of the US states, several Canadian provinces, and Hong Kong.

Along with expansion came technical improvements. TeleCheck Hawaii, perhaps because of its origin as the first TeleCheck franchise, was always independently minded and eventually left the franchise system to go it alone as Uni-Check. Before they left, the introduced the first interactive voice response (IVR) check verification system. The fully-automated, touch-tone based IVR system expanded to other TeleCheck franchises, who named the automated voice "Samantha." TeleCheck liked it so much that they bought the developer, a company called Real-Share.

TeleCheck had other ideas, as well. Perhaps inspired by Ma Bell's "dataphone" efforts, TeleCheck launched the first point of sale electronic payment terminal I know of, if you are generous about the definition 2. The first-generation TeleCheck Terminal, introduced 1980, was a protrusion of the front of a standard pyramid phone that read magnetic stripes and sent them over the phone line. Instead of typing the digits from a check and driver's license on their phone keypads, merchants could call into TeleCheck and just swipe a card. Of course, this only worked for cards, drivers licenses and credit cards that TeleCheck processed, but it was a hit nonetheless.

Nationwide expansion of the TeleCheck service necessarily entailed nationwide expansion of the TeleCheck network. With each franchise operating independent computers that shared records with the other, TeleCheck was a sophisticated, networked operation by the standards of the time. As it turned out, they had help, from one of the most advanced computer networking companies. In 1980, TeleCheck Services was acquired by Tymshare.

Tymshare is, to me, one of the signature brands of the era of Business Computing. As the name suggests, twee spelling and all 3, Tymshare started out as a pay-by-the-minute time-sharing provider. Founded in the same year as TeleCheck by two ex-General Electric Computer employees, Tymshare grew out of UC Berkeley by selling time on an SDS 940 computer (initially borrowed from UC, later rented themselves) running the Berkeley Timesharing System. The combination of the 940 computer (itself mostly designed by UC Berkeley), the BTS operating system, and the dial-in timesharing model made Tymshare one of the most affordable routes to serious computing. The company was tremendously successful, but time-sharing was a short lived industry. Computers were getting faster, smaller, and cheaper every year, so the set of customers that needed a computer but could not afford their own got smaller and smaller. Tymshare probably saw that coming, because like early TeleCheck, they aggressively diversified into just about anything that a computer could do—including check guarantees.

As Tymshare grew, they purchased more computers, and larger. In the late '60s, they were operating dozens of machines running a largely custom operating system. They had economized on many of their acquisitions by running the acquired software on their time-sharing fleet, which was efficient but challenging for applications like TeleCheck that were designed around a central computer (for each franchise). Tymshare's business wasn't as simple as connecting a user to their nearest computer; they needed to accept dial-in calls from around the nation and then connect them to whichever computer ran the requested application, potentially on the other side of the country.

The solution, remarkably prescient of later wide-area networks, was a cutting-edge architecture of Varian 620 minicomputers that controlled banks of telephone modems and forwarded traffic to a set of SDS 940 computers called "supervisors." The role of the "supervisor" was very much what we would call a "router" today: they packetized data from the Varians and then routed it to other 940s according to a virtual circuit switching scheme. While the system initially connected dial-in users to 940s, it was readily extended to building arbitrary circuits between any of the serial interfaces of the Varian "edge nodes."

A business that wanted to offer a dial-in service to a wide area could save a tremendous amount of money on phone lines and modem banks by instead purchasing a small number of leased lines to a Tymshare data center. Their users could then call any of the Tymshare access phone numbers, where they would receive a command prompt from the answering Varian. When they typed the keyword assigned to the interconnected computer system, Tymshare's network built a circuit from a modem in one data center to a modem in another, connecting the caller to the customer's computer across their own nationwide network.

In 1976, Tymshare separated their network infrastructure into a separate company, Tymnet, which registered as a telecommunications carrier. Tymnet would ultimately outlive Tymshare itself, becoming the "industrial internet" before the contemporary buzzword or, really, the internet that spawned it. Despite some technical challenges originating from Tymnet's proto-internet architecture, everything from credit card transactions to supply chain notifications to consumer dial-up ISP traffic ran over Tymnet for decades after. Tymnet provided modem bank services to AOL, for example, and some of the vintage 1970s Tymshare dial-in numbers are still in service with various contract modem providers.


After its Tymshare acquisition, TeleCheck had a nationwide computer network, significant computer capacity, and an appetite for technical sophistication. It was a troubling time for a check guarantee firm, though. A new technology called Electronic Funds Transfer, or EFT, let retailers withdraw money directly from customer's accounts using only their debit cards. TeleCheck and Tymshare had actually found some business processing these transactions, but for the most part it was a separate industry that competed with checks. Baer cautioned that there was no reason to panic, as consumers would continue to use checks for many years to come, but there were clearly other things underway at TeleCheck. They were building their own transaction processing network.

I started thinking about TeleCheck because I was pumping gas and looking around for anything to distract me from how much it costs these days. Attached to the fuel dispenser, in between a half dozen other mandatory regulatory notices, was a sticker with the bright red and white TeleCheck logo. Rather than the "Your Check Is Welcome Here" verbiage used by early TeleCheck decals (back when many retailers did not yet accept personal checks from unknown customers), this one had decidedly less interesting text: "When you provide a check as payment, you authorize us either to use information from your check to make a one-time electronic fund transfer from your account or to process the payment as a check transaction."

In 1984, Tymshare, and TeleCheck with it, were sold to McDonnell-Douglas. McD-D, as I like to call them, was a formidable aerospace and defense contractor that was feeling an acute need to diversify as "peace broke out." They are also known, perhaps mostly due to their 1997 merger with Boeing and its effects on that company, as a bit of a backwater for actual engineering. They didn't hold on to Tymshare for very long, just a few years to give TeleCheck the curious property of a check guarantee service backed by F-15s.

Around the same time they were acquired, TeleCheck introduced a next-generation payment terminal that incorporated an MICR check reader along with support for newly standardized credit cards. This terminal followed the same basic model, of calling in via Tymnet and then sending the card or check contents over the phone line. This method of handling credit cards proved short lived as the industry reorganized and security requirements became far stricter, but it got TeleCheck equipment into a huge number of retailers. In 1989, McD-D, unsatisfied with the finance industry and perhaps ginning up the Gulf War, sold their entire software division. TeleCheck's increasing success as a general payments processor no doubt helped attract the buyer: First Data.

First Data probably qualifies as obscure, as they have few consumer-facing options, but they're a giant in payment processing. First Data provided the original infrastructure for both Visa and MasterCard before becoming part of American Express, who continued to operate the company as a general financial information systems company until they spun it out again. Back in the early '90s, though, TeleCheck found itself as a subsidiary of a company that also processed credit and debit transactions. It was only natural to unify those business lines into one, which TeleCheck called Electronic Check Acceptance.

Picture with me, in your mind's eye, the Verifone TRANZ 330. You have no idea what I'm talking about, of course, but if you're about my age or older you would recognize one. The Tranz 330 was the first widely successful credit card payments terminal, and anchors the Verifone brand name as a manufacturer of devices that verified payments over the phone. In practice, the TRANZ 330's main purpose was to read the data from a credit card, accept a keyed-in dollar amount, and then connect (usually over Tymnet) to a backend computer to authorize the transaction.

It could do much the same for checks: one of the features of the TRANZ 330 was check authorization, designed specifically for TeleCheck. The cashier could press a key to select check authorization, follow prompts to enter the check information and swipe the writer's driver's license, and then get back an authorization code (or a decline) on the terminal's screen.

The TRANZ 330 represented a milestone in two ways. It was the first payment terminal with TeleCheck support that resembled modern payment terminals in function. Earlier TeleCheck terminals were primarily phones with a payment card reader attached to them, the TRANZ 330 was not a phone at all and managed modem calls transparently to the user. Second, it represented a shift from TeleCheck providing a complete end-to-end solution to TeleCheck as one of a number of services supported by a common payment frontend.

After the First Data acquisition, TeleCheck was further integrated into other payment equipment, but it won back the branding. The TeleCheck Accelera and TeleCheck Eclipse, both manufactured by Verifone but bearing the TeleCheck logo, looked very much like every other credit card terminal but with the addition of an MICR check reader and slip printer. The added hardware allowed the terminals to read the check automatically, and also to print the authorization code on the back.

When these devices were introduced, the expectation was that a merchant would use the terminal to "authorize" a check (getting a TeleCheck guarantee on it), and then separately deposit the check with their bank. This wasn't all that different from how credit card transactions were handled at the time, with authorization usually done in real time (if at all) and "capture" (the actual payment) submitted as part of a batch at the end of the day. There was still a lot of paper handling involved, though, and during the 1990s it was clear that shipping slips of paper around the country was not a suitable long-term plan for checking.

Since the 1980s, a system called the Automated Clearing House (ACH) had been brewing among various bank coalitions and, later, the Federal Reserve Bank and an organization called NACHA: the National Automated Clearing House Association. ACH was one of the first standardized forms of computer payment, basically just a specification for a text file that contained the basic information for check-like transactions. Instead of exchanging paper checks, banks uploaded a text file to the ACH and then downloaded another text file later. Those files represented transactions in and out, processed in the banks and the ACH platform as a once-daily batch. ACH caught on for many of the same purposes that checks had fulfilled, like payroll (commonly called "direct deposit") and payments to savvy billers that wanted to avoid card payment fees (commonly called "e-check").


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


Since an ACH transaction is pretty much just a line in a text file with the same numbers you would find on a check, it is superficially possible to take someone's check, copy down the numbers, and then enter it as an ACH transaction. In actual practice this wasn't allowed—until 2000. That year, NACHA adopted a provisional set of rules for "point of purchase entry," the on-the-fly creation of ACH transactions from a point of sale system. In the real world of retail, where cashiers are not excited to ask for someone's bank account and routing numbers, wait for the consumer to figure them out, and then try to key them in correctly, this pretty much universally turned into "check conversion."

The idea of check conversion is pretty simple: you "pay by check," but when the cashier takes the check from you, they actually put it into a terminal that reads the check information and uses it to create a technically unrelated transaction in an ACH batch. Most retailers use slip printers that add some tracking information (like a transaction number) to the back of the check, and stamp it "void" with a message that it has been "converted" into an ACH transaction. Some retailers would even return the voided check to the consumer as a sort of "receipt" of the ACH transaction, although I don't believe this practice is still common.

To the average person, there is hardly any difference: you write a number on a check, sign it, the money eventually disappears from your account. Since there are differences in the legalities of check and ACH processing, though, businesses are required to disclose that they convert checks to ACH. The main difference, in practice, is that ACH transactions tend to clear more quickly than checks. That does cause occasional problems for consumers who are "kiting," writing checks that they do not yet have the money to cover, since they may be counting on the slower process of exchanging slips of paper. In our modern world, traditional processing of checks has been completely eliminated in favor of image-based processing, which is something like ACH conversion built into the check clearing process itself. That means that the actual difference in processing time between ACH and checks is no longer as significant (although the availability of same-day ACH potentially complicates this, I do not believe that NACHA rules allow check conversions to be opted in for same-day clearing). The whole story of check conversion has become mostly a forgotten detail of the beautiful tapestry of transaction processing.

Naturally, TeleCheck integrated ACH conversion into their product. Many businesses now handle checks via payment terminals that perform a TeleCheck authorization and ACH conversion in one step, all facilitated by TeleCheck for a fee that is a bit lower than an equivalent payment card transaction. The function of writing TeleCheck authorization numbers is integrated into the slip printer, which used to endorse checks but now decorates them with ACH conversion details.


In 2019, First Data was acquired by financial technology giant Fiserv. Fiserv continues to operate TeleCheck to this day, but with ACH conversion now a commodity service and retail of personal checks so standard that we are forgetting about it, TeleCheck has started to look less like an interesting technology company and more like every other credit bureau.

Here's one of the reasons I find TeleCheck so interesting: search for them. I mean, just type it into Google or whatever. What do you see?

A very minimal corporate website, curiously at "getassistance.telecheck.com" (the bare "telecheck.com" redirects to the subdomain as well), with zero information except for law enforcement contact info, a form to look up declined checks, and a set of mandatory regulatory notices. "Victims of Human Trafficking" is a top-level navigation item, peeking out from above the hero banner of typing hands.

TeleCheck is now a ghost, a specter of financial technology past. Baer's 1980s predictions about EFT not cutting into TeleCheck's business fared well only if you ignore the closely related phenomenon of credit card networks. Nationwide, check volume, especially at retail, has collapsed to almost nothing. Fiserv continues to operate TeleCheck as, essentially, a legacy cash cow. They don't market the service at all, and maintain only the brand presence that is legally mandated of credit bureaus.

TeleCheck has a twisting, confusing corporate history, and besides Flagg's larger-than-life ambitions, credit reporting and debt collection are the reasons why. Consumer credit bureaus started to get a bad rap in the 1960s and have never recovered, they must be among the most hated brands in America today. Debt collectors have never had many friends. TeleCheck is, in various ways, both of those things, and they are now more important functions than technicalities of handling checks.

TeleCheck has been the target of dozens of lawsuits under the Fair Credit Reporting Act. To be fair, I don't think that they're worse than usual in this regard. One of the major implications of the FCRA is that credit bureaus can be liable for having incorrect information about consumers, but prior to the passage of the FCRA most credit bureaus were, shall we say, fast and loose about details.

Let's consider an example, of TeleCheck's practice of linked identities. If a person is writing bad checks serially, they will probably not write bad checks from the same account every time. So, as part of the "30 million facts" that TeleCheck's 1980s ads claimed their computers contained, TeleCheck saved relationships between identifiers. If you presented a driver's license and a check from one account, and then a month later at another store presented a driver's license and a check from a different account, TeleCheck permanently recorded the association between all three.

Say the first check bounced. If someone else, months later, at a different store, presented a check from the second account, TeleCheck would decline it. Why? Because they followed the links, from the second bank account to your driver's license to the first account. We might recognize this as taint analysis, and TeleCheck would follow connections through multiple steps until a bad check written by one person could result in "Code 4" declines of checks on different accounts by different people. To say that this was controversial with affected consumers understates the problems, and the way that account linking worked (especially between people of whom TeleCheck otherwise had no evidence of a relationship) became a legal morass. Several of TeleCheck's franchises seem to have left the TeleCheck brand in an effort to manage their risk, especially considering state-specific regulations on credit reporting.

As another way to manage regulatory complexity and liability, TeleCheck spun out its debt collection function into an independent company (although also owned by First Data) called TRS. Or TRS Recovery Services. Here's the thing, I am 90% sure that TRS stands for TeleCheck Recovery Services, but their own website says "TRS Recovery Services (TRS)", which would imply TeleCheck Recovery Services Recovery Services. I think the intent of the whole thing was to divorce the TRS brand from TeleCheck for reputational reasons (consider TRS has a totally different logo), which lead to some "TRS doesn't stand for anything" nonsense. TRS has a website that is almost but not quite identical to TeleCheck, with mandatory regulatory notices only, and they are universally hated as the people that hound you for life over a bounced check at Walmart.

Telecheck newspaper ad

Within the story of TeleCheck we see the full arc of payments technology: 1960s idealism at a new world in which everyone's checks are welcome, 2020s cynicism with an ailing conglomerate interested mostly in not losing lawsuits. TeleCheck is completely unexciting, the cartoon opposite of innovation, but it is very much still with it. Did you know that you can pay at amazon.com via direct ACH withdrawal from your checking account? Mass retailers are still surprisingly likely to accept checks as payment, and they are still, for the most part, doing that via TeleCheck. Even small businesses don't have to miss out: Fiserv also owns Clover, and Clover integrates TeleCheck electronic acceptance.

Deep inside Amazon's help system, under Payment Issues, an article explains how to "Correct a Failed Checking Account Authorization." Besides making sure you typed your account number correctly, the advice is: Call TRS Recovery Solutions. Sometime, somewhere, you must have written a bad check. A nationwide network of computers took note. Honolulu businessmen hobnobbed at a Boy Scout council meeting. TeleCheck took phone calls, McDonnell-Douglas forever changed Boeing, Harry Flagg went on to running MLMs. "Many Hawaii organizations currently are buying time on comparable computer facilities on the Mainland," Flagg said, when they bought the CDC 3100, the first in Hawaii. "Our installation will save them time and money." It might find them a date, too. The Nai'a explores the ocean, two small business colleges fold, two companies from Hawaii make competing claims about an obscure part of history. You're at WalMart, the total is $2, and the cashier is saying something about "Code 4." These things are all connected. They are all connected by Tymnet.

  1. Reported founding dates for TeleCheck range from 1964 to 1966, but a newspaper article about Flagg's new company ran in 1964 so that's what I'm going with. I think it took them 1-2 years to start operation.

  2. This is actually an interesting distinction, because Verifone is also a Hawaiian company and claims to have invented the first telephone-based payment terminal in 1981. That another Hawaiian company had a similar device in 1980 makes you wonder if they all knew each other.

  3. Tymshare was actually named after its founder, LaRoy Tymes, which is awesome.

LotusNotes

I tend to focus on the origin of the computer within the military. Particularly in the early days of digital computing, the military was a key customer, and fundamental concepts of modern computing arose in universities and laboratories serving military contracts. Of course, the war would not last forever, and computing had applications in so many other fields—fields that, nonetheless, started out as beneficiaries of military largesse.

Consider education. The Second World War had a profound impact on higher education in the US. The GI bill made college newly affordable to veterans, who in the 1950s made up a large portion of the population. That was only the tip of the iceberg, though: military planners perceived the allied victory as a result of technical and industrial excellence. Many of the most decisive innovations of the war—radar and radionavigation, scientific management and operations research, nuclear weapons—had originated in academic research laboratories at the nation's most prestigious universities. Many of those universities, MIT, Stanford, University of California, created subsidiaries and spinoffs that act as major defense contractors to this day.

Educational institutions bent themselves, to some degree, to the needs of the military. The relationship was not at all one-sided. Besides direct funding for defense-oriented research, in the runup to the Cold War the military started to shower money on education itself. Research contracts from uniformed services and grant programs from the young DoD supported all kinds of educational programs. For the military, there were two general goals: first, it was assumed that R&D in civilian education would lead to findings that directly improved the military's own educational system. Weapons and tactics of war were increasingly technical, even computer controlled, and the military was acutely aware that training a large number of 18-year-old enlistees to operate complex equipment according to tactical doctrine under pressure was, well, to call it a challenge would be an understatement.

Second, the nation's ranks of academics made up something like a military auxiliary. The Civil Air Patrol built up a base of trained pilots, in case there was ever a need to quickly expand the Air Force. By the same logic, university programs in management, sciences, and education itself produced a corps of well-educated people who would form the staff of the next era of secret military laboratories. Well, that's not exactly how it turned out, with the Cold War's radical turn to privatization, but it was an idea, anyway.

That spirit of military-academic collaboration is how a group of researchers, mostly physicists, at the University of Illinois found themselves with military funding to develop a system called "Programmed Logic for Automatic Teaching Operations," or PLATO. With its origins in the late 1950s, and heyday in the 1970s, PLATO is usually considered the first effort in computerized teaching. It's a fascinating sibling to other large-scale computer systems of the time, like those in air traffic control. There are many similarities: PLATO struggled with connecting terminals and computers over a large area, before "the Internet" was even an idea. It had to display graphics, a very primitive computer capability at the time but one that was thought to be vital for classroom demonstrations. The system supported many simultaneous users, and had to process data in real-time to synchronize their various workspaces.

There were also important differences. Unlike SAGE and the 9020, unlike business accounting and tabulation systems, unlike almost every computer application yet devised, PLATO was designed for user-facilitated content.

Reflecting its origins among academic physicists, PLATO heavily emphasized collaboration. Many of the earlier, 1960s-era PLATO developments focused on simplifying the development of learning modules so that teachers could create interactive PLATO courses with less specialized computer training. By the 1970s, as PLATO terminals were increasingly installed in schools and other institutions in Illinois, the emphasis on collaboration turned towards communication. If learning modules were easy for teachers to develop, the students should also be able to use the system to create their own coursework and study materials. Researchers and other academic users had a similar desire for a computer system where they could keep notes, write reports, and stay in touch with their colleagues.

PLATO did not prove an enduring success—despite a decades-long effort toward commercialization, it was expensive and the actual benefits of computer-aided teaching remained unproven1. Few PLATO systems ever escaped the University of Illinois and its network of satellite delivery locations. Follow-on projects fizzled out and, despite PLATO's incredible ascent from a 1960 concept to an elaborate 1970s multi-user interactive system, PLATO spent the 1980s in decline. No one thinks about PLATO very much any more, which is unfortunate, because it is one of those remarkable, isolated moments in history, a sort of conceptual singularity, in which a project with limited real success incubates so many concepts that it sets the course of history afterwards.

When you look at our modern computers, smartphones and social media and Farmville and so on, it's hard to find a single thing that isn't somehow derived from PLATO. Through PLATO's 1970s NSF funding and resulting interaction with other NSF efforts, it is my opinion that PLATO is probably a more important precursor to the modern internet than ARPANET and NSFNET. Not so much in a technical way; PLATO was closely tied to a mainframe-terminal architecture that would not have likely lead to our flexible packet-routing internet architecture. Rather, in a vibes way. PLATO was a large-area, networked computer system that emphasized posting things and looking at things other people had posted. It offered math lessons, but it also offered games.

Perhaps most importantly, it had notes.

PLATO's developers at the Computer-Based Education Research Lab of the University of Illinois had long had a simple way of communicating with each other, by editing a series of lessons titled "notes1" through "notes19." While functional, it was imperfect: Google Wave had not yet appeared to tell us that everyone being able to edit everything is good actually, so the complete lack of access controls, or formatting, or really organization of any kind was making the "notes" lessons a headache as the system gained users. In 1972, the PLATO IV terminal, new funding, and new backend computers had PLATO growing fast. There was an obvious need for a better version of the notes lessons.

Through the machinations of academia, the task of building a better "notes" file fell on two high-school students, on their summer break from the University of Illinois-affiliated laboratory high school. Using PLATO's native TUTOR programming language and facilities intended for exams and course discussions, Dave Woolley and Kim Mast wrote a lesson called "=notes=". The =notes= lesson was originally intended for system announcements, trouble reports, and communications between PLATO's operators. Soon, though, it was doing far more.

PLATO's notes were not the first implementation of a computer-based discussion board. Similar capabilities had been implemented by ARPANET users at least a couple of years earlier. It was also not the first email system, and indeed, was not email at all: notes only offered public posts. PLATO didn't gain a private messaging capability at all until a year later. The notes lesson wasn't even the only message board on PLATO, although it was the most sophisticated.

What stood out about notes is that it was popular. Everyone used it. By 1976, the notes lesson had evolved into a generalized application that allowed any PLATO user to create and manage their own notes file. That management included access controls and capabilities we might now call moderation. It was one of the most popular applications on all of PLATO, and that was against the competition of the several notable early video games created there.

Brian Dear, author of "The Friendly Orange Glow," a book on PLATO, argues that the notes lesson's peculiar history as a public messaging system that came before a private one had a critical impact on PLATO's users and culture. Communications on PLATO were "public-first." While a "private notes" feature was added later, users were already in the habit of doing things in the open. This sense of community, of close collaboration with some and passive awareness of others, must be a cultural precursor to the BBSs of the early internet and the social networks of today.

But that's not even what I'm here to talk about. During the late '70s, there was something else going on at the University of Illinois: future Microsoft CTO Ray Ozzie was working on his undergraduate in CS, a large part of which involved PLATO. After his graduation in 1979, he worked for Data General, manufacturer of a popular series of minicomputers, where he reported to Jonathan Sachs. After Data General, he did a stint at Software Arts, the development firm behind VisiCalc. He must have stayed in touch with Jonathan Sachs, though, who in 1982 had left Data General to found his own company: Lotus Development.

Lotus is widely remembered for Lotus 1-2-3, the hit spreadsheet application that displaced VisiCalc as possibly the most important software package in all of personal computing. Lotus's two founders were Sachs and Mitchell Kapor, both of whom had connections to Data General and Software Arts, in an era in which blockbuster software products often came from companies founded by a few entrepreneurial employees of the incumbent that was on its way out. And, indeed, they stuck to the familiar recruiting strategy: Ray Ozzie, of similar employment background, joined Lotus in the early '80s as a developer on their complete office suite, Lotus SmartSuite 2.

Lotus 1-2-3 was one of the most successful software products ever, but the frontier of office computing was quickly expanding to word processing and presentations. Far from our modern monoculture of Google Docs and Microsoft Office still kind of hanging on I guess, the productivity software market was extremely competitive in the 1980s and just about every Independent Software Vendor had some kind of productivity or office suite under development. The vast majority of these were commercial failures, and Lotus SmartSuite was no different 3. It is fascinating to consider that SmartSuite included an (apparently mediocre) desktop database/rapid application development tool called FORMS. Desktop databases were once considered table stakes for productivity suites, but are now almost completely forgotten, recast as expensive and standalone SaaS offerings like AirTable. But I digress...

By 1985, Lotus had acquired VisiCorp and consolidated its dominance in PC spreadsheets. Despite its overwhelming success in spreadsheets, Lotus struggled elsewhere: not just SmartSuite and Jazz but standalone word processor Manuscript and modeling-oriented spreadsheet Improv were often technically impressive but were also consistently poor sellers. I think that part of the problem is that the people at Lotus were a little too smart. This is best exemplified by their unsuccessful "personal information management" or PIM product (this is the same general category as Microsoft Outlook and Mozilla Thunderbird, although historically PIMs were less email-centric and more focused on calendars and contacts).

Lotus Agenda was marketed as a PIM, but unlike other contenders of the era, it came without a fixed schema into which the user inserted data. Instead, it behaved more like a very simple desktop database, with the user entering data into spreadsheet-like tables however they wanted and then defining views based on column and row filters. The result was highly generalized, able to fit just about any use case with sufficient time and effort. It also did very little to help: you started with a blank grid, left to your own devices. It reminds me of modern PIM-adjacent offerings like Emacs org mode or Obsidian that attract a fervent following of dedicated users who ascribe some sort of life-changing experience to them, while the other 99% of us launch the software and then wonder what we are supposed to do. Lotus Agenda, for its part, reportedly had a similar cult fame.

I imagine that Lotus's difficulty entering new markets must have encouraged them to get creative. They had quite a few products, many of them the subject of awards or technical papers or patents, no lack of engineering capability. What they seemed short on was vision and marketing. They had not quite figured out what customers wanted, or at least what customers wanted that was different from what their competitors already offered. I suppose they were willing to take a bet, as long as the risk could be adequately controlled.

In 1984, Ray Ozzie left Lotus to found a new company called Iris Associates. Like Software Arts developer-publisher relationship with VisiCorp, Iris Associates was an independent company but was contractually obligated to offer exclusive publishing rights on its products to Lotus Development. In exchange, the first few years of Iris's operation were funded by a substantial investment from Lotus. Ozzie, now in control of his own kingdom, quickly hired three other University of Illinois alums—people he knew from his time working on PLATO.

It's not clear to me if this was the original goal of Iris Associates or if it became the goal as a result of their experience with PLATO. However it happened, Iris Associates spent about the next five years building a version of PLATO's =notes= lesson that could run on a network of Windows PCs. Through their publishing arrangement with Lotus, this would come to be known as Lotus Notes.

Lotus Notes is one of the most famous, or infamous, examples of "groupware." Groupware is a hard concept to put your finger on, in good part because of its history. The whole thing, as a category, is easily traceable to Douglas Engelbart's work on Human Augmentation at SRI—work that, like PLATO, failed to gain market adoption but was nonetheless tremendously influential on the larger art. Human augmentation is the core idea in groupware, also called computer-supported cooperative work (CSCW). These were lofty academic ideas around the ways that computers could augment the social processes behind most productive work; they were ideas that grew and forked like branches from the middle of the 20th century to today. Many of those branches withered and died, others flourished before some sort of rotting disease set in (looking at you, Sharepoint). The result is that "groupware" meant different things at different times, and today is so non-specific that everything from enterprise business process automation platforms to slightly dressed-up webmail are lumped together.

In the days of Iris Associates, groupware meant something like this: software that assisted humans in communicating, collaborating, and tracking and executing processes. In practical terms, this meant core PIM applications like email and calendaring, but also included collaborative editing, workflow automation, document and policy management, and all kinds of other core business needs that involved more than one person. It's hard to apply a definition of groupware to Lotus Notes because Lotus Notes was one of the major products that originally defined the category, so its capabilities and limitations have become part of the background.

Lotus was under development from roughly 1984 (founding of Iris) to 1989 (first commercial release). The time period also put Lotus Notes in another category, although this one even more tenuous: network operating systems. Novell NetWare, for example, hit the scene in 1983. Network operating systems revolved mostly around file and printer sharing, but most gained some form of email. Lotus Notes had substantial overlap with network operating system features, but ran on Windows, making it an important step in the decline of fully integrated network products like NetWare in favor of application software on Windows which, more and more with each Windows release, used underlying operating system capabilities instead of implementing a complete application-specific network stack. Novell would go in a similar direction, with GroupWise, part of a wild industry restructuring during the 1990s.

It's hard to put a finger on what exactly Lotus Notes was from a modern perspective because the software itself was highly generalized. Wikipedia puts it like this: "Notes and Domino is a cross-platform, distributed document-oriented NoSQL database and messaging framework and rapid application development environment that includes pre-built applications like email, calendar, etc." Let's set aside for the moment the matter of Domino, which we have not introduced, and focus on the parts like "distributed document-oriented NoSQL database" and "messaging framework" which are strange ways to describe a groupware package but actually a very logical way to describe a rapid application development product—mentioned here as just one of the things that Lotus Notes is. I suppose that's a bit of synecdoche.

Like Lotus Agenda before it, Lotus Notes was something of a blank canvas that could be configured, customized, and designed to meet virtually any need. Unlike Lotus Agenda, they had learned a lesson about onboarding and Lotus Notes came out of the box with a set of familiar groupware features. What set Notes apart from other implementations of email, for example, is that Notes email was just another set of views and logic built on a common database. It was a custom application like the others, just one already provided as a sample.

Much of the strangeness of Lotus Notes reflects its origins in PLATO. Most of the microcomputer network products of the 1980s were built around peer-to-peer networking, the idea that multiple small computers could communicate with each other directly (NetWare, for example, used a peer-to-peer architecture despite the fact that certain machines were usually clearly logical servers). That was the hot new thing in the '80s, but PLATO was not of the '80s, it was of the '60s. PLATO wasn't even really a client-server architecture: it was a terminal-computer architecture, in which individual PLATO terminals interacted with one of multiple (mostly DEC) mainframe or midsize computers that ran the actual logic.

PLATO wasn't just multi-user, though, it was multi-computer. To form one integrated system out of multiple independent sets of computers and terminals, PLATO replicated all of the user data between the computers. Lotus Notes inherited the same approach: data was stored in a database, and communicated between users by replicating that database between machines.

Everything in Lotus Notes is a note, and a database is a collection of notes that are identified by unique IDs. When a note is updated, that note is replicated to other copies of the database. Over time, the obvious performance and reliability problems with this architecture became apparent and the replication process became a lot more sophisticated, but it always worked on this simple logical model of updating the local database and then replicating it to others.

And that was about it—the database full of notes was all there ever was to Lotus Notes. That and an incredible amount of functionality exposed to the user by the fact that notes could incorporate simple scripts, arbitrary programs, workflow rules, and entire GUI layouts. Lotus Notes did not differentiate between "user data" and "program data" at a low level, and indeed users could edit views, write scripts, and build entire applications on top of Lotus Notes that replicated to other users just like an email.

The kind of freedom that Lotus Notes gave its users was incredibly empowering, but also a bit of a nightmare for usability. I am reminded of the early object-oriented work, like Smalltalk, much of which came from a similar milieu of SRI and Xerox-inspired applied computing research. If everything is a note, then you can use the tools for manipulating notes to create anything. The Lotus Notes data model was extremely flexible, about as close to completely schema-less as practical, but it still offered indexing and full text search. There wasn't much that you couldn't shove into Lotus Notes if you tried, and for businesses that adopted it wholesale, it could expand in scope until it took over everything. Not just email and calendars but enterprise resource planning!

Early versions of Lotus Notes, before the IBM acquisition, allowed for programming using either an expression language very similar to Lotus 1-2-3 spreadsheet formulas (and called, sensibly, Formula) or an imperative language very similar to Visual Basic (called LotusScript). Using a tool called Designer, it was possible to visually edit GUI interfaces like forms and reports. These were backed by a Notes-specific query language, although later versions of Notes also gained an SQL compatibility layer and some options for interoperating directly with relational databases.

Because of the example of PLATO, Notes was architected around a client-server model. The client, generally called Lotus Notes, partially replicated the database to allow the user to view and edit notes. The server, which came to be called Domino, kept a complete copy of the database and executed enough of its logic to handle access controls. Like PLATO's replicated model, though, it was not expected that there be a single, central server. Many Domino servers could share the same database, and replicated changes among themselves. This architecture was very convenient for the era's business deployments, where each office location would have a local area network but network connections between sites were expensive and comparatively slow. Domino servers had the effect of consolidating user activity into a reduced volume of replication traffic, and when inter-office network links were lost the office still functioned normally, albeit as an island.

Despite its origins as a product explicitly for Windows, Notes maintained a degree of cross-platform capability. Domino was available for several platforms and the Lotus Notes client was available for pretty much anything. The simplicity of the Formula and LotusScript languages did a lot to simplify porting, although the system itself was originally written in C++. That all started to change when Lotus rolled in Iris and shortly after, in 1995, merged into IBM.

Lotus Notes is hard to write about because of the naming. The curious decision to call the server (which was, originally, called Lotus Notes Server) Domino already complicates the situation, and that itself appears to be a result of IBM's unique product vision. The IBM renaming churn went on for the whole span that Lotus operated under IBM, meaning that whether you call it "Notes" or "Lotus Notes" or "IBM Notes" or even refer to the entire thing as "Domino," you're probably correct for some point in the history. Modern IBM documents use elegant language like "the Notes and Domino product family." I have opted to just stick to Notes.

IBM was tremendously invested in the development of client-server microcomputer applications in the 1990s, unsurprisingly since they were the closest match to IBM's traditional strength in mainframe systems. The acquisition of Lotus was, no doubt, intended to advance that focus: Lotus Notes must have been one of the most prominent client-server products of the Windows '95 era. Its architecture, something like a "thin client" system that was derived from its mainframe precursor, felt akin to IBM's block terminals and generally fit well with IBM's approach to microcomputer applications (small client, big backend).

IBM was also investing in another project that fit their client-server vision: Java. Java was originally designed for embedded applications (interactive television!), but by its first official release in 1996 it had grown into more of a general purpose business application platform. The Microsoft .NET platform is a good later comparable, starting out where Java ended up. The "standard edition" and "enterprise edition" split of Java, and web service platform concept developing around them, fit client-server applications well. Java was widely seen as an easier, more productive language than C++. Best of all, Java's origins as a portable language to run on lightweight, embedded VMS made it a perfect fit for software that would need to run on a variety of different clients.


I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.


Lotus Notes 1 through 3 had been iterative improvements, from 1989 to 1993, adding features and fixing defects. With Lotus Notes 4, spanning 1996 to 1999, IBM Lotus Development started on a wholesale port to Java. Java was added as a native scripting language, alongside Formula and LotusScript, and many newer Lotus Notes features were "Java first." The Java-ization of Lotus Notes could be said to have completed in 2008, when the Lotus Notes client was completely replaced with a full-Java implementation based on Eclipse 4.

Alongside the new world of the object-oriented user interface, IBM also adapted Lotus Notes to the web. Version 4 introduced a very simple web interface where users could view static notes through a web browser, and IBM added an SMTP bridge for interoperation with other email systems.

At the dawn of the 2000s, Lotus Notes' technical leadership had eroded. The state of the art in GUI applications was far more advanced, and Microsoft had invested in their own groupware products (Exchange and Sharepoint). As part of the Windows Server offering, the Microsoft products were better integrated into the Windows desktop experience. IBM's investment in web technologies had lagged, so while Sharepoint was never exactly a hot rod, it was easier to interact with than Lotus Notes.

Still, development continued apace. IBM retired the Lotus branding around 2002, decisively folding IBM Notes and IBM Domino into Big Blue. Domino Web Access provided a web interface by 2002, although it was limited and not much different from what we would now consider mere webmail. IBM introduced SameTime, an instant messaging system that integrated with Lotus Notes. QuickPlace, similarly, was a web-based file/document management system that integrated with Lotus Notes as well. Both of these were even initially Lotus-branded, although IBM dropped that practice a few years later (and, for some reason, renamed QuickPlace to Quickr. It was a time).

When we talk about Java and the web, you might remember something, perhaps with a shudder: applets. In 2008, IBM introduced XPages. XPages were basically Java Server Faces components that ran on Domino server. They interacted directly with the Notes database and were, in some ways, just web-based versions of Notes forms. You even edited them using Domino Designer, the IDE for "serious" Notes development. Unfortunately, XPages hewed too close to JSF and remained a separate platform from Notes itself. You could not simply view existing Notes forms via XPages, or even readily port Notes interfaces to XPages format. If you wanted to support both the desktop and web experience, as most businesses would around 2010, you tended to end up writing things twice. Next thing you know, you're only really actively maintaining the web version, and then you might as well be on Sharepoint which was then well established.

To me, the most captivating part of Lotus Notes is its decline. In the mid '90s, Lotus Notes was the dominant groupware platform by a wide margin—Forbes put Lotus Notes market share at 64% in 1995. In 1997, they had fallen to 47%. A 2008 survey put Lotus Notes at 10%. Considering the long cycles and sheer staying power of enterprise software, that was a remarkable fall—all the more so considering that at that peak, IBM spent over $3 billion to get their hands on it.

Admittedly, IBM buying a popular software product at great expense and running it into the ground is an old story. But there's another major player here too: Microsoft was at their best. The vast majority of Lotus Notes' users went directly to Microsoft Exchange, which besides its higher level of integration with Windows (both at a technical and sales level) was also agreed to be more performant and easier to maintain. I would wager that Microsoft was at their peak engineering competence in the early '00s, and they had a considerable late-mover advantage over software that carried all the baggage of predating open-standards TCP/IP networking. NetWare went into a precipitous slide for the same reason.

Standards themselves are another important part of the story. Early email systems were all proprietary, and weren't necessarily networked systems at all—the first email implementations just stored mail on the single machine that all of the users accessed via terminal. Lotus Notes having a completely proprietary implementation of email didn't stand out in that world. When Microsoft first launched Exchange, it had the same limitations. That changed fast, albeit more by accident than intention. ITU X.400 had been expected to provide the standards for email on the public internet, replacing the hodgepodge of proprietary and network-specific open standards. Instead, X.400 failed to do much of anything at all—but not before it became the basis of Microsoft Exchange.

Exchange ended up in sort of a standards purgatory, an implementation of an open standard that died before Exchange could even launch. Exchange's X.400 capabilities became a proprietary standard of unusual origin, and most of industry and academia opted to follow the NSF example and adopt the older and simpler SMTP. Microsoft apparently understood that interoperability would be critical for the future of email, so Exchange was marketed from the very start as a multi-protocol system that could speak SMTP as fluently as anything else. That wasn't ever 100% true, but it was the idea that mattered. Lotus Notes had SMTP support but it was lagging behind, Exchange was the future focused on interoperation.

The web story was almost exactly the same. That's not to say that Exchange had a good version of a web interface, Microsoft has famously struggled with messes like Outlook Web Access. But they were still more agile than IBM, and their complete vertical control of the desktop experience meant that their weird proprietary web technology (ActiveX) fared better than IBM's weird proprietary web technology (a confusing tangle of Notes browser plugins and Java applets). Besides, Lotus's web features were apparently separately licensed at great cost. Microsoft wasn't cheap but, in this case, they were the budget option.

Last of all is the elephant in the room. It is hard, today, to explain exactly what Lotus Notes was. That's not a recent problem. In 1998, Forbes wrote:

Even before IBM arrived, Notes' identity had been blurred; it's no clearer now. First the software was marketed as a system to make it easy for a widely dispersed group of employees to work together editing a document or managing a project. Then Notes was redefined as a tool for building customized collaborative applications. Now Lotus seems to be playing up the E-mail function in Notes and positioning it as a replacement for cc:Mail. The latter was once a leader in its field but is now suffering a slow death.

cc:Mail! That's a whole different article. So let's stick to the point: the fact that Lotus Notes sort of defined a product category became a liability as other vendors muscled in with more targeted, more narrowly focused, and more obviously useful competition. Lotus Notes started to sound like Zombocom: You can do anything! Microsoft was just selling email and calendar. Notes' large scope was a problem for the user experience as well; besides the fact that the Notes client looked and felt dated by the '00s, the generality and depth of its capabilities meant that it was also just plain hard to use.

In 2018, IBM sold what remained of Lotus Development to an Indian software company called HCL. HCL is a classic IT consulting firm, with a portfolio that spans "Industry 4.0" to "SAP consulting." They acquired Lotus as part of a bundle of IBM castoffs: some of you might remember BigFix. Lotus Notes is now HCL Notes, and as far as I can tell HCL intends to just enjoy the revenue as long as legacy customers will pay them to keep Notes running.

That's another part of the allure of Lotus Notes: it might be the most legacy of legacy software. I have never worked at an organization that was using Lotus Notes, but everywhere I've worked that wasn't a startup had a powerful institutional memory of back when, in the Lotus Notes Times. Some people remembered it fondly, most people remembered it hatefully, but they all remembered it.

Will GMail ever inspire such emotion?

  1. The more things change, the more they stay the same...

  2. Previous versions of this article mistakenly called it Lotus Symphony, which was yet another office suite from IBM Lotus (this one OpenOffice-based). I regret the error.

  3. Lotus SmartSuite was Lotus's office suite for DOS, they also developed an office suite for MacOS called Lotus Jazz. You can appreciate the theme consistency of the names, but it quickly becomes confusing: Lotus was acquired by IBM in 1995, and became another division of a company that also offered a software engineering collaboration platform called Rational Jazz. You might get excited and think that perhaps one lead to the other or something, but this appears to be pure coincidence.

  4. At the time, Eclipse was being positioned as a more general framework for GUI software. In practice it didn't take off for anything besides other IDEs, but there was a brief heyday of Eclipse Framework productivity tools. Think of the days when the Flickr Uploadr was XUL-based and pretty much Firefox in a slim trenchcoat. But for Eclipse. This is actually very interesting because, in a parallel to Lotus Notes, Eclipse was architecturally inspired by IBM's block-based mainframe terminals, informed by research work with Smalltalk.

CodeSOD: Please Find, Rewind

As previously discussed, C++ took a surprisingly long time to get a "starts with" function for strings. It took even longer to get a function called "contains". In part, that's simply because string::find solves that problem.

Nancy sends us a… different approach to solving this problem.

bool substringInString(string str, string::iterator &it)
{
  string tmp;
  bool result = false;
  int size = str.length();

  int count = 0;
  while (count < size)
  {
    tmp += *it;
    it++;
    count++;
    if (tmp.find(str) != string::npos)
    {
      result = true;
      it -= size;
      break;
    }
  }

  if ( !result)
  {
    it -= size;
  }

  return result;
}

This function iterates across a string, character by character. In this iteration, we copy one character at a time into tmp. Then we see if tmp contains our search str. If it does, we break out of the loop after rewinding the iterator. Outside of the loop, we check if we found the substring, and if we did, we rewind the iterator. Then we return true or false based on whether on not we found the substring.

So wait a second. str is our search string. it is where we're searching. And we copy from it up to our search string's length into a temporary string. We then do a find in that temporary string- hey! This is just a startsWith check written in the most insane way possible.

Why even bother with the while loop? While tmp is shorter than the search string, the answer is always "no, we haven't found it". And the developers knew that- that's why they always rewind size characters on the iterator. They're always searching exactly that many characters. Of course, since we always rewind the same amount, we can also just move the it -= size statement out of the loop and out of the if statement and do it once.

Nancy calls this "a little gem" in a "large codebase". Yeah, a real gem.

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

CodeSOD: Not for Nullthing

Today's anonymous submitter sends us some code that just makes your mind go… blank when you look at it.

	public static boolean isNull(String value) {
		return StringUtils.isBlank(value);
	}

StringUtils.isBlank comes from the Apache Commons library. It's a helper function for Java which returns true if a string is, well, blank. "Blank" in this case is: empty, null, or only whitespace. So it's important to note that isBlank may return true on a null, but it isn't truly a null-check, so wrapping it in isNull is just confusing.

But imagine I've got another problem. Let's say I have a database that's been poorly normalized and maintained. And so I have a bunch of fields that maybe are null, but some also maybe contain the string "null". What am I going to do then? I need another function.

	public static boolean isNullAndNull(String value) {
		return isNull(value) && "null".equalsIgnoreCase(value);
	}

Ah yes, isNullAndNull, the clearest and easiest name I could imagine for this. It tells me exactly what the function is checking: is it null, and is it also null? We add a second check to our isNull call- we check if the input value matches the string "null". Except we're &&ing the conditions together. So this function will always return false. It can't both be blank and contain the string "null".

Which means Jennifer Null, who is a real person, can breathe easy. This version of a null check won't think she's nothing.

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.

Empty Pockets

If you've seen one developer recounting how their AI agent deleted production, you've seen them all. They're mostly not interesting stories. It's like watching someone speeding through traffic on a motorcycle without a helmet: the eventual tragedy is sad, but it's unsurprising and not an interesting story to tell. It's not even interesting as a warning: the kind of person who speeds on a motorcycle without a helmet isn't doing so because they don't understand the danger. They've just decided it doesn't apply to them.

But the founder of PocketOS, Jer, recently shared how- whoopsie!- their AI agent deleted production. There's a lot of ingredients that go into this particular disaster, which I think makes it interesting, because the use of a poorly supervised AI agent is only one ingredient in this absolute trainwreck of a story.

PocketOS is a small company that makes software for rental companies to manage reservations. Car rentals are a big customer, but the tool is more general than that. They manage all of their infrastructure via a service called Railway. Railway is a pretty-looking GUI tool for automating your deployments and the target environments.

PocketOS also is heavily adopting Cursor wrapping around the Claude model. They've paid big bucks for the top-end model offered. Many of their components, like Railway, offer MCP services so that their LLM can do useful things. They're using the Claude LLM to automate as much as they can.

So far, this is all a pretty typical setup. They pointed Claude at their code and gave it a "routine" task, and sent it to work. It toddled through the problem and encountered a credential issue. It "decided" that the fix for this issue was to delete a storage volume and recreate it. It scanned through the code to find a file containing an API key, found it, and then sent a POST request via cURL to delete the volume in question.

Jer writes:

To execute the deletion, the agent went looking for an API token. It found one in a file completely unrelated to the task it was working on. That token had been created for one purpose: to add and remove custom domains via the Railway CLI for our services. We had no idea — and Railway's token-creation flow gave us no warning — that the same token had blanket authority across the entire Railway GraphQL API, including destructive operations like volumeDelete. Had we known a CLI token created for routine domain operations could also delete production volumes, we would never have stored it.

Wait, the tokens you create in Railway all have god-level privileges? That sounds like a terrible idea. And you were storing the token in your code? We'll come back to this in a moment, but sure, this is bad, but you can just restore from backup, right?

The volume was deleted. Because Railway stores volume-level backups in the same volume — a fact buried in their own documentation that says "wiping a volume deletes all backups" — those went with it. Our most recent recoverable backup was three months old.

Oh. Oh no.

Now, I don't think it's literally true that Railway is storing your backups literally in the same volume as the thing they're backing up. I certainly hope not. But they do apparently delete your backups when you delete the volume associated with them. Which is a choice, certainly. A bad one. And one that they documented, according to Jer. It was, in his words, "buried" in the docs.

But let's go back to the tokens for a moment. I am not a Railway user, but I checked out the tool and went through the process of creating a project token. And while no, Railway does not give you big red flags warning you "Hey, this token can do ABSOLUTELY ANYTHING", it also never gives you an opportunity to scope the token. Which, I don't know about you, but the first thing I do when I create an authentication entity is try and figure out how to control its authorizations, because I assume at the start it doesn't have any. That'd be sane.

The scoping happens when you create the token, depending on what context you're in when you do it. It's only a handful of scopes, and no fine grained permissions on API keys at all. The lowest level is "Project" which can do anything to a single environment- which does mean that even if you, like Jer's team, wanted to have a script that changed some DNS settings in production, that same key could be used to delete volumes in production. Which means you really really want to take care of that key, and you certainly don't want to leave it where some junior developer or bumbling AI agent can find it.

Jer also complains that Railway shouldn't allow an API call to take destructive actions without more protections, like forcing someone to type in the name of the thing being deleted or sending a confirmation email, or something. This, I'm more skeptical of. Most cloud providers don't offer anything like this in their APIs, at least that I've seen, because on a certain level, if you're invoking the API with the proper credentials, that's a big enough hill to climb that we can assume you've intended your action. The correct way to protect against this is properly scoped keys and keeping those keys secure and not just lying around in plain text. There's a certain aspect of understanding that you're using a potentially dangerous tool and need to take the responsibility for safety into your own hands; while a table saw can easily take some fingers off, it's perfectly safe when used correctly.

This is all bad, but how can we make it worse? Well, Jer demanded that Claude "explain itself". In a section called "The Agent's Confession", Jer highlights that the agent is able to identify the explict rules that it failed to follow.

Read that again. The agent itself enumerates the safety rules it was given and admits to violating every one. This is not me speculating about agent failure modes. This is the agent on the record, in writing.

No, it is not the agent on record. I see this kind of thing a lot when people talk about LLMs. An LLM cannot explain its reasoning. It cannot go on "the record". It cannot confess to anything. While what it plops out when asked might be interesting, it is not an explanation. The only explanation is that it's a powerful statistical model trying to create a plausible string of tokens! It's simply looking at its context window and your prompt and trying to predict what it should say. It can tell you what rules it violated not because it understands the rules or knows it violated any rules, but because those rules are in its context window. If you ask it right, it'll confess to killing JFK and framing Oswald for the crime.

Jer then tries to ensure that Cursor takes some of the blame, pointing to Cursor's "guardrails" documentation. Except, here, the documentation is actually quite explicit about what those guardrails guarantee. If you're using a first-party tool, it will prohibit unsafe operations. When using 3rd party MCPs, like Railway's, the only guardrail is that it requires human approval for every action- unless you update your allowlist for that MCP. If you put them in your allowlist, the guardrails go away. Jer argues that tools should enforce more protection against LLM behaviors, but the problem with that is people- like the PocketOS team- turn those protections off. And like a lot of safety mistakes, they can get away with it all the way up until the point where they can't.

Jer follows this by listing off a pile of other times using Cursor has caused disasters, which isn't making the argument he thinks it is: yes, Cursor is dangerous, but those dangers are well known. It makes the choice to turn Cursor loose without strict supervision seem even more foolish.

Jer writes:

For now I want this incident understood on its own terms: as a Cursor failure, a Railway failure, and a backup-architecture failure that all happened to one company in one Friday afternoon.

It's also a PocketOS failure. It's a failure to properly assess the tools and environments you chose to use for your product. A failure to read and understand the docs for vital features, like *backups*. A failure to employ even the most basic safeguards. A failure to put a second's thought into key management- even if that key was only for DNS entries, you still shouldn't chuck it in source control. A failure to have a competent backup strategy. It's worth noting that they did restore from a three month old backup, which means they were at one point taking backups outside of Railway's volume setup. That was a wise decision. That they stopped is a failure.

The first rule of disaster retrospectives is that it's never one piece that's the failure. It's never one person's fault, one tool's fault, one vendor's fault. It's a systemic failure. Railway's keys should be finer grained. But also, you shouldn't leave keys lying around. Deleting backups when you delete the volume is a terrible idea, but having only one service for backups (that's also your primary site) is a terrible idea. Claude's ability to enforce its own guardrails should be better, but LLMs are notoriously dangerous about this: you should know better, and by your own words you did.

This is not an anti-AI post, or even a "get a load of this asshole" post. It is a "understand the damn tools you're using" post. Be critical of them. Don't trust them. Ever. Especially LLMs, because the worst part of an LLM is that it takes away the one thing computers used to be good at: predictable, deterministic behavior. But not just LLMs: don't trust your cloud provider, don't trust your infrastructure manager. Dig into them and understand how they work, and if they seem to complicated to understand, than they may be too complicated to trust.

Update: As pointed out in the featured comment below, Railway did finally get a backup restored. So they got their data back. Yay? From the post, Jer remains committed to making this a Railway issue and not a PocketOS issue.

[Advertisement] Picking up NuGet is easy. Getting good at it takes time. Download our guide to learn the best practice of NuGet for the Enterprise.

Error'd: Parametric Projection

Roger C. gets on second base with an unforced error. "Not only is the content too large, the error message informing us of this is also too large to fit the visible space. A layered, double WTF."

782b1790d9d549d6a8acf4045669d7a6

"AWS Spellcheck Fail!" alerts Peter "If only someone at AWS knew the correct paramters to activate the spellcheck."

ee85e87fd7cb4cc2ac3038cb9f97ccf8

"How long is too long for a job to be open? " wonders Lincoln K. "I didn't even know LinkedIn existed 61 years ago, let alone was accepting postings... Though only 81 applicants in that time is hardly an impressive turn-out." For a "Vice President Operations and Quality Control", no less.

1c3d4b06a37e4119b62dc39bad29b9a3

An anonymous Richard reports "This came through my door. On a card that, in order to get to my door, had my full address printed on it, including my ."

9df5e07d210846f08dc925105f19b64b

Oenophile Abroad Michael R. shares "My Macbook broke after being "exposed" to red wine. As a German in London it pleases me so see that the repair shop offers this time granularity."

a9f634b888de4927babb91d7d2920579

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.

CodeSOD: Cancel Catch

"This WTF is in Matlab" almost feels like cheating. At one place I worked, somebody's job was struggling through a mountain of Matlab code and porting it into C. "This Matlab code looks like it was written by an alien," also doesn't really get much traction- all Matlab code looks like it was written by an alien. This falls into the realm of "Researchers use Matlab, researchers may be very smart about their domain, but generally don't know the first thing about writing maintainable code, because that's not their job."

But let's take a look at some MatLab Carl W found:

    try
        if (~isempty(fieldnames(bigStruct)) && isfield(bigStruct,'pathName'))
            [FileName, PathName] = uigetfile(bigStruct.pathName);
        else
            [FileName, PathName] = uigetfile(lastPath); %lastPath holds previous path
        end
    catch
        bigStruct = struct;
    end

The uigetfile function opens a file dialog box. When the user selects a file, FileName holds the filename, PathName holds the containing path. If the user doesn't select a valid file, or clicks "Cancel", both of those variables get set to 0. It's then up to the caller to check the return value and decide what happens next.

Which is not what happens here, obviously. The developer responsible seems to believe that it maybe throws an exception? And they can just catch it? Carl's best guess is that this is a "weird" way to catch the cancel button. But it does mean that FileName and PathName get set to 0, and those zeros propagate until something finally tries to open those files, at which point everything blows up and the user doesn't know why.

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

A Whale of a Problem

From our Anonymous submitter:

Our company creates graphs to visualize data. We have many small fish customers, but we have one whale who uses our product that is 90% of company revenue. (WTF number 1.)

So if he is not happy, it's all-hands-on deck-mode.

He complained that our APIs and charts are loading slowly for him. For 3 weeks, we've tried a TON of optimizations, including WTF 2: spinning up a special server he alone can hit.

Today, we found out that he's always complaining when he's in his car, driving from home to the office. But since he "totally has the best wifi money can buy," that isn't worth investigating.

WTF 3: thinking wifi and data are always 100% reliable in a car driving around.

Humpback whale breaching in Ballena Marine National Park

Our submitter highlights one of the major pitfalls of the so-called whale client: if they're a bad client, you're in for an extra-bad time.

As I lean harder into freelancing, I'm learning to scan the waters ahead of me for potential whales. My goal is to build up multiple small, diverse income streams, because I've had my own dangerous encounters with whales in the past.

At one employer of mine, there was Facebook, who acted as if they were our new owners rather than a new customer. They'd already produced flashy marketing videos of the sorts of solutions they planned to implement with our software, showing people delighted with the results. In meetings, these things were talked up as amazing game-changers. Meanwhile, I found all the things Facebook wanted to do horribly creepy and invasive.

Even worse, Facebook began dictating how our award-winning technical support should change to accommodate their whims, up to and including having a dedicated toady—er, support rep—who did nothing but field Facebook-related tickets, similar to a technical account manager (TAM).

That was the last straw for me. I left that company before I was forced to deal with any of Facebook's crap.

My second whale sighting occurred at a startup that'd landed Porsche, far and away their biggest client ever. All of a sudden, our timeline for adding new features and fixing bugs became Porsche's honey-do list. All of a sudden, the platform frequently crashed and became unusable for everyone because it couldn't handle the amount of traffic Porsche (and their clients) hurled at it.

On the other hand, there were several times in that startup's existence when a big wad of promised funding failed to materialize. Porsche kept the business afloat and literally kept my lights on.

I find it less than ideal to be at any company's mercy. I want a world that would neither spawn whales nor millions of startups named Sploink, Dink, and Twangle that promise to bring the power of AI to your dinner fork.

Have your own epic whaling adventures? Share with us in the comments!

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!

CodeSOD: Lint Brush Off

A few years back, C# added the concept of "primary constructors". Instead of declaring the storage for class members and then initializing them in the constructor, you can annotate the class itself with the required fields, and C# automatically generates a constructor for you. It's all very TypeScript and very Microsoft, and certainly cuts down on some boilerplate.

Esben B's team isn't really using them in many places, but they are using a linter which is opinionated about them. So this in-line constructor causes the linter to complain:

    public DocumentNetworkController(ILookupClient service)

The linter wants you to switch this to a primary constructor. Esben didn't want to do that, and didn't want to change the global linter configuration, and so added a pragma to disable that particular warning:

#pragma warning disable IDE0290 // Use primary constructor
    public DocumentNetworkController(ILookupClient service)
#pragma warning restore IDE0290

The linter didn't like this. It threw a new warning: that this suppression wasn't needed. Which was news to Esben, as clearly the suppression was needed if you wanted to make the warnings go away. The obvious solution was to disable the warning that you didn't need to disable the warning:

#pragma warning disable IDE0079, IDE0290 // Use primary constructor
    public DocumentNetworkController(ILookupClient service)
#pragma warning restore IDE0290, IDE0079

Except this doesn't work. These pragmas take effect on the next line, which means you can't disable IDE0079 on the same line as IDE0290 and expect it to work. Which means the final version of the code looked like this:

#pragma warning disable IDE0079 // Disable warning about not needed supression
#pragma warning disable IDE0290 // Use primary constructor
    public DocumentNetworkController(ILookupClient service)
#pragma warning restore IDE0290, IDE0079

Esben writes:

So the nice recommendation to use a primary ctor ended up with 3 lines of annoying boilerplate code. Good times \o/

While yes, this is frustrating, I will say there's an element of "when the table saw keeps taking fingers off, that may be more of a you problem." I don't know the details, so I can't say, "just change the linter config or adopt its recommendation" and claim that the problem goes away, but when the tool hurts you, it's a definite sign of one of two things: it's either the wrong tool, or you're using it wrong.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

CodeSOD: The JSON Template

We rip on PHP a lot, but I am willing to admit that the language and ecosystem have evolved over the years. What started as an ugly templating language is now just an ugly regular language.

But what happens when you still really want to do things with templates? Allison has inherited a Python-based, WSGI application which rejects any sort of formal routing or basic web development best practices. Their way of routing requests is simply long chains of "if condition then invokeA elif otherCondition then invokeB". Sometimes, those conditions will directly set the MIME type on the HTTP response.

They do use a templating library called Mako for generating their responses. They use it for their HTML responses, obviously. They also use it for their JSON responses, generating code like this:

{
    "success": true,
    "items": {
        %for item in items_available.keys():
        "${item}": ${items_available[item]}${',' if not loop.last else ''} 
        %endfor
        }   
}

The %for and matching %endfor mark the Python code off, which generates JSON via string-munging, complete with the check to make sure we're not on the last iteration of the loop.

Like so much bad code, this offers a degree of fractal wrongness. Instead of iterating over the keys and fetching the items inside the loop, you could iterate for key,value in items_available.items()- and according to the Mako docs, that for is just a regular Python for loop. That we're just outputting the contents of the dictionary is itself potentially a problem- sure, if we know the types of the dictionary, we'll know that whatever it is can be output in the body of a JSON document, but do we really think this code is using type annotations? I don't. And for a RESTful web service, I'm always going to feel weird about using a success field when ideally the HTTP status code could convey most of that information (and yes, I know there are reasons to still put status in the body, I just hate it).

Of course, the real issue is just: Python's built in JSON serialization is actually pretty advanced. And performant! You don't need any of this, you could just do something like:

return json.dumps({"success": true, "items": items_available})

No templates. No formatting. No worries about how the data gets represented. Well, still worries, because JSON serialier will throw exceptions if it doesn't know what to do with a type. But then at least you get that exception on the server side and aren't sending the client a malformed document.

In any case, this is a good demonstration that you can write bad PHP in any language.

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.

Error'd: April Showers

"RFC 1738 (and 3986) disagree" and so does Daniel D. "Reddit API has some weird app creation going on with lots of recently migrated and undocumented stuff. But having redirect URL set to localhost (or 127.0.0.1) usually works. Well, if you don't disagree with Sir Tim Berners-Lee about what URL is. Which Reddit does. hostnumber = digits "." digits "." digits "." digits". I'd file this one with all the websites that try to perform validation on email addresses, and get it wrong.

ad5bfafde9a74b7a8c38d429a364be48

"Why aren't we getting any resumes?" wondered Fred G. "This is a snippet from a job posting. I'm sure it worked perfectly when HR tested it."

2c21d5766e724b9095103c6c537adfa3

"Service required..." was Chris H.'s title for this gem. "My 2022 Chevrolet has been at the dealer for recall service for two weeks now, "waiting for parts". That doesn't stop GM from emailing every few days with a reminder that the car needs the recall service, and inviting me to schedule it at a dealer (that isn't actually a dealer) located a convenient 2500 mile drive from my home (about 200 times the distance to the dealer where the car currently sits), and providing a non-existent placeholder phone number to contact them at to schedule the recall service."

78cac2590ecf4996a2f4ee79e0b38b49

"How to subtly tell your customers that you don't wish to be contacted" explains Yuri. "The bank's staff must be wondering why no one wants to talk to them...Is it their suit's brand that is throwing everyone off? Can they blame it on COVID?"

81b84743c3a9405f8ed25c9c18b86029

"Bad money formatting by tax software" Adam R. complained. "I'm ashamed to admit it, but yes, I did pay Intuit money to file my taxes. This should really be a free service provided by the government, but, y'know, *lobbying*. You'd think that a business focused on tax preparation software would know how to properly format currency values, but in this case they failed to set the proper number of decimal points."

a9085ecfb2d2403ebd3d856e0c2a1179

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

CodeSOD: Tune Out the Static

Henrik H (previously) sends us a simple representative C# line:

static void GenerateCommercilaInvoice()

This is a static method which takes no parameters and returns nothing. Henrik didn't share the implementation, but this static function likely does something that involves side effects, maybe manipulating the database (to generate that invoice?). Or, possibly worse, it could be doing something with some global or static state. It's all side effects and no meaningful controls, so enjoy debugging that when things go wrong. Heck, good luck testing it. Our best case possibility is that it's just a wrapper around a call to a stored procedure.

This method signature is basically a commercila for refactoring.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

Representative Line: Comment Overflow

Today, we look at a representative comment, sent to us by Nona. This particular comment was in a pile of code delivered by an offshore team.

// https://stackoverflow.com/questions/46744740/lodash-mongoose-object-id-difference/46745169

"Wait," you say, "what's the WTF about a comment pointing to a Stack Overflow page. I do that all the time?"

In this case, it's because this particular comment wasn't given any further explanation. It also wasn't in a block of code that was doing anything with either lodash, Mongoose, or set differences. It was, however, repeated multiple times throughout the codebase, because the entire codebase was a pile of copy-pasta glued together with the bare minimum code to make it work.

In at least one place, the comment was probably correct and helpful. But it got swept up as part of a broader copy/paste exercise, and now is scattered through the code without any true purpose.

[Advertisement] Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.

Turning Thirty

Eric O worked for a medical device company. The medical device industry moves slowly, relative to other technical industries. Medical science and safety have their own cadence, and at a certain point, iterating faster doesn't matter much.

Eric was working on a new feature on a system that had been in use for thirteen years. This new feature interacted with a database which stored information about racks of test tubes, and Eric's tests meant creating several entries for racks of test tubes. And that's when Eric discovered that the database only allowed thirty racks. Add any more, it would just roll right back over to one.

This was odd. The database was small- less than 40MB, even in production- and there were automatic tasks to purge old data for compliance purposes. Why a hard limit of thirty?

Eric had only been at the company for a year, so he asked one of the more senior team members, Lester. "Oh yeah, that was before my time. You should probably ask Carl."

Later that day, Eric happened to bump into Carl around the coffee maker, and asked the question. "Oh, yeah, I do vaguely remember something about that. It was in the requirements for the product. I thought it was weird, but didn't think too much about it. You should probably ask Elise, she's been here like twenty years."

Well, now it was getting curious. Eric went over to the "old building", as it was named, the original office for the company on the other side of the parking lot. Most of the offices had moved to the new building a decade earlier, and it mostly served as fabrication and storage, but a few offices remained.

Elise was on the third floor, down a poorly lit hallway, sitting in an office with water-stained acoustical tile in its ceiling. "Oh, yeah, I put that into the requirements document. It's funny, I thought it was weird too, but the system you're working on was a replacement for an older system. Our requirements were derived from those. Let me think… Irving worked on that, but he's dead, god rest him. Penny is retired. Oh, you know, Humbert is still around. He didn't work on that, but he worked on some of the systems that came before that. He's upstairs and on the other side of the building."

Eric went upstairs and to the other side of the building. The fourth floor had been last remodeled circa 1985, and the ugly industrial paint on the wall was made even uglier by the fact that someone had replaced most of the flourescent tubes with LEDs. Most. The mismatched color temperature started Eric down the path of a headache.

Humbert was in an office similar to Elise's. On his desk was a plaque commemerating 40 years of service with the company. Eric asked about the limitation, and Humbert laughed.

"You're working on the latest version of a product that initially started on an old PDP-11 running MUMPS. I mean, the first versions, anyway. We ran to desktop computers as fast as we could. I wrote a version for DOS in… oh… '86? I knew none of the facilities we worked with had more than ten or fifteen racks of tubes, and I needed somehow to limit the size of the database so it all fit on a single 5 1/4" floppy disk. I picked thirty, because it seemed like a good round number. Honestly, I'm shocked that the limit still exists."

So was Eric. There had been several ground-up-rewrites since 1986, before the one Eric maintained had been released thirteen years ago. Each one of them had chosen to maintain the same limitation, without ever considering why it existed. The rule had simply been copied, mindlessly, for 40 years.

"I'm kind of impressed," Eric said to Humbert, "in a horrified way."

"Me too, kid, me too."

[Advertisement] Keep all your packages and Docker containers in one place, scan for vulnerabilities, and control who can access different feeds. ProGet installs in minutes and has a powerful free version with a lot of great features that you can upgrade when ready.Learn more.

CodeSOD: Good Etiquette

"Here, you're a programmer, take this over. It's business critical."

That's what Felicity's boss told her when he pointed her to a network drive containing an Excel spreadsheet. The Excel spreadsheet contained a pile of macros. The person who wrote it had left, and nobody knew how to make it work, but the macros in question were absolutely business vital.

Also, it's in French.

We'll take this one in chunks. The indentation is as in the original.

Public Sub ExporToutVersBaseDonnées(ClasseurEnCours As Workbook)
Call AffectionVariables(ToutesLesCellulesNommées)
Call AffectationBaseDonnées(BaseDonnées)
BaseDonnées.Activate

The procedures AffectionVariables and AffectationBaseDonnées populate a pile of global variables. "base de données" is French for database, but don't let the name fool you- anything referencing "base de données" is referencing another Excel file located on a shared server. There are, in total, four Excel files that must live on a shared server, and two more which must be in a hard-coded path on the user's computer.

Oh, and the shared server is referenced not by a hostname, but by IP address- which is why the macros were breaking on everyone's computer; the IP address changed.

Let's continue.

'Vérifier si la ligne existe déjà.
        If ClasseurEnCours.Sheets("DATA").Range("Num_Fichier") = 0 Then
        Num_Fichier = BaseDonnées.Sheets(1).Range("Dernier_Fichier").Value + 1
Insérer_Ligne: '(étiquette Goto) insérer une ligne
    Application.GoTo Reference:="Dernière_Ligne"
            Selection.EntireRow.Insert
'Copie les cellules (colonne A à colonne FI) de la ligne au-dessus de la ligne insérée.
            With ActiveCell
                    .Offset(-1, 0).Range("A1:FM1").Copy
'Colle le format de la cellule précédemment copiée à la cellule active puis libère les données du presse papier
                    .PasteSpecial
                    .Range("A1:FM1").Value = ""
'Se repositionne au début de la ligne insérée.
                    .Range("A1").Select
            End With
            Application.CutCopyMode = False

Uh oh, Insérer_Ligne is a label for a Goto target. Not to be confused by the Application.GoTo call on the next line- that just selects a range in the spreadsheet.

After that little landmine, we copy/paste some data around in the sheet.

That's the If side of the conditional, let's look at the else clause:

        Else
Cherche_Numéro_Fichier: ' Chercher la ligne ou le numéro de fichier est égale à NumFichier.
                        While ActiveCell.Value <> Num_Fichier
                If ActiveCell.Row = Range("Etiquettes").Row Then
                    GoTo Insérer_Ligne
                End If
                ActiveCell.Offset(-1, 0).Range("a1:a1").Select
            Wend
            'Vérifier le numéro d'indice de la ligne active.
                If Cells(ActiveCell.Row, 165).Value <> ClasseurEnCours.Sheets("DATA").Range("Dernier_Indice") Then
                    ActiveCell.Offset(-1, 0).Range("A1:A1").Select
                    GoTo Cherche_Numéro_Fichier
                End If
            ActiveCell.Offset(0, 0).Range("A1:FM1").Value = ""
        End If

We start with another label, and… then we have a Goto. A Goto which jumps us back into the If side of the conditional. A Goto inside of a while loop, a while loop that's marching around the spreadsheet to search for certain values in the cell.

After the loop, we have another Goto which will possibly jump us up to the start of the else block.

The procedure ends with some cleanup:

'----- 
' Do some stuff on the active cell and the following cells on the column
.-----
BaseDonnées.Close True
Set BaseDonnées = Nothing
End Sub

I do not know what this function does, and the fact that the code is largely in a language I don't speak isn't the obstacle. I have no idea what the loops and the gotos are trying to do. I'm not even a "never use Goto ever ever ever" person; in a language like VBA, it's sometimes the best way to handle errors. But this bizarre time-traveling flow control boggles me.

"Etiquettes" is French for "labels", and it may be bad etiquette but I've got some four letter labels for this code.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

Error'd: Having a Beastly Time

It's time again for a reader special, and once again it's all The Beast In Black (there must be a story to that nick, no?).

"MySQL is not better than your SQL," he pontificated, "especially when it comes to the Workbench Migration Wizard"

7369002fc20e41b89b64ed7f32ef3641

"Sadly," says he, "Not even gmail/chromium either."

149c9109443f4521b1a38b91bd0bcc22

"Updated software is available, but there are no updates!" he puzzled. "Clicking Install Now just throws that dialog right back in my face. I'm re-cursing." Zero, one, does it really make a difference?

e9ea57c886984dc8a106df503a2fd923

"Questions" The Beast in Black "I do, in fact, have a question..."

f5c83f7bc02644a895f1f9aa5ec368a1

One of the foundational guides to my [lyle, not bib] engineering career was John Bentley's Programming Pearls. These are not those.
"Veni, vidi: vc. No pearls of wisdom here, just litter." says The Beast.

4e13a188deb94473abcc6148d106458c

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

CodeSOD: We'll Hire Better Contractors Next Time, We Promise

Nona writes: "this is the beginning of a 2100 line function."

That's bad. Nona didn't send us the entire JavaScript function, but sent us just the three early lines, which definitely raise concerns:

if (res.length > 0) {
  await (function () {
    return new Promise((resolve, reject) => {

We await a synchronous function which retuns a promise, passing a function to the promise. As a general rule, you don't construct promises directly, you let asynchronous code generate them and pass them around (or await them). It's not a thing you never do, but it's certainly suspicious. It gets more problematic when Nona adds:

This function happens to contain multiple code repetition snippets, including these three lines.

That's right, this little block appears multiple times in the function, inside of anonymous function getting passed to the Promise.

No, the code does not work in its current state. It's unclear what the 2100 line function was supposed to do. And yes, this was written by lowest-bidder third-party contractors.

Nona adds:

I am numb at this point and know I gotta fix it or we lose contracts

Management made the choice to "save money" by hiring third parties, and now Nona's team gets saddled with all the crunch to fix the problems created by the "savings".

[Advertisement] Plan Your .NET 9 Migration with Confidence
Your journey to .NET 9 is more than just one decision.Avoid migration migraines with the advice in this free guide. Download Free Guide Now!

EnshittifAIcation

Photo by Ehimetalor Akhere Unuabona on Unsplash

Yesterday morning, first thing after waking up, I checked my emails. One of them was from a client - a sharp person, but not a tech expert - forwarding a message from one of their "digital marketplaces". They claimed that during site crawling, their bot upgrades the connection to HTTP/2, and that this somehow causes issues on their end, so they were asking us to disable HTTP/2 to fix the problem.

I contacted Alex directly - the person (spoiler: not a person) who had sent the email - explaining that if their bot has trouble with HTTP/2 (which, on the contrary, provides significant benefits for the e-commerce experience in question), that's their problem, not ours, and they should fix it. Completely unprompted, I received something unexpected in reply: a guide on how to configure Apache to do what they wanted. The problem? Not only did it completely ignore my stated position, but we don't use Apache - we use nginx. And, I should add, their guide was entirely wrong. I replied pointing all of this out and finally asked to be "escalated to a human, since I was clearly talking to an AI that wasn't understanding any of my responses". The reply was blunt: "That's not possible for this type of issue. Follow our guide or we will suspend your service and your e-commerce visibility." For me, obviously, that's a hard pass. For my client, though, it's a real problem - an intelligent person who understood the situation, but still a problem to solve.


Over the past few months, I've been witnessing a dramatic increase in botnet attacks targeting some of the servers I manage, especially e-commerce ones. These aren't directed at me personally - they also hit servers I manage on behalf of clients. At first I thought they were AI scrapers, but the traffic comes from everywhere, especially from residential connections scattered around the world. I believe these are deliberate disruption campaigns, a side effect of the turbulent geopolitical climate we're living through.

On several of these e-commerce servers, we decided to implement geo-blocking, as I've described previously on this blog. Normally, once you've identified your whitelist countries and the shop isn't a global operation, everything works fine. In other cases, problems arise.

A few days ago, a partner of one of my clients - a company that provides services and needs access to some prepared XML feeds - started complaining they could no longer connect. I asked them for the IP pool they connect from, or at least the country their connections originate from. Their vague reply was: "We can't provide that information because we don't have a fixed IP or set of IPs." They completely ignored the question about the country. I pushed further, but got nowhere - different "people", giving different answers, all wildly off the mark and ignoring what I was actually asking, insisting instead that I whitelist their user agent. I explained, repeatedly, that the block is at the firewall level - meaning I never even see their user agent: if the connection is dropped, there's no handshake, no HTTP headers, nothing. It didn't matter. They kept repeating the same thing without engaging with what I wrote. Eventually they went directly to my client. I'll paste the exact text:

We're having some trouble accessing the site and downloading the XML, as they both currently require a VPN connection. To ensure our Lambda functions can run correctly, could you please:

  • Remove the location-based restrictions for our access;
  • Or, allow the User-Agent "REDACTED" in your firewall/server settings?

Please let us know which option works best for you.

Let's break this down:

  • "Require a VPN connection" - who said anything about a VPN? Pure hallucination.
  • "Remove the location-based restrictions for our access" - they never once answered: which location?
  • "Allow the user agent" - I explained, multiple times, that the block is at the firewall level. The connection is dropped before any handshake occurs. There is no user agent to allow.

This morning, another client writes: "The marketing consultancy wants all the server load graphs to get an idea of where we stand." This is the second time in just a few days I've received a request like this. I send both the graphs and the full specs of the dedicated server in use - average load under 5%. The response was staggering: "The internal team, supported by the most advanced AI, believes your current setup is not adequate for the industry, load, and audience you're targeting, and recommends migrating to a cloud VPS with AT LEAST 8 GB of dedicated RAM to ensure sufficient resources, as the current ones are insufficient."

The current ones? 128 GB of RAM. Two modern CPUs. 48 cores total. If we followed their advice, the site would be down within five minutes - and that's just counting legitimate traffic. My client, unaware of the technical differences, asks me if we can implement what they're suggesting.


The shift was abrupt - not unlike when an intern arrives convinced they already know everything, often with the best of intentions: bringing fresh air into an environment that needs "modernising". But with an intern, you can talk. That same confidence often turns into curiosity, hunger to learn, real experience. I've watched eager interns grow into excellent professionals - people who eventually surpassed me in skill and success, and that felt genuinely satisfying, knowing I'd contributed, at least in part, to their growth. With AI, this is impossible. It doesn't grow, doesn't listen, doesn't update its mental model based on what you write back - and above all, it doesn't know what it doesn't know.

That's why I'd like companies to consider that AI systems are stochastic machines, not experts. They can solve some problems, but there's a limit. There will always be a limit, at least with current technology, and we can't afford to ignore it. The damage risks far outweighing the "savings" generated.


The enormous problem with my work these days is the extreme confidence that certain companies project, replacing humans - even senior ones - with AI, with no right of appeal. The result is monstrous confusion, enormous wasted time for everyone, and a widespread erosion of reliability, all papered over by the AI's unshakeable assertiveness - and by those who believe these systems are the Answer to the Ultimate Question of Life, the Universe, and Everything.

Rewarding confidence over actual competence is a bug humanity has always had. It has produced disasters throughout history, it is producing disasters now, and not only in the tech world.

So I find myself wondering: if they're so convinced that AI is better than senior professionals, why don't they replace the bosses with AI? I'm fairly confident the decisions would be considerably better - and humans would end up exactly where they should be.

Why I Love FreeBSD

Why I Love FreeBSD

When I first laid eyes on the FreeBSD Handbook, back in 2002, I couldn't believe what I was seeing. Six years of Linux, a relationship I've written about elsewhere, across various distributions, had trained me to hunt for documentation in fragments: often incomplete, often outdated, sometimes already stale after barely a year. Here was an operating system that came with a complete, accurate, up-to-date (as much as possible), detailed manual. I was already a convinced believer in Open Source, but I found myself reasoning in very practical terms: if the team behind this OS puts this much care into its documentation, imagine how solid the system itself must be. And so I decided to give it a try. I had a Sony Vaio with no room for a dual boot. I synced everything to a desktop machine with more space, took a breath, and made a decision: I'd install FreeBSD on that laptop and reinstall Linux when the experiment was over.

Spoiler: FreeBSD never left that machine.

At the time I had no idea that this experiment would shape the way I design and run systems for the next twenty years.

I realized almost immediately that GNU/Linux and FreeBSD were so similar they were completely different.

The Unix inspiration was the same, but everything worked differently - and the impression was that FreeBSD was distinctly more mature, less chaotic, more focused. A magnificent cathedral - a form then widely criticized in the circles I moved in - but one that had certain undeniable virtues. Back then I compiled the entire system from source, and I noticed right away that performance was better on that hardware than Linux had ever been. Not only that: Linux would overheat and produce unpredictable results - errors, sudden shutdowns, fans screaming even after compilation finished. My Linux friends continued to insist it was a “hardware problem”, but FreeBSD handled the load far more gracefully. I could read my email in mutt while compiling, something that was practically impossible on Linux, which would slow to a crawl. The fans would settle within seconds of the load ending, and the system felt genuinely more responsive. I never experienced a crash. I was running KDE on all my systems at the time, and the experience on FreeBSD was noticeably superior - more consistent and steady performance, none of the micro-freezes I'd come to accept on Linux, greater overall stability. The one drawback: I compiled everything, including KDE. I was a university student and couldn't leave my laptop in another room - the risk of an "incident" involving one of my flatmates was too real - so I kept it within arm's reach, night after night, fans spinning as KDE and all its applications compiled. At some point I figured out exactly how long the KDE build took, and started using it as a clock: fans running meant it was before four in the morning. Fans silent meant I'd made it past.

The Handbook taught me an enormous amount - more than many of my university courses - including things that had nothing to do with FreeBSD specifically. It taught me the right approach: understand first, act second. The more I read, the more I wanted a printed copy to keep at my desk. So I convinced my parents that I needed a laser printer “for university work”. And the first thing I printed, of course, was the Handbook. That Handbook still contains relevant information today. There have been significant changes over the past twenty-four years, but the foundations are still the same. Many tools still work exactly as they did. Features have been added, but the originals still operate on the same principles. Evolution, not revolution. And when you're building something meant to last, that is - in my view - exactly the right philosophy. Change is good. Innovation is good. On my own machines I've broken and rebuilt things thousands of times. But production environments must be stable and predictable. That, still today, is one of the qualities I value most in every BSD.

Over the years, FreeBSD has served me well. At a certain point it stepped down as my primary desktop - partly because I switched to Mac, partly because of unsupported hardware - but it never stopped being one of my first choices for servers and any serious workload. As I often say: I only have one workstation, and I use it to access hundreds of servers. It's far easier to replace a workstation - I can reconfigure everything in a couple of hours - than to deal with a production server gone sideways, with anxious clients waiting or operations ground to a halt.

FreeBSD has never chased innovation for its own sake. It has never chased hype at the expense of its core purpose. Its motto is "The Power to Serve" - and to do that effectively, efficiently, securely. That is what FreeBSD has been for me.

I love FreeBSD because it has served me for decades without surprises. I love FreeBSD because it innovates while making sure my 2009 servers keep running correctly, requiring only small adjustments at each major update rather than a complete overhaul.

I love FreeBSD because it doesn't rename my network interfaces after a reboot or an upgrade.

And because its jails - around since 2000 - are an effective, efficient, secure, simple, and fully native mechanism: you can manage everything without installing a single external package. I love FreeBSD because ZFS is native, and with it I get native boot environments, which means safe, reversible upgrades. Or, if you're running UFS, you change a single character in fstab and the entire filesystem becomes read-only - cleanly, with no kludges. I love FreeBSD because bhyve is an efficient, lightweight, reliable hypervisor. I love it for its performance, for its features, for everything it has given me.

But I love FreeBSD also - and above all - for its community. Around the BSDs, in general, you find people driven by genuine passion, curiosity, and competence. Over the past twenty years the tech world has attracted many people who appear to be interested in technology. In reality, they are often just looking for something to monetize quickly, even at the cost of destroying it. In the BSD community, that is far less common. At conferences I've had the chance to meet developers in person - to understand their spirit, their skill, and yes, their passion. Not just in the volunteers who contribute for the joy of it, but in those funded by the Foundation as well. And then there are the engineers from companies that rely heavily on FreeBSD - Netflix among them - and they bring the same quality: that engagement, that enthusiasm, that tells you FreeBSD isn't a job for them. It's a pleasure. Which is one of the reasons why every time I attend a BSD conference, I come home even more in love with the project: the vibe of the community, the dedication of the developers, the presence of a Foundation that is strong and effective without being domineering or self-important - which, compared to the foundations of other major Open Source projects, makes it genuinely remarkable. Faces that have been part of this project for over twenty years, and still light up the moment they find their friends and start talking about what they've been working on. That positivity is contagious - and it flows directly into the code, the project, the vision for what comes next. Because that's the heart of it. FreeBSD has always been an operating system written by humans, for humans: built to serve and to be useful, with a consistency, documentation, pragmatism, and craftsmanship that most other projects - particularly mainstream Linux distributions - simply don't have. The Foundation wants to hear from ordinary users. It actively promotes the kind of engagement that brings more people to FreeBSD. Not because big tech companies are pushing to create dependency, but because it believes in the project.

So thank you, FreeBSD, for helping me stay passionate for so many years, for keeping my projects running, for keeping my clients' servers up and my data safe. Thank you, FreeBSD, for never wasting time chasing the trend of the moment, and instead focusing on doing things right. Thank you, FreeBSD, for all the extraordinary people - from across the entire BSD community - you've brought into my life. Friends, not colleagues. Real people. The genuine kind. And when the people running something still believe in it - truly believe in it, after all these years - and the project keeps succeeding, that tells you there is real substance underneath. In the code. In the people. In the community.

FreeBSD doesn't want to be "the best and greatest”. It wants to serve.

The Power to Serve.

I can't cancel GitHub Copilot

Back when Copilot first came out, I immediately disliked it. But I decided to give it a fair shake and tried to evaulate it in good faith. I wasn’t interested in paying for it, but they had a form for FOSS community members to apply for a free subscription, so I filled it out and gave it a shot. Once approved I spent 15 minutes (successfully) convincing it to write a Python script that printed out the lyrics to “All Star” verbatim, and haven’t touched it since.

Since then, like clockwork I get an email every month informing me that my subscription has been automatically renewed.

Hi there,

Thank you for renewing your free access to GitHub Copilot. Your access to GitHub Copilot will be reviewed on 2026-05-31. GitHub Copilot checks eligibility monthly per our policy. No steps are needed on your end.

We hope you enjoy using GitHub Copilot and participating in the developer community.

I’m not being charged for it and so it’s a matter of principle more than anything, but I ought to be able to turn this off. But I cannot find anything in the GitHub settings which would allow me to cancel this free “subscription”.

A screenshot of GitHub's copilot settings. Everything which can be disabled is disabled, but may features cannot be disabled. There is no obvious way to cancel the subscription.

GitHub support has been less than helpful:

A screenshot of a support ticket opened on March 26th asking for assistance in cancelling my Copilot subscription. I asked for an update on April 21st. There is no response from GitHub.

How do I get rid of this thing!

Addressing the harassment

Kiwi Farms is a web forum that facilitates the discussion and harassment of online figures and communities. Their targets are often subject to organized group trolling and stalking, as well as doxing and real-life harassment. Kiwi Farms has been tied to the suicides of three people who were victims of harassment by the website.

Wikipedia: Kiwi Farms


About three years ago, a thread on Kiwi Farms was opened about me. In the years since, it has grown to about 1,200 posts full of bigots responding to anything and everything I do online with scorn, slurs, and overt bigotry. The thread is full of resources to facilitate harassment, including, among other things, all of my social media profiles, past and present, a history of my residential addresses, my phone numbers, details about my family members, a list of my usernames and password hashes from every leaked database of websites I have accounts on, and so on. Most of my articles or social media posts are archived on Kiwi Farms and then subjected to the most bigoted rebuttals you can imagine. Honestly, it’s mostly just… pathetic. But it’s a problem when it escapes containment, and it’s designed to.

Kiwi Farms is the most organized corner of the harassment which comes my way, but it comes in many forms. On Mastodon, for example, before I deleted my account I would often receive death threats, or graphic images and videos of violence against minorities. I have received a lot of hate and death threats over email, too, several of which I confess that I took some pleasure in forwarding to the sender’s employer.

One of the motivations for this harassment is to “milk” me for “drama”. The idea is to get my hackles up, make me fearful for my safety, and alienate me from my communities, with the hope that it will trigger an entertaining meltdown. Maybe people respond poorly to this kind of harassment – that’s the idea, really – and it often makes the situation worse. Responding to it can legitimize the abuse, elevate it into the discourse, draw more attention to it, and stoke the flames. It can make the victim look bad when they respond emotionally to harassment designed to evoke negative emotions. I have left it unaddressed for a long time in order to subvert this goal, and address it now with a cool head in a relatively quiet period in the harassment campaign.

The harassment waxes and wanes over time, usually picking up whenever I write a progressive blog post that gets some reach. It really took off after a series of incidents in which I called for the Hyprland community and its maintainers to be held to account for the bigotry and harassment on their Discord server (1, 2) and when I spoke out against Richard Stallman’s prolific and problematic public statements regarding the sexual abuse of minors (3).

The abuse crescendoed in October of 2024, when I was involved in editing The Stallman Report. The report is a comprehensive analysis of Richard Stallman’s problematic political discourse regarding sexual harassment, sexual assault, and the sexual abuse of minors, and it depends almost entirely on primary sources – quotes from Stallman’s website which remain online and have not been retracted to this day. The purpose of the report was to make a clear and unassailable case for Stallman’s removal from positions of power, make specific recommendations to address the underlying problems, and to stimulate a period of reflection and reform in the FOSS community. It didn’t achieve much, in the end: the retaliation from Stallman’s defenders was fiercer and more devoted than the support from those who saw the report’s sense.

Myself and the other authors asserted our moral rights to publish anonymously, motivated by our wish to reduce our exposure to the exact sort of harassment I’ve been subjected to over the years. However, I was careless in my opsec during the editing process, and it was possible to plausibly link me to the report as a result, leading to a sharp increase in harassment.


This brings me to a retaliatory, defamatory “report” published about me in the style of the Stallman Report.1 This report is, essentially, a distillation of the Kiwi Farms thread on me, sanitized of overt bigotry and presented in a readily linkable form in order to stalk me around the internet and enable harassment. It’s used to discredit anything I do online and push for my exclusion from online communities, by dropping the link on Hacker News, Reddit, GitHub or Codeberg issues, etc, anywhere myself or my work is mentioned, or used to discredit the Stallman Report by discrediting one of its unmasked authors.2

The report is pretty obviously written in bad faith and relies on a lot of poor arguments to make the case that I’m a misogynist and a pedophile, charges I deny. It also accuses me of being a hypocrite, which I acknowledge in general terms, because, well, who isn’t. The key thing I want people who encounter this report to keep in mind is that this is the “polite” face of an organized harassment campaign.

Most reasonable readers easily dismiss the report because it is rather transparent in its bad faith. However, someone who reads it in good faith, just trying to do their due diligence, might come away from it with some reasonable concerns. Consider the following quote from my long-deleted Reddit account, /u/sircmpwn:

I’m of the opinion that 14 year old girls should be required to have an IUD installed. Ten years of contraception that requires a visit to the doctor to remove prematurely.

This comment was written 13 years ago, and I don’t stand by what I wrote. I was 19 at the time, and I was a moron. My mother had me when she was 23 years old, and the abuse I suffered at her hands during my childhood was severe, and I generalized this experience to all women. When I wrote this comment, I was one year removed from the abuse, living alone and in poverty, and early in a life-long process of coming to terms with the abuse and figuring out how to be a well-adjusted adult after 18 long years of abuse and isolation.

But an explanation is not an excuse. This comment was reprehensible, as were many of the awful ideas I held at the time. Many years later, I can recognize that this comment is misogynistic, denies the agency of children and women over their own bodies, disparages the many, many mothers who do a wonderful job raising children in difficult circumstances, and is based in argumentation which can reasonably be related to eugenics. This comment was just awful – there’s a reason this was deleted. I apologize to anyone who read it at the time, or comes across it now, and is justifiably insulted.

I don’t feel that it’s necessary to rebuke most of the report. But, there is a grain of truth in the report, the grain of truth that led me to retract my shitty Reddit comments and reflect on myself, and that grain of truth is this: in early adulthood, I was a huge asshole.


I have had more than my fair share of harmful ignorance, bad takes, sexism and misogyny, transphobic and homophobic beliefs, and worse. Moreover, I have verbally abused many people and made many of my own arguments in bad faith to support bad conclusions. Some of the people who read this will recall having found themselves at the wrong end of my verbal abuse and harassment.

It’s important for me to take responsibility for this period of my life, and in dismissing bad faith criticisms of myself to carefully avoid dismissing good faith criticisms in the same fell swoop.

I’m not really sure how to deal with this part of my life appropriately. I have apologized to a few people individually, but it’s not a scalable solution and with many people I have no business re-opening wounds to salve my own conscience. I can offer a general apology, and I will. I’ve never found the right moment to say it, but now will do: I apologise, sincerely, to everyone who I have harmed with verbal abuse and with hateful and problematic rhetoric. If you have had a bad experience or experiences with me, and there’s anything you want from me that can help you heal from that experience – a personal apology, for example – please reach out to me and ask.

That said, apologies alone aren’t enough. I believe in restorative justice, in growing and mending wounds and repairing harm done, and I set myself seriously to this task over many years. I have gone to therapy, spoken with close friends about it, and taken structural action as well: I have founded support groups and worked one-on-one with many of the people whose politics and behavior I object to. I want an amicable end to bigotry and bullying, for bigots and bullies like my former self to look forward to, to provide a path that doesn’t require them to double down. It’s not easy, and not everyone manages, but I have to look at myself and see the path I’ve taken and imagine that it’s possible, because what’s left for the likes of me if not?

This part of my past brings me a great deal of shame, and that shame motivates me to grow as a person. In a certain sense, it is an ironic, cruel privilege to have had so much cause to reflect on myself, to drive me to question myself and my ideas, and become a much better person with much more defensible ideas. It has driven me to study feminism, social justice, racial justice, intersectionality, LGBTQ theory, antifascism, and to find the intersections in my own life and strive to act out of a more legitimate sense of justice.

I’m often still a firebrand, but I’ve chosen much better hills to die on. My passion is invested in making a more just world, building safe and healthy communities, elevating my peers, and calling for justice and a just society. I have taken the lessons I have learned and tried to share them with other people, and to stand up for what I can now say I know is right, both online and in real life. Through a process of learning, reflection, and humility, I acknowledge that I have done a lot harm in my youth. To repair this harm, I have committed myself to doing more than enough good now to make sure that the world is a better place when all is said and done. That’s what justice means to me when I turn my principles inwards and hold myself accountable.


So where do we go from here?

The response to my progressive beliefs and activism is reactionary backlash, doxing, harassment, and death threats targeting me and my family, all of which is likely to escalate in response to this post, and none of which is defensible. On the other hand, I understand that the consequences for my own reactionary past is, in some cases, alienation – and, honestly, fair enough.

But I don’t want you to confuse my honest faults with the defamation and harassment I endure for standing up for my honest strengths. If you feel generous and optimistic about who I am today, and you recognize my growth, and wish for an ally in the fight for what’s right, your good faith and solidarity mean the world to me. I would appreciate it if you would express your support and rebuke harassment when you see it, and help keep me honest as I continue a life-long process of learning and growth.

If I’ve hurt you, and you want to seek reconciliation, I make myself available to you for that purpose. If I’ve hurt you, and you simply don’t care to be hurt again, I’m sorry – I understand where you’re coming from, and have made my peace with it.

Please send words of support and/or death threats to drew@ddevault.org.

Thank you.

Rewrote my blog with Zine

15 years ago, on December 11th, 2010, at the bold age of 17, I wrote my first blog post on the wonders of the Windows Phone 7 on Blogspot. I started blogging as a kid at the behest of a family friend at Microsoft, who promised she’d make sure I would become the youngest Microsoft MVP if I started blogging. That never came to pass, though, because as I entered adulthood and started to grow independent of my Microsoft-friendly family I quickly began down the path to the free and open source software community.

Early blog posts covered intriguing topics such as complaining about my parent’s internet filter, a horrible hack to “replace” the battery of a dead gameboy game, announcing my friend’s Minecraft guild had a new website (in PHP), and so on. After Blogspot, I moved to Jekyll on GitHub pages, publishing You don’t need jQuery in 2013. For a long time this was the oldest post on the site.

I’m pretty proud of my writing skills and have a solid grasp on who I am today, but the further back you go the worse my writing, ideas, values, and politics all get. I was growing up in front of the world on this blog, you know? It’s pretty embarassing to keep all of this old stuff around. But, I decided a long time ago to keep all of it up, so that people can understand where I’ve come from, and that everyone has to start somewhere.1

At some point – I’m not sure when – I switched from Jekyll to Hugo, and I’ve stuck with it since. But lately I’ve been frustrated with it. I’d like my blog engine to remain relatively stable and simple, but Hugo is quite complex and over the past few years I’ve been bitten by a number of annoying and backwards-incompatible changes. And, as part of my efforts to remove vibe-coded software from my stack, I was disappointed to learn that Hugo is being vibe coded now, and so rewriting my blog went onto the todo list.

Choosing the right static site generator (SSG) was a bit of a frustrating process. Other leading candidates, like Pelican or Zola, are also built from slop now. But a few months ago I found Zine, and after further study I found it to be a pretty promising approach. Over the past few days I have rewritten my templates and ported in nearly 400 (jeesh) blog posts from my archives.

There’s a lot to like about Zine. I’m pretty intrigued by SuperHTML as a templating engine design – the templates are all valid HTML5 and use an interesting approach to conditions, loops, and interpolation. SuperMD has some interesting ideas, but I’m less sold on it. The Scripty language used for interpolation and logic is a bit iffy in terms of design – feels half baked. And the designers had some fun ideas, like devlogs, which I feel are kind of interesting but tend to have an outsized influence on the design, more polished where the polish might have been better spent elsewhere. The development web server tends to hang fairly often and I’ve gotten it to crash with esoteric error messages every now and then.

But what can I say, it’s alpha software – I hope it will improve, and I’m betting that it will by migrating my blog. There’s no official LLM policy (yet) and I hope they will end up migrating to Codeberg, and using Discord for project communication is not something I appreciate, but maybe they’ll change their tune eventually.

In the meantime, I took the opportunity to clean up the code a bit. The canonical links have gone through several rounds of convention and backwards compatibility, and I have replaced them with a consistent theme and set up redirects. I probably broke everyone’s feed readers when rolling these changes out, and I apologise for that. I have gone through the backlog and updated a number of posts as best as I can to account for bitrot, but there are still a lot of broken videos and links when you get far enough back – hopefully I can restore some of that given enough time.

I’ve also gone ahead and imported the really old stuff from Blogspot. The whole lot is garbage, but if you’re curious to see where I started out, these old posts are more accessible now.

tar: a slop-free alternative to rsync

So apparently rsync is slop now. When I heard, I wanted to drop a quick note on my blog to give an alternative: tar. It doesn’t do everything that rsync does, in particular identifying and skipping up-to-date files, but tar + ssh can definitely accomodate the use case of “transmit all of these files over an SSH connection to another host”.

Consider the following:

tar -cz public | ssh example.org tar -C /var/www -xz

This will transfer the contents of ./public/ to example.org:/var/www/public/, preserving file ownership and permissions and so on, with gzip compression. This is roughly the equivalent of:

rsync -a public example.org:/var/www/

Here’s the same thing with a lightweight progress display thanks to pv:

tar -cz public | pv | ssh example.org tar -C /var/www -xz

I know tar is infamously difficult to remember how to use. Honestly, I kind of feel that way about rsync, too. But, here’s a refresher on the most important options for this use-case. To use tar, pick one of the following modes with the command line flags:

  • -c: create an archive
  • -x: extract an archive

Use -f <filename> to read from or write to a file. Without this option, tar uses stdin and stdout, which is what the pipelines above rely on. Use -C <path> to change directories before archiving or extracting files. Use -z to compress or decompress the tarball with gzip. That’s basically everything you need to know about tar to use it for this purpose (and for most purposes, really).

With rsync, to control where the files end up you have to memorize some rules about things like whether or not each path has a trailing slash. With tar, the rules are, in my opinion, a bit easier to reason about. The paths which appear on the command line of tar -c are the paths that tar -x will open to create those files. So if you run this:

tar -c public/index.html public/index.css

You get a tarball which has public/index.html and public/index.css in it.

When tar -x opens this tarball, it will call fopen("public/index.html", "w"). So, whatever tar’s working directory is, it will extract this file into ./public/index.html. You can change the working directory before tar does this, on either end, by passing tar -C <path>.

Of course, you could just use scp, but this fits into my brain better.

I hope that’s useful to you!


Update: As a fun little challenge I wrapped up this concept in a small program that makes it easier to use:

https://git.sr.ht/~sircmpwn/xtar

Example:

xtar -R /var/www me@example.org public/*

A eulogy for Vim

Vim is important to me. I’m using it to write the words you’re reading right now. In fact, almost every word I have ever committed to posterity, through this blog, in my code, all of the docs I’ve written, emails I’ve sent, and more, almost all of it has passed through Vim.

My relationship with the software is intimate, almost as if it were an extra limb. I don’t think about what I’m doing when I use it. All of Vim’s modes and keybindings are deeply ingrained in my muscle memory. Using it just feels like my thoughts flowing from my head, into my fingers, into a Vim-shaped extension of my body, and out into the world. The unique and profound nature of my relationship with this software is not lost on me.

A picture of my right hand, with the letters “hjkl” tattooed on the wrist

I didn’t know Bram Moolenaar. We never met, nor exchanged correspondence. But, after I moved to the Netherlands, Bram’s home country, in a strange way I felt a little bit closer to him. He passed away a couple of years after I moved here, and his funeral was held not far from where I lived at the time. When that happened, I experienced an odd kind of mourning. He was still young, and he had affected my own life profoundly. He was a stranger, and I never got to thank him.

The people he entrusted Vim to were not strangers, they knew Bram and worked with him often, and he trusted them. It’s not my place to judge their work as disrespectful to his memory, or out of line with what he would have wanted. Even knowing Bram only through Vim, I know he and I disagreed often. However, the most personal thing I know about Bram, and that many people remember about him, was his altruistic commitment to a single cause: providing education and healthcare to Ugandan children in need. So, at the very least, I know that he cared.

I won’t speculate on how he would have felt about generative AI, but I can say that GenAI is something I care about. It causes a lot of problems for a lot of people. It drives rising energy prices in poor communities, disrupts wildlife and fresh water supplies, increases pollution, and stresses global supply chains. It re-enforces the horrible, dangerous working conditions that miners in many African countries are enduring to supply rare metals like Cobalt for the billions of new chips that this boom demands. And at a moment when the climate demands immediate action to reduce our footprint on this planet, the AI boom is driving data centers to consume a full 1.5% of the world’s total energy production in order to eliminate jobs and replace them with a robot that lies.

Meanwhile, this whole circus is enabling the rising tide of fascism around the world, not only by supercharging propaganda but also by directly financially supporting fascist policies and policymakers. All this to enrich the few, centralize power, reduce competition, and underwrite an enormous bubble that, once it bursts, will ruin the lives of millions of the world’s poor and marginalized classes.

I don’t think it’s cute that someone vibe coded “battleship” in VimScript. I think it’s more important that we stop collectively pretending that we don’t understand how awful all of this is. I don’t want to use software which has slop in it. I do what I can to avoid it, and sadly even Vim now comes under scrutiny in that effort as both Vim and NeoVim are relying on LLMs to develop the software.

So this is how, a few years after Bram’s passing, I find myself in another unusual moment of mourning: mourning Vim itself. What an odd feeling.


To keep my conscience clear, and continue to enjoy the relationship I have with this amazing piece of software, I have forked Vim. You can find my fork here: Vim Classic.

The choice of which version to use as the basis for a fork was a bit difficult. The last version of Vim released during Bram’s lifetime was Vim 9.0. To me, that seems like a good starting point. But, in the end, I chose to base my fork on Vim 8.2.0148 instead. Patch 148 was the patch immediately prior to the introduction of Vim9 Script, Vim 9.0’s flagship feature.

I’m sure Bram worked hard on Vim9 script, and I want to honor that. At the same time, it was still very new when he passed away, and the job of fully realizing its potential was handed down to the current maintainers. Its absence from Vim Classic is an honest assessment that I don’t have the time or energy to try to sort out all of the work on Vim9 which followed in Bram’s footsteps, and decide what stays and what goes. It seems like a useful line to draw in the sand: Vim Classic is compatible with legacy plugins, but not the newfangled stuff.

Since forking from this base, I have backported a handful of patches, most of which address CVEs discovered after this release, but others which address minor bug fixes. I also penned a handful of original patches which bring the codebase from this time up to snuff for building it on newer toolchains. My old vimrc needed very few changes to work on this version of Vim, and all of my plugins work with the exception of fzf.vim, which I would like to fix at some point (or maybe a sympathetic reader is willing to work on backporting the necessary changes). Thanks, dzwdz, for figuring out the issues with fzf.vim!

I plan to use this for a little while, look for sore points and rough edges, collect feedback from other users, and then tag a little release soon. Going forward, maintenance will be slow and quiet. I welcome your patches, particularly to help with maintaining the runtime scripts, stuff like making sure new language features end up in the syntax files. I’ll also gladly accept new bug fixes, and maybe even a few new features if a good case can be made for including them. Backporting small patches from Vim upstream will be considered, with extra scrutiny.

In short, I invite you to use Vim Classic, if you feel the same way as me, and to maintain it with me, contributing the patches you need to support your own use cases.

The cults of TDD and GenAI

I’ve gotten a lot of flack throughout my career over my disdain towards test-driven development (TDD). I have met a lot of people who swear by it! And, I have also met a lot of people who insisted that I adopt it, too, often with the implied threat of appealing to my boss if appealing to me didn’t work.

The basic premise of TDD, for those unaware, is that one first writes a unit test that verifies the expected behavior for some code they want to write, observes the new test fail, and then one writes the implementation, iterating on it until the test passes. The advantage of this approach is, first, to ensure that your codebase is adequately covered by testing, and, second, to provide you a rapid feedback loop to assist in your work.

I have often found elements of TDD to be quite useful. Using a unit test or something similar to provide an efficient rapid feedback loop is a technique which I have employed many times. However, I am and have always been skeptical of the cult which arises around automated software testing and in particular TDD. A lot of people adopt an unquestioning loyalty to TDD, building tools and practices and vibes around the idea. It’s often too much.

The flaw with TDD is that, while it ensures that you have a test for every function you write, it also exerts an influence on the tested codebase, shaping the code to be as “testable” as possible, which only sometimes leads to better code. Moreover, TDD has no means of ensuring that the behavior that your tests verify is the right behavior for your software to have. Software with a thousand passing tests and 100% test coverage could be doing whatever the user or the business or whatever needs it to, but it could just as easily not meet the requirements in spite of those comprehensive tests – and in any case it gives you confidence in your work, which may or may not be misplaced.

The cult of TDD exploits the fact that TDD is very good at making you feel like a good, diligent programmer. That rapid feedback loop not only assists in your work but also enables a powerful dopamine cycle. Add into that a culture of aiming for 100% coverage and you get the bonus hit from watching a number go up. Buy into the whole cult and you get a slew of new README badges to keep green, and lots of cool charts and numbers, hundreds of blinkenlights on your test suite, a bunch of fun Slack messages from Jenkins, and a cute cardboard cut-out of the CTO to keep in the cubicle of whoever last broke the build.1 All of this pomp and circumstance is fun and it feels good and because it’s all in the name of testing (which is good, right?) it makes you feel like a good programmer even if none of it necessarily contributes to the results your team is supposed to achieve.

All of these flashy traits allow one to adopt the aesthetics of good, diligent software engineering work regardless of how good the work actually is. It’s an intoxicating way to work, especially for someone who struggles with software engineering. It makes you feel like a good programmer and gives you data to “back it up”, stuff you can cite at your performance review. But, software development is really hard, and TDD doesn’t go that far to making it easier. All of the really hard problems are not solved by TDD.

I suspect that coding agents are tapping into the same emotional and psychological reflexes that the cult of TDD gives us an early example of. Software development is still hard, but using an agent allows someone who’s just “so-so” at programming to feel the rush of being great at programming, a rush they might have been chasing for their entire career, and I bet the rush is so much sweeter than watching the lights on your test suite runs tick over to green.

A coding agent permits one to feel as if they have the raw productive power a great programmer can tap into. One may feel like the “10× programmers” they’ve sat next to in the open office for ten years, whose skills they never quite achieved themselves. It scales up the raw output by a factor of ten, and lets one assemble apparently great works in a fraction of the time, solo-coding great cathedrals in the time it used to take them to build, with great difficulty, a homely shack.

But, if it seems too good to be true…

Those cathedrals are not the great works they appear to be. The construction is shoddy and the architecture nonsensical and a great programmer hand-writing code will still outperform any mediocre programmer once the gleam wears off of their respective works and the bugs and problems start showing up. The project has 99.9% coverage on a thousand beautiful green tests, and, inside, the foundations are still rotten.

God, though, I understand why so many people are chasing that dragon, even though it’s going to ruin their careers, and maybe even their lives. I get why people fall for this, in spite of the externalities that they must know of by now. In spite of the colossal waste, the loss of fresh water resources, the fact that AI datacenters are the fastest growing source of carbon emissions, the people suffering sky-rocketing power bill and rolling outages near these new datacenters, the reams and reams of fascist propaganda these machines are producing to tear our society apart, the corruption, the market manipulation, the plain and simple fact that the ultimate purpose of these tools is to put their users out of a job entirely… well, once you finally get a taste of what it feels like to be great… I suppose all of those problems seem so far away.

Redesigning my microkernel from the ground up

As you may recall, circa 2022-2023 I was working on a microkernel written in Hare named Helios. Helios was largely inspired by and modelled after the design of seL4 and was my first major foray into modern OS development that was serious enough to get to a somewhat useful state of functionality, with drives for some real hardware, filesystems, and an environment for running user programs of a reasonable level of sophistication.

Helios development went strong for a while but eventually it slowed and eventually halted in a state of design hell. Since Helios was my first major OS project at this scale and with this much ambition, the design and implementation ended up with a lot of poor assumptions that made it a pretty weak foundation for building a complete OS upon. In late 2023 I more or less gave up on it and moved my OS development work out of the realm of writing code and back into the realm of thinking really hard about how to design operating systems.

What followed was a couple of years of design thinking, developing small scale design experiments, and doing deeper research into prior art – reading papers and studying existing kernels. It was also during this period that I wrote Bunnix, a working Unix clone, motivated in part by a desire to gain some first-hand experience working in the design and implementation of Unix-style operating systems – a fertile environment for learning a lot of the nuts and bolts of OS implementations by working against a complete and proven design.

In August I was finally prepared to have another go. I decided to start over from scratch, importing and adapting and rewriting code from Helios and Bunnix on an as-needed basis to speed things up, and writing from scratch anything where the lessons learned in hindsight outweighed the benefits of adapting existing code.1

The result is Hermes.

Hermes has not yet reached feature parity with Helios, lacking some IPC features and an aarch64 port, but already it’s significantly more robust and thoughtfully designed than Helios.

The big glitzy feature that most obviously distinguishes Hermes from Helios is that Hermes supports symmetric multiprocessing (SMP), which is to say, running on multiple CPU cores. This time around, I finally listened to the advice I’d been hearing in osdev circles for years and implemented SMP as early as possible to avoid dealing with tons of problems adding multiprocessing to an otherwise mature kernel.

The multicore scheduler at the heart of Hermes is surprisingly simple, actually. It uses relatively ordinary per-CPU run queues. Each new task, once scheduleable, is scheduled, in order of preference, on (1) the CPU matching its affinity, (2) any currently idle CPU, or (3) a random CPU. If a CPU would idle, it first tries to steal a pending task from another CPU. The most important parts of the scheduler are less than 200 lines of code ([1], [2]).

The less obviously impressive improvements from Helios to Hermes are numerous. The syscall and IPC ABIs were rethought from the ground up – one of the major goals of the redesign. I also moved from an seL4-style capability derivation graph – which is quite complex to implement and reason about – to reference counting to manage the lifetimes of kernel resources. Resource management in general is much simpler and should improve the performance of the kernel substantially.

I’ve also taken a much different approach to organizing the code, to allow the kernel and many of the things around it – its bootloaders and the userspace that runs the kernel test suite – share a lot more code than was possible in Helios, making a lot of the non-kernel code a lot easier to write and maintain.

The userspace is also a substantial upgrade in design from Helios, or at least I hope it will be when more of it takes shape. Rather than developing a specialized Hare standard library, independent of the upstream Hare standard library, for writing drivers and low-level services, I have started with a port of the upstream Hare standard library and built low-level driver and service support libraries around it. The userspace is streamlined considerably by doing so, giving these low-level components access to a more comfortable and featureful programming environment and reducing the complexity of the system by making various components more uniform in their design.

Finally, I’ve taken a much more serious approach to testing Hermes and making it as robust and complete as possible in real-world use-cases. I borrowed the EFI bootloader from Bunnix and repurposed it for Hermes, opening up a lot of newer hardware, and I have written a more comprehensive test suite and run and verified it on much more real-world hardware. I have about ten devices which all (consistently!) pass the Hermes test suite. Feel free to try it out on yours as well and let me know how it goes!

That’s all there is to say for now, but I hope to keep you in the loop as I continue working on this for a while. The userspace is starting to take shape and soon(™) I hope to start building out block device drivers, some filesystems, and enough support code to run a shell and a handful of useful programs. In the meantime, feel free to poke around the code and play around with it. There is also some early documentation available for you to read if you wish. I’m hanging out in #ares on Libera Chat if you have any questions.

OpenAI employees… are you okay?

You might have seen an article making the rounds this week, about a young man who ended his life after ChatGPT encouraged him to do so. The chat logs are really upsetting.

Someone two degrees removed from me took their life a few weeks ago. A close friend related the story to me, about how this person had approached their neighbor one evening to catch up, make small talk, and casually discussed their suicidal ideation at some length. At the end of the conversation, they asked to borrow a rope, and their neighbor agreed without giving the request any critical thought. The neighbor found them the next morning.

I didn’t know the deceased, nor their neighbor, but I’m close friends with someone who knew both. I found their story deeply chilling – ice runs through my veins when I imagine how the neighbor must have felt. I had a similar feeling upon reading this article, wondering how the people behind ChatGPT and tools like it are feeling right now.

Two years ago, someone I knew personally took their life as well. I was not friendly with this person – in fact, we were on very poor terms. I remember at the time, I had called a crisis hotline just to ask an expert for advice on how to break this news to other people in my life, many of whom were also on poor terms with a person whose struggles to cope with their mental health issues caused a lot of harm to others.

None of us had to come to terms with any decisions with the same gravity as what that unfortunate neighbor had to face. None of us were ultimately responsible for this person’s troubles or were the impetus for what happened. Nonetheless, the uncomfortable and confronting feelings I experienced in the wake of that event perhaps give me some basis for empathy and understanding towards the neighbor, or for OpenAI employees, and others who find themselves in similar situations.

If you work on LLMs, well… listen, I’ve made my position as an opponent of this technology clear. I feel that these tools are being developed and deployed recklessly, and I believe tragedy is the inevitable result of that recklessness. If you confide in me, I’m not going to validate your career choice. But maybe that’s not necessarily a bad quality to have in a confidant? I still feel empathy towards you and I recognize your humanity and our need to acknowledge each other as people.

If you feel that I can help, I encourage you to reach out. I will keep our conversation in confidence, and you can reach out anonymously if that makes you feel safer. I’m a good listener and I want to know how you’re doing. Email me.


If you’re experiencing a crisis, 24-hour support is available from real people who are experts in getting you the help you need. Please consider reaching out. All you need to do is follow the link.

What's up with FUTO?

Some time ago, I noticed some new organization called FUTO popping up here and there. I’m always interested in seeing new organizations that fund open source popping up, and seeing as they claim several notable projects on their roster, I explored their website with interest and gratitude. I was first confused, and then annoyed by what I found. Confused, because their website is littered with bizzare manifestos,1 and ultimately annoyed because they were playing fast and loose with the term “open source”, using it to describe commercial source-available software.

FUTO eventually clarified their stance on “open source”, first through satire and then somewhat more soberly, perpetuating the self-serving myth that “open source” software can privilege one party over anyone else and still be called open source. I mentally categorized them as problematic but hoped that their donations or grants for genuinely open source projects would do more good than the harm done by this nonsense.

By now I’ve learned better. tl;dr: FUTO is not being honest about their “grant program”, they don’t have permission to pass off these logos or project names as endorsements, and they collaborate with and promote mask-off, self-proclaimed fascists.

An early sign that something is off with FUTO is in that “sober” explanation of their “disdain for OSI approved licenses”, where they make a point of criticizing the Open Source Initiative for banning Eric S. Raymond (aka ESR) from their mailing lists, citing right-wing reactionary conspiracy theorist Bryan Lunduke’s blog post on the incident. Raymond is, as you may know, one of the founders of OSI and a bigoted asshole. He was banned from the mailing lists, not because he’s a bigoted asshole, but because he was being a toxic jerk on the mailing list in question. Healthy institutions outgrow their founders. That said, FUTO’s citation and perspective on the ESR incident could be generously explained as a simple mistake, and we should probably match generosity with generosity given their prolific portfolio of open source grants.

I visited FUTO again quite recently as part of my research on Cloudflare’s donations to fascists, and was pleased to discover that this portfolio of grants had grown immensely since my last visit, and included a number of respectable projects that I admire and depend on (and some projects I don’t especially admire, hence arriving there during my research on FOSS projects run by fascists). But something felt fishy about this list – surely I would have heard about it if someone was going around giving big grants to projects like ffmpeg, VLC, musl libc, Tor, Managarm, Blender, NeoVim – these projects have a lot of overlap with my social group and I hadn’t heard a peep about it.

So I asked Rich Felker, the maintainer of musl libc, about the FUTO grant, and he didn’t know anything about it. Rich and I spoke about this for a while and eventually Rich uncovered a transaction in his GitHub sponsors account from FUTO: a one-time donation of $1,000. This payment circumvents musl’s established process for donations from institutional sponsors. The donation page that FUTO used includes this explanation: “This offer is for individuals, and may be available to small organizations on request. Commercial entities wishing to be listed as sponsors should inquire by email.” It’s pretty clear that there are special instructions for institutional donors who wish to receive musl’s endorsement as thanks for their contribution.

The extent of the FUTO “grant program”, at least in the case of musl libc, involved ignoring musl’s established process for institutional sponsors, quietly sending a modest one-time donation to one maintainer, and then plastering the logo of a well-respected open source project on a list of “grant recipients” on their home page. Rich eventually posted on Mastodon to clarify that the use of the musl name and logo here was unauthorized.

I also asked someone I know on the ffmpeg project about the grant that they had received from FUTO and she didn’t know anything about it, either. Here’s what she said:

I’m sure we did not get a grant from them, since we tear each other to pieces over everything, and that would be enough to start a flame war. Unless some dev independently got money from them to do something, but I’m sure that we as a project got nothing. The only grant we’ve received is from the STF last year.

Neovim is another project FUTO lists as a grant recipient, and they also have a separate process for institutional sponsors. I didn’t reach out to anyone to confirm, but FUTO does not appear on the sponsor list so presumably the M.O. is the same. This is also the case for Wireshark, Conduit, and KiCad. GrapheneOS is listed prominently as well, but that doesn’t seem to have worked out very well for them. Presumably ffmpeg received a similar quiet donation from FUTO, rather than something more easily recognizable as a grant.

So, it seems like FUTO is doing some shady stuff and putting a bunch of notable FOSS projects on their home page without good reason to justify their endorsement. Who’s behind all of this?

As far as I can tell, the important figures are Eron Wolf2 and Louis Rossmann.3 Wolf is the founder of FUTO – a bunch of money fell into his lap from founding Yahoo Games before the bottom fell out of Yahoo, and he made some smart investments to grow his wealth, which he presumably used to fund FUTO. Rossmann is a notable figure in the right to repair movement, with a large following on YouTube, who joined FUTO a year later and ultimately moved to Austin to work more closely with them. His established audience and reputation provides a marketable face for FUTO. I had heard of Rossmann prior to learning about FUTO and held him in generally good regard, despite little specific knowledge of his work, simply because we have a common cause in right to repair.

I hadn’t heard of Wolf before looking into FUTO. However, in the course of my research, several people tipped me off to his association with Curtis Yarvin (aka moldbug), and in particular to the use of FUTO’s platform and the credentials of Wolf and Rossman to platform and promote Yarvin. Curtis Yarvin is a full-blown, mask-off, self-proclaimed fascist. A negligible amount of due diligence is required to verify this, but here’s one source from Politico in January 2025:

I’ve interacted with Vance once since the election. I bumped into him at a party. He said, “Yarvin, you reactionary fascist.” I was like, “Thank you, Mr. Vice President, and I’m glad I didn’t stop you from getting elected.”

Ian Ward for Politico

Vice President Vance and numerous other figures in the American right have cited Yarvin as a friend and source of inspiration in shaping policy.4 Among his many political positions, Yarvin has proclaimed that black people are genetically predisposed to a lower IQ than white people, and moreover suggests that black people are inherently suitable for enslavement.5

Yarvin has appeared on FUTO’s social media channels, in particular in an interview published on PeerTube and Odysee, the latter a platform controversial for its role in spreading hate speech and misinformation.6 Yarvin also appeared on stage to “debate” Louis Rossmann in June 2022, in which Yarvin is permitted to speak at length with minimal interruptions or rebuttals to argue for an authoritarian techno-monarchy to replace democracy.

Rossmann caught some flack for this “debate” and gave a milquetoast response in a YouTube comment on this video, explaining that he agreed to this on very short notice as a favor to Eron, who had donated “a million” to Rossmann’s non-profit prior to bringing Rossmann into the fold at FUTO. Rossmann does rebuke Yarvin’s thesis, albeit buried in this YouTube comment rather than when he had the opportunity to do so on-stage during the debate. Don’t argue with fascists, Louis – they aren’t arguing with you, they are pitching their ideas7 to the audience. Smart fascists are experts at misdirection and bad-faith debate tactics and as a consequence Rossmann just becomes a vehicle for fascist propaganda – consult the YouTube comments to see who this video resonates with the most.

In the end, Rossmann seems to regret agreeing to this debate. I don’t think that Eron Wolf regrets it, though – based on his facilitation of this debate and his own interview with Yarvin on the FUTO channel a month later, I can only assume that Wolf considers Yarvin a close associate. No surprise given that Wolf is precisely the kind of insecure silicon valley techbro Yarvin’s rhetoric is designed to appeal to – moderately wealthy but unknown, and according to Yarvin, fit to be a king. Rossmann probably needs to reflect on why he associates with and lends his reputation to an organization that openly and unapologetically platforms its founder’s fascist friends.

In summary, FUTO is not just the product of some eccentric who founded a grant-making institution that funds open source at the cost of making us read his weird manifestos on free markets and oligopoly. It’s a private, for-profit company that associates with and uses their brand to promote fascists. They push an open-washing narrative and they portray themselves as a grant-making institution when, in truth, they’re passing off a handful of small donations as if they were endorsements from dozens of respectable, high-profile open source projects, in an attempt to legitimize themselves, and, indirectly, legitimize people they platform like Curtis Yarvin.

So, if you read this and discover that your project’s name and logo is being proudly displayed on the front page of a fascist-adjacent, washed-up millionaire’s scummy vanity company, and you don’t like that, maybe you should ask them to knock it off? Eron, Louis – you know that a lot of these logos are trademarked, right?


Updated 2025-10-27:

The FUTO website has been updated to clarify the nature of grants versus donations (before, after) and to reduce the appearance of endorsements from the donation recipients – the site is much better after this change.

I spoke to a representative who spoke for the FUTO leadership, and shared positive feedback regarding the changes to the website. I also asked for a follow-up on the matter of platforming fascist activist Curtis Yarvin on their social media channels, and I was provided this official response:

We prefer to spend our time building great software for people to use, funding interesting projects, and making FUTO better every day. We have no interest in engaging in politics as this is a distraction from our work. We’d like to move past this. We don’t have any further comment that that.

I understand the view that distancing oneself from politics can be a productive approach to your work, and the work FUTO does funding great software like Immich is indeed important work that should be conducted relatively free of distractions. However, FUTO is a fundamentally political organization and it does not distance itself from politics. Consider for example the FUTO statement on Open Source, which takes several political positions:

  • Disapproval of the Open Source Initiative and their legitimacy as an authority
  • Disapproval of OSI’s proposed AI standards
  • Disapproval of the “tech oligopoly”
  • Advocacy for an “open source” which is inclusive of restrictions on commercial use
  • Support for Eric S. Raymond’s side in his conflict with OSI
  • Tacit support for Bryan Lunduke

A truer “apolitical” approach would accept the mainstream definition of open source, would not take positions on conflicts with OSI or Eric Raymond, and would be careful not to cite (or platform) controversial figures such as Lunduke.

It is difficult for FUTO’s work to be apolitical at all. Importantly, there are biases in their grants and donations: their selections have a tendency to privilege projects focused on privacy, decentralization, multimedia, communication, and right to repair, all of which suggest the political priorities of FUTO. The choice to fund “source first” software in addition to open source, or not to fund outright closed source software, or not to vet projects based on their community moderation, are also political factors in their process of selecting funding recipients, or at least are seemingly apolitical decisions which ultimately have political consequences.

This brings us to the political nature of the choice to platform Curtis Yarvin. Yarvin is a self-proclaimed fascist who argues openly for fascist politics. Platforming Yarvin on the FUTO channels legitimizes Yarvin’s ideas and his work, and provides curious listeners a funnel that leads to Yarvin’s more radical ideas and into a far-right rabbit-hole. Platforming Yarvin advances the fascist political program.

It should go without saying that it is political to support fascism or fascists. There is an outrageous moral, intellectual, and political contradiction in claiming that it is apolitical to promote a person whose political program is to dismantle democracy and eject people he disagrees with from the political sphere entirely. FUTO should reflect on their values, acknowledge the political nature of their work, and consider the ways in which their work intersects with politics writ large, then make decisions that align their political actions with their political beliefs.

Cloudflare bankrolls fascists

US politics has been pretty fascist lately. The state is filling up concentration camps, engaging in mass state violence against people on the basis of racialized traits, deporting them to random countries without any respect for habeas corpus, exerting state pressure on the free press to censor speech critical of the current administration, and Trump is openly floating the idea of an unconstitutional third term.

Fascism is clearly on the rise, and they’re winning more and more power. None of this is far removed from us in the FOSS community – there are a number of fascists working in FOSS, same as the rest of society. I don’t call them fascists baselessly – someone who speaks out in support of and expresses solidarity with fascists, or who uses fascists dog-whistles or promotes fascist ideology and talking points, or boosts fascist conspiracy theories – well, they’re a fascist.

If one consistently speaks in support of a certain political position and against the opponents of that position then it is correct to identify them with this political position. Facts, as it were, don’t care about feelings, namely the feelings that get hurt when someone is called a fascist. Fascists naturally do not want to be identified as such and will reject the label, but we shouldn’t take their word for it. People should be much more afraid of being called out as fascist than they are afraid of calling someone a fascist. If someone doesn’t want to be called a fascist, they shouldn’t act like one.

It’s in this disturbing political context that I saw an odd post from the Cloudflare blog pop up in my circles this week: Supporting the future of the open web: Cloudflare is sponsoring Ladybird and Omarchy. Based on Ladybird’s sponsorship terms we can assume that these projects received on the order of $100,000 USD from Cloudflare. I find this odd for a few reasons, in particular because one thing that I know these two projects have in common is that they are both run by fascists.

Even at face value this is an unusual pair of projects to fund. I’m all for FOSS projects getting funded, of course, and I won’t complain about a project’s funding on the solitary basis that it’s an odd choice. I will point out that these are odd choices, though, especially Omarchy.

Ladybird makes some sense, given that it’s aligned in principle with Cloudflare’s stated objective to “support the open web”, though I remain bearish that new web browser engines are even possible to make.1 Omarchy is a bizarre choice, though – do we really need another pre-customized Arch Linux distribution? And if we do, do we really need a big corporation like Cloudflare to bankroll it? Everyone on /r/unixporn manages to make Arch Linux look pretty for free.

Omarchy is a very weird project to fund, come to think of it. Making an Arch Linux spin technically requires some work, and work is work, I won’t deny it, but most of the work done here is from Arch Linux and Hyprland. Why not fund those, instead? Well, don’t fund Hyprland, since it’s also run by a bunch of fascists, but you get my point.

Anyway, Omarchy and Ladybird are both run by fascists. Omarchy makes this pretty obvious from the outset – on the home page the big YouTube poster image prominently features SuperGrok, which is a pathetically transparent dog-whistle to signal alliance with Elon Musk’s fascist politics. Omarchy is the pet project of David Heinemeier Hansson, aka DHH, who is well known as a rich fascist weirdo.2 One need only consult his blog to browse his weird, racist views on immigration, fat-shaming objections to diverse representation, vaguely anti-feminist/homophobic/rapey rants on consent, and, recently, tone-policing antifascists who celebrate the death of notable fascist Charlie Kirk.

Speaking of tributes to Charlie Kirk, that brings us to Andreas Kling, the project lead for Ladybird, who tweeted on the occasion of his assassination:

RIP Charlie Kirk

I hope many more debate nerds carry on his quest to engage young people with words, not fists.

@awesomekling

Kling has had a few things to say about Kirk on Twitter lately. Here’s another one – give you three guesses as to which “[group]” he objects to punching. You may also recall that Kling achieved some notoriety for his obnoxious response as the maintainer of SerenityOS when someone proposed gender-neutral language for the documentation:

Screenshot of the interaction on GitHub. Kling responds “This project is not an appropriate arena to advertise your personal politics.”

Replacing “he” with “they” in one sentence of the documentation is the kind of “ideologically motivated change” that Serenity’s CONTRIBUTING.md apparently aims to prevent, a classic case of the sexist “identities that are not men are inherently political” nonsense. Ladybird has a similar, weirdly defensive policy on “neutrality”, and a milquetoast code of conduct, which is based on the Ruby Community Conduct Guideline, which has been itself the subject of many controversies due to its inadequacy leading to real-world incidents of harassment and abuse.

Here’s another one – Kling endorsing white replacement theory in June:

White males are actively discriminated against in tech.

It’s an open secret of Silicon Valley.

One of the last meetings I attended before leaving Apple (in 2017) was management asking us to “keep the corporate diversity targets in mind” when interviewing potential new hires.

The phrasing was careful, but the implication was pretty clear.

I knew in my heart this wasn’t wholesome, but I was too scared to rock the boat at the time.

@awesomekling replying to @danheld3

And in a moment of poetic irony, a few days ago Kling spoke in solidarity with Hansson over his “persecution” for “banal, mainstream positions” on Twitter just a few days ago, in response to Hansson’s tweet signal-boosting another notable reactionary tech fascist, Bryan Lunduke.

So, to sum it up, Kling wears his mask a bit better than Hansson, but as far as I’m concerned it seems clear that both projects are run by fascists. If it walks like a fascist and quacks like a fascist… then why is Cloudflare giving them hundreds of thousands of dollars?

A better future for JavaScript that won't happen

In the wake of the largest supply-chain attack in history, the JavaScript community could have a moment of reckoning and decide: never again. As the panic and shame subsides, after compromised developers finish re-provisioning their workstations and rotating their keys, the ecosystem might re-orient itself towards solving the fundamental flaws that allowed this to happen.

After all, people have been sounding the alarm for years that this approach to dependency management is reckless and dangerous and broken by design. Maybe this is the moment when the JavaScript ecosystem begins to understand the importance and urgency of this problem, and begins its course correction. It could leave behind its sprawling dependency trees full of micro-libraries, establish software distribution based on relationships of trust, and incorporate the decades of research and innovation established by more serious dependency management systems.

Perhaps Google and Mozilla, leaders in JavaScript standards and implementations, will start developing a real standard library for JavaScript, which makes micro-dependencies like left-pad a thing of the past. This could be combined with a consolidation of efforts, merging micro-libraries into larger packages with a more coherent and holistic scope and purpose, which prune their own dependency trees in turn.

This could be the moment where npm comes to terms with its broken design, and with a well-funded effort (recall that, ultimately, npm is GitHub is Microsoft, market cap $3 trillion USD), will develop and roll out the next generation of package management for JavaScript. It could incorporate the practices developed and proven in Linux distributions, which rarely suffer from these sorts of attacks, by de-coupling development from packaging and distribution, establishing package maintainers who assemble and distribute curated collections of software libraries. By introducing universal signatures for packages of executable code, smaller channels and webs of trust, reproducible builds, and the many other straightforward, obvious techniques used by responsible package managers.

Maybe other languages that depend on this broken dependency management model, like Cargo, PyPI, RubyGems, and many more, are watching this incident and know that the very same crisis looms in their future. Maybe they will change course, too, before the inevitable.

Imagine if other large corporations who depend on and profit from this massive pile of recklessly organized software committed their money and resources to it, through putting their engineers to the task of fixing these problems, through coming together to establish and implement new standards, through direct funding of their dependencies and by distributing money through institutions like NLNet, ushering in an era of responsible, sustainable, and secure software development.

This would be a good future, but it’s not the future that lies in wait for us. The future will be more of the same. Expect symbolic gestures – mandatory 2FA will be rolled out in more places, certainly, and the big players will write off meager donations in the name of “OSS security and resilience” in their marketing budgets.

No one will learn their lesson. This has been happening for decades and no one has learned anything from it yet. This is the defining hubris of this generation of software development.

Embedding Wren in Hare

I’ve been on the lookout for a scripting language which can be neatly embedded into Hare programs. Perhaps the obvious candidate is Lua – but I’m not particularly enthusiastic about it. When I was evaluating the landscape of tools which are “like Lua, but not Lua”, I found an interesting contender: Wren.

I found that Wren punches far above its weight for such a simple language. It’s object oriented, which, you know, take it or leave it depending on your use-case, but it’s very straightforwardly interesting for what it is. I found a few things to complain about, of course – its scope rules are silly, the C API has some odd limitations here and there, and in my opinion the “standard library” provided by wren CLI is poorly designed. But, surprisingly, my list of complaints more or less ends there, and I was excited to build a nice interface to it from Hare.

The result is hare-wren. Check it out!

The basic Wren C API is relatively straightforwardly exposed to Hare via the wren module, though I elected to mold it into a more idiomatic Hare interface rather than expose the C API directly to Hare. You can use it something like this:

use wren;

export fn main() void = {
	const vm = wren::new(wren::stdio_config);
	defer wren::destroy(vm);
	wren::interpret(vm, "main", `
		System.print("Hello world!")
	`)!;
};

$ hare run -lc main.ha
Hello world!

Calling Hare from Wren and vice-versa is also possible with hare-wren, of course. Here’s another example:

use fmt;
use wren;

export fn main() void = {
	let config = *wren::stdio_config;
	config.bind_foreign_method = &bind_foreign_method;

	const vm = wren::new(&config);
	defer wren::destroy(vm);

	wren::interpret(vm, "main", `
	class Example {
		foreign static greet(user)
	}

	System.print(Example.greet("Harriet"))
	`)!;
};

fn bind_foreign_method(
	vm: *wren::vm,
	module: str,
	class_name: str,
	is_static: bool,
	signature: str,
) nullable *wren::foreign_method_fn = {
	const is_valid = class_name == "Example" &&
		signature == "greet(_)" && is_static;
	if (!is_valid) {
		return null;
	};
	return &greet_user;
};

fn greet_user(vm: *wren::vm) void = {
	const user = wren::get_string(vm, 1)!;
	const greeting = fmt::asprintf("Hello, {}!", user)!;
	defer free(greeting);
	wren::set_string(vm, 0, greeting);
};

$ hare run -lc main.ha
Hello, Harriet!

In addition to exposing the basic Wren virtual machine to Hare, hare-wren has an optional submodule, wren::api, which implements a simple async runtime based on hare-ev and a modest “standard” library, much like Wren CLI. I felt that the Wren CLI libraries had a lot of room for improvement, so I made the call to implement a standard library which is only somewhat compatible with Wren CLI.

On top of the async runtime, Hare’s wren::api runtime provides some basic features for reading and writing files, querying the process arguments and environment, etc. It’s not much but it is, perhaps, an interesting place to begin building out something a bit more interesting. A simple module loader is also included, which introduces some conventions for installing third-party Wren modules that may be of use for future projects to add new libraries and such.

Much like wren-cli, hare-wren also provides the hwren command, which makes this runtime, standard library, and module loader conveniently available from the command line. It does not, however, support a REPL at the moment.

I hope you find it interesting! I have a few projects down the line which might take advantage of hare-wren, and it would be nice to expand the wren::api library a bit more as well. If you have a Hare project which would benefit from embedding Wren, please let me know – and consider sending some patches to improve it!

What's new with Himitsu 0.9?

Last week, Armin and I worked together on the latest release of Himitsu, a “secret storage manager” for Linux. I haven’t blogged about Himitsu since I announced it three years ago, and I thought it would be nice to give you a closer look at the latest release, both for users eager to see the latest features and for those who haven’t been following along.1


A brief introduction: Himitsu is like a password manager, but more general: it stores any kind of secret in its database, including passwords but also SSH keys, credit card numbers, your full disk encryption key, answers to those annoying “security questions” your bank obliged you to fill in, and so on. It can also enrich your secrets with arbitrary metadata, so instead of just storing, say, your IMAP password, it can also store the host, port, TLS configuration, and username, storing the complete information necessary to establish an IMAP session.

Another important detail: Himitsu is written in Hare and depends on Hare’s native implementations of cryptographic primitives – neither Himitsu nor the cryptography implementation it depends on have been independently audited.


So, what new and exciting features does Himitsu 0.9 bring to the table? Let me summarize the highlights for you.

A new prompter

The face of Himitsu is the prompter. The core Himitsu daemon has no user interface and only communicates with the outside world through its IPC protocols. One of those protocols is the “prompter”, which Himitsu uses to communicate with the user, to ask you for consent to use your secret keys, to enter the master password, and so on. The prompter is decoupled from the daemon so that it is easy to substitute with different versions which accommodate different use-cases, for example by integrating the prompter more deeply into a desktop environment or to build one that fits better on a touch screen UI like a phone.

But, in practice, given Himitsu’s still-narrow adoption, most people use the GTK+ prompter developed upstream. Until recently, the prompter was written in Python for GTK+ 3, and it was a bit janky and stale. The new hiprompt-gtk changes that, replacing it with a new GTK4 prompter implemented in Hare.

I’m excited to share this one with you – it was personally my main contribution to this release. The prompter is based on Alexey Yerin’s hare-gi, which is a (currently only prototype-quality) code generator which processes GObject Introspection documents into Hare modules that bind to libraries like GTK+. The prompter uses Adwaita for its aesthetic and controls and GTK layer shell for smoother integration on supported Wayland compositors like Sway.

Secret service integration

Armin has been hard at work on a new package, himitsu-secret-service, which provides the long-awaited support for integrating Himitsu with the dbus Secret Service API used by many Linux applications to manage secret keys. This makes it possible for Himitsu to be used as a secure replacement for, say, gnome-keyring.

Editing secret keys

Prior to this release, the only way to edit a secret key was to remove it and re-add it with the desired edits applied manually. This was a tedious and error-prone process, especially when bulk-editing keys. This release includes some work from Armin to improve the process, by adding a “change” request to the IPC protocol and implementing it in the command line hiq client.

For example, if you changed your email address, you could update all of your logins like so:

$ hiq -c email=newemail@example.org email=oldemail@example.org

Don’t worry about typos or mistakes – the new prompter will give you a summary of the changes for your approval before the changes are applied.

You can also do more complex edits with the -e flag – check out the hiq(1) man page for details.

Secret reuse notifications

Since version 0.8, Himitsu has supported “remembering” your choice, for supported clients, to consent to the use of your secrets. This allows you, for example, to remember that you agreed for the SSH agent to use your SSH keys for an hour, or for the duration or your login session, etc. Version 0.9 adds a minor improvement to this feature – you can add a command to himitsu.ini, such as notify-send, which will be executed whenever a client takes advantage of this “remembered” consent, so that you can be notified whenever your secrets are used again, ensuring that any unexpected use of your secrets will get your attention.

himitsu-firefox improvements

There are also some minor improvements landed for himitsu-firefox that I’d like to note. tiosgz sent us a nice patch which makes the identification of login fields in forms more reliable – thanks! And I’ve added a couple of useful programs, himitsu-firefox-import and himitsu-firefox-export, which will help you move logins between Himitsu and Firefox’s native password manager, should that be useful to you.

And the rest

Check out the changelog for the rest of the improvements. Enjoy!

Just speak the truth

Today, we’re looking at two case studies in how to respond when reactionaries appear in your free software community.

Exhibit A

It is a technical decision.

The technical reason is that the security team does not have the bandwidth to provide lifecycle maintenance for multiple X server implementations. Part of the reason for moving X from main to community was to reduce the burden on the security team for long-term maintenance of X. Additionally, nobody so far on the security team has expressed any interest in collaborating with xxxxxx on security concerns.

We have a working relationship with Freedesktop already, while we would have to start from the beginning with xxxxxx.

Why does nobody on the security team have any interest in collaboration with xxxxxx? Well, speaking for myself only here – when I looked at their official chat linked in their README, I was immediately greeted with alt-right propaganda rather than tactically useful information about xxxxxx development. At least for me, I don’t have any interest in filtering through hyperbolic political discussions to find out about CVEs and other relevant data for managing the security lifecycle of X.

Without relevant security data products from xxxxxx, as well as a professionally-behaving security contact, it is unlikely for xxxxxx to gain traction in any serious distribution, because X is literally one of the more complex stacks of software for a security team to manage already.

At the same time, I sympathize with the need to keep X alive and in good shape, and agree that there hasn’t been much movement from freedesktop in maintaining X in the past few years. There are many desktop environments which will never get ported to Wayland and we do need a viable solution to keep those desktop environments working.

I know the person who wrote this, and I know that she’s a smart cookie, and therefore I know that she probably understood at a glance that the community behind this “project” literally wants to lynch her. In response, she takes the high road, avoids confronting the truth directly, and gives the trolls a bunch of talking points to latch on for counter-arguments. Leaves plenty of room for them to bog everyone down in concern trolling and provides ample material to fuel their attention-driven hate machine.

There’s room for improvement here.

Exhibit B

Screenshot of a post by Chimera Linux which reads “any effort to put (redacted) in chimera will be rejected on the technical basis of the maintainers being reactionary dipshits”

Concise, speaks the truth, answers ridiculous proposals with ridicule, does not afford the aforementioned reactionary dipshits an opportunity to propose a counter-argument. A+.

Extra credit for the follow-up:

Screenshot of a follow-up post that reads “just to be clear, given the coverage of the most recent post, we don’t want to be subject to any conspiracy theories arising from that. so i’ll just use this opportunity to declare that we are definitely here to further woke agenda by turning free software gay”


The requirement for a passing grade in this class is a polite but summary dismissal, but additional credit is awarded for anyone who does not indulge far-right agitators as if they were equal partners in maintaining a sense of professional decorum.

If you are a community leader in FOSS, you are not obligated to waste your time coming up with a long-winded technical answer to keep nazis out of your community. They want you to argue with them and give them attention and feed them material for their reactionary blog or whatever. Don’t fall into their trap. Do not answer bad faith with good faith. This is a skill you need to learn in order to be an effective community leader.

If you see nazis 👏👏 you ban nazis 👏👏 — it’s as simple as that.


The name of the project is censored not because it’s particularly hard for you to find, but because all they really want is attention, and you and me are going to do each other a solid by not giving them any of that directly.

To preclude the sorts of reply guys who are going to insist on name-dropping the project and having a thread about the underlying drama in the comments, the short introduction is as follows:

For a few years now, a handful of reactionary trolls have been stoking division in the community by driving a wedge between X11 and Wayland users, pushing a conspiracy theory that paints RedHat as the DEI boogeyman of FOSS and assigning reactionary values to X11 and woke (pejorative) values to Wayland. Recently, reactionary opportunists “forked” Xorg, replaced all of the literature with political manifestos and dog-whistles, then used it as a platform to start shit with downstream Linux distros by petitioning for inclusion and sending concern trolls to waste everyone’s time.

The project itself is of little consequence; they serve our purposes today by providing us with case-studies in dealing with reactionary idiots starting shit in your community.

Unionize or die

Tech workers have long resisted the suggestion that we should be organized into unions. The topic is consistently met with a cold reception by tech workers when it is raised, and no big tech workforce is meaningfully organized. This is a fatal mistake – and I don’t mean “fatal” in the figurative sense. Tech workers, it’s time for you to unionize, and strike, or you and your loved ones are literally going to die.

In this article I will justify this statement and show that it is clearly not hyperbolic. I will explain exactly what you need to do, and how organized labor can and will save your life.

Hey – if you want to get involved in labor organizing in the tech sector you should consider joining the new unitelabor.dev forum. Adding a head’s up here in case you don’t make it to the end of this very long blog post.

The imperative to organize is your economic self-interest

Before I talk about the threats to your life and liberty that you must confront through organized labor, let me re-iterate the economic position for unionizing your workplace. It is important to revisit this now, because the power politics of the tech sector has been rapidly changing over the past few years, and those changes are not in your favor.

The tech industry bourgeoisie has been waging a prolonged war on labor for at least a decade. Far from mounting any kind of resistance, most of tech labor doesn’t even understand that this is happening to them. Your boss is obsessed with making you powerless and replaceable. You may not realize how much leverage you have over your boss, but your boss certainly does – and has been doing everything in their power to undermine you before you wizen up. Don’t let yourself believe you’re a part of their club – if your income depends on your salary, you are part of the working class.

Payroll – that’s you – is the single biggest expense for every tech company. When tech capitalists look at their balance sheet and start thinking of strategies for increasing profits, they see an awful lot of pesky zeroes stacked up next to the line item for payroll and benefits. Long-term, what’s their best play?

It starts with funneling cash and influence into educating a bigger, cheaper generation of compsci graduates to flood the labor market – “everyone can code”. Think about strategic investments in cheap(ish), broadly available courses, online schools and coding “bootcamps” – dangling your high salary as the carrot in front of wannabe coders fleeing dwindling prospects in other industries, certain that the carrot won’t be nearly as big when they all eventually step into a crowded labor market.

The next step is rolling, industry-wide mass layoffs – often obscured under the guise of “stack ranking” or some similar nonsense. Big tech has been callously cutting jobs everywhere, leaving workers out in the cold in batches of thousands or tens of thousands. If you don’t count yourself among them yet, maybe you will soon. What are your prospects for re-hire going to look like if this looming recession materializes in the next few years?

Consider what’s happening now – why do you think tech is driving AI mandates down from the top? Have you been ordered to use an LLM assistant to “help” with your programming? Have you even thought about why the executives would push this crap on you? You’re “training” your replacement. Do you really think that, if LLMs really are going to change the way we code, they aren’t going to change the way we’re paid for it? Do you think your boss doesn’t see AI as a chance to take $100M off of their payroll expenses?

Aren’t you worried you could get laid off and this junior compsci grad or an H1B takes your place for half your salary? You should be – it’s happening everywhere. What are you going to do about it? Resent the younger generation of programmers just entering the tech workforce? Or the immigrant whose family pooled their resources to send them abroad to study and work? Or maybe you weren’t laid off yet, and you fancy yourself better than the poor saps down the hall who were. Don’t be a sucker – your enemy isn’t in the cubicle next to you, or on the other side of the open office. Your enemy has an office with a door on it.

Listen: a tech union isn’t just about negotiating higher wages and benefits, although that’s definitely on the table. It’s about protecting yourself, and your colleagues, from the relentless campaign against labor that the tech leadership is waging against us. And more than that, it’s about seizing some of the awesome, society-bending power of the tech giants. Look around you and see what destructive ends this power is being applied to. You have your hands at the levers of this power if only you rise together with your peers and make demands.

And if you don’t, you are responsible for what’s going to happen next.

The imperative to organize is existential

If global warming is limited to 2°C, here’s what Palo Alto looks like in 2100:1

Map of Palo Alto showing flooding near the coast

Limiting warming to 2° C requires us to cut global emissions in half by 2030 – in 5 years – but emissions haven’t even peaked yet. Present-day climate policies are only expected to limit warming to 2.5° to 2.9° C by 2100.2 Here’s Palo Alto in 75 years if we stay our current course:

Map of Palo Alto showing much more extreme flooding

Here’s the Gulf of Mexico in 75 years:

Gulf of Mexico showing

This is what will happen if things don’t improve. Things aren’t improving – they’re getting worse. The US elected an anti-science president who backed out of the Paris agreement, for a start. Your boss is pouring all of our freshwater into datacenters to train these fucking LLMs and expanding into this exciting new market with millions of tons of emissions as the price of investment. Cryptocurrencies still account for a full 1% of global emissions. Datacenters as a whole account for 2%. That’s on us – tech workers. That is our fucking responsibility.

Climate change is accelerating, and faster than we thought, and the rich and powerful are making it happen faster. Climate catastrophe is not in the far future, it’s not our children or our children’s children, it’s us, it’s already happening. You and I will live to see dozens of global catastrophes playing out in our lifetimes, with horrifying results. Even if we started a revolution tomorrow and overthrew the ruling class and implemented aggressive climate policies right now we will still watch tens or hundreds of millions die.

Let’s say you are comfortably living outside of these blue areas, and you’ll be sitting pretty when Louisiana or Bruges or Fiji are flooded. Well, 13 million Americans are expected to have to migrate out of flooded areas – and 216 million globally3 – within 25 to 30 years. That’s just from the direct causes of climate change – as many as 1 billion could be displaced if we account for the ensuing global conflict and civil unrest.4 What do you think will happen to non-coastal cities and states when 4% of the American population is forced to flee their homes? You think you won’t be affected by that? What happens when anywhere from 2.5% to 12% of the Earth’s population becomes refugees?

What are you going to eat? Climate change is going to impact fresh water supplies and reduce the world’s agriculturally productive land. Livestock is expected to be reduced by 7-10% in just 25 years.5 Food prices will skyrocket and people will starve. 7% of all species on Earth may already be extinct because of human activities.6 You think that’s not going to affect you?

The overwhelming majority of the population supports climate action.7 The reason it’s not happening is because, under capitalism, capital is power, and the few have it and the many don’t. We live in a global plutocracy.

The plutocracy has an answer to climate change: fascism. When 12% of the world’s population is knocking at the doors of the global north, their answer will be concentration camps and mass murder. They are already working on it today. When the problem is capitalism, the capitalists will go to any lengths necessary to preserve the institutions that give them power – they always have. They have no moral compass or reason besides profit, wealth, and power. The 1% will burn and pillage and murder the 99% without blinking.

They are already murdering us. 1.2 million Americans are rationing their insulin.8 The healthcare industry, organized around the profit motive, murders 68,000 Americans per year.9 To the Europeans among my readership, don’t get too comfortable, because I assure you that our leaders are working on destroying our healthcare systems, too.

Someone you love will be laid off, get sick, and die because they can’t afford healthcare. Someone you know, probably many people that you know, will be killed by climate change. It might be someone you love. It might be you.

When you do get laid off mid-recession, your employer replaces you and three of your peers with a fresh bootcamp “graduate” and a GitHub Copilot subscription, and all of the companies you might apply to have done the same… how long can you keep paying rent? What about your friends and family, those who don’t have a cushy tech job or tech worker prospects, what happens when they get laid off or automated away or just priced out of the cost of living? Homelessness is at an all time high and it’s only going to get higher. Being homeless takes 30 years off of your life expectancy.10 In the United States, there are 28 vacant homes for every homeless person.11

Capitalism is going to murder the people you love. Capitalism is going to murder you.

We need a different answer to the crises that we face. Fortunately, the working class can offer a better solution – one with a long history of success.

Organizing is the only answer and it will work

The rich are literally going to kill you and everyone you know and love just because it will make them richer. Because it is making them richer.

Do you want to do something about any of the real, urgent problems you face? Do you want to make meaningful, rapid progress on climate change, take the catastrophic consequences we are already guaranteed to face in stride, and keep your friends and family safe?

Well, tough shit – you can’t. Don’t tell me you’ll refuse the work, or that it’ll get done anyway without you, or that you can just find another job. They’ll replace you, you won’t find another job, and the world will still burn. You can’t vote your way to a solution, either: elections don’t matter, your vote doesn’t matter, and your voice is worthless to politicians.12 Martin Gilens and Benjamin Page demonstrated this most clearly in their 2014 study, “Testing Theories of American Politics: Elites, Interest Groups, and Average Citizens”.13

Gilens and Page plotted a line chart which shows us the relationship between the odds of a policy proposal being adopted (Y axis) charted against public support for the policy (X axis). If policy adoption was entirely driven by public opinion, we would expect a 45° line (Y=X), where broad public support guarantees adoption and broad public opposition prevents adoption. We could also substitute “public opinion” for the opinions of different subsets of the public to see their relative impact on policy. Here’s what they got:

Two graphs, the first labelled “Average Citizens’ Preferences” and the second “Economic Elites’ Preferences”, showing that the former has little to no correlation with the odds of a policy being adopted, and the latter has a significant impact

For most of us, we get a flat line: Y, policy adoption, is completely unrelated to X, public support. Our opinion has no influence whatsoever on policy adoption. Public condemnation or widespread support has the same effect on a policy proposal, i.e. none. But for the wealthy, it’s a different story entirely. I’ve never seen it stated so plainly and clearly: the only thing that matters is money, wealth, and capital. Money is power, and the rich have it and you don’t.

Nevertheless, you must solve these problems. You must participate in finding and implementing solutions. You will be fucked if you don’t. But it is an unassailable fact that you can’t solve these problems, because you have no power – at least, not alone.

Together, we do have power. In fact, we can fuck with those bastards’ money and they will step in line if, and only if, we organize. It is the only solution, and it will work.

The ultra-rich possess no morals or ideology or passion or reason. They align with fascists because the fascists promise what they want, namely tax cuts, subsidies, favorable regulation, and cracking the skulls of socialists against the pavement. The rich hoard and pillage and murder with abandon for one reason and one reason only: it’s profitable. The rich always do what makes them richer, and only what makes them richer. Consequently, you need to make this a losing strategy. You need to make it more profitable to do what you want. To control the rich, you must threaten the only thing they care about.

Strikes are so costly for companies that they will do anything to prevent them – and if they fail to prevent them, then shareholders will pressure them to capitulate if only to stop the hemorrhaging of profit. This threat is so powerful that it doesn’t have to stop at negotiating your salary and benefits. You could demand your employer participate in boycotting Israel. You could demand that your employer stops anti-social lobbying efforts, or even adopts a pro-social lobbying program. You could demand that your CEO cannot support causes that threaten the lives and dignity of their queer or PoC employees. You could demand that they don’t bend the knee to fascists. If you get them where it hurts – their wallet – they will fall in line. They are more afraid of you than we are afraid of them. They are terrified of us, and it’s time we used that to our advantage.

We know it works because it has always worked. In 2023, United Auto Workers went on strike and most workers won a 25% raise. In February, teachers in Los Angeles went on strike for just 8 days and secured a 19% raise. Nurses in Oregon won a 22% raise, better working schedules, and more this year – and Hawaiian nurses secured an agreement to improve worker/patient ratios in September. Tech workers could take a page out of the Writer’s Guild’s book – in 2023 they secured a prohibition against the use of their work to train AI models and the use of AI to suppress their wages.

Organized labor is powerful and consistently gets concessions from the rich and powerful in a way that no other strategy has ever been able to. It works, and we have a moral obligation to do it. Unions get results.

How to organize step by step

I will give you a step-by-step plan for exactly what you need to do to start moving the needle here. The process is as follows:

  1. Building solidarity and community with your peers
  2. Understanding your rights and how to organize safely
  3. Establishing the consensus to unionize, and doing it
  4. Promoting solidarity across tech workplaces and labor as a whole

Remember that you will not have to do this alone – in fact, that’s the whole point. Step one is building community with your colleagues. Get to know them personally, establish new friendships and grow the friendships you already have. Learn about each other’s wants, needs, passions, and so on, and find ways to support each other. If someone takes a sick day, organize someone to check on them and make them dinner or pick up their kids from school. Organize a board game night at your home with your colleagues, outside of work hours. Make it a regular event!

Talk to your colleagues about work, and your workplace. Tell each other about your salaries and benefits. When you get a raise, don’t be shy, tell your colleagues how much you got and how you negotiated it. Speak positively about each other at performance reviews and save critical feedback for their ears only. Offer each other advice about how to approach their boss to get their needs met, and be each other’s advocate.

Talk about the power you have to work together to accomplish bigger things. Talk about the advantage of collective action. It can start small – perhaps your team collectively refuses to incorporate LLMs into your workflow. Soon enough you and your colleagues will be thinking about unionizing.

Disclaimer: Knowledge about specific processes and legal considerations in this article is US-specific. Your local laws are likely similar, but you should research the differences with your colleagues.

The process of organizing a union in the US is explained step-by-step at workcenter.gov. More detailed resources, including access to union organizers in your neighborhood, are available from the American Federation of Labor and Congress of Industrial Organizations (AFL-CIO). But your biggest resources will be people already organizing in the tech sector: in particular you should consult CODE-CWA, which works with tech workers to provide mentoring and resources on organizing tech workplaces – and has already helped several tech workplaces organize their unions and start making a difference. They’ve got your back.

This is a good time to make sure that you and your colleagues understand your rights. First of all, you would be wise to pool your resources and hire the attention of a lawyer specializing in labor – consult your local bar association to find one (it’s easy, just google it and they’ll have a web thing). Definitely reach out to AFL-CIO and CODE-CWA to meet experienced union organizers who can help you.

You cannot be lawfully fired or punished for discussing unions, workplace conditions, or your compensation and benefits, with your colleagues. You cannot be punished for distributing literature in support of your cause, especially if you do it off-site (even just outside of the front door). Be careful not to make careless remarks about your boss’s appearance, complain about the quality of your company’s products, make disparaging comments about clients or customers, etc – don’t give them an easy excuse. Hold meetings and discussions outside of work if necessary, and perform your duties as you normally would while organizing.

Once you start getting serious about organizing, your boss will start to work against you, but know that they cannot stop you. Nevertheless, you and/or some of your colleagues may run the risk of unlawful retaliation or termination for organizing – this is why you should have a lawyer on retainer. This is also why it’s important to establish systems of mutual aid, so that if one of your colleagues gets into trouble you can lean on each other to keep supporting your families. And, importantly, remember that HR works for the company, not for you. HR are the front lines that are going to execute the unionbusting mandates from above.

Once you have a consensus among your colleagues to organize – which you will know because they will have signed union cards – you can approach your employer to ask them to voluntarily recognize the union. If they agree to opening an organized dialogue amicably, you do so. If not, you will reach out to the National Labor Relations Board (NLRB) to organize a vote to unionize. Only organize a vote that you know you will win. Once your workplace votes to unionize, your employer is obligated to negotiate with you in good faith. Start making collective decisions about what you want from your employer and bring them to the table.

In this process, you will have established a relationship with more experienced union organizers who will continue to help you with conducting your union’s affairs and start getting results. The next step is to make yourself available for this purpose to the next tech workplace that wants to unionize: to share what you’ve learned and support the rest of the industry in solidarity. Talk to your friends across the industry and build solidarity and power in mass.

Prepare for the general strike on May 1st, 2028

The call has gone out: on Labor Day, 2028 – just under three years from now – there will be a general strike in the United States. The United Auto Workers union, one of the largest in the United States, has arranged for their collective bargaining agreements to end on this date, and has called for other unions to do the same across all industries. The American Federation of Teachers and its 1.2 million members are on board, and other unions are sure to follow. Your new union should be among them.

This is how we collectively challenge not just our own employers, but our political institutions as a whole. This is how we turn this nightmare around.

A mass strike is a difficult thing to organize. It is certain to be met with large-scale, coordinated, and well-funded propaganda and retaliation from the business and political spheres. Moreover, a mass strike depends on careful planning and mass mutual aid. We need to be prepared to support each other to get it done, and to plan and organize seriously. When you and your colleagues get organized, discuss this strike amongst yourselves and be prepared to join in solidarity with the rest of the 99% around the country and the world at large.

To commit yourselves to participate or get involved in the planning of the grassroots movement, see generalstrikeus.com.

Join unitelabor.dev

I’ve set up a Discourse instance for discussion, organizing, Q&A, and solidarity among tech workers at unitelabor.dev. Please check it out!

If you have any questions or feedback on this article, please post about it there.

Unionize or die

You must organize, and you must start now, or the worst will come to pass. Fight like your life depends on it, beause it does. It has never been more urgent. The tech industry needs to stop fucking around and get organized.

We are powerful together. We can change things, and we must. Spread the word, in your workplace and with your friends and online. On the latter, be ready to fight just to speak – especially in our online spaces owned and controlled by the rich (ahem – YCombinator, Reddit, Twitter – etc). But fight all the same, and don’t stop fighting until we’re done.

We can do it, together.

Resources

Tech-specific:

General:

Send me more resources to add here!

The British Airways position on various border disputes

My spouse and I are on vacation in Japan, spending half our time seeing the sights and the other half working remotely and enjoying the experience of living in a different place for a while. To get here, we flew on British Airways from London to Tokyo, and I entertained myself on the long flight by browsing the interactive flight map on the back of my neighbor’s seat and trying to figure out how the poor developer who implemented this map solved the thorny problems that displaying a world map implies.

I began my survey by poking through the whole interface of this little in-seat entertainment system1 to see if I can find out anything about who made it or how it works – I was particularly curious to find a screen listing open source licenses that such such devices often disclose. To my dismay I found nothing at all – no information about who made it or what’s inside. I imagine that there must be some open source software in that thing, but I didn’t find any licenses or copyright statements.

When I turned my attention to the map itself, I did find one copyright statement, the only one I could find in the whole UI. If you zoom in enough, it switches from a satellite view to a street view showing the OpenStreetMap copyright line:

Picture of the display showing 'Street Maps: (c) OpenStreetMap contributors'
Note that all of the pictures in this article were taken by pointing my smartphone camera at the screen from an awkward angle and fine-tune your expectations accordingly. I don’t have pictures to support every border claim documented in this article, but I did take notes during the flight.

Given that British Airways is the proud flag carrier of the United Kingdom I assume that this is indeed the only off-the-shelf copyrighted material included in this display, and everything else was developed in-house without relying on any open source software that might require a disclosure of license and copyright details. For similar reasons I am going to assume that all of the borders shown in this map are reflective of the official opinion of British Airways on various international disputes.

As I briefly mentioned a moment ago, this map has two views: satellite photography and a very basic street view. Your plane and its route are shown in real-time, and you can touch the screen to pan and zoom the map anywhere you like. You can also rotate the map and change the angle in “3D” if you have enough patience to use complex multitouch gestures on the cheapest touch panel they could find.

The street view is very sparse and only appears when you’re pretty far zoomed in, so it was mostly useless for this investigation. The satellite map, thankfully, includes labels: cities, country names, points of interest, and, importantly, national borders. The latter are very faint, however. Here’s an illustrative example:

A picture of the screen showing the area near the Caucasus mountains with the plane overflying the Caspian sea

We also have our first peek at a border dispute here: look closely between the “Georgia” and “Caucasus Mountains” labels. This ever-so-faint dotted line shows what I believe is the Russian-occupied territory of South Ossetia in Georgia. Disputes implicating Russia are not universally denoted as such – I took a peek at the border with Ukraine and found that Ukraine is shown as whole and undisputed, with its (undotted) border showing Donetsk, Luhansk, and Crimea entirely within Ukraine’s borders.

Of course, I didn’t start at Russian border disputes when I went looking for trouble. I went directly to Palestine. Or rather, I went to Israel, because Palestine doesn’t exist on this map:

Picture of the screen showing Israel

I squinted and looked very closely at the screen and I’m fairly certain that both the West Bank and Gaza are outlined in these dotted lines using the borders defined by the 1949 armistice. If you zoom in a bit more to the street view, you can see labels like “West Bank” and the “Area A”, “Area B” labels of the Oslo Accords:

Picture of the street map zoomed in on Ramallah

Given that this is British Airways, part of me was surprised not to see the whole area simply labelled Mandatory Palestine, but it is interesting to know that British Airways officially supports the Oslo Accords.

Heading south, let’s take a look at the situation in Sudan:

Picture of the satellite map over Sudan

This one is interesting – three areas within South Sudan’s claimed borders are disputed, and the map only shows two with these dotted lines. The border dispute with Sudan in the northeast is resolved in South Sudan’s favor. Another case where BA takes a stand is Guyana, which has an ongoing dispute with Venezuela – but the map only shows Guyana’s claim, albeit with a dotted line, rather than the usual approach of drawing both claims with dotted lines.

Next, I turned my attention to Taiwan:

Picture of the satellite map over eastern China and Taiwan

The cities of Taipei and Kaohsiung are labelled, but the island as a whole was not labelled “Taiwan”. I zoomed and panned and 3D-zoomed the map all over the place but was unable to get a “Taiwan” label to appear. I also zoomed into the OSM-provided street map and panned that around but couldn’t find “Taiwan” anywhere, either.

The last picture I took is of the Kashmir area:

Picture of the satellite map showing the Kashmir region

I find these faint borders difficult to interpret and I admit to not being very familiar with this conflict, but perhaps someone in the know with the patience to look more closely will email me their understanding of the official British Airways position on the Kashmir conflict (here’s the full sized picture).

Here are some other details I noted as I browsed the map:

  • The Hala’ib Triangle and Bir Tawil are shown with dotted lines
  • The Gulf of Mexico is labelled as such
  • Antarctica has no labelled borders or settlements

After this thrilling survey of the official political positions of British Airways, I spent the rest of the flight reading books or trying to sleep.

Resistance from the tech sector

As of late, most of us have been reading the news with a sense of anxious trepidation. At least, those of us who read from a position of relative comfort and privilege. Many more read the news with fear. Some of us are already no longer in a position to read the news at all, having become the unfortunate subjects of the news. Fascism is on the rise worldwide and in the United States the news is particularly alarming. The time has arrived to act.

The enemy wants you to be overwhelmed and depressed, to feel like the situation is out of your control. Propaganda is as effective on me as it is on you, and in my own home the despair and helplessness the enemy aims to engineer in us often prevails in my own life. We mustn’t fall for this gambit.

When it comes to resistance, I don’t have all of the answers, and I cannot present a holistic strategy for effective resistance. Nevertheless, I have put some thought towards how someone in my position, or in my community, can effectively apply ourselves towards resistance.

The fact of the matter is that the tech sector is extraordinarily important in enabling and facilitating the destructive tide of contemporary fascism’s ascent to power. The United States is embracing a technocratic fascism at the hands of Elon Musk and his techno-fetishist “Department of Government Efficiency”. Using memes to mobilize the terminally online neo-right, and “digitizing” and “modernizing” government institutions with the dazzling miracles of modern technology, the strategy puts tech, in its mythologized form – prophesied, even, through the medium of science fiction – at the center of a revolution of authoritarian hate.

And still, this glitz and razzle dazzle act obscures the more profound and dangerous applications of tech hegemony to fascism. Allow me to introduce public enemy number one: Palantir. Under the direction of neo-fascist Peter Thiel and in collaboration with ICE, Palantir is applying the innovations of the last few decades of surveillance capitalism to implementing a database of undesirables the Nazis could have never dreamed of. Where DOGE is hilariously tragic, Palantir is nightmarishly effective.

It’s clear that the regime will be digital. The through line is tech – and the tech sector depends on tech workers. That’s us. This puts us in a position to act, and compels us to act. But then, what should we do?

If there’s one thing I want you to take away from this article, something to write on your mirror and repeat aloud to yourself every day, it’s this: there’s safety in numbers. It is of the utmost importance that we dispense with American individualism and join hands with our allies to resist as one. Find your people in your local community, and especially in your workplace, who you can trust and who believe in what’s right and that you can depend on for support. It’s easier if you’re not going it alone. Talk to your colleagues about your worries and lean on them to ease your fears, and allow them to lean on you in turn.

One of the most important actions you can take is to unionize your workplace. We are long overdue for a tech workers union. If tech workers unionize then we can compel our employers – this regime’s instruments of fascist power – to resist also. If you’re at the bottom looking up at your boss’s boss’s boss cozying up with fascists, know that with a union you can pull the foundations of his power out from beneath him.

More direct means of resistance are also possible, especially for the privileged and highly paid employees of big tech. Maneuver yourself towards the levers of power. At your current job, find your way onto the teams implementing the technology that enables authoritarianism, and fuck it up. Drop the database by “mistake”. Overlook bugs. Be confidently wrong in code reviews and meetings. Apply for a job at Palantir, and be incompetent at it. Make yourself a single point of failure, then fail. Remember too that plausible deniability is key – make them work to figure out that you are the problem.

This sort of action is scary and much riskier than you’re probably immediately comfortable with. Inaction carries risks also. Only you are able to decide what your tolerance for risk is, and what kind of action that calls for. If your appetite for risk doesn’t permit sabotage, you could simply refuse to work on projects that aren’t right. Supporting others is essential resistance, too – be there for your friends, especially those more vulnerable than yourself, and support the people who engage in direct resistance. You didn’t see nuffin, right? If your allies get fired for fucking up an important digital surveillance project – you’ll have a glowing reference for them when they apply for Palantir, right?

Big tech has become the problem, and it’s time for tech workers to be a part of the solution. If this scares you – and it should – I get it. I’m scared, too. It’s okay for it to be scary. It’s okay for you not to do anything about it right now. All you have to do right now is be there for your friends and loved ones, and answer this question: where will you draw the line?

Remember your answer, and if and when it comes to pass… you will know when to act. Don’t let them shift your private goalposts until the frog is well and truly boiled to death.

Hang in there.

A Firefox addon for putting prices into perspective

I had a fun idea for a small project this weekend, and so I quickly put it together over the couple of days. The result is Price Perspective.

Humor me: have you ever bought something, considered the price, and wondered how that price would look to someone else? Someone in the developing world, or a billionaire, or just your friend in Australia? In other words, can we develop an intuition for purchasing power?

The Price Perspective add-on answers these questions. Let’s consider an example: my income is sufficient to buy myself a delivery pizza for dinner without a second thought. How much work does it take for someone in Afghanistan to buy the same pizza? I can fire up Price Perspective to check:

The results are pretty shocking.

How about another example: say I’m looking to buy a house in the Netherlands. I fire up funda.nl and look at a few places in Amsterdam. After a few minutes wondering if I’ll ever be in an economic position to actually afford any of these homes (and speculating on if that day will come before or after I have spent this much money on rent over my lifetime), I wonder what these prices look like from the other side. Let’s see what it’d take for the Zuck to buy this apartment I fancy:

Well… that’s depressing. Let’s experiment with Price Perspective to see what it would take to make a dent in Zuck’s wallet. Let’s add some zeroes.

So, Zuckerberg over-bidding this apartment to the tune of €6.5B would cost him a proportion of his annual income which is comparable to me buying it for €5,000.

How about the reverse? How long would I have to work to buy, say, Jeff Bezos’s new mansion?

Yep. That level of wealth inequality is a sign of a totally normal, healthy, well-functioning society.

Curious to try it out for yourself? Get Price Perspective from addons.mozilla.org, tell it where you live and how much money you make in a year, and develop your own sense of perspective.

Using linkhut to signal-boost my bookmarks


Notice: linkhut has started to use generative AI tools upstream. Consequently, I have withdrawn my endorsement of the project.


It must have been at least a year ago that I first noticed linkhut, and its flagship instance at ln.ht, appear on SourceHut, where it immediately caught my attention for its good taste in inspirations. Once upon a time, I had a Pinboard account, which is a similar concept, but I never used it for anything in the end. When I saw linkhut I had a similar experience: I signed up and played with it for a few minutes before moving on.

I’ve been rethinking my relationship social media lately, as some may have inferred from my unannounced disappearance from Mastodon.1 While reflecting on this again recently, in a stroke of belated inspiration I suddenly appreciated the appeal of tools like linkhut, especially alongside RSS feeds – signal-boosting stuff I read and found interesting.

The appeal of this reminds me of one of the major appeals of SoundCloud to me, back when I used it circa… 2013? That is: I could listen to the music that artists I liked were listening to, and that was amazing for discovering new music. Similarly, for those of you who enjoy my blog posts, and want to read the stuff I like reading, check out my linkhut feed. You can even subscribe to its RSS feed if you like. There isn’t much there today, but I will be filling it up with interesting articles I see and projects I find online.

I want to read your linkhut feed, too, but it’s pretty quiet there at the moment. If you find the idea interesting, sign up for an account or set up your own instance and start bookmarking stuff – and email me your feed so I can find some good stuff to subscribe to in my own feed reader.

Please stop externalizing your costs directly into my face

This blog post is expressing personal experiences and opinions and doesn’t reflect any official policies of SourceHut.

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale. This isn’t the first time SourceHut has been at the wrong end of some malicious bullshit or paid someone else’s externalized costs – every couple of years someone invents a new way of ruining my day.

Four years ago, we decided to require payment to use our CI services because it was being abused to mine cryptocurrency. We alternated between periods of designing and deploying tools to curb this abuse and periods of near-complete outage when they adapted to our mitigations and saturated all of our compute with miners seeking a profit. It was bad enough having to beg my friends and family to avoid “investing” in the scam without having the scam break into my business and trash the place every day.

Two years ago, we threatened to blacklist the Go module mirror because for some reason the Go team thinks that running terabytes of git clones all day, every day for every Go project on git.sr.ht is cheaper than maintaining any state or using webhooks or coordinating the work between instances or even just designing a module system that doesn’t require Google to DoS git forges whose entire annual budgets are considerably smaller than a single Google engineer’s salary.

Now it’s LLMs. If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned, including expensive endpoints like git blame, every page of every git log, and every commit in every repo, and they do so using random User-Agents that overlap with end-users and come from tens of thousands of IP addresses – mostly residential, in unrelated subnets, each one making no more than one HTTP request over any time period we tried to measure – actively and maliciously adapting and blending in with end-user traffic and avoiding attempts to characterize their behavior or block their traffic.

We are experiencing dozens of brief outages per week, and I have to review our mitigations several times per day to keep that number from getting any higher. When I do have time to work on something else, often I have to drop it when all of our alarms go off because our current set of mitigations stopped working. Several high-priority tasks at SourceHut have been delayed weeks or even months because we keep being interrupted to deal with these bots, and many users have been negatively affected because our mitigations can’t always reliably distinguish users from bots.

All of my sysadmin friends are dealing with the same problems. I was asking one of them for feedback on a draft of this article and our discussion was interrupted to go deal with a new wave of LLM bots on their own server. Every time I sit down for beers or dinner or to socialize with my sysadmin friends it’s not long before we’re complaining about the bots and asking if the other has cracked the code to getting rid of them once and for all. The desperation in these conversations is palpable.

Whether it’s cryptocurrency scammers mining with FOSS compute resources or Google engineers too lazy to design their software properly or Silicon Valley ripping off all the data they can get their hands on at everyone else’s expense… I am sick and tired of having all of these costs externalized directly into my fucking face. Do something productive for society or get the hell away from my servers. Put all of those billions and billions of dollars towards the common good before sysadmins collectively start a revolution to do it for you.

Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage. I am begging you to stop using them, stop talking about them, stop making new ones, just stop. If blasting CO2 into the air and ruining all of our freshwater and traumatizing cheap laborers and making every sysadmin you know miserable and ripping off code and books and art at scale and ruining our fucking democracy isn’t enough for you to leave this shit alone, what is?

If you personally work on developing LLMs et al, know this: I will never work with you again, and I will remember which side you picked when the bubble bursts.

A holistic perspective on intellectual property, part 1

I’d like to write about intellectual property in depth, in this first of a series of blog posts on the subject. I’m not a philosopher, but philosophy is the basis of reasonable politics so buckle up for a healthy Friday afternoon serving of it.

To understand intellectual property, we must first establish at least a shallow understanding of property generally. What is property?1 An incomplete answer might state that a material object I have power over is my property. An apple I have held in my hand is mine, insofar as nothing prevents me from using it (and, in the process, destroying it), or giving it away, or planting it in the ground. However, you might not agree that this apple is necessarily mine if I took it from a fruit stand without permission. This act is called “theft” — one of many possible transgressions upon property.

It is important to note that the very possibility that one could illicitly assume possession of an object is a strong indication that “property” is a social convention, rather than a law of nature; one cannot defy the law of gravity in the same way as one can defy property. And, given that, we could try to imagine other social conventions to govern the use of things in a society. If we come up with an idea we like, and we’re in a radical mood, we could even challenge the notion of property in society at large and seek to implement a different social convention.

As it stands today, the social convention tells us property is a thing which has an “owner”, or owners, to whom society confers certain rights with respect to the thing in question. That may include, for example, the right to use it, to destroy it, to exclude others from using it, to sell it, or give it away, and so on. Property is this special idea society uses to grant you the authority to use a bunch of verbs with respect to a thing. However, being a social convention, nothing prevents me from using any of these verbs on something society does not recognize as my property, e.g. by selling you this bridge. This is why the social convention must be enforced.

And how is it enforced? We could enforce property rights with shame: stealing can put a stain on one’s reputation, and this shame may pose an impediment to one’s social needs and desires, and as such theft is discouraged. We can also use guilt: if you steal something, but don’t get caught, you could end up remorseful without anyone to shame you for it, particularly with respect to the harm done to the person who suffered a loss of property as a result. Ultimately, in modern society the social convention of property is enforced with, well, force. If you steal something, society has appointed someone with a gun to track you down, restrain you, and eventually lock you up in a miserable room with bars on the windows.


I’d like to take a moment here to acknowledge the hubris of property: we see the bounty of the natural world and impose upon it these imagined rights and privileges, divvy it up and hand it out and hoard it, and resort to cruelty if anyone steps out of line. Indeed this may be justifiable if the system of private property is sufficiently beneficial to society, and the notion of property is so deeply ingrained into our system that it feels normal and unremarkable. It’s worth remembering that it has trade-offs, that we made the whole thing up, and that we can make up something else with different trade-offs. That being said, I’m personally fond of most of my personal property and I’d like to keep enjoying most of my property rights as such, so take from that what you will.2


One way we can justify property rights is by using them as a tool for managing scarcity. If demand for coffee exceeds the supply of coffee beans, a scarcity exists, meaning that not everyone who wants to have coffee gets to have some. But, we still want to enjoy scarce things. Perhaps someone who foregoes coffee will enjoy some other scarce resource, such as tea — then everyone can benefit in some part from some access to scarce resources. I suppose that the social convention of property can derive some natural legitimacy from the fact that some resources are scarce.3 In this sense, private property relates to the problem of distribution.

But a naive solution to distribution has flaws. For example, what of hoarding? Are property rights legitimate when someone takes more than they need or intend to use? This behavior could be motivated by an antagonistic relationship with society at large, such as as a means of driving up prices for private profit; such behavior could be considered anti-social and thus a violation of the social convention as such.

Moreover, property which is destroyed by its use, such as coffee, is one matter, but further questions are raised when we consider durable goods, such as a screwdriver. The screwdriver in my shed spends the vast majority of its time out of use. Is it just for me to assert property rights over my screwdriver when I am not using it? To what extent is the scarcity of screwdrivers necessary? Screwdrivers are not fundamentally scarce, given that the supply of idle screwdrivers far outpaces the demand for screwdriver use, but our modern conception of property has the unintended consequence of creating scarcity where there is none by denying the use of idle screwdrivers where they are needed.

Let’s try to generalize our understanding of property, working our way towards “intellectual property” one step at a time. To begin with, what happens if we expand our understanding of property to include immaterial things? Consider domain names as a kind of property. In theory, domain names are abundant, but some names are more desirable than others. We assert property rights over them, in particular the right to use a name and exclude others from using it, or to derive a profit from exclusive use of a desirable name.

But a domain name doesn’t really exist per-se: it’s just an entry in a ledger. The electric charge on the hard drives in your nearest DNS server’s database exist, but the domain name it represents doesn’t exist in quite the same sense as the electrons do: it’s immaterial. Is applying our conception of property to these immaterial things justifiable?

We can start answering this question by acknowledging that property rights are useful for domain names, in that this gives domain names desirable properties that serve productive ends in society. For example, exclusive control over a domain name allows a sense of authenticity to emerge from its use, so that you understand that pointing your browser to drewdevault.com will return the content that the person, Drew DeVault, wrote for you. We should also acknowledge that there are negative side-effects of asserting property rights over domains, such as domain squatting, extortionate pricing for “premium” domain names, and the advantage one party has over another if they possess a desirable name by mere fact of that possession, irrespective of merit.

On the balance of things, if we concede the legitimacy of personal property4 I find it relatively easy to concede the legitimacy of this sort of property, too.

The next step is to consider if we can generalize property rights to govern immaterial, non-finite things, like a story. A book, its paper and bindings and ink, is a material, finite resource, and can be thought of in terms that apply to material property. But what of the words formed by the ink? They can be trivially copied with a pen and paper, or transformed into a new medium by reading it aloud to an audience, and these processes do not infringe on the material property rights associated with the book. This process cannot be thought of as stealing, as the person who possesses a copy of the book is not asserting property rights over the original. In our current intellectual property regime, this person is transgressing via use of the idea, the intellectual property — the thing in the abstract space occupied by the story itself. Is that, too, a just extension of our notion of property?

Imagine with me the relationship one has with one’s property, independent of the social constructs around property. With respect to material property, a relationship of possession exists: I physically possess a thing, and I have the ability to make use of it through my possession of it. If someone else were to deny me access to this thing, they would have to resort to force, and I would have to resort to force should I resist their efforts.

Our relationship with intellectual property is much different. An idea cannot be withheld or seized by force. Instead, our relationship to intellectual property is defined by our history with respect to an idea. In the case of material property, the ground truth is that I keep it locked in my home to deny others access to it, and the social construct formalizes this relationship. With respect to intellectual property, such as the story in a book, the ground truth is that, sometime in the past, I imagined it and wrote it down. The social construct of intellectual property invents an imagined relationship of possession, modelled after our relationship with material property.

Why?

The resource with the greatest and most fundamental scarcity is our time,5 and as a consequence the labor which goes into making something is of profound importance. Marx famously argued for a “labor theory of value”, which tells us that the value inherent in a good or service is in the labor which is required to provide it. I think he was on to something!6 Intellectual property is not scarce, nor can it be possessed, but it does have value, and that value could ultimately be derived from the labor which produced it.

The social justification for intellectual property as a legal concept is rooted in the value of this labor. We recognize that intellectual labor is valuable, and produces an artifact — e.g. a story — which is valuable, but is not scarce. A capitalist society fundamentally depends on scarcity to function, and so through intellectual property norms we create an artificial scarcity to reward (and incentivize) intellectual labor without questioning our fundamental assumptions about capitalism and value.7 But, I digress — let’s revisit the subject in part two.

In part two of this series on intellectual property, I will explain the modern intellectual property regime as I understand it, as well as its history and justification. So equipped with the philosophical and legal background, part three will constitute the bulk of my critique of intellectual property, and my ideals for reform. Part four will examine how these ideas altogether apply in practice to open source, as well as the hairy questions of intellectual property as applied to modern problems in this space, such as the use of LLMs to file the serial numbers off of open source software.


If you want to dive deeper into the philosophy here, a great resource is the Stanford Encyclopedia of Philosophy. Check out their articles on Property and Ownership and Redistribution for a start, which expand on some of the ideas I’ve drawn on here and possess a wealth of citations catalogued with a discipline I can never seem to muster for my blog posts. I am a programmer, not a philosopher, so if you want to learn more about this you should go read from the hundreds of years of philosophers who have worked on this with rigor and written down a bunch of interesting ideas.

Join us to discuss transparency and governance at FOSDEM '25

Good news: it appears that Jack Dorsey’s FOSDEM talk has been cancelled!

This is a follow up to two earlier posts, which you can read here: one and two.

I say it “appears” so, because there has been no official statement from anyone to that effect. There has also been no communication from staff to the protest organizers, including to our email reaching out as requested to discuss fire safety and crowd control concerns with the staff. The situation is a bit unclear, but… we’ll extend FOSDEM the benefit of the doubt, and with it our gratitude. From all of the volunteers who have been organizing this protest action, we extend our heartfelt thanks to the staff for reconsidering the decision to platform Dorsey and Block, Inc. at FOSDEM. All of us – long-time FOSDEM volunteers, speakers, devroom organizers, and attendees – are relieved to know that FOSDEM stands for our community’s interests.

More importantly: what comes next?

The frustration the community felt at learning that Block was sponsoring FOSDEM and one of the keynote slots1 had been given to Dorsey and his colleagues uncovered some deeper frustrations with the way FOSDEM is run these days. This year is FOSDEM’s 25th anniversary, and it seems sorely overdue for graduating from the “trust us, it’s crazy behind the scenes” governance model to something more aligned with the spirit of open source.

We trust the FOSDEM organizers — we can extend them the benefit of the doubt when they tell us that talk selection is independent of sponsorships. But it strains our presumption of good faith when the talk proposal was rejected by 3 of the 4 independent reviewers and went through anyway. And it’s kind of weird that we have to take them at their word — that the talk selection process isn’t documented anywhere publicly, nor the conflict of interest policy, nor the sponsorship terms, nor almost anything at all about how FOSDEM operates or is governed internally. Who makes decisions? How? We don’t know, and that’s kind of weird for something so important in the open source space.

Esther Payne, a speaker at FOSDEM 2020, summed up these concerns:

Why do we have so little information on the FOSDEM site about the budget and just how incorporated is FOSDEM as an organisation? How do the laws of Belgium affect the legalities of the organisation? How is the bank account administrated? How much money goes into the costs of this year, and how much of the budget goes into startup costs for the next year?

Peter Zaitsev, a long-time devroom organizer and FOSDEM speaker for many years, asked similar questions last year. I’ve spoken to the volunteers who signed up for the protest – we’re relieved that Dorsey’s talk has been cancelled, but we’re still left with big questions about transparency and governance at FOSDEM.

So, what’s next?

Let’s do something useful with that now-empty time slot in Janson. Anyone who planned to attend the protest is encouraged to come anyway on Sunday at 12:00 PM, where we’re going to talk amongst ourselves and anyone else who shows up about what we want from FOSDEM in the future, and what a transparent and participatory model of governance would look like. We would be thrilled if anyone on the FOSDEM staff wants to join the conversation as well, assuming their busy schedule permits. We’ll prepare a summary of our discussion and our findings to submit to the staff and the FOSDEM community for consideration after the event.

Until then – I’ll see you there!

NOTICE: The discussion session has been cancelled. After meeting with many of the protest volunteers and discussing the matter among the organizers, we have agreed that de-platforming Dorsey is mission success and improvising further action isn’t worth the trouble. We’ll be moving for reforms at FOSDEM after the event – I’ll keep you posted.


P.S. It’s a shame we won’t end up handing out our pamphlets. The volunteers working on that came up with this amazing flyer and I think it doesn’t deserve to go unseen:

We will be doing a modest print run for posterity — find one of us at FOSDEM if you want one.

FOSDEM '25 protest

Update: Dorsey’s talk was cancelled! See the update here.

Last week, I wrote to object to Jack Dorsey and his company, Block, Inc., being accepted as main track speakers at FOSDEM, and proposed a protest action in response. FOSDEM issued a statement about our plans on Thursday.

Today, I have some updates for you regarding the planned action.

I would like to emphasize that we are not protesting FOSDEM or its organizers. We are protesting Jack Dorsey and his company, first and foremost, from promoting their business at FOSDEM. We are members of the FOSDEM community. We have variously been speakers, devroom organizers, volunteers, and attendees for years — in other words, we are not activism tourists. We have a deep appreciation for the organizers and all of the work that they have done over the years to make FOSDEM such a success.

That we are taking action demonstrates that we value FOSDEM, that we believe it represents our community, and that we want to defend its — our — ethos. Insofar as we have a message to the FOSDEM organizers, it is one of gratitude, and an appeal to build a more open and participatory process, in the spirit of open source, and especially to improve the transparency of the talk selection process, sponsorship terms, and conflict of interest policies, so protests like ours are not necessary in the future. To be clear, we do not object to the need for sponsors generally at FOSDEM — we understand that FOSDEM is a free, volunteer driven event, many of us having volunteered for years — but we do object specifically to Jack Dorsey and Block, Inc. being selected as sponsors and especially as speakers.

As for the planned action, I have some more information for anyone who wishes to participate. Our purpose is to peacefully disrupt Dorsey’s talk, and only Dorsey’s talk, which is scheduled to take place between 12:00 and 12:30 on Sunday, February 2nd in Janson. If you intend to participate, we will be meeting outside of the upper entrance to Janson at 11:45 AM. We will be occupying the stage for the duration of the scheduled time slot in order to prevent the talk from proceeding as planned.

To maintain the peaceful nature of our protest and minimize the disruption to FOSDEM generally, we ask participants to strictly adhere to the following instructions:

  1. Do not touch anyone else, or anyone else’s property, for any reason.
  2. Do not engage in intimidation.
  3. Remain quiet and peaceful throughout the demonstration.
  4. When the protest ends, disperse peacefully and in a timely manner.
  5. Leave the room the way you found it.

Dorsey’s time slot is scheduled to end at 12:30, but we may end up staying as late as 14:00 to hand the room over to the next scheduled talk.

I’ve been pleased by the response from volunteers (some of whom helped with this update — thanks!), but we still need a few more! I have set up a mailing list for planning the action. If you plan to join, and especially if you’re willing and able to help with additional tasks that need to be organized, please contact me directly to receive an invitation to the mailing list.

Finally, I have some corrections to issue regarding last week’s blog post.

In the days since I wrote my earlier blog post, Dorsey’s talk has been removed from the list of keynotes and moved to the main track, where it will occupy the same time slot in the same room but not necessarily be categorized as a “keynote”.

It has also been pointed out that Dorsey does not bear sole responsibility for Twitter’s sale. However, he is complicit and he profited handsomely from the sale and all of its harmful consequences. The sale left the platform at the disposal of the far right, causing a sharp rise in hate speech and harassment and the layoffs of 3,700 of the Twitter employees that made it worth so much in the first place.

His complicity, along with his present-day activities at Block, Inc. and the priorities of the company that he represents as CEO — its irresponsible climate policy, $120M in fines for enabling consumer fraud, and the layoffs of another 1,000 employees in 2024 despite posting record profits on $5B in revenue — are enough of a threat to our community and its ethos to raise alarm at his participation in FOSDEM. We find this compelling enough to take action to prevent him and his colleagues from using FOSDEM’s platform to present themselves as good actors in our community and sell us their new “AI agentic framework”.

The open source community and FOSDEM itself would not exist without collective action. Our protest to defend its principles is in that spirit. Together we can, and will, de-platform Jack Dorsey.

I’ll see you there!

No billionaires at FOSDEM

Update: Dorsey’s talk was cancelled! See the update here.

Jack Dorsey, former CEO of Twitter, ousted board member of BlueSky, and grifter extraordinaire to the tune of a $5.6B net worth, is giving a keynote at FOSDEM.

The FOSDEM keynote stage is one of the biggest platforms in the free software community. Janson is the biggest venue in the event – its huge auditorium can accommodate over 1,500 of FOSDEM’s 8,000 odd attendees, and it is live streamed to a worldwide audience as the face of one of the free and open source software community’s biggest events of the year. We’ve platformed Red Hat, the NLNet Foundation, NASA, numerous illustrious community leaders, and many smaller projects that embody our values and spirit at this location to talk about their work or important challenges our community faces.

Some of these challenges, as a matter of fact, are Jack Dorsey’s fault. In 2023 this stage hosted Hachyderm’s Kris Nóva to discuss an exodus of Twitter refugees to the fediverse. After Dorsey sold Twitter to Elon Musk, selling the platform out to the far right for a crisp billion-with-a-“B” dollar payout, the FOSS community shouldered the burden – both with our labor and our wallets – of a massive exodus onto our volunteer-operated servers, especially from victims fleeing the hate speech and harassment left in the wake of the sale. Two years later one of the principal architects of, and beneficiaries of, that disaster will step onto the same stage. Even if our community hadn’t been directly harmed by Dorsey’s actions, I don’t think that we owe this honor to someone who took a billion dollars to ruin their project, ostracize their users, and destroy the livelihoods of almost everyone who worked on it.

Dorsey is presumably being platformed in Janson because his blockchain bullshit company is a main sponsor of FOSDEM this year. Dorsey and his colleagues want to get us up to speed on what Block is working on these days. Allow me to give you a preview: in addition to posting $5B in revenue and a 21% increase in YoY profit in 2024, Jack Dorsey laid off 1,000 employees, ordering them not to publicly discuss board member Jay-Z’s contemporary sexual assault allegations on their way out, and announced a new bitcoin mining ASIC in collaboration with Core Scientific, who presumably installed them into their new 100MW Muskogee, OK bitcoin mining installation, proudly served by the Muskogee Generating Station fossil fuel power plant and its 11 million tons of annual CO2 emissions and an estimated 62 excess deaths in the local area due to pollution associated with the power plant. Nice.

In my view, billionaires are not welcome at FOSDEM. If billionaires want to participate in FOSS, I’m going to ask them to refrain from using our platforms to talk about their AI/blockchain/bitcoin/climate-disaster-as-a-service grifty business ventures, and instead buy our respect by, say, donating 250 million dollars to NLNet or the Sovereign Tech Fund. That figure, as a percentage of Dorsey’s wealth, is proportional to the amount of money I donate to FOSS every year, by the way. That kind of money would keep the FOSS community running for decades.

I do not want to platform Jack Dorsey on this stage. To that end, I am organizing a sit-in, in which I and anyone who will join me are going to sit ourselves down on the Janson stage during his allocated time slot and peacefully prevent the talk from proceeding as scheduled. We will be meeting at 11:45 AM outside of Janson, 15 minutes prior to Dorsey’s scheduled time slot. Once the stage is free from the previous speaker, we will sit on the stage until 12:30 PM. Bring a good book. If you want to help organize this sit-in, or just let me know that you intend to participate, please contact me via email; I’ll set up a mailing list if there’s enough interest in organizing things like printing out pamphlets to this effect, or even preparing an alternative talk to “schedule” in his slot.


Follow-up: FOSDEM ’25 protest

Neurodivergence and accountability in free software

In November of last year, I wrote Richard Stallman’s political discourse on sex, which argues that Richard Stallman, the founder of and present-day voting member of the board of directors of the Free Software Foundation (FSF), endorses and advocates for a harmful political agenda which legitimizes adult attraction to minors, consistently defends adults accused of and convicted of sexual crimes with respect to minors, and more generally erodes norms of consent and manipulates language regarding sexual harassment and sexual assault in his broader political program.

In response to this article, and on many occasions when I have re-iterated my position on Stallman in other contexts, a common response is to assert that my calls to censure Stallman are ableist, on the basis that Stallman is neurodivergent (ND). This line of reasoning suggests that Stallman’s awkward and zealous views on sex are in line with his awkward and zealous positions on other matters (such as his insistence on “GNU/Linux” terminology rather than “Linux”), and that together this illustrates a pattern which suggests neurodivergence is at play. This argumentation is flawed, but I think it presents us with a good opportunity to talk about how neurodivergence and sexism presents in our community.

Neurodivergence (antonymous with “neurotypical”) is an umbrella term that encompasses a wide variety of human experiences, including autism, ADHD, personality disorders, bipolar disorder, and others. The particular claims I’ve heard about Stallman suggest that he is “obviously” autistic, or has Asperger syndrome.1 The allegation of ableism in my criticisms of Stallman are rooted in this presumption of neurodivergence in Stallman: the argument goes that I am putting his awkwardness on display and mocking him for it, that calling for the expulsion of someone on the basis of being awkward is ableist, and that this has a chilling effect on our community, which is generally thought to have a high incidence of neurodivergence. I will respond to this defense of Stallman today.

A defense of problematic behavior that cites neurodivergence to not only explain, but excuse, said behavior, is ableist and harms neurodivergent people, rather than standing up for them as these arguments portray themselves as doing. To illustrate this, I opened a discussion on the Fediverse asking neurodivergent people to chime in and reached out directly to some ND friends in my social circle.


Aside: Is Stallman neurodivergent?

Stallman’s neurodivergence is an unsolicited armchair diagnosis with no supporting evidence besides “vibes”. This 2008 article summarizes his public statements on the subject:

“During a 2000 profile for the Toronto Star, Stallman described himself to an interviewer as ‘borderline autistic,’ a description that goes a long way toward explaining a lifelong tendency toward social and emotional isolation and the equally lifelong effort to overcome it,” Williams wrote.

When I cited that excerpt from the book during the interview, Stallman said that assessment was “exaggerated.”

“I wonder about it, but that’s as far as it goes,” he said. “Now, it’s clear I do not have [Asperger’s] — I don’t have most of the characteristics of that. For instance, one of those characteristics is having trouble with rhythm. I love the most complicated, fascinating rhythms.” But Stallman did acknowledge that he has “a few of the characteristics” and that he “might have what some people call a ‘shadow’ version of it.”

The theory that Stallman is neurodivergent is usually cited to explain his various off-putting behaviors, but there is no tangible evidence to support the theory. This alone raises some alarms, in that off-putting behavior is sufficient evidence to presume neurodivergence. I agree that some of his behavior, off-putting or otherwise, appears consistent, to my untrained eye, with some of the symptoms of autism. Nevertheless I am not going to forward an armchair diagnosis in either direction. However, because a defense of Stallman on the basis of neurodivergence is contingent on him being neurodivergent, this rest of this article will presume that it is true for the purpose of rebuttal.

tl;dr: we don’t know and the assumption that he is is ableist.


This defense of Stallman is ableist because it infantalizes and denies agency to neurodivergent people. Consider what’s being said here: it only follows that Stallman’s repugnant behavior is excusable because he’s neurodivergent if neurodivergent people cannot help but be repugnant. An autistic person I spoke to, who wishes to remain anonymous, had the following to say:

As an autistic person, I find these statements deeply offensive, because they build on and perpetuate damaging stereotypes.

Research has repeatedly proved that, on average, autistic folks have high empathy and a higher sense of values than the general population. We are not the emotionless robots that the popular imagination believes we are.

But we are not a monolith, and some autistic folks are absolute assholes who should be called out (and held accountable) for the harm that they cause. Autism is context, not an excuse: it can explain why someone might struggle in some situations and need additional support, but it should never be an excuse to harm others. We can all learn and improve.

I have witnessed people pulling the autism card to avoid consequences for CoC violations, then calling out the organization for “not supporting true diversity” when they’re shown the door. This is manipulative and insulting to the other neurodivergent members of the community, and should never be tolerated.

Bram Dingelstad, a neurodivergent person who participated in the discussion, had this to say:

Problematic behaviour is what it is: problematic.

There are a lot of neurodivergent people out there that are able to carry themselves in a way that doesn’t make anyone unsafe or harm victims of sexual assault by dismissing or downplaying their lived experience. In my opinion, using neurodivergence as an excuse for this behaviour only worsens the perception of neurodiversity.

Richard Stallman should be held accountable for his speech and his actions.

Another commenter put it more concisely, if not as eloquently:

It’s fucking ableist to say neurodiversity disposes you towards problematic behaviors. It’s disgusting trying to hide behind it and really quite insulting.

I came away from these discussions with the following understanding: neurodivergence, in particular autism, causes people to struggle to understand unstated social norms and conventions, sometimes with embarrassing or harmful consequences, such as with respect to interpersonal relationships. The people I’ve spoken to call for empathy and understanding in the mistakes which can be made in light of this, but also call for accountability – to be shown what’s right (and, importantly, why it’s so), and then to be expected to behave accordingly, no different from anyone else.

Being neurodivergent doesn’t make someone sexist, but it can make it harder for them to hide sexist views. To associate Stallman’s sexism with his perceived neurodivergence is ableist, and to hold Stallman accountable for his behavior is not. One commenter puts it this way:

I’ve said quite a few times is that sexism is not a symptom of autism. Writing this sort of behaviour off as “caused by” neurodivergence is itself ableist, I’m not a huge fan of the narrative that I have “the neurodevelopmental disorder that makes you a bigot”.

I fundamentally disagree with the idea that the pervasive sexism in tech is because of the high incidence of neurodiversity. It’s because tech has broadly operated as a boys club for decades, and those norms persist.

Using neurodivergence as a cover for sexism and problematic behavior in our communities is a toxic, ableist, and, of course, sexist attitude that serves to provide problematic men with space to be problematic. Note also how intersections between neurodiversity and identity play out: white men tend to be excused on the basis of neurodivergence, whereas for women, transgender people, people of color, etc – the excuse does not apply. Consider the differences in how bipolar disorder is perceived in women – “she’s crazy” – versus how men with autism are accommodated – “he can’t help it”.

So, I reject the notion that it is ableist to criticize problematic behavior that can be explained by neurodivergence. But, even if it were, an anonymous autistic commenter has this to say:

If we accept the hypothesis that it is ableist to condemn behavior which can be explained by neurodivergence (and I don’t), my answer is: be ableist. I don’t like it, but it’s ridiculous to imagine any other option in the physical world, and it’s weird to treat the virtual world so differently.

Here’s an anecdote: when I was at school, a new person, Adam, joined the class. We didn’t want Adam to feel excluded, so we included him in our social events. Adam had narcissistic personality disorder, and likely in part because of this, he was also a serial harasser of women. So what did we do about it?

We stopped inviting Adam. I wish we didn’t have to stop inviting him, but our hands were tied. I’m not going to say it’s something only he could change, because maybe he truly couldn’t change that. Maybe it was ableist to exclude him. But the safety of my friends comes first. The hard part is distinguishing between this situation and a situation where someone is excluded when they are perceived as a threat just because they’re different.

Stallman’s rhetoric and behavior are harmful, and we need to address that harm. The refrain of “criticizing Stallman’s behavior is ableist and alienates neurodiverse individuals in our community” is itself ableist and isn’t doing any favors for our neurodiverse friends.

To conclude this article, I thought I’d take this opportunity to find out what our neurodiverse friends are actually struggling with and how we can better accommodate their needs in our community.

First of all, a recognition of individuals as being autonomous, independent people with agency and independent needs has to come first, with neurodiversity and with everything else. Listen to people when they explain their experiences and their needs as individuals, and don’t rely on romanticized and stereotypical understandings of particular neurodevelopmental conditions such as autism. These stereotypes are often deeply harmful: one person spoke of being accused of incompetence and lying about their neurodivergence in a ploy for sympathy. They experienced severe harassment, at the worst in the form of harassers engineering stressful situations and screenshoting their reactions to humiliate them and damage their reputation.

Standing up for your peers is important, in this as in all things. Not only against harassment, discrimination, and abuse on the basis of neurodivergence, but on any basis, from any person – which I was often reminded is especially important for neurodivergent people who are not cishet white men, as these challenges are amplified in light of these intersectional identities. Talk to people and understand their experiences, their needs, and their worldview. Be patient, but clear and open in your communication. The neurodivergent people I spoke to often found it difficult to learn social mores, moreso than most neurotypical experiences, but nevertheless the vast majority of them felt perfectly capable of it, and the expectation that they weren’t is demeaning and ableist.

I also heard some advice from the neurodivergent community that applies especially to free software community leaders. Clearly stated community norms and expectations, through codes of conducts and visible moderation, is often helpful for neurodivergent people. Many ND people struggle to intuit or “guess” social norms and prefer expectations to be stated unambiguously. Normalizing the use of tone indicators (e.g. “/s”), questions clarifying intent, and conflict de-escalation are also good tools to employ.

Another consideration of merit is accommodations for asynchronous participation in meaningful governance and decision-making processes. Some ND people find it difficult to participate in real-time discussions in chat rooms or in person, and mediums like emails and other long-form slow discussions are easier for them to engage with. Accommodations for sensory sensitivities at in-person events is another good strategy to include more ND folks in your event. Establishing quiet spaces to get away from the busier parts of the event, being considerate of lighting choices, flexible break times, and activities for smaller groups were all highlighted to me by ND people as making their experience more enjoyable.

These are the lessons I took away from speaking to dozens of neurodivergent people in researching this blog post. I encourage you to speak to, and listen to, people in your communities as well, particularly when dealing with an issue which cites their struggles or impacts them directly.

Rust for Linux revisited

Ugh. Drew’s blogging about Rust again.

– You

I promise to be nice.

Two years ago, seeing the Rust-for-Linux project starting to get the ball rolling, I wrote “Does Rust belong in the Linux kernel?”, penning a conclusion consistent with Betteridge’s law of headlines. Two years on we have a lot of experience to draw on to see how Rust-for-Linux is actually playing out, and I’d like to renew my thoughts with some hindsight – and more compassion. If you’re one of the Rust-for-Linux participants burned out or burning out on this project, I want to help. Burnout sucks – I’ve been there.

The people working on Rust-for-Linux are incredibly smart, talented, and passionate developers who have their eyes set on a goal and are tirelessly working towards it – and, as time has shown, with a great deal of patience. Though I’ve developed a mostly-well-earned reputation for being a fierce critic of Rust, I do believe it has its place and I have a lot of respect for the work these folks are doing. These developers are ambitious and motivated to make an impact, and Linux is undoubtedly the highest-impact software in the world, and in theory Linux is enthusiastically ready to accept motivated innovators into its fold to facilitate that impact.

At least in theory. In practice, the Linux community is the wild wild west, and sweeping changes are infamously difficult to achieve consensus on, and this is by far the broadest sweeping change ever proposed for the project. Every subsystem is a private fiefdom, subject to the whims of each one of Linux’s 1,700+ maintainers, almost all of whom have a dog in this race. It’s herding cats: introducing Rust effectively is one part coding work and ninety-nine parts political work – and it’s a lot of coding work. Every subsystem has its own unique culture and its own strongly held beliefs and values.

The consequences of these factors is that Rust-for-Linux has become a burnout machine. My heart goes out to the developers who have been burned in this project. It’s not fair. Free software is about putting in the work, it’s a classical do-ocracy… until it isn’t, and people get hurt. In spite of my critiques of the project, I recognize the talent and humanity of everyone involved, and wouldn’t have wished these outcomes on them. I also have sympathy for many of the established Linux developers who didn’t exactly want this on their plate… but that’s neither here nor there for the purpose of this post, and any of those developers and their fiefdoms who went out of their way to make life difficult for the Rust developers above and beyond what was needed to ensure technical excellence are accountable for these shitty outcomes.1

So where do we go now?

Well, let me begin by re-iterating something from my last article on the subject: “I wish [Rust-for-Linux] the best of luck and hope to see them succeed”. Their path is theirs to choose, and though I might advise a moment to rest before diving headfirst into this political maelstrom once again, I support you in your endeavours if this is what you choose to do. Not my business. That said, allow me to humbly propose a different path for your consideration.

Here’s the pitch: a motivated group of talented Rust OS developers could build a Linux-compatible kernel, from scratch, very quickly, with no need to engage in LKML politics. You would be astonished by how quickly you can make meaningful gains in this kind of environment; I think if the amount of effort being put into Rust-for-Linux were applied to a new Linux-compatible OS we could have something production ready for some use-cases within a few years.

Novel OS design is hard: projects like Redox are working on this, but it will take a long time to bear fruit and research operating systems often have to go back to the drawing board and make major revisions over and over again before something useful and robust emerges. This is important work – and near to my heart – but it’s not for everyone. However, making an OS which is based on a proven design like Linux is much easier and can be done very quickly. I worked on my own novel OS design for a couple of years and it’s still stuck in design hell and badly in need of being rethought; on the other hand I wrote a passable Unix clone alone in less than 30 days.

Rust is a great fit for a large monolithic kernel design like Linux. Imagine having the opportunity to implement something like the dcache from scratch in Rust, without engaging with the politics – that’s something a small group of people, perhaps as few as one, could make substantial inroads on in a short period of time taking full advantage of what Rust has on offer. Working towards compatibility with an existing design can leverage a much larger talent pool than the very difficult problem of novel OS design, a lot of people can manage with a copy of the ISA manual and a missive to implement a single syscall in a Linux-compatible fashion over the weekend. A small and motivated group of contributors could take on the work of, say, building out io_uring compatibility and start finding wins fast – it’s a lot easier than designing io_uring from scratch. I might even jump in and build out a driver or two for fun myself, that sounds like a good opportunity for me to learn Rust properly with a fun project with a well-defined scope.

Attracting labor shouldn’t be too difficult with this project in mind, either. If there was the Rust OS project, with a well-defined scope and design (i.e. aiming for Linux ABI compatibility), I’m sure there’s a lot of people who’d jump in to stake a claim on some piece of the puzzle and put it together, and the folks working on Rust-for-Linux have the benefit of a great deal of experience with the Linux kernel to apply to oversight on the broader design approach. Having a clear, well-proven goal in mind can also help to attract the same people who want to make an impact in a way that a speculative research project might not. Freeing yourselves of the LKML political battles would probably be a big win for the ambitions of bringing Rust into kernel space. Such an effort would also be a great way to mentor a new generation of kernel hackers who are comfortable with Rust in kernel space and ready to deploy their skillset to the research projects that will build a next-generation OS like Redox. The labor pool of serious OS developers badly needs a project like this to make that happen.

So my suggestion for the Rust-for-Linux project is: you’re burned out and that’s awful, I feel for you. It might be fun and rewarding to spend your recovery busting out a small prototype Unix kernel and start fleshing out bits and pieces of the Linux ABI with your friends. I can tell you from my own experience doing something very much like this that it was a very rewarding burnout recovery project for me. And who knows where it could go?

Once again wishing you the best and hoping for your success, wherever the path ahead leads.

What about drivers?

To pre-empt a response I expect to this article: there’s the annoying question of driver support, of course. This was an annoying line of argumentation back when Linux had poor driver support as well, and it will be a nuisance for a hypothetical Linux-compatible Rust kernel as well. Well, the same frustrated arguments I trotted out then are still ready at hand: you choose your use-cases carefully. General-purpose comes later. Building an OS which supports virtual machines, or a datacenter deployment, or a specific mobile device whose vendor is volunteering labor for drivers, and so on, will come first. You choose the hardware that supports the software, not the other way around, or build the drivers you need.

That said, a decent spread of drivers should be pretty easy to implement with the talent base you have at your disposal, so I wouldn’t worry about it.

So you want to compete with or replace open source

We are living through an interesting moment in source-available software.1 The open source movement has always had, and continues to have, a solid grounding in grassroots programmers building tools for themselves and forming communities around them. Some looming giants brought on large sums of money – Linux, Mozilla, Apache, and so on – and other giants made do without, like GNU, but for the most part if anyone thought about open source 15 years ago they were mostly thinking about grassroots communities who built software together for fun. With the rise of GitHub and in particular the explosion of web development as an open platform, commercial stakeholders in software caught on to the compelling economics of open source. The open source boom that followed caused open source software to have an enormous impact on everyone working in the software industry, and, in one way or another, on everyone living on planet Earth.

Over the past decade or so, a lot of businesses, particularly startups, saw these economics unfolding in front of them and wanted to get in on this boom. A lot of talented developers started working on open source software with an explicit aim towards capitalizing on it, founding businesses and securing capital investments to build their product – an open source product. A few years following the onset of these startups, the catch started to become apparent. While open source was proven to be incredibly profitable and profoundly useful for the software industry as a whole, the economics of making open source work for one business are much different.

It comes down to the fact that the free and open source software movements are built on collaboration, and all of our success is attributable to this foundation. The economics that drew commercial interest into the movement work specifically because of this collaboration – because the FOSS model allows businesses to share R&D costs and bring together talent across corporate borders into a great melting pot of innovation. And, yes, there is no small amount of exploitation going on as well; businesses are pleased to take advantage of the work of Jane Doe in Ohio’s FOSS project to make themselves money without sharing any of it back. Nevertheless, the revolutionary economics of FOSS are based on collaboration, and are incompatible with competition.

The simple truth of open source is that if you design your business model with an eye towards competition, in which you are the only entity who can exclusively monetize the software product, you must eschew the collaborative aspects of open source – and thus its greatest strength. Collaboration in open source works because the collaborators, all representatives of different institutions, are incentivized to work together for mutual profit. No one is incentivized to work for you, for free, for your own exclusive profit.

More than a few of these open source startups were understandably put out when this reality started to set in. It turns out the market capitalization of a business that has an open source product was often smaller than the investments they had brought in. Under these conditions it’s difficult to give the investors the one and only thing they demand – a return on investment. The unbounded growth demanded by the tech boom is even less likely to be attainable in open source. There are, to be entirely clear, many business models which are compatible with open source. But there are also many which are not. There are many open source projects which can support a thriving business or even a thriving sub-industry, but there are some ideas which, when placed in an open source framing, simply cannot be capitalized on as effectively, or often at all.

Open source ate a lot of lunches. There are some kinds of software which you just can’t make in a classic silicon valley startup fashion anymore. Say you want to write a database server – a sector which has suffered a number of rug-pulls from startups previously committed to open source. If you make it closed source, you can’t easily sell it like you could 10 or 20 years ago, ala MSSQL. This probably won’t work. If you make it open source, no one will pay you for it and you’ll end up moaning about how the major cloud providers are “stealing” your work. The best way to fund the development of something like that is with a coalition of commercial stakeholders co-sponsoring or co-maintaining the project in their respective self-interests, which is how projects like PostgreSQL, Mesa, or the Linux kernel attract substantial paid development resources. But it doesn’t really work as a startup anymore.

Faced with these facts, there have been some challenges to the free and open source model coming up in the past few years, some of which are getting organized and starting to make serious moves. Bruce Perens, one of the founding figures of the Open Source Initiative, is working on the “post-open” project; “Fair Source” is another up-and-coming-effort, and there have been and will be others besides.

What these efforts generally have in common is a desire to change the commercial dynamic of source-available software. In other words, the movers and shakers in these movements want to get paid more, or more charitably, want to start a movement in which programmers that work on source-available software as a broader class get paid more. The other trait they have in common is a view that the open source definition and the four freedoms of free software do not sufficiently provide for this goal.

For my part, I don’t think that this will work. I think that the aim of sole or limited rights to monetization and the desire to foster a collaborative environment are irreconcilable. These movements want to have both, and I simply don’t think that’s possible.

This logic is rooted in a deeper notion of ownership over the software, which is both subtle and very important. This is a kind of auteur theory of software. The notion is that the software they build belongs to them. They possess a sense of ownership over the software, which comes with a set of moral and perhaps legal rights to the software, which, importantly, are withheld from any entity other than themselves. The “developers” enjoy this special relationship with the project – the “developers” being the special class of person entitled to this sense of ownership and the class to whom the up-and-coming source-available movements make an appeal, in the sense of “pay the developers” – and third-party entities who work on the source code are merely “contributors”, though they apply the same skills and labor to the project as the “developers” do. The very distinction between “first-party” and “third-party” developers is contingent on this “auteur” worldview.

This is quite different from how most open source projects have found their wins. If Linux can be said to belong to anyone, it belongs to everyone. It is for this reason that it is in everyone’s interests to collaborate on the project. If it belonged to someone or some entity alone, especially if that sense of ownership is rooted in justifying that entity’s sole right to effectively capitalize on the software, the dynamic breaks down and the incentive for the “third-party” class to participate is gone. It doesn’t work.

That said, clearly the proponents of these new source-available movements feel otherwise. And, to be clear, I wish them well. I respect the right for authors of software to distribute it under whatever terms they wish.2 And, for my part, I do believe that source-available is a clear improvement over proprietary software, even though these models fall short of what I perceive as the advantages of open source. However, for these movements to have a shot at success, they need to deeply understand these dynamics and the philosophical and practical underpinnings of the free and open source movements.

However, it is very important to me that we do not muddy the landscape of open source by trying to reform, redefine, or expand our understanding of open source to include movements which contradict this philosophy. My well-wishes are contingent on any movements which aim to compete with open source stopping short of calling themselves open source. This is something I appreciate about the fair source and post-open movements – both movements explicitly disavow the label of open source. If you want to build something new, be clear that it is something new – this is the ground rule.

So you want to compete with open source, or even replace it with something new. Again, I wish you good luck. But this question will be at the heart of your challenge: will you be able to assume the mantle of the auteur and capitalize on this software while still retaining the advantages that made open source successful? Will you be able to appeal to the public in the same way open source does while holding onto these commercial advantages for yourself? Finding a way to answer this question with a “yes” is the task laid before you. It will be difficult; in the end, you will have to give something to the public to get something in return. Simply saying that the software itself is a gift equal to the labor you ask of the public is probably not going to work, especially when this “gift” comes with monetary strings attached.

As for me, I still believe in open source, and even in the commercial potential of open source. It requires creativity and a clever business acumen to identify and exploit market opportunities within this collaborative framework. To win in open source you must embrace this collaboration and embrace the fact that you will share the commercial market for the software with other entities. If you’re up to that challenge, then let’s keep beating the open source drum together. If not, these new movements may be a home for you – but know that a lot of hard work still lies ahead of you in that path.

Writing a Unix clone in about a month

I needed a bit of a break from “real work” recently, so I started a new programming project that was low-stakes and purely recreational. On April 21st, I set out to see how much of a Unix-like operating system for x86_64 targets that I could put together in about a month. The result is Bunnix. Not including days I didn’t work on Bunnix for one reason or another, I spent 27 days on this project.

You can try it for yourself if you like:

To boot this ISO with qemu:

qemu-system-x86_64 -cdrom bunnix.iso -display sdl -serial stdio

You can also write the iso to a USB stick and boot it on real hardware. It will probably work on most AMD64 machines – I have tested it on a ThinkPad X220 and a Starlabs Starbook Mk IV. Legacy boot and EFI are both supported. There are some limitations to keep in mind, in particular that there is no USB support, so a PS/2 keyboard (or PS/2 emulation via the BIOS) is required. Most laptops rig up the keyboard via PS/2, and YMMV with USB keyboards via PS/2 emulation.

Tip: the DOOM keybindings are weird. WASD to move, right shift to shoot, and space to open doors. Exiting the game doesn’t work so just reboot when you’re done playing. I confess I didn’t spend much time on that port.

What’s there?

The Bunnix kernel is (mostly) written in Hare, plus some C components, namely lwext4 for ext4 filesystem support and libvterm for the kernel video terminal.

The kernel supports the following drivers:

  • PCI (legacy)
  • AHCI block devices
  • GPT and MBR partition tables
  • PS/2 keyboards
  • Platform serial ports
  • CMOS clocks
  • Framebuffers (configured by the bootloaders)
  • ext4 and memfs filesystems

There are numerous supported kernel features as well:

  • A virtual filesystem
  • A /dev populated with block devices, null, zero, and full psuedo-devices, /dev/kbd and /dev/fb0, serial and video TTYs, and the /dev/tty controlling terminal.
  • Reasonably complete terminal emulator and somewhat passable termios support
  • Some 40 syscalls, including for example clock_gettime, poll, openat et al, fork, exec, pipe, dup, dup2, ioctl, etc

Bunnix is a single-user system and does not currently attempt to enforce Unix file modes and ownership, though it could be made multi-user relatively easily with a few more days of work.

Included are two bootloaders, one for legacy boot which is multiboot-compatible and written in Hare, and another for EFI which is written in C. Both of them load the kernel as an ELF file plus an initramfs, if required. The EFI bootloader includes zlib to decompress the initramfs; multiboot-compatible bootloaders handle this decompression for us.

The userspace is largely assembled from third-party sources. The following third-party software is included:

  • Colossal Cave Adventure (advent)
  • dash (/bin/sh)
  • Doom
  • gzip
  • less (pager)
  • lok (/bin/awk)
  • lolcat
  • mandoc (man pages)
  • sbase (core utils)1
  • tcc (C compiler)
  • Vim 5.7

The libc is derived from musl libc and contains numerous modifications to suit Bunnix’s needs. The curses library is based on netbsd-curses.

The system works but it’s pretty buggy and some parts of it are quite slapdash: your milage will vary. Be prepared for it to crash!

How Bunnix came together

I started documenting the process on Mastodon on day 3 – check out the Mastodon thread for the full story. Here’s what it looked like on day 3:

Screenshot of an early Bunnix build, which boots up, sets up available memory, and exercises an early in-memory filesystem

Here’s some thoughts after the fact.

Some of Bunnix’s code stems from an earlier project, Helios. This includes portions of the kernel which are responsible for some relatively generic CPU setup (GDT, IDT, etc), and some drivers like AHCI were adapted for the Bunnix system. I admit that it would probably not have been possible to build Bunnix so quickly without prior experience through Helios.

Two of the more challenging aspects were ext4 support and the virtual terminal, for which I brought in two external dependencies, lwext4 and libvterm. Both proved to be challenging integrations. I had to rewrite my filesystem layer a few times, and it’s still buggy today, but getting a proper Unix filesystem design (including openat and good handling of inodes) requires digging into lwext4 internals a bit more than I’d have liked. I also learned a lot about mixing source languages into a Hare project, since the kernel links together Hare, assembly, and C sources – it works remarkably well but there are some pain points I noticed, particularly with respect to building the ABI integration riggings. It’d be nice to automate conversion of C headers into Hare forward declaration modules. Some of this work already exists in hare-c, but has a ways to go. If I were to start again, I would probably be more careful in my design of the filesystem layer.

Getting the terminal right was difficult as well. I wasn’t sure that I was going to add one at all, but I eventually decided that I wanted to port vim and that was that. libvterm is a great terminal state machine library, but it’s poorly documented and required a lot of fine-tuning to integrate just right. I also ended up spending a lot of time on performance to make sure that the terminal worked smoothly.

Another difficult part to get right was the scheduler. Helios has a simpler scheduler than Bunnix, and while I initially based the Bunnix scheduler on Helios I had to throw out and rewrite quite a lot of it. Both Helios and Bunnix are single-CPU systems, but unlike Helios, Bunnix allows context switching within the kernel – in fact, even preemptive task switching enters and exits via the kernel. This necessitates multiple kernel stacks and a different approach to task switching. However, the advantages are numerous, one of which being that implementing blocking operations like disk reads and pipe(2) are much simpler with wait queues. With a robust enough scheduler, the rest of the kernel and its drivers come together pretty easily.

Another source of frustration was signals, of course. Helios does not attempt to be a Unix and gets away without these, but to build a Unix, I needed to implement signals, big messy hack though they may be. The signal implementation which ended up in Bunnix is pretty bare-bones: I mostly made sure that SIGCHLD worked correctly so that I could port dash.

Porting third-party software was relatively easy thanks to basing my libc on musl libc. I imported large swaths of musl into my own libc and adapted it to run on Bunnix, which gave me a pretty comprehensive and reliable C library pretty fast. With this in place, porting third-party software was a breeze, and most of the software that’s included was built with minimal patching.

What I learned

Bunnix was an interesting project to work on. My other project, Helios, is a microkernel design that’s Not Unix, while Bunnix is a monolithic kernel that is much, much closer to Unix.

One thing I was surprised to learn a lot about is filesystems. Helios, as a microkernel, spreads the filesystem implementation across many drivers running in many separate processes. This works well enough, but one thing I discovered is that it’s quite important to have caching in the filesystem layer, even if only to track living objects. When I revisit Helios, I will have a lot of work to do refactoring (or even rewriting) the filesystem code to this end.

The approach to drivers is also, naturally, much simpler in a monolithic kernel design, though I’m not entirely pleased with all of the stuff I heaped into ring 0. There might be room for an improved Helios scheduler design that incorporates some of the desirable control flow elements from the monolithic design into a microkernel system.

I also finally learned how signals work from top to bottom, and boy is it ugly. I’ve always felt that this was one of the weakest points in the design of Unix and this project did nothing to disabuse me of that notion.

I had also tried to avoid using a bitmap allocator in Helios, and generally memory management in Helios is a bit fussy altogether – one of the biggest pain points with the system right now. However, Bunnix uses a simple bitmap allocator for all conventional pages on the system and I found that it works really, really well and does not have nearly as much overhead as I had feared it would. I will almost certainly take those lessons back to Helios.

Finally, I’m quite sure that putting together Bunnix in just 30 days is a feat which would not have been possible with a microkernel design. At the end of the day, monolithic kernels are just much simpler to implement. The advantages of a microkernel design are compelling, however – perhaps a better answer lies in a hybrid kernel.

What’s next

Bunnix was (note the past tense) a project that I wrote for the purpose of recreational programming, so it’s purpose is to be fun to work on. And I’ve had my fun! At this point I don’t feel the need to invest more time and energy into it, though it would definitely benefit from some. In the future I may spend a few days on it here and there, and I would be happy to integrate improvements from the community – send patches to my public inbox. But for the most part it is an art project which is now more-or-less complete.

My next steps in OS development will be a return to Helios with a lot of lessons learned and some major redesigns in the pipeline. But I still think that Bunnix is a fun and interesting OS in its own right, in no small part due to its demonstration of Hare as a great language for kernel hacking. Some of the priorities for improvements include:

  • A directory cache for the filesystem and better caching generally
  • Ironing out ext4 bugs
  • procfs and top
  • mmaping files
  • More signals (e.g. SIGSEGV)
  • Multi-user support
  • NVMe block devices
  • IDE block devices
  • ATAPI and ISO 9660 support
  • Intel HD audio support
  • Network stack
  • Hare toolchain in the base system
  • Self hosting

Whether or not it’s me or one of you readers who will work on these first remains to be seen.

In any case, have fun playing with Bunnix!

Copyleft licenses are not “restrictive”

One may observe an axis, or a “spectrum”, along which free and open source software licenses can be organized, where one end is “permissive” and the other end is “copyleft”. It is important to acknowledge, however, that though copyleft can be found at the opposite end of an axis with respect to permissive, it is not synonymous with the linguistic antonym of permissive – that is, copyleft licenses are not “restrictive” by comparison with permissive licenses.

Aside: Free software is not synonymous with copyleft and open source is not synonymous with permissive, though this is a common misconception. Permissive licenses are generally free software and copyleft licenses are generally open source; the distinction between permissive and copyleft is orthogonal to the distinction between free software and open source.

It is a common misunderstanding to construe copyleft licenses as more “restrictive” or “less free” than permissive licenses. This view is predicated on a shallow understanding of freedom, a sort of passive freedom that presents as the absence of obligations. Copyleft is predicated on a deeper understanding of freedom in which freedom is a positive guarantee of rights.[source]

Let’s consider the matter of freedom, obligation, rights, and restrictions in depth.

Both forms of licenses include obligations, which are not the same thing as restrictions. An example of an obligation can be found in the permissive MIT license:

Permission is hereby granted […] to deal in the Software without restriction […] subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

This obliges the user, when distributing copies of the software, to include the copyright notice. However, it does not restrict the use of the software under any conditions. An example of a restriction comes from the infamous JSON license, which adds the following clause to a stock MIT license:

The Software shall be used for Good, not Evil.

IBM famously petitioned Douglas Crockford for, and received, a license to do evil with JSON.1 This kind of clause is broadly referred to in the free software jargon as “discrimination against field of endeavour”, and such restrictions contravene both the free software and open source definitions. To quote the Open Source Definition, clause 6:

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

No such restrictions are found in free or open source software licenses, be they permissive or copyleft – all FOSS licenses permit the use of the software for any purpose without restriction. You can sell both permissive and copyleft software, use it as part of a commercial cloud service,2 use the software as part of a nuclear weapons program,3 or do whatever else you want with it. There are no restrictions on how free software is used, regardless of if it is permissive or copyleft.

Copyleft does not impose restrictions, but it does impose obligations. The obligations exist to guarantee rights to the users of the software – in other words, to ensure freedoms. In this respect copyleft licenses are more free than permissive licenses.

Freedom is a political concept, and in order to understand this, we must consider it in political terms, which is to say as an exercise in power dynamics. Freedom without obligation is a contradiction. Freedom emerges from obligations, specifically obligations imposed on power.

Where does freedom come from?

Consider the United States as an example, a society which sets forth freedom as a core political value.4 Freedoms in the US are ultimately grounded in the US constitution and its bill of rights. These tools create freedoms by guaranteeing rights to US citizens through the imposition of obligations on the government. For instance, you have a right to an attorney when accused of a crime in the United States, and as such the government is obliged to provide you with one. It is from obligations such as these that freedom emerges. Freedom of assembly, another example, is guaranteed such that the police are prevented from breaking up peaceful protests – this freedom emerges from a constraint (or restriction, if you must) on power (the government) as a means of guaranteeing the rights and freedom of those with less power by comparison (its citizens).

Who holds the power in the context of software?

Consider non-free software by contrast: software is written by corporations and sold on to users with substantial restrictions on its use. Corporations hold more power than individuals: they have more resources (e.g. money), more influence, and, in a sense more fundamental to the software itself, they retain in private the tools to understand the software, or to modify its behavior, and they dictate the conditions under which it may be used (e.g. only if your license key has not expired, or only for certain purposes). This is true of anyone who retains the source code in private and uses copyright law to enforce their will upon the software – in this way they possess, and exercise, power over the user.

Permissive licenses do not provide any checks on this power; generally they preserve moral rights and little else. Permissive licenses provide for relatively few and narrow freedoms, and are not particularly “free” as such. Copyleft licenses constrain these powers through additional obligations, and from these obligations greater freedoms emerge. Specifically, they oblige reciprocity. They are distinguished from permissive licenses in this manner, but where permissive licenses permit, copyleft does not restrict per-se – better terms might be “reciprocal” and “non-reciprocal”, but perhaps that ship has sailed. “You may use this software if …” is a statement made both by permissive and copyleft licenses, with different ifs. Neither form of license says “you cannot use this software if …”; licenses which do so are non-free.

Permissive licenses and copyleft licenses are both free software, but only the latter provides a guarantee of rights, and while both might be free only the latter provides freedom.

FDO's conduct enforcement actions regarding Vaxry

freedesktop(.org), aka FDO, recently banned Hyprland maintainer Vaxry from the FDO community, and in response Vaxry has taken his case to the court of public opinion, publishing their email exchanges and writing about it on his blog.

It saddens me to bear witness to these events today. I wrote in September of last year about problems with toxicity in the Hyprland community. I initially reached out to Vaxry to discuss these problems in private in February of last year. I failed to get through to him, leading to that blog post in September. I spent some time in the following weeks talking with Vaxry on his behavior and his community’s social norms, again in private, but again, I was unable to get through to him. Unfortunately, we find ourselves again leaving the private sphere and discussing Vaxry’s behavior and the problem posed by the Hyprland community once again.

The fact of the matter is that Hyprland remains a toxic community, enabled and encouraged by its toxic leadership, namely Vaxry. FDO’s decision to ban Vaxry is ultimately a consequence of Vaxry’s behavior, and because he has elected to appeal his case in public, I am compelled to address his behavior in public. I hereby rise firmly in defense of FDO’s decision.

I invite you to start by reading the two email threads, one, and two, which Vaxry has published for your consideration, as well as Vaxry’s follow-ups on his blog, one, and two.

Here’s my read on the situation.

The FDO officer that reached out to Vaxry did it after Vaxry’s problematic behavior was brought to her attention by members of the FDO community, and was acting on her mandate within the FDO conduct enforcement board by investigating complaints submitted to her by this community. It is not a stretch to suggest a close relationship between these communities exists: FDO is the steward of both the Wayland protocol and implementation and the wlroots library, essential dependencies of Hyprland and sources for collaboration between Hyprland and FDO. Vaxry and other members of the Hyprland community had already participated extensively in these projects (mainly in discussions on IRC and GitLab issues) at the time of the email exchange, in spaces where the code of conduct applies.

The FDO officer duly investigated the complaints she had received and found, in collaboration with the other members of the FDO conduct enforcement team, that they were credible, and worrying. There are numerous examples of behavior from Vaxry that contravenes the FDO code of conduct in several different respects, and any number of them would be grounds for an immediate ban. Since these behaviors are concerning, but did not take place in the FDO community, the conduct board decided to issue a warning in private, stating that if this sort of behavior was seen in the FDO community that it would result in enforcement action from the conduct team.

All of the actions from the FDO conduct team are reasonable and show considerable restraint. Vaxry could have taken it in stride with no consequences to himself. Instead, he immediately escalated the situation. He construes the FDO officer’s polite and well-reasoned warning as threats and intimidation. He minimizes examples of his own hate speech by shrugging them off as a joke. He belittles the FDO officer and builds a straw man wherein her email is an official statement on behalf of RedHat, and cites a conspiracy theory about DEI programs at RedHat as justification for calling the FDO officer a hypocrite. He is insulted on my behalf that my name was cited in the FDO officer’s email in lowercase, “drew”, and feels the need to address this.

The FDO officer responds to Vaxry’s unhinged rant with a sarcastic quip clarifying that it was indeed within the FDO conduct team’s remit to ban Vaxry from their GitLab instance – I confess that in my view this was somewhat unprofessional, though I can easily sympathize with the FDO officer given the context. Following this, Vaxry states that Hyprland will cease all communication with FDO’s conduct team and ignore (emphasis his) any future emails from them. Finally, he threatens legal action (on what basis is unclear) and signs the email.

Regardless of how you feel about the conduct team issuing a private warning to Vaxry on the basis of activities outside of FDO community spaces, the email thread that ensues most certainly is within the scope of the FDO code of conduct, and Vaxry’s behavior therein is sufficient justification for a ban from the FDO community as far as I’m concerned. The conduct team cites Vaxry’s stated intention to ignore any future conduct interventions as the ultimate reason for the ban, which I find entirely reasonable on FDO’s part. I have banned people for far less than this, and I stand by it.

Vaxry’s follow-up blog posts only serve to underscore this point. First of all, he immediately opens with a dog-whistle calling for the reader to harass the FDO officer in question: “I don’t condone harassing this person, but here is their full name, employer and contact details”:

I do not condone any hateful messages sent towards any of the parties mentioned.

Recently I have received an email filled with threats to my inbox, from a member of the X.org board, Freedesktop.org, and a Red Hat employee. Their name is [redacted].

Moreover, Vaxry claims to have apologised for his past conduct, which is not true. In lieu of an apology, Vaxry has spent the “1.5 years” since the last incident posting angry rants on his blog calling out minority representation and “social justice warriors” in light of his perceived persecution. Meanwhile the Hyprland community remains a toxic place, welcoming hate, bullying, and harassment, but now prohibiting all “political” speech, which in practice means any discussion of LGBTQ topics, though this is largely unenforced. In the end, the Hyprland community’s fundamental problem is that they’re all “just having fun”, and it seems that they can’t have “fun” unless it’s at someone else’s expense.

The FDO team is right that Hyprland’s community reflects poorly on the Linux desktop community as a whole. Vaxry has created a foothold for hate, transphobia, homophobia, bullying, and harassment in the Linux desktop community. We are right to take action to correct this problem.

Every option other than banning Vaxry has been exhausted over the past year and a half. I personally spent several weeks following my last blog post on the matter discussing Vaxry’s behavior in confidence and helping him understand how to improve, and at my suggestion he joined a private community of positive male role models to discuss these issues in a private and empathetic space. After a few weeks of these private discussions, the last thing he said to me was “I do believe there could be arguments to sway my opinion towards genocide”.1

There’s nothing left to do but to build a fence around Hyprland and protect the rest of the community from them. I know that there’s a lot of good people who use and contribute to Hyprland, and I’m sorry for those of you who are affected by this problem. But, in the end, actions have consequences. The rest of the community has no choice but to sanction Vaxry.

And, to Vaxry – I know you’re reading this – there are going to continue to be consequences for your actions, but it’s still not too late to change. I know it’s humiliating to be called out like this, and I really would rather not have had to do so. FDO is probably not the last time you’re going to be banned if you don’t change course, and it would reflect better on you if you took it on the chin and didn’t post inflammatory rants on your blog – trust me, you don’t look like the good guy here. You are trapped in an echo chamber of hate, anger, and bigotry. I hope that you find a way out, and that someday you can build a community which is as great as your software is.

And, to the FDO officer in question: I’m so sorry that you’re at the ass end of all of this hate and abuse. You don’t deserve any of it. You did a good job, and I’m proud of you and the rest of the FDO conduct team. If you need any support, someone to talk to, don’t hesitate to reach out and ask, on IRC, Matrix, email, whatever. Don’t read the comments.

And on that note, I condemn in the harshest terms the response from communities like /r/linux on the subject. The vile harassment and hate directed at the FDO officer in question is obscene and completely unjustifiable. I don’t care what window manager or desktop environment you use – this kind of behavior is completely uncalled for. I expect better.


P.S. The Hyprland community has already descended on me before even publishing this post, after I called Vaxry out on Mastodon a few hours ago. My notifications are not full of reasonable objections to my complaints, but instead the response is slurs and death threats. This only serves to prove my characterization of the Hyprland community as deeply toxic.

Why Prusa is floundering, and how you can avoid their fate

Prusa is a 3D printer manufacturer which has a long history of being admired by the 3D printing community for high quality, open source printers. They have been struggling as of late, and came under criticism for making the firmware of their Mk4 printer non-free.1

Armin Ronacher uses Prusa as a case-study in why open source companies fail, and uses this example to underline his argument that open source needs to adapt for commercial needs, namely by adding commercial exclusivity clauses to its licenses – Armin is one of the principal proponents of the non-free Functional Source License. Armin cites his experience with a Chinese manufactured 3D printer as evidence that intellectual property is at the heart of Prusa’s decline, and goes on to discuss how this dynamic applies to his own work in developing a non-free license for use with Sentry. I find this work pretty interesting – FSL is a novel entry into the non-free license compendium, and it’s certainly a better way to do software than proprietary models, assuming that it’s not characterized as free or open source. But, allow me to use the same case study to draw different conclusions.

It is clear on the face of it that Prusa’s move to a non-free firmware is unrelated to their struggles with the Chinese competition – their firmware was GPL’d, and the cited competitor (Bambu) evidently respects copyleft, and there’s no evidence that Bambu’s printers incorporate derivatives of Prusa’s firmware in a manner which violates the GPL. Making the license non-free is immaterial to the market dynamics between Prusa and Bambu, so the real explanation must lie elsewhere.

If you had asked me 10 years ago what I expected Prusa’s largest risk would be, I would have simply answered “China” and you would have probably said the same. The Chinese economy and industrial base can outcompete Western manufacturing in almost every manufacturing market.2 This was always the obvious vulnerability in their business model, and they absolutely needed to be prepared for this situation, or their death was all but certain. Prusa made one of the classic errors in open source business models: they made their product, made it open source, sold it, and assumed that they were done working on their business model.

It was inevitable that someday Chinese manufacturers would undercut Prusa on manufacturing costs. Prusa responded to this certainty by not diversifying their business model whatsoever. There has only ever been one Prusa product: their latest 3D printer model. The Mk4 costs $1,200. You can buy the previous generation (at $1,000), or the MINI (from 2019, $500). You can open your wallet and get their high-end printers, which are neat but fail to address the one thing that most users at this price-point really want, which is more build volume. Or, you can buy an Ender 3 off Amazon right now for $180 and you’ll get better than half of the value of an Mk4 at an 85% discount. You could also buy Creality’s flagship model for a cool $800 and get a product which beats the Mk4 in every respect. China has joined the market, bringing with them all of the competitive advantages their industrial base can bring to bear, and Prusa’s naive strategy is causing their position to fall like a rock.

Someone new to 3D printing will pick up an Ender and will probably be happy with it for 1-2 years. When they upgrade, will they upgrade to a Prusa or an Ender 5? Three to five years a customer spends in someone else’s customer pipeline is an incredibly expensive opportunity cost Prusa is missing out on. This opportunity cost is the kind of arithmetic that would make loss leaders like a cheap, low-end, low-or-negative-margin Prusa printer make financial sense. Hell, Prusa should have made a separate product line of white-labeled Chinese entry-level 3D printers just to get people on the Prusa brand.

Prusa left many stones unturned. Bambu’s cloud slicer is a massive lost opportunity for Prusa. On-demand cloud printing services are another lost opportunity. Prusa could have built a marketplace for models & parts and skimmed a margin off of the top, but they waited until 2022 to launch Printables – waiting until the 11th hour when everyone was fed up with Thingiverse. Imagine a Prusa where it works out of the box, you can fire up a slicer in your browser which auto-connects to your printer and prints models from a Prusa-operated model repository, paying $10 for a premium model, $1 off the top goes to Prusa, with the same saved payment details which ensure that a fresh spool of Prusa filament arrives at your front door when it auto-detects that your printer is almost out. The print you want is too big for your build volume? Click here to have it cloud printed – do you want priority shipping for that? Your hot-end is reaching the end of its life – as one of our valued business customers on our premium support contract we would be happy to send you a temporary replacement printer while yours is shipped in for service.

Prusa’s early foothold in the market was strong, and they were wise to execute the way they did early on. But they absolutely had to diversify their lines of business. Prusa left gaping holes in the market and utterly failed to capitalize on any of them. Prusa could have been synonymous with 3D printing if they had invested in the brand (though they probably needed a better name). I should be able to walk into a Best Buy and pick up an entry-level Prusa for $250-$500, or into a Home Depot and pick up a workshop model for $1000-$2000. I should be able to bring it home, unbox it, scan a QR code to register it with PrusaConnect, and have a Benchy printing in less than 10 minutes.

Chinese manufacturers did all of this and more, and they’re winning. They aren’t just cheaper – they offer an outright better product. These are not cheap knock-offs: if you want the best 3D printer today it’s going to be a Chinese one, regardless of how much you want to spend, but, as it happens, you’re going to spend less.

Note that none of this is material to the license of the product, be it free or non-free. It’s about building a brand, developing a customer relationship, and identifying and exploiting market opportunities. Hackers and enthusiasts who found companies like Prusa tend to imagine that the product is everything, but it’s not. Maybe 10% of the work is developing the 3D printer itself – don’t abandon the other 90% of your business. Especially when you make that 10% open: someone else is going to repurpose it, do the other 90%, and eat your lunch. FOSS is great precisely because it makes that 10% into community property and shares the cost of innovation, but you’d be a fool to act as if that was all there was to it. You need to deal with sales and marketing, chase down promising leads, identify and respond to risks, look for and exploit new market opportunities, and much more to be successful.

This is a classic failure mode of open source businesses, and it’s Prusa’s fault. They had an excellent foothold early in the market, leveraging open source and open hardware to great results and working hand-in-hand with enthusiasts early on to develop the essential technology of 3D printing. Then, they figured they were done developing their business model, and completely dropped the ball as a result. Open source is not an “if you build it, the money will come” situation, and to think otherwise is a grave mistake. Businesses need to identify their risks and then mitigate them, and if they don’t do that due diligence, then it’s their fault when it fails – it’s not a problem with FOSS.

Free and open source software is an incredibly powerful tool, including as a commercial opportunity. FOSS really has changed the world! But building a business is still hard, and in addition to its fantastic advantages, the FOSS model poses important and challenging constraints that you need to understand and work with. You have to be creative, and you must do a risk/reward assessment to understand how it applies to your business and how you can utilize it for commercial success. Do the legwork and you can utilize FOSS for a competitive advantage, but skip this step and you will probably fail within a decade.

Richard Stallman's political discourse on sex

Richard Stallman, the founder of the Free Software Foundation, has been subject to numerous allegations of misconduct. He stepped down in 2019, and following his re-instatement in 2021, a famous open letter was published in which numerous organizations and individuals from throughout the Free Software ecosystem called for his removal from the Free Software Foundation. The letter had no effect; Stallman remains a voting member of the FSF’s board of directors to this day and continues to receive numerous speaking engagements.

Content warning: This article discusses sexual abuse, sexual assault, sexual harassment, and all of the above with respect to minors, as well as the systemic normalization of abuse, and directly quotes statements which participate in the normalization of abuse.

This article presents an analysis of Stallman’s political discourse on sex with the aim of establishing the patterns that cause the sort of discomfort that led to Stallman’s public condemnation. In particular, we will address how Stallman speaks about sexual assault, harassment, consent, and minors in his discourse.

I think that it is important to acknowledge this behavior not as a series of isolated incidents, nor a conflict with Stallman’s “personal style”, but a pattern of behavior from which a political narrative forms, and draws attention to the fact that the meager retractions, excuses, and non-apologies from both Stallman and the Free Software Foundation as a whole fail to account for that pattern in a meaningful way.

The failure of the Free Software community to account for Richard Stallman’s behavior has a chilling effect. The norms set by our leadership influence the norms of our broader community, and many members of the Free Software community look to Stallman as a ideological and political leader. The norms Stallman endorses are harmful and deeply confronting and alienating to many people, in particular women and children. Should these norms be adopted by our movement, we risk creating a community which enables the exploitation of vulnerable people.

Let’s begin to address this by considering Stallman’s retraction of his comments in support of pedophilia. The following comment from Stallman in 2013 drew harsh criticism:

There is little evidence to justify the widespread assumption that willing participation in pedophilia hurts children.

stallman.org, 04 January 2013 “Pedophilia”

Following much of the criticism directed at Stallman, he had a number of “personal conversations” which reframed his views. Of the many comments Stallman has made which drew ire, this is one of the few for which a correction was made, in September 2019:

Many years ago I posted that I could not see anything wrong about sex between an adult and a child, if the child accepted it.

Through personal conversations in recent years, I’ve learned to understand how sex with a child can harm per psychologically. This changed my mind about the matter: I think adults should not do that. I am grateful for the conversations that enabled me to understand why.

stallman.org, 14 September 2019 “Sex between an adult and a child is wrong”

This statement from Stallman has been accepted by his defenders as evidence of his capitulation on pedophilia. I argue that this statement is misleading due to the particular way Stallman uses the word “child”. When Stallman uses this word, he does so with a very specific meaning, which he explains on his website:

Children: Humans up to age 12 or 13 are children. After that, they become adolescents or teenagers. Let’s resist the practice of infantilizing teenagers, by not calling them “children”.

stallman.org, “Anti-glossary”

It seems clear from this definition is that Stallman’s comments are not a capitulation at all. His 2019 retraction, when interpreted using his definition of “children”, does not contradict most of Stallman’s past statements regarding sex and minors, including his widely criticized defenses of many people accused of sexual impropriety with minors.

Stallman’s most recent direct response to his criticism underscores this:

It was right for me to talk about the injustice to Minsky, but it was tone-deaf that I didn’t acknowledge as context the injustice that Epstein did to women or the pain that caused.

fsf.org, April 12, 2021, “RMS addresses the free software community”

Stallman qualifies his apology by explicitly re-affirming his defense of Marvin Minsky, which is addressed in detail later in this piece. Stallman’s doubling-down here is consistent with the supposition that Stallman maintains the view that minors can have sexual relationships with adults of any age, provided that they aren’t “children” – in other words, provided they’re at least 13 or 14 years old.

Stallman cares deeply about language and its usage. His strange and deliberate usage of the word “children” is also found many times throughout his political notes over the years. For example:

It sounds horrible: “UN peacekeepers accused of child rape in South Sudan.” But the article makes it pretty clear that the “children” involved were not children. They were teenagers.

stallman.org, 30 April 2018 “UN peacekeepers in South Sudan”

Here Stallman again explicitly distinguishes “teenagers” from children, drawing this distinction especially in the context of sexual relationships between adults and minors. Stallman repeats this pattern many times over the years – we see it again in Stallman’s widely criticized defense of Cody Wilson:

Cody Wilson has been charged with hiring a “child” sex worker. Her age has not been announced, but I think she must surely be a teenager, not a child. Calling teenagers “children” in this context is a way of smearing people with normal sexual proclivities as “perverts”.

stallman.org, 23 September 2018 “Cody Wilson”

And once more when defending Roy Moore:

Senate candidate Roy Moore tried to start dating/sexual relationships with teenagers some decades ago.

He tried to lead Ms Corfman step by step into sex, but he always respected “no” from her and his other dates. Thus, Moore does not deserve the exaggerated condemnation that he is receiving for this. As an example of exaggeration: one mailing referred to these teenagers as “children”, even the one that was 18 years old. Many teenagers are minors, but none of them are children.

The condemnation is surely sparked by the political motive of wanting to defeat Moore in the coming election, but it draws fuel from ageism and the fashion for overprotectiveness of “children”.

stallman.org, 27 November 2017 “Roy Moore’s relationships”

Ms. Corfman was 14 at the time Roy Moore is accused of initiating sexual contact with her; Moore was 32 at the time. Here we see an example of him re-iterating his definition of “children”, a distinction he draws especially to suggest that an adult having sex with a minor is socially acceptable.

Note that Stallman refers to Ms. Corfman as Moore’s “date”. Stallman’s use of this word is important: here he normalizes the possibility that a minor and an adult could engage in a healthy dating relationship. In this statement, Stallman cites an article which explains circumstances which do not resemble such a normalized dating experience: Moore isolated Corfman from her mother, drove her directly to his home, and initiated sexual contact there.

Note also that the use of the phrase “step by step” in this quotation is more commonly referred to as “grooming” in the discourse on child sexual exploitation.

Stallman reaches for similar reasoning in other political notes, such as the following:

A British woman is on trial for going to a park and inviting teenage boys to have sex with her there. Her husband acted as a lookout in case someone else passed by. One teenager allegedly visited her at her house repeatedly to have sex with her.

None of these acts would be wrong in any sense, provided they took precautions against spreading infections. The idea that adolescents (of whatever sex) need to be “protected” from sexual experience they wish to have is prudish ignorantism, and making that experience a crime is perverse.

stallman.org, 26 May 2017, “Prudish ignorantism”

The woman in question, aged 60, had sex with her husband, age 69, in a public space, and invited spectators as young as 11 to participate.

Stallman has also sought to normalize adult attraction to minors, literally describing it as “normal” in September 2018:

Calling teenagers “children” encourages treating teenagers as children, a harmful practice which retards their development into capable adults.

In this case, the effect of that mislabeling is to smear Wilson. It is rare, and considered perverse, for adults to be physically attracted to children. However, it is normal for adults to be physically attracted to adolescents. Since the claims about Wilson is the latter, it is wrong to present it as the former.

stallman.org, 23 September 2018, “Cody Wilson”

One month prior, Stallman made a statement which similarly normalized adult attraction to minors, and suggests that acting on this attraction should be acceptable to society, likening opposition to this view to homosexual conversion therapy:

This accords with the view that Stendhal reported in France in the 1800s, that a woman’s most beautiful years were from 16 to 20.

Although this attitude on men’s part is normal, the author still wants to present it as wrong or perverted, and implicitly demands men somehow control their attraction to direct it elsewhere. Which is as absurd, and as potentially oppressive, as claiming that homosexuals should control their attraction and direct it towards to the other sex. Will men be pressured to undergo “age conversion therapy” intended to brainwash them to feel attracted mainly to women of their own age?

stallman.org, 21 August 2018, “Age and attraction”

A trend is thus clearly seen in Stallman’s regular political notes, over several years, wherein Stallman re-iterates his position that “adolescents” or “teenagers” are distinct from “children” for the purpose of having sex with adults, and normalizes and defends adult attraction to minors and adults who perform sexual acts with minors. We see this distinction of the two groups, children and adolescents, outlined again on his “anti-glossary”, which still published on his website today, albeit without the connotations of sex. His regular insistence on a definition of children which excludes adolescents serves such that his redaction of his controversial 2013 comment serves to redact none of the other widely-condemned comments he has made since.

Stallman has often written political notes when people accused of sexual impropriety, particularly with minors, appear in the news, or appear among Stallman’s social circle. Stallman’s comments generally downplay the abuse and manipulate language in a manner which benefits perpetrators of abuse. We see this downplaying in another example from 2019:

Should we accept stretching the terms “sexual abuse” and “molestation” to include looking without touching?

I do not accept it.

stallman.org, 11 June 2019 “Stretching meaning of terms”

Stallman is writing here in response to a news article outlining accusations of sexual misconduct directed at Ohio State athletics doctor Richard Strauss. Strauss was accused of groping at least 177 students between 1979 and 1997 during routine physical exams, accusations corroborated by at least 50 members of the athletic department staff.

In addition to Stallman’s regular fixation of the use of the word “children” with respect to sex, this political note also draws our attention to the next linguistic fixation of Stallman I want to question: the use of phrases like “sexual abuse” and “sexual assault”. The term “sexual assault” also appears in Stallman’s “Anti-glossary”:

Sexual assault: The term is applied to a broad range of actions, from rape on one end, to the least physical contact on the other, as well as everything in between. It acts as propaganda for treating them all the same. That would be wrong.

The term is further stretched to include sexual harassment, which does not refer to a single act, but rather to a series of acts that amounts to a form of gender bias. Gender bias is rightly prohibited in certain situations for the sake of equal opportunity, but that is a different issue.

I don’t think that rape should be treated the same as a momentary touch. People we accuse have a right to those distinctions, so I am careful not to use the term “sexual assault” to categorize the actions of any person on any specific occasion.

stallman.org, “Anti-glossary”

Stallman often fixates on the term “sexual assault” throughout his political notes. He feels that the term fails to distinguish between “grave” and “minor” crimes, as he illustrated in 2021:

“Sexual assault” is so vague that it makes no sense as a charge. Because of that term, we can’t whether these journalists were accused of a grave crime or a minor one. However, the charge of espionage shows this is political persecution.

stallman.org, 21 July 2021, “Imprisonment of journalists”

I would like to find out what kind of crimes Stallman feels the need to distinguish along this axis. His other political notes give us some hints, such as this one regarding Al Franken’s sexual misconduct scandal:

If it is true that he persistently pressured her to kiss him, on stage and off, if he stuck his tongue into her mouth despite her objections, that could well be sexual harassment. He should have accepted no for an answer the first time she said it. However, calling a kiss “sexual assault” is an exaggeration, an attempt to equate it to much graver acts, that are crimes.

The term “sexual assault” encourages that injustice, and I believe it has been popularized specifically with that intention. That is why I reject that term.

stallman.org, 30 July 2019, “Al Franken”

Stallman also wrote in 2020 to question the use of the phrase again:

In the US, when thugs1 rape people they say are suspects, it is rare to bring them to justice.

I object to describing any one crime as “sexual assault” because that is vague about the severity of the crime. This article often uses that term to refer to many crimes that differ in severity but raise the same issue. That may be a valid practice.

stallman.org, 12 August 2020, “When thugs rape people they say are suspects”

In the article Stallman cites in this political note, various unwelcome sexual acts by the police are described, the least severe of which is probably molestation.

More alarmingly, Stallman addresses his views on the term “sexual assault” in this 2017 note, affording for the possibility that a 35-year-old man could have had consensual sex with an 11-year-old girl.

Jelani Maraj (who I had never heard of) could be imprisoned for a long time for “sexual assault”. What does that concretely mean?

Due to the vagueness of the term “sexual assault” together with the dishonest law that labels sex with adolescents as “rape” even if they are willing, we cannot tell from this article what sort of acts Maraj was found to have committed. So we can’t begin to judge whether those acts were wrong.

I see at least three possibilities. Perhaps those acts really constituted rape — it is a possibility. Or perhaps the two had sex willingly, but her parents freaked out and demanded prosecution. Or, intermediate between those two, perhaps he pressured her into having sex, or got her drunk.

stallman.org, 13 November 2017, “Jelani Maraj”

Another article by Stallman does not explicitly refer to sexual assault, but does engage in a bizarre defense of a journalist who was fired for masturbating during a video conference. In this article Stallman fixates on questions such as whether or not the genitals being in view of the webcam was intentional or not, and suggests that masturbating on a video call would be acceptable should the genitals remain unseen.

The New Yorker’s unpublished note to staff was vague about its grounds for firing Toobin. Indeed, it did not even acknowledge that he had been fired. This is unfair, like convicting someone on unstated charges. Something didn’t meet its “standards of conduct”, but it won’t tell us what — we can only guess. What are the possibilities? Intentionally engaging in video-call sex as a side activity during a work meeting? If he had not made a mistake in keeping that out of view of the coworkers, why would it make a difference what the side activity was?

stallman.org, November 2020, “On the Firing of Jeffrey Toobin”

Finally, Stallman elaborated on his thoughts on the term most recently in October 2023. This note gives the clearest view of Stallman’s preferred distinction between various sexual crimes:

I warned that the stretchable term “sexual assault”, which extends from grave crimes such as rape through significant crimes such as groping and down to no clear lower bound, could be stretched to criminalize minor things, perhaps even stealing a kiss. Now this has happened.

What next? Will a pat on the arm or a hug be criminalized? There is no clear limit to how far this can go, when a group builds up enough outrage to push it.

stallman.org, 15 October 2023, “Sexual assault for stealing a kiss”

From Stallman’s statements, we can refine his objection to the term “sexual assault”, and sexual behaviors generally, to further suggest that the following beliefs are held by Stallman on the subject:

  • Groping and molestation are not sexual assault, but are crimes
  • Kissing someone without consent is not sexual assault, furthermore it is not wrong
  • Masturbating during a video conference is not wrong if you are not seen doing so
  • A 35-year-old man having sex with an 11-year-old girl does not constitute rape, nor sexual assault, but is in fact conscionable

The last of these may be covered under Stallman’s 2019 retraction, even accounting for Stallman’s unconventional use of the word “children”.

Stallman’s fixation on the term “sexual assault” can be understood in his political notes as having the political aims of eroding the meaning of the phrase, questioning the boundaries of consent, downplaying the importance of agency in intimate interactions, appealing for the defense of people accused of sexual assault, and arguing for sexual relationships between minors and adults to be normalized. In one notable case, he has used this political angle to rise to the defense of his friends – in Stallman’s infamous email regarding Marvin Minsky, he writes the following:

The injustice [done to Minsky] is in the word “assaulting”. The term “sexual assault” is so vague and slippery that it facilitates accusation inflation: taking claims that someone did X and leading people to think of it as Y, which is much worse than X.

(…)

The word “assaulting” presumes that he applied force or violence, in some unspecified way, but the article itself says no such thing. Only that they had sex.

We can imagine many scenarios, but the most plausible scenario is that she presented herself to him as entirely willing. Assuming she was being coerced by Epstein, he would have had every reason to tell her to conceal that from most of his associates.

I’ve concluded from various examples of accusation inflation that it is absolutely wrong to use the term “sexual assault” in an accusation.

— Excerpt from Selam G’s recount of Stallman’s email to MIT Computer Science and Artificial Intelligence Laboratory mailing list, September 2019. Selam’s quotation has been corroborated by other sources. Minsky is, in this context, accused of having had a sexual encounter with a minor facilitated by convicted child trafficker Ghislaine Maxwell. The original accusation does not state that this sexual encounter actually occurred; only that the minor in question was instructed to have sex with Minsky. Minsky would have been at least 75 years old at the time of the alleged incident; the minor was 16.

There is an important, but more subtle pattern in Stallman’s statements that I want to draw your attention to here: Stallman appears to have little to no understanding of the role of power dynamics in sexual harassment, assault, and rape. Stallman appears to reject the supposition that these acts could occur without an element of outwardly apparent violent coercion.

This is most obviously evidenced by his statements regarding the sexual abuse of minors; most people understand that minors cannot consent to sex even if they “appear willing”, in particular because an adult in this situation is exploiting a difference in experience and maturity to manipulate the child into sexually satisfying them – in other words, a power differential. Stallman seems to reject this understanding of consent in his various defenses of people accused of sexual impropriety with minors, and in cases where the pretense of consent cannot be easily established, he offers the perpetrator the benefit of the doubt.

We can also find an example of Stallman disregarding power dynamics with respect to adults in the following political note from 2017:

A famous theater director had a habit of pestering women, asking them for sex.

As far as I can tell from this article, he didn’t try to force women into sex.

When women persistently said no, he does not seem to have tried to punish them.

The most he did was ask.

He was a pest, but nothing worse than that.

stallman.org, 29 October 2017, “Pestering women”

In this case we have an example of “quid pro quo”, a kind of sexual harassment which weaponizes power dynamics for sexual gratification. This kind of sexual harassment is explicitly cited as illegal by Title VII of the US Civil Rights Act. A lack of competence in this respect displayed by Stallman, whose position in the Free Software Foundation board of directors requires that he act in a manner consistent with this law, is alarming.

I have identified this blindness to power dynamics as a recurring theme in Stallman’s comments on sexual abuse, be it with respect to sexual relationships between minors and adults, managers and subordinates, students and teachers, or public figures and their audience. I note for the reader that Stallman has held and currently holds several of these positions of power.

In addition to his position as a voting member of the Free Software Foundation’s Board of Directors, Stallman is still invited to speak at events and conferences. Stallman’s infamous rider prescribes a number of his requirements for attending an event; most of his conditions are relatively reasonable, though amusing. In this document, he states his preference for being accommodated in private, on a “spare couch”, when he travels. At these events, in these private homes, he may be afforded many opportunities to privacy with vulnerable people, including minors that, in his view, can consent to having sex with adults.

In summary, Stallman has a well-documented and oft-professed set of political beliefs which reject the social and legal norms regarding consent. He is not simply quietly misled in these beliefs; rather he advocates for these values using his political platform. He has issued no meaningful retractions of these positions or apologies for harm caused, and has continued to pursue a similar agenda since his return to the FSF board of directors.

This creates a toxic environment not only in the Free Software Foundation and in Stallman’s direct purview, but in the broader Free Software movement. The free software movement is culturally poisoned by our support of Stallman as our ideological leader. The open letter calling for Stallman’s removal received 3,000 signatures; the counter-letter in support of Stallman received 6,876 before it stopped accepting submissions.

Richard Stallman founded the Free Software Foundation in 1985, and has performed innumerable works to the benefit of our community since then. We’ve taken Stallman’s views on software freedom seriously, and they’ve led us to great achievements. It is to Stallman’s credit that the Free Software community is larger than one man. However, one’s political qualifications to speak about free software does not make one qualified to address matters of sex; in this respect Stallman’s persistence presents as dangerous incompetence.

When we consider his speech on sex as a discourse that has been crafted and rehearsed methodically over the years, he asks us to consider him seriously, and so we must. When we analyze the dangerous patterns in this discourse, we have to conclude that he is not fit for purpose in his leadership role, and we must acknowledge the shadow that our legitimization of his discourse casts on our community.

Can I be on your podcast?

I am working on rousing the Hare community to get the word out about our work. I have drafted the Hare evangelism guidelines to this effect, which summarizes how we want to see our community bringing Hare to more people.

We’d like to spread the word in a way which is respectful of the attention of others – we’re explicitly eschewing unsolicited prompts for projects to consider writing/rewriting in Hare, as well as any paid sponsorships or advertising. Blog posts about Hare, videos, participating in (organic) online discussions – much better! And one idea we have is to talk about Hare on podcasts which might be interested in the project.

If that describes your podcast, here’s my bold request: can I make an appearance?

Here are some mini “press kits” to give you a hook and some information that might be useful for preparing an interview.

The Hare programming language

Hare is a systems programming language designed to be simple, stable, and robust. Hare uses a static type system, manual memory management, and a minimal runtime. It is well-suited to writing operating systems, system tools, compilers, networking software, and other low-level, high performance tasks.

Hare has been in development since late 2019 and today has about 100 contributors.

A hand-drawn picture of a rabbit

Hare’s official mascot, Harriet. Drawn by Louis Taylor, CC-0

The Ares operating system

Ares is an operating system written in Hare which is under development. It features a micro-kernel oriented design and runs on x86_64 and aarch64. Its design is inspired by the seL4 micro-kernel and Plan 9.

A photo of a laptop running the Ares operating system

A picture of a ThinkPad running Ares and demonstrating some features

Himitsu: a secret storage system

Himitsu is a secure secret storage system for Unix-like systems. It provides an arbitrary key/value store (where values may be secret) and a query language for manipulating the key store.

Himitsu is written in Hare.

Interested?

If any of these topics are relevant for your podcast and you’d like to talk about them, please reach out to me via email: drew@ddevault.org

Thanks!

On "real name" policies

Some free software projects reject anonymous or pseudonymous contributions, requiring you to author patches using your “real name”. Such projects have a so-called “real name” policy; Linux is one well-known example.1

The root motivations behind such policies vary, but in my experience the most often cited rationale is that it’s important to establish the provenance of the contribution for copyright reasons. In the case of Linux, contributors are asked to “sign-off” their commits to indicate their agreement to the terms of the Developer Certificate of Origin (DCO), which includes clauses like the following:

The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file.

To some extent, the DCO serves as a legal assertion of copyright and an agreement to license a work under given copyright terms (GPLv2 in the case of Linux). This record also means that the author of the code is accountable in case the copyright is challenged; in the case of an anonymous or pseudonymous contributor you’re shit out of luck. At that point, liability over the disagreement would likely fall into the hands of the maintainer that accepted the contribution. It is reasonable for a maintainer to ask a contributor to assert their copyright and accept liability over the provenance of their code in a legally meaningful and accountable form.

The possibility that someone may have something useful to offer to a free software project, but is not comfortable disclosing their name for any number of reasons, is a reasonable supposition. A maintainer whose “real name” policy is challenged on this basis would also be reasonable in saying “I feel for you, but I cannot agree to accept legal liability over the provenance of this code, nor can I communicate that risk to end-users who acquire code under a license that may or may not be valid as such”.

“Real name” policies are controversial in the free software community. I open with this perspective in an attempt to cool down the room. Those who feel marginalized by “real name” policies often skew young, and many treat matters such as copyright and licensing with disdain. Moreover, the problem tends to inflame deeply hurtful sentiments and raise thorny matters of identity and discrimination, and it’s easy to construe the intent of the policymakers as the intent to cause harm. The motivations behind these policies are reasonable.

That said, intent or otherwise, these policies can cause harm. The profile of the contributor who is comfortable using their “real name” is likely to fall more narrowly into over-represented demographics in our community; enforcing a real-name policy will ostracize some people. Those with marginalized identities tend to be less comfortable with disclosing their “real name”. Someone who has been subject to harassment may not be comfortable with this disclosure, since it offers more fuel to harassers keeping tabs on their activities. The use of a “real name” also confers a gender bias; avoiding a “real name” policy neatly eliminates discrimination on this basis. Of course, there are also many falsehoods programmers believe about names which can present in the implementation of such a policy.

There is also one particular problem which has been at the heart of conflict surrounding the use of “real-name” policies in free software: transgender identities. A transgender person is likely to change their name in the process of assuming their new identity. When this happens, their real name changes. However, it may or may not match their legal name – some trans people opt to change it, others don’t; if they do it is a process that takes time. Meanwhile, addressing a trans person by their old name, or “deadname”, is highly uncomfortable. Doing so deliberately, as a matter of policy or otherwise, is a form of discrimination. Many trans people experience deliberate “deadnaming” as a form of harassment in their daily lives, and institutionalizing this behavior is cruel.

The truth is, managing the names of participants is more challenging than anyone would like. On the one hand, names establish accountability and facilitate collaboration, and importantly, credit the authors of a work for services performed. On the other hand, names are highly personal and deeply affecting, and their usage and changes over time are the subject of important consideration at the discretion of their owner. A complicating factor is that handling names properly introduces technical problems which must be overcome.

To embrace the advantages of “real name” policies – establishing provenance, encouraging accountability, fostering a social environment – without causing harm, the approach I have settled on for my projects is to use the DCO to establish provenance and encourage contributors to sign-off and participate under the identity they feel most comfortable with. I encourage people to utilize an identity they use beyond the project’s walls, to foster a social environment and a connection to the broader community, to establish accountability, and to ensure that participants are reachable for further discussion on their work. If a contributor’s identity changes, we make every effort to support this change in contemporary, future, and historical use.

Going off-script

There is a phenomenon in society which I find quite bizarre. Upon our entry to this mortal coil, we are endowed with self-awareness, agency, and free will. Each of the 8 billion members of this human race represents a unique person, a unique worldview, and a unique agency. Yet, many of us have the same fundamental goals and strive to live the same life.

I think of such a life experiences as “following the script”. Society lays down for us a framework for living out our lives. Everyone deviates from the script to some extent, but most people hit the important beats. In Western society, these beats are something like, go to school, go to college, get a degree, build a career, get married, have 1.5 children, retire to Florida, die.

There are a number of reasons that someone may deviate from the script. The most common case is that the deviations are imposed by circumstance. A queer person will face discrimination, for instance, in marriage, or in adopting and raising children. Someone born into the lower class will have reduced access to higher education and their opportunities for career-building are curtailed accordingly; similar experiences follow for people from marginalized groups. Furthermore, more and more people who might otherwise be able to follow the script are finding that they can’t afford a home and don’t have the resources to build a family.

There are nevertheless many people who are afforded the opportunity to follow the script, and when they do so, they often experience something resembling a happy and fulfilling life. Generally this is not the result of a deliberate choice – no one was presented with the script and asked “is this what you want”? Each day simply follows the last and you make the choices that correspond with what you were told a good life looks like, and sometimes a good life follows.

Of course, it is entirely valid to want the “scripted” life. But you were not asked if you wanted it: it was just handed to you on a platter. The average person lacks the philosophical background which underpins their worldview and lifestyle, and consequently cannot explain why it’s “good”, for them or generally. Consider your career. You were told that it was a desirable thing to build for yourself, and you understand how to execute your duties as a member of the working class, but can you explain why those duties are important and why you should spend half of your waking life executing them? Of course, if you are good at following the script, you are rewarded for doing so, generally with money, but not necessarily with self-actualization.

This state of affairs leads to some complex conflicts. This approach to life favors the status quo and preserves existing power structures, which explains in part why it is re-enforced by education and broader social pressures. It also leads to a sense of learned helplessness, a sense that this is the only way things can be, which reduces the initiative to pursue social change – for example, by forming a union.

It can also be uncomfortable to encounter someone who does not follow the script, or even questions the script. You may be playing along, and mostly or entirely exposed to people who play along. Meeting someone who doesn’t – they skipped college, they don’t want kids, they practice polyamory, they identify as a gender other than what you presumed, etc – this creates a moment of dissonance and often resistance. This tends to re-enforce biases and can even present as inadvertent micro-aggressions.

I think it’s important to question the script, even if you decide that you like it. You should be able to explain why you like it. This process of questioning is a radical act. A radical, in its non-pejorative usage, is born when someone questions their life and worldview, decides that they want something else, and seeks out others who came to similar conclusions. They organize, they examine their discomfort and put it to words, and they share these words in the hope that they can explain a similar discomfort that others might feel within themselves. Radical movements, which by definition is any movement which challenges the status quo, are the stories of the birth and spread of radical ideas.

Ask yourself: who are you? Did you choose to be this person? Who do you want to be, and how will you become that person? Should you change your major? Drop out? Quit your job, start a business, found a labor union? Pick up a new hobby? Join or establish a social club? An activist group? Get a less demanding job, move into a smaller apartment, and spend more time writing or making art? However you choose to live, choose it deliberately.

The next step is an exercise in solidarity. How do you feel about others who made their own choices, choices which may be alike or different to your own? Or those whose choices were constrained by their circumstances? What can you do together that you couldn’t do alone?

Who do you want to be? Do you know?

The forbidden topics

There are forbidden topics in the hacker community. One is sternly reprimanded for bringing them up, by their peers, their leaders, and the community at large. In private, one can expect threats and intimidation; in public, outcry and censorship. The forbidden topics are enforced by the moderators of our spaces, taken off of forums, purged from chat rooms, and cleaned up from GitHub issues and mailing lists; the ban-hammers fall swiftly and resolutely. My last article to touch these subjects was removed from Hacker News by the moderators within 30 minutes and landed several death threats in my inbox. The forbidden topics, when raised, are met with a resounding, aggressive dismissal and unconditional condemnation.

Some years ago, the hacker community possessed near-unanimous praise for the ideals of free speech; the hacker position was generally that of what we would now understand as “radical” free speech, which is to say the kind of “shout ‘fire’ in a crowded movie theater” radical, but more specifically the kind that tolerates hate speech. The popular refrain went, “I disapprove of what you say, but I will defend to the death your right to say it”. Many hackers hold this as a virtue to this day. I once held this as a virtue for myself.

However, this was a kind of free speech which was unconsciously contingent on being used for speech with which the listener was comfortable. The hacker community at this time was largely homogeneous, and as such most of the speech we were exposed to was of the comfortable sort. As the world evolved around us, and more people found their voice, this homogeneity began to break down. Critics of radical free speech, victims of hate speech, and marginalized people of all kinds began to appear in hacker communities. The things they had to say were not comfortable.

The free speech absolutists among the old guard, faced with this discomfort, developed a tendency to defend hate speech and demean speech that challenged them. They were not the target of the hate, so it did not make them personally uncomfortable, and defending it would maintain the pretense of defending free speech, of stalwartly holding the line on a treasured part of their personal hacker ethic. Speech which challenged their preconceptions and challenged their power structures was not so easily acceptable. The pretense is dropped and they lash out in anger, calling for the speakers to be excluded from our communities.

Some of the once-forbidden topics are becoming less so. There are carefully chalked-out spaces where we can talk about them, provided they are not too challenging, such as LGBTQ identities or the struggles of women in our spaces. Such discussions are subject to careful management by our leaders and moderators, to the extent necessary to preserve power structures. Those who speak on these topics are permitted to do so relatively free of retaliation provided that they speak from a perspective of humility, a voice that “knows its place”. Any speech which suggests that the listener may find themselves subject to a non-majority-conforming person in a position of power, or even that of a peer, will have crossed the line; one must speak as a victim seeking the pity and grace of your superiors to be permitted space to air your grievances.

Similarly, space is made for opposition to progressive speech, again moderated only insofar as it is necessary to maintain power structures. Some kinds of overt hate speech may rouse a response from our leaders, but those who employ a more subtle approach are permitted their voice. Thus, both progressive speech and hate speech are permitted within a carefully regulated framework of power preservation.

Some topics, however, remain strictly forbidden.

Our community has persistent and pervasive problems of a particular sort which we are not allowed to talk about: sexual harassment and assault. Men who assault, harass, and even rape women in our spaces, are protected. A culture of silence is enforced, and those who call out rape, sexual assault, or harassment, those who criticise they who enable and protect these behaviors, are punished, swiftly and aggressively.

Men are terrified of these kinds of allegations. It seems like a life sentence: social ostracization, limited work opportunities, ruined relationships. We may have events in our past that weigh on our conscience; was she too drunk, did she clearly consent, did she regret it in the morning? Some of us have events in our past that we try not to think about, because if we think too hard, we might realize that we crossed the line. This fills men with guilt and uncertainty, but also fear. We know the consequences if our doubts became known.

So we lash out in this fear. We close ranks. We demand the most stringent standards of evidence to prove anything, evidence that we know is not likely to be there. We refuse to believe that our friends were not the men we thought they were, or to confront that we might not be ourselves. We demand due process under the law, we say they should have gone to the police, that they can’t make accusations of such gravity without hard proof. Think of the alleged perpetrator; we can’t ruin their lives over frivolous accusations.

For victims, the only recourse permitted by society is to suffer in silence. Should they speak, victims are subject to similar persecutions: they are ostracized, struggle to work, and lose their relationships. They have to manage the consequences of a traumatic experience with support resources which are absent or inadequate. Their trauma is disbelieved, their speech is punished, and their assailants walk free among us as equals while they are subject to retaliatory harassment or worse.

Victims have no recourse which will satisfy men. Reporting a crime is traumatic, especially one of this nature. I have heard many stories of disbelief from the authorities, disbelief in the face of overwhelming evidence. They were told it was their fault. They were told they should have been in a different place, or wearing something else, or should have simply been a different person. It’s their fault, not the aggressor’s. It’s about what they, the victim, should have done differently, never mind what the perpetrator should have done differently. It’s estimated that less than 1% of rapes end with the rapist in jail1 – the remainder go unreported, unprosecuted or fail after years of traumatic legal proceedings for the victims. The legal system does not provide justice: it exacerbates harm. A hacker will demand this process is completed before they will seek justice, or allow justice to be sought. Until then, we will demand silence, and retaliate if our demands are not met.

The strict standards of evidence required by the justice system are there because of the state monopoly on violence: a guilty verdict in a crime will lead to the imprisonment of the accused. We have no such recourse available in private, accordingly there is no need to hold ourselves to such standards. Our job is not to punish the accused, but rather to keep our communities safe. We can establish the need to take action to whatever standard we believe is sufficient, and by setting these standards as strict as the courts we will fail to resolve over 99% of the situations with which we are faced – a standard which is clearly not sufficient to address the problem. I’m behind you if you want to improve the justice system in this regard, but not if you set this as a blocker to seeking any justice at all. What kind of hacker puts their faith in authority?

I find the state of affairs detestable. The hypocrisy of the free speech absolutist who demands censorship of challenging topics. The fact that the famous hacker curiosity can suddenly dry up if satisfying it would question our biases and preconceptions. The complicity of our moderators in censoring progressive voices in the defense of decorum and the status quo. The duplicitous characterization of “polite” hate speech as acceptable in our communities. Our failure to acknowledge our own shortcomings, our fear of seeing the “other” in a position of power, and the socially enforced ignorance of the “other” that naturally leads to failing to curtail discrimination and harassment in our communities. The ridiculously high standard of evidence we require from victims, who simply ask for our belief at a minimum, before we’ll consider doing anything about their grievance, if we could even be convinced in the first place.

Meanwhile, the problems that these forbidden topics seek to discuss are present in our community. That includes the “polite” problems, such as the conspicuous lack of diversity in our positions of power, which may be discussed and commiserated only until someone suggests doing something about it; and also the impolite problems up to and including the protection of the perpetrators of sexual harassment, sexual assault, and, yes, rape.

Most hackers live under the comfortable belief that it “can’t happen here”, but it can and it does. I attended a hacker event this year – HiP Berlin – where I discovered that some of the organizers had cooperated to make it possible for multiple known rapists to participate, working together to find a way to circumvent the event’s code of conduct – a document that they were tasked with enforcing. One of the victims was in attendance, believing the event to be safe. At every hacker event I have attended in recent memory, I have personally witnessed or heard stories of deeply problematic behavior and protection for its perpetrators from the leadership.

Our community has problems, important problems, that every hacker should care about, and we need the bravery and humility to face them, not the cowardice to retaliate against those who speak up. Talk to, listen to, and believe your peers and their stories. Stand up for what’s right, and speak out when you see something that isn’t. Demand that your leaders and moderators do the right thing. Make a platform where people can safely speak about what our community needs to do right by them, and have the courage to listen to them and confront yourself.

You need to be someone who will do something about it.


Edit: Case in point: this post was quietly removed by Hacker News moderators within 40 minutes of its submission.

Hyprland is a toxic community

Hyprland is an open source Wayland compositor based on wlroots, a project I started back in 2017 to make it easier to build good Wayland compositors. It’s a project which is loved by its users for its emphasis on customization and “eye candy” – beautiful graphics and animations, each configuration tailored to the unique look and feel imagined by the user who creates it. It’s a very exciting project!

Unfortunately, the effect is spoilt by an incredibly toxic and hateful community. I cannot recommend Hyprland to anyone who is not prepared to steer well clear of its community spaces. Imagine a high school boys’ locker room come to life on Discord and GitHub and you’ll get an idea of what it’s like.

I became aware of the issues with Hyprland’s community after details of numerous hateful incidents on their Discord came to my attention by way of the grapevine. Most of them stem from the community’s tolerance of hate: community members are allowed to express hateful views with impunity, up to and including astonishing views such as endorsements of eugenics and calls for hate-motivated violence. Such comments are treated as another act in the one big inside joke that is the Hyprland community – the community prefers not to take itself “too seriously”. Hate is moderated only if it is “disruptive” (e.g. presents as spam), but hate presented with a veneer of decorum (or sarcasm) is tolerated, and when challenged, it’s laughed off as a joke.

In one particular incident, the moderators of the Discord server engaged in a harassment campaign against a transgender user, including using their moderator privileges to edit the pronouns in their username from “they/she” to “who/cares”. These roles should be held by trusted community leaders, and it’s from their behavior that the community’s culture and norms stem – they set an example for the community and define what behaviors are acceptable or expected. The problem comes from the top down.

Someone recently pitched a code of conduct – something that this project sorely needs – in a GitHub issue. This thread does not have much overt hate, but it does clearly show how callous and just plain mean the community is, including its leadership (Vaxerski is the original author of Hyprland). Everything is a joke and anyone who wants to be “serious” about anything is mercilessly bullied and made fun of. Quoting this discussion:

I think [a Code of Conduct] is pretty discriminatory towards people that prefer a close, hostile, homogeneous, exclusive, and unhealthy community.

First of all, why would I pledge to uphold any values? Seems like just inconveniencing myself. […] If I’d want to moderate, I’d spend 90% of the time reading kids arguing about bullshit instead of coding.

If you don’t know how to behave without a wall of text explaining how to behave online then you shouldn’t be online.

I am not someone who believes all projects need a code of conduct, if there exists a reasonable standard of conduct in its absence – and that means having a community that does not bully and harass others for expressing differing points of view, let alone for simply having a marginalized identity.

I would have preferred to address these matters in private, so I reached out to Vaxry in February. He responded with a lack of critical awareness over how toxicity presents in his community. However, following my email, he put out a poll for the Discord community to see if the community members experienced harassment in the community – apparently 40% of respondents reported such experiences. Vaxry et al implemented new moderation policies as a result. But these changes did not seem to work: the problems are still present, and the community is still a toxic place that facilitates bullying and hate, including from the community leaders.

Following my email conversation with Vaxry, he appeared on a podcast to discuss toxicity in the Hyprland community. This quote from the interview clearly illustrates the attitude of the leadership:

[A trans person] joined the Discord server and made a big deal out of their pronouns [..] because they put their pronouns in their nickname and made a big deal out of them because people were referring to them as “he” [misgendering them], which, on the Internet, let’s be real, is the default. And so, one of the moderators changed the pronouns in their nickname to “who/cares”. […] Let’s be real, this isn’t like, calling someone the N-word or something.

Later he describes a more moderated community (the /r/unixporn discord server) as having an environment in which everyone is going to “lick your butthole just to be nice”. He compared himself to Terry Davis, the late operating system developer whose struggles with mental illness were broadcast for the world to see, citing a video in which he answers a phone call and refers to the person on the phone by the N-word “ten times” – Vaxry compares this to his approach to answering “stupid questions”.

It really disappoints me to see such an exciting project brought low by a horribly mismanaged community of hate and bullying. Part of what makes open source software great is that it’s great for everyone. It’s unfortunate that someone can discover this cool project, install it and play with it and get excited about it, then join the community to find themselves at the wrong end of this behavior. No one deserves that.

I empathise with Vaxry. I remember being young, smart, productive… and mean. I did some cool stuff, but I deeply regret the way I treated people. It wasn’t really my fault – I was a product of my environment – but it was my responsibility. Today, I’m proud to have built many welcoming communities, where people are rewarded for their involvement, rather than coming away from their experience hurt. What motivates us to build and give away free software if not bringing joy to ourselves and others? Can we be proud of a community which brings more suffering into the world?

My advice to the leadership begins with taking a serious look in the mirror. This project needs a “come to Jesus” moment. Ask yourself what kind of community you can be proud of – can you be proud of a community that people walk away from feeling dejected and hurt? Yours is not a community that brings people joy. What are you going to do about it?

A good start will be to consider the code of conduct proposal seriously, but a change of attitude is also required. My inbox is open to any of the leaders in this project (or any other project facing similar problems) if you want to talk. I’m happy to chat with you in good faith and help you understand what’s needed and why it’s important.

To members of the Hyprland community, I want each of you to personally step up to make the community better. If you see hate and bullying, don’t stay silent. This is a community which proclaims to value radical free speech: test it by using your speech to argue against hate. Participate in the community as you think it should be, not as it necessarily is, and change will follow. If you are sensitive to hate, or a member of a marginalized group, however, I would just advise steering clear of Hyprland until the community improves.

If the leadership fails to account for these problems, it will be up to the community to take their activity elsewhere. You could set up adjacent communities which are less toxic, or fork the software, or simply choose to use something else.

To the victims of harassment, I offer my sincere condolences. I know how hard it is to be the subject of this kind of bullying. You don’t deserve to be treated like this. There are many places in the free software community where you are welcome and celebrated – Hyprland is not the norm. If you need support, I’m always available to listen to your struggles.

To everyone else: please share this post throughout the Hyprland community and adjacent communities. This is a serious problem and it’s not going to change unless its clearly brought to light. The Hyprland maintainers need to be made aware that the broader open source community does not appreciate this kind of behavior.

I sincerely hope that this project improves its community. A serious attitude shift is needed from the top-down, and I hope for the sake of Vaxry, the other leaders, and the community as a whole, that such change comes sooner rather than later. When Vaxry is older and wiser, I want him to look back on the project and community that he’s built with pride and joy, not with regret and shame.


Vaxry has published a response to this post.

I was also privately provided some of the enusing discussion from the Hyprland Discord. Consider that this lacks context and apply your grain of salt accordingly.

Screenshot of a Discord channel with the initial reaction to this post. A user called “slave labor” responds with “no way”, “the computer reddit woke up”

Screenshot of a Discord channel with Vaxry’s initial reaction to this post. “Really, right as I wanted to take a day off because of health reasons I have to reply to this?”. Another user responds “wow this is quite… shallow”, “almost as if it recycles very limited context to get more clicks”

I apologise to Vaxry for interrupting their rest, and wish them a speedy recovery.

Screenshot of a Discord channel. Some notable quotes include “LGBTQ is fucking trash anyways” (someone else responds “fuck off” to this) and “for reclaiming polymc from the leftoids”. The discussion as a whole lacks any sembelance of professionalism.

Here is a plain text log which includes some additional discussion.

AI crap

There is a machine learning bubble, but the technology is here to stay. Once the bubble pops, the world will be changed by machine learning. But it will probably be crappier, not better.

Contrary to the AI doomer’s expectations, the world isn’t going to go down in flames any faster thanks to AI. Contemporary advances in machine learning aren’t really getting us any closer to AGI, and as Randall Monroe pointed out back in 2018:

A panel from the webcomic 'xkcd' showing a timeline from now into the distant future, dividing the timeline into the periods between 'AI becomes advanced enough to control unstoppable swarms of robots' and 'AI becomes self-aware and rebels against human control'. The period from self-awareness to the indefinite future is labelled 'the part lots of people seem to worry about'; Randall is instead worried about the part between these two epochs.

What will happen to AI is boring old capitalism. Its staying power will come in the form of replacing competent, expensive humans with crappy, cheap robots. LLMs are a pretty good advance over Markov chains, and stable diffusion can generate images which are only somewhat uncanny with sufficient manipulation of the prompt. Mediocre programmers will use GitHub Copilot to write trivial code and boilerplate for them (trivial code is tautologically uninteresting), and ML will probably remain useful for writing cover letters for you. Self-driving cars might show up Any Day Now™, which is going to be great for sci-fi enthusiasts and technocrats, but much worse in every respect than, say, building more trains.

The biggest lasting changes from machine learning will be more like the following:

  • A reduction in the labor force for skilled creative work
  • The complete elimination of humans in customer-support roles
  • More convincing spam and phishing content, more scalable scams
  • SEO hacking content farms dominating search results
  • Book farms (both eBooks and paper) flooding the market
  • AI-generated content overwhelming social media
  • Widespread propaganda and astroturfing, both in politics and advertising

AI companies will continue to generate waste and CO2 emissions at a huge scale as they aggressively scrape all internet content they can find, externalizing costs onto the world’s digital infrastructure, and feed their hoard into GPU farms to generate their models. They might keep humans in the loop to help with tagging content, seeking out the cheapest markets with the weakest labor laws to build human sweatshops to feed the AI data monster.

You will never trust another product review. You will never speak to a human being at your ISP again. Vapid, pithy media will fill the digital world around you. Technology built for engagement farms – those AI-edited videos with the grating machine voice you’ve seen on your feeds lately – will be white-labeled and used to push products and ideologies at a massive scale with a minimum cost from social media accounts which are populated with AI content, cultivate an audience, and sold in bulk and in good standing with the Algorithm.

All of these things are already happening and will continue to get worse. The future of media is a soulless, vapid regurgitation of all media that came before the AI epoch, and the fate of all new creative media is to be subsumed into the roiling pile of math.

This will be incredibly profitable for the AI barons, and to secure their investment they are deploying an immense, expensive, world-wide propaganda campaign. To the public, the present-day and potential future capabilities of the technology are played up in breathless promises of ridiculous possibility. In closed-room meetings, much more realistic promises are made of cutting payroll budgets in half.

The propaganda also leans into the mystical sci-fi AI canon: the threat of smart computers with world-ending power, the forbidden allure of a new Manhattan Project and all of its consequences, the long-prophesied singularity. The technology is nowhere near this level, a fact well-known by experts and the barons themselves, but the illusion is maintained in the interests of lobbying lawmakers to help the barons erect a moat around their new industry.

Of course, AI does present a threat of violence, but as Randall points out, it’s not from the AI itself, but rather from the people that employ it. The US military is testing out AI-controlled drones, which aren’t going to be self-aware but will scale up human errors (or human malice) until innocent people are killed. AI tools are already being used to set bail and parole conditions – it can put you in jail or keep you there. Police are using AI for facial recognition and “predictive policing”. Of course, all of these models end up discriminating against minorities, depriving them of liberty and often getting them killed.

AI is defined by aggressive capitalism. The hype bubble has been engineered by investors and capitalists dumping money into it, and the returns they expect on that investment are going to come out of your pocket. The singularity is not coming, but the most realistic promises of AI are going to make the world worse. The AI revolution is here, and I don’t really like it.

Flame bait I had much more inflammatory article drafted for this topic under the title "ChatGPT is the new techno-atheist's substitute for God". It makes some fairly pointed comparisons between the cryptocurrency cult and the machine learning cult and the religious, unshakeable, and largely ignorant faith in both technologies as the harbingers of progress. It was fun to write, but this is probably the better article. I found this Hacker News comment and quoted it in the original draft: "It's probably worth talking to GPT4 before seeking professional help [to deal with depression]." In case you need to hear it: [do not][suicide] (TW: suicide) seek out OpenAI's services to help with your depression. Finding and setting up an appointment with a therapist can be difficult for a lot of people -- it's okay for it to feel hard. Talk to your friends and ask them to help you find the right care for your needs.

Hello from Ares!

I am pleased to be writing today’s blog post from a laptop running Ares OS. I am writing into an ed(1) session, on a file on an ext4 filesystem on its hard drive. That’s pretty cool! It seems that a lot of interesting stuff has happened since I gave that talk on Helios at FOSDEM in February.

A picture of my ThinkPad while I was editing this blog post

The talk I gave at FOSDEM was no doubt impressive, but it was a bit of a party trick. The system was running on a Raspberry Pi with one process which included both the slide deck as a series of raster images baked into the ELF file, as well as the GPU driver and drawing code necessary to display them, all in one package. This was quite necessary, as it turns out, given that the very idea of “processes” was absent from the system at this stage.

Much has changed since that talk. The system I am writing to you from has support for processes indeed, complete with fork and exec and auxiliary vectors and threads and so on. If I run “ps” I get the following output:

mercury % ps
1 /sbin/usrinit dexec /sbin/drv/ext4 block0 childfs 0 fs 0
2 /etc/driver.d/00-pcibus
3 /etc/pci.d/class/01/06/ahci
4 /etc/driver.d/00-ps2kb
5 /etc/driver.d/99-serial
6 /etc/driver.d/99-vgacons
7 /sbin/drv/ext4 block0
15 ed blog.md
16 ps

Each of these processes is running in userspace, and some of them are drivers. A number of drivers now exist for the system, including among the ones you see here a general-purpose PCI driver, AHCI (SATA), PS/2 keyboard, PC serial, and a VGA console, not to mention the ext4 driver, based on lwext4 (the first driver not written in Hare, actually). Not shown here are additional drivers for the CMOS real-time clock (so Ares knows what time it is, thanks to Stacy Harper), a virtio9pfs driver (thanks also to Tom Leb for the initial work here), and a few more besides.

As of this week, a small number of software ports exist. The ext4 driver is based on lwext4, as I said earlier, which might be considered a port, though it is designed to be portable. The rc shell I have been working on lately has also been ported, albeit with many features disabled, to Mercury. And, of course, I did say I was writing this blog post with ed(1) – I have ported Michael Forney’s ed implementation from sbase, with relatively few features disabled as a matter of fact (the “!” command and signals were removed).

This ed port, and lwext4, are based on our C library, designed with drivers and normal userspace programs in mind, and derived largely from musl libc. This is coming along rather well – a few features (signals again come to mind) are not going to be implemented, but it’s been relatively straightforward to get a large amount of the POSIX/C11 API surface area covered on Ares, and I was pleasantly surprised at how easy it was to port ed(1).

There’s still quite a lot to be done. In the near term, I expect to see the following:

  • A virtual filesystem
  • Pipes and more shell features enabled, such as redirects
  • More filesystem support (mkdir et al)
  • A framebuffer console
  • EFI support on x86_64
  • MBR and GPT partitions

This is more of the basics. As these basics unblock other tasks, a few of the more ambitious projects we might look forward to include:

  • Networking support (at least ICMP)
  • Audio support
  • ACPI support
  • Basic USB support
  • A service manager (not systemd…)
  • An installer, perhaps a package manager
  • Self-hosting builds
  • Dare I say Wayland?

I should also probably do something about that whining fan I’m hearing in the background right now. Of course, I will also have to do a fresh DOOM port once the framebuffer situation is improved. There’s also still plenty of kernel work to be done and odds and ends all over the project, but it’s in pretty good shape and I’m having a blast working on it. I think that by now I have answered the original question, “can an operating system be written in Hare”, with a resounding “yes”. Now I’m just having fun with it. Stay tuned!

Now I just have to shut this laptop off. There’s no poweroff command yet, so I suppose I’ll just hold down the power button until it stops making noise.

The rc shell and its excellent handling of whitespace

This blog post is a response to Mark Dominus’ “The shell and its crappy handling of whitespace.

I’ve been working on a shell for Unix-like systems called rc, which draws heavily from the Plan 9 shell of the same name. When I saw Mark’s post about the perils of whitespace in POSIX shells (or derived shells, like bash), I thought it prudent to see if any of the problems he outlines are present in the shell I’m working on myself. Good news: they aren’t!

Let’s go over each of his examples. First he provides the following example:

for i in *.jpg; do
	cp $i /tmp
done

This breaks if there are spaces in the filenames. Not so with rc:

% cat test.rc
for (i in *.jpg) {
	cp $i subdir
}
% ls
a.jpg   b.jpg  'bite me.jpg'   c.jpg   subdir   test.rc
% rc ./test.rc 
% ls subdir/
a.jpg   b.jpg  'bite me.jpg'   c.jpg

He gives a similar example for a script that renames jpeg to jpg:

for i in *.jpeg; do
  mv $i $(suf $i).jpg
done

This breaks for similar reasons, but works fine in rc:

% cat test.rc  
fn suf(fname) {
	echo $fname | sed -e 's/\..*//'
}

for (i in *.jpeg) {
	mv $i `{suf $i}.jpg
}
% ls 
a.jpeg   b.jpeg  'bite me.jpeg'   c.jpeg   test.rc
% rc ./test.rc  
% ls 
a.jpg   b.jpg  'bite me.jpg'   c.jpg   test.rc

There are other shells, such as fish or zsh, which also have answers to these problems which don’t necessarily call for generous quoting like other shells often do. rc is much simpler than these shells. At the moment it clocks in at just over 3,000 lines of code, compared to fish at ~45,000 and zsh at ~144,000. Admittedly, it’s not done yet, but I would be surprised to see it grow beyond 5,000 lines for version 1.0.1

The key to rc’s design success in this area is the introduction of a second primitive. The Bourne shell and its derivatives traditionally work with only one primitive: strings. But command lines are made of lists of strings, and so a language which embodies the primitives of the command line ought to also be able to represent those as a first-class feature. In traditional shells a list of strings is denoted inline with the use of spaces within those strings, which raises obvious problems when the members themselves contain spaces; see Mark’s post detailing the errors which ensue. rc adds lists of strings as a formal primitive alongside strings.

% args=(ls --color /) 
% echo $args(1) 
ls
% echo $args(2) 
--color
% echo $#args 
3
% $args 
bin   dev  home  lost+found  mnt  proc  run   srv      swap  tmp  var
boot  etc  lib   media       opt  root  sbin  storage  sys   usr
% args=("foo bar" baz) 
% touch $args 
% ls 
 baz  'foo bar'

Much better, right? One simple change eliminates the need for quoting virtually everywhere. Strings can contain spaces and nothing melts down.

Let me run down the remaining examples from Mark’s post and demonstrate their non-importance in rc. First, regarding $*, it just does what you expect.

% cat yell.rc
#!/bin/rc
shift
echo I am about to run $* now!!!
exec $*
% ls *.jpg
'bite me.jpg'
% ./yell.rc ls *.jpg
I am about to run ls bite me.jpg now!!!
'bite me.jpg'

Note also that there is no need to quote the arguments to “echo” here. Also note the use of shift; $* includes $0 in rc.

Finally, let’s rewrite Mark’s “lastdl” program in rc and show how it works fine in rc’s interactive mode.

#!/bin/rc
cd $HOME/downloads
echo $HOME/downloads/`{ls -t | head -n1}

Its use at the command line works just fine without quotes.

% file `{lastdl} 
/home/sircmpwn/downloads/test image.jpg: JPEG image data, JFIF standard 1.01,
aspect ratio, density 1x1, segment length 16, baseline, precision 8,
5000x2813, components 3

Just for fun, here’s another version of this rc script that renames files with spaces to without, like the last example in Mark’s post:

#!/bin/rc
cd $HOME/downloads
last=`{ls -t | head -n1}
if (~ $last '* *') {
	newname=`{echo $last | tr ' \t' '_'}
	mv $last $HOME/downloads/$newname
	last=$newname
}
echo $HOME/downloads/$last

The only quotes to be found are those which escape the wildcard match testing for a space in the string.2 Not bad, right? Like Plan 9’s rc, my shell imagines a new set of primitives for shells, then starts from the ground up and builds a shell which works better in most respects while still being very simple. Most of the problems that have long plagued us with respect to sh, bash, etc, are solved in a simple package with rc, alongside a nice interactive mode reminiscent of the best features of fish.

rc is a somewhat complete shell today, but there is a bit more work to be done before it’s ready for 1.0, most pressingly with respect to signal handling and job control, alongside a small bit of polish and easier features to implement (such as subshells, IFS, etc). Some features which are likely to be omitted, at least for 1.0, include logical comparisons and arithmetic expansion (for which /bin/test and /bin/dc are recommended respectively). Of course, rc is destined to become the primary shell of the Ares operating system project that I’ve been working on, but I have designed it to work on Unix as well.

Check it out!

Alpine Linux does not make the news

My Linux distribution of choice for several years has been Alpine Linux. It’s a small, efficient distribution which ships a number of tools I appreciate for their simplicity, such as musl libc. It has a very nice package manager, apk, which is fast and maintainable. The development community is professional and focuses on diligent maintenance of the distribution and little else. Over the years I have used it, very little of note has happened.

I run Alpine in every context; on my workstation and my laptops but also on production servers, on bare-metal and in virtual machines, on my RISC-V and ARM development boards, at times on my phones, and in many other contexts besides. It has been a boring experience. The system is simply reliable, and the upgrades go over without issue every other quarter,1 accompanied by high-quality release notes. I’m pleased to maintain several dozen packages in the repositories, and the community is organized such that it is easy for someone like me to jump in and do the work required to maintain it for my use-cases.

Red Hat has been in the news lately for their moves to monetize the distribution, moves that I won’t comment on but which have generally raised no small number of eyebrows, written several headlines, and caused intense flamewars throughout the internet. I don’t run RHEL or CentOS anywhere, in production or otherwise, so I just looked curiously on as all of this took place without calling for any particular action on my part. Generally speaking, Alpine does not make the news.

And so it has been for years, as various controversies come about and die off, be it with Red Hat, Ubuntu, Debian, or anything else, I simply keep running “apk upgrade” every now and then and life goes on uninterrupted. I have high-quality, up-to-date software on a stable system and suffer from no fuss whatsoever.

The Alpine community is a grassroots set of stakeholders who diligently concern themselves with the business of maintaining a good Linux distribution. There is little in the way of centralized governance;2 for the most part the distribution is just quietly maintained by the people who use it for the purpose of ensuring its applicability to their use-cases.

So, Alpine does not make the news. There are no commercial entities which are trying to monetize it, at least no more than the loosely organized coalition of commercial entities like SourceHut that depend on Alpine and do their part to keep it in good working order, alongside various users who have no commercial purpose for the system. The community is largely in unanimous agreement about the fundamental purpose of Alpine and the work of the community is focused on maintaining the project such that this purpose is upheld.

This is a good trait for a Linux distribution to have.

Seriously, don't sign a CLA

SourceGraph is making their product closed source, abandoning the Apache 2.0 license it was originally distributed under, so once again we convene in the ritual condemnation we offer to commercial products that piss in the pool of open source. Invoking Bryan Cantrill once more:

Bryan Cantrill on OpenSolaris — YouTube

A contributor license agreement, or CLA, usually (but not always) includes an important clause: a copyright assignment. These agreements are provided by upstream maintainers to contributors to open source software projects, and they demand a signature before the contributor’s work is incorporated into the upstream project. The copyright assignment clause that is usually included serves to offer the upstream maintainers more rights over the contributor’s work than the contributor was offered by upstream, generally in the form of ownership or effective ownership over the contributor’s copyright and the right to license it in any manner they choose in the future, including proprietary distributions.

This is a strategy employed by commercial companies with one purpose only: to place a rug under the project, so that they can pull at the first sign of a bad quarter. This strategy exists to subvert the open source social contract. These companies wish to enjoy the market appeal of open source and the free labor of their community to improve their product, but do not want to secure these contributors any rights over their work.

This is particularly pathetic in cases like that of SourceGraph, which used a permissive Apache 2.0 license. Such licenses already allow their software to be incorporated into non-free commercial works, such is the defining nature of a permissive license, with relatively few obligations: in this case, a simple attribution will suffice. SourceGraph could have been made non-free without a CLA at all if this one obligation was met. The owners of SourceGraph find the simple task of crediting their contributors too onerous. This is disgusting.

SourceGraph once approached SourceHut asking about building an integration between our platforms. They wanted us to do most of the work, which is a bit tacky but reasonable under the reciprocal social contract of open source. We did not prioritize it and I’m glad that we didn’t: our work would have been made non-free.

Make no mistake: a CLA is a promise that a open source software project will one day become non-free. Don’t sign them.

What are my rights as a contributor?

If you sign away your rights by agreeing to a CLA, you retain all of the rights associated with your work.

By default, you own the copyright over your contribution and the contribution is licensed under the same software license the original project uses, thus, your contribution is offered to the upstream project on the same terms that their contribution was offered to you. The copyright for such projects is held collectively by all contributors.

You also always have the right to fork an open source project and distribute your improvements on your own terms, without signing a CLA – the only power upstream holds is authority over the “canonical” distribution. If the rug is pulled from under you, you may also continue to use, and improve, versions of the software from prior to the change in license.

How do I prevent this from happening to my project?

A CLA is a promise that software will one day become non-free; you can also promise the opposite. Leave copyright in the collective hands of all contributors and use a copyleft license.

Without the written consent of all contributors, or performing their labor yourself by re-writing their contributions, you cannot change the license of a project. Skipping the CLA leaves their rights intact.

In the case of a permissive software license, a new license (including proprietary licenses) can be applied to the project and it can be redistributed under those terms. In this way, all future changes can be written with a new license. The analogy is similar to that of a new project with a proprietary license taking a permissively licensed project and incorporating all of the code into itself before making further changes.

You can prevent this as well with a copyleft license: such a license requires the original maintainers to distribute future changes to the work under a free software license. Unless they can get all copyright holders – all of the contributors – to agree to a change in license, they are obligated to distribute their improvements on the same terms.

Thus, the absence of a CLA combined with the use of a copyleft license serves as a strong promise about the future of the project.

Learn more at writefreesoftware.org:

What should I do as a business instead of a CLA?

It is not ethical to demand copyright assignment in addition to the free labor of the open source community. However, there are some less questionable aspects of a contributor license agreement which you may uphold without any ethical qualms, notably to establish provenance.

Many CLAs include clauses which establish the provenance of the contribution and transfer liability to the contributor, such that the contributor agrees that their contribution is either their own work or they are authorized to use the copyright (for example, with permission from their employer). This is a reasonable thing to ask for from contributors, and manages your exposure to legal risks.

The best way to ask for this is to require contributions to be “signed-off” with the Developer Certificate of Origin.


Previously:

Social media and "parasocial media"

A few months ago, as Elon Musk took over Twitter and instituted polices that alienated many people, some of these people fled towards federated, free software platforms like Mastodon. Many people found a new home here, but there is a certain class of refugee who has not found it to their liking.

I got to chatting with one such “refugee” on Mastodon today. NotJustBikes is a creator I enjoy watching on YouTube Invidious, who makes excellent content on urbanism and the design of cities. He’s based in my home town of Amsterdam and his videos do a great job of explaining many of the things I love about this place for general audiences. He’s working on building an audience, expanding his reach, and bringing his message to as many people as possible in the interest of bringing better infrastructure to everyone.

But he’s not satisfied with his move from Twitter to Mastodon, nor are some of his friends among the community of “urbanist” content creators. He yearns for an “algorithm” to efficiently distribute content to his followers, and Mastodon is not providing this for him.

On traditional “social media” platforms, in particular YouTube, the interactions are often not especially social. The platforms facilitate a kind of intellectual consumption moreso than conversation: conversations flow in one direction, from creator to audience, where the creator produces and the audience consumes. I think a better term for these platforms is “parasocial media”: they are optimized for creating parasocial relationships moreso than social relationships.

The fediverse is largely optimized for people having conversations with each other, and not for producing and consuming “content”. Within this framework, a “content creator” is a person only in the same sense that a corporation is, and their conversations are unidirectional, where the other end is also not a person, but an audience. That’s not the model that the fediverse is designed around.

It’s entirely reasonable to want to build an audience and publish content in a parasocial manner, but that’s not what the fediverse is for. And I think that’s a good thing! There are a lot of advantages in having spaces which focus on being genuinely “social”, rather than facilitating more parasocial interactions and helping creators build an audience. This limits the fediverse’s reach, but I think that’s just fine.

Within this model, the fediverse’s model, it’s possible to publish things, and consume things. But you cannot effectively optimize for building the largest possible audience. You will generally be more successful if you focus on the content itself, and not its reach, and on the people you connect with at a smaller scale. Whether or not this is right for you depends on your goals.

I hope you enjoyed this content! Remember to like and subscribe.

Burnout and the quiet failures of the hacker community

This has been a very challenging year for me. You probably read that I suffered from burnout earlier in the year. In some respects, things have improved, and in many other respects, I am still haunted.

You might not care to read this, and so be it, take your leave if you must. But writing is healing for me. Maybe this is a moment for solidarity, sympathy, for reflecting on your own communities. Maybe it’s a vain and needlessly public demonstration of my slow descent into madness. I don’t know, but here we go.

Yesterday was my 30th birthday. 🎂 It was another difficult day for me. I drafted a long blog post with all of the details of the events leading up to my burnout. You will never read it; I wrote it for myself and it will only be seen by a few confidants, in private, and my therapist. But I do want to give you an small idea of what I’ve been going through, and some of the take-aways that matter for you and the hacker community as a whole.

Here’s a quote from yesterday’s unpublished blog post:

Trigger warnings: child abuse, rape, sexual harassment, suicide, pedophilia, torture.

You won’t read the full story, and trust me, you’re better off for that. Suffice to say that my life has been consumed with trauma and strife all year. I have sought healing, and time for myself, time to process things, and each time a new crisis has landed on my doorstep, most of them worse than the last. A dozen things went wrong this year, horribly wrong, one after another. I have enjoyed no peace in 2023.

Many of the difficulties I have faced this year have been beyond the scope of the hacker community, but several have implicated it in challenging and confronting ways.

The hacker community has been the home I never had, but I’m not really feeling at home here right now. A hacker community that was precious to me failed someone I love and put my friends in danger. Rape and death had come to our community, and was kept silent. But I am a principled person, and I stand for what is right; I spoke the truth and it brought me and my loved ones agonizing stress and trauma and shook our community to the core. Board members resigned. Marriages are on the rocks. When the dust settled, I was initially uncomfortable staying in this community, but things eventually started to get better. Until another member of this community, someone I trusted and thought of as a friend, confessed to me that he had raped multiple women a few years ago. I submitted my resignation from this community last night.

Then I went to GPN, a hacker event in Germany, at the start of June. It was a welcome relief from the stress I’ve faced this year, a chance to celebrate hacker culture and a warm reminder of the beauty of our community. It was wonderful. Then, on the last night, a friend took me aside and confided in me that they are a pedophile, and told me it was okay because they respected the age of consent in Germany – which is 14. What began as a wonderful reminder of what the hacker community can be became a PTSD episode and a reminder that rape culture is fucking everywhere.

I don’t want to be a part of this anymore. Our communities have tolerated casual sexism and misogyny and transphobia and racism and actual fucking rapists, and stamped down on women and queer people and brown people in our spaces with a smile on our face and a fucked-up facsimile of tolerance and inclusion as a cornerstone of the hacker ethic.

This destroys communities. It is destroying our communities. If there’s one thing I came to understand this year, it’s that these problems are pervasive and silent.

Here’s what you need to do: believe the victims. Stand up for what’s right. Have the courage to remove harmful people from your environment, especially if you’re a man and have a voice. Make people feel welcome, and seen. Don’t tolerate casual sexism in the hacker community or anywhere else. Don’t tolerate transphobia or homophobia. Don’t tolerate racists. If you see something, say something. And for fuck’s sake, don’t bitch about that code of conduct that someone wants to add to your community.1

I’m going to withdraw a bit from the in-person hacker community for the indefinite future. I don’t think I can manage it for a while. I have felt good about working on my software and collaborating with my free software communities online, albeit at a much-reduced capacity. I’m going to keep working, and writing, insofar as I find satisfaction in it. Life goes on.

Be there for the people you love, and love more people, and be there for them, too.

Reforming the free software message

Several weeks ago, I wrote The Free Software Foundation is dying, wherein I enumerated a number of problems with the Free Software Foundation. Some of my criticisms focused on the message: fsf.org and gnu.org together suffer from no small degree of incomprehensibility and inaccessibility which makes it difficult for new participants to learn about the movement and apply it in practice to their own projects.

This is something which is relatively easily fixed! I have a background in writing documentation and a thorough understanding of free software philosophy and practice. Enter writefreesoftware.org: a comprehensive introduction to free software philosophy and implementation.

The goals of this resource are:

  • Provide an accessible introduction to the most important principles of free software
  • Offer practical advice on choosing free software licenses from a free software perspective (compare to the OSS perspective at choosealicense.com).
  • Publish articles covering various aspects of free software in practice, such as how it can be applied to video games

More:

  • No particular association with any particular free software project or organization
  • No policy of non-cooperation with the open source movement

Compare writefreesoftware.org with the similar resources provided by GNU (1, 2) and you should get the general idea.

The website is itself free software, CC-BY-SA 4.0. You can check out the source code here and suggest any improvements or articles for the mailing list. Get involved! This resource is not going to solve all of the FSF’s problems, but it is an easy way to start putting the effort in to move the free software movement forward. I hope you like it!

Throwing in the towel on mobile Linux

I have been tinkering with mobile Linux – a phrase I will use here to describe any Linux distribution other than Android running on a mobile device – as my daily driver since about 2019, when I first picked up the PinePhone. For about 3 years I have run mobile Linux as my daily driver on my phone, and as of a few weeks ago, I’ve thrown in the towel and switched to Android.

The distribution I ran for the most time is postmarketOS, which I was mostly quite happy with, running at times sxmo and Phosh. I switched to UBports a couple of months ago. I have tried a variety of hardware platforms to support these efforts, namely:

  • Pinephone (pmOS)
  • Pinephone Pro (pmOS)
  • Xiaomi Poco F1 (pmOS)
  • Fairphone 4 (UBports)

I have returned to LineageOS as my daily driver and closed the book on mobile Linux for the time being. What put the final nails in the coffin was what I have been calling out as my main concern throughout my experience: reliability, particularly of the telephony components.

Use-case Importance postmarketOS UBports LineageOS
Basic system reliability 5 2 4 5
Mobile telephony 5 3 3 5
Hotspot 4 5 3 5
2FA 4 4 1 5
Web browsing 4 5 2 4
Mobile banking 4 1 1 5
Bluetooth audio 3 4 2 4
Music player 3 4 1 3
Reading email 3 1 3 4
Navigation aid 3 2 1 5
Camera 3 3 3 5
Password manager 3 5 1 1
sysadmin 3 5 2 3
More on these use-cases and my experiences

Mobile banking: only available through a proprietary vendor-provided Android app. Tried to get it working on Waydroid; did not work on pmOS and almost worked on UBports, but Waydroid is very unreliable. Kind of shit but I don’t have any choice because my bank requires it for 2FA.

Web browsing: I can just run Firefox upstream on postmarketOS. Amazing! UBports cannot do this, and the available web browsers are not nearly as pleasant to use. I run Fennic on Android and it’s fine.

Music player: the music player on UBports is extremely unreliable.

Reading email: This is not entirely pmOS’s fault; I could have used my main client, aerc, which is a testament to pmOS’s general utility, but it is a TUI that is uncomfortable to use on a touchscreen-only device.

Password manager: pmOS gets 5/5 because I could use the password manager I wrote myself, himitsu, out of the box. Non-critical use-case because I could just type passwords in manually on the rare occasion I need to use one.

sysadmin: stuff like being able to SSH into my production boxes from anywhere to troubleshoot stuff.

Among these use-cases, there is one that absolutely cannot be budged on: mobile telephony. My phone is a critical communication device and I need to be able to depend on calls and SMS at all times, therefore the first two rows need to score 4 or 5 before the platform is suitable for my use. I remember struggling with postmarketOS while I was sick with a terrible throat infection – and I could not call my doctor. Not cool.

I really like these projects and I love the work that’s going into them. postmarketOS in particular: being able to run the same environment I run everywhere else, Alpine Linux, on my phone, is fucking amazing. The experience is impressively complete in many respects, all kinds of things, including things I didn’t expect to work well, work great. In the mobile Linux space I think it’s the most compelling option right now.

But pmOS really suffers from reliability issues – both on edge and on stable it seemed like every update broke some things and fixed others, so only a subset of these cool features was working well at any given moment. The breakage would often be minor nuisances, such as the media controls on my bluetooth headphones breaking in one update and being fixed in the next, or major showstoppers such as broken phone calls, SMS, or, in one case, all of my icons disappearing from the UI (with no fallback in most cases, leaving me navigating the UI blind).

So I tried UBports instead, and despite the general lack of good auxiliary features compared to pmOS, the core telephony was more reliable – for a while. But once issues started to appear, particularly around SMS, I could not tolerate it for long in view of the general uselessness of the OS for anything else. I finally gave it up and installed LineageOS.

Mobile Linux is very cool and the community has made tremendous, unprecedented progress towards realizing its potential, and the forward momentum is still strong. I’m excited to see it continue to improve. But I think that before anyone can be expected to use this as a daily driver, the community really needs to batten down the hatches and focus on one thing and one thing only: always, always being usable as a phone. I’ll be back once more reliability is in place.

How to go to war with your employer

There is a power differential between you and your employer, but that doesn’t mean you can’t improve your working conditions. Today I’d like to offer a little bit of advice on how to frame your relationship with your employer in terms which empower you and afford you more agency. I’m going to talk about the typical working conditions of the average white-collar job in a neo-liberal political environment where you are mostly happy to begin with and financially stable enough to take risks, and I’m specifically going to talk about individual action or the actions of small groups rather than large-scale collective action (e.g. unions).

I wish to subvert the expectation here that employees are subordinate to their employers. A healthy employment relationship between an employee and employer is that of two entities who agree to work together on equal terms to strive towards mutual goals, which in the simplest form is that you both make money and in the subtleties also suggests that you should be happy doing it. The sense of “going to war” here should rouse in you an awareness of the resources at your disposal, a willingness to use them to forward your interests, and an acknowledgement of the fact that tactics, strategy, propaganda, and subterfuge are among the tools you can use – and the tools your employer uses to forward their own interests.

You may suppose that you need your employer more than they need you, but with some basic accounting we can get a better view of the veracity of this supposition. Consider at the most fundamental that your employer is a for-profit entity that spends money to make money, and they spend money on you: as a rule of thumb, they expect a return of at least your salary ×1.5 (accounting for overhead, benefits et al) for their investment in you, otherwise it does not make financial sense for them to employ you.

If you have finer-grained insights into your company’s financial situation, you can get a closer view of your worth to them by dividing their annual profit by their headcount, adjusted to your discretion to account for the difference in the profitability of your role compared to your colleagues. It’s also wise to run this math in your head to see how the returns from your employment are affected by conditions in the hiring market, layoffs, etc – having fewer employees increases the company’s return per employee, and a busier hiring market reduces your leverage. In any case, it should be relatively easy for you to justify, in the cold arithmetic of finance that businesses speak, that employees matter to the employer, and the degree to which solidarity between workers is a meaningful force amplifier for your leverage.

In addition to your fundamental value, there are some weak points in the corporate structure that you should be aware of. There are some big levers that you may already be aware of that I have already placed outside of the scope of this blog post, such as the use of collective bargaining, unionization, strikes, and so on, where you need to maximize your collective leverage to really put the screws to your employer. Many neo-liberal workplaces lack the class consciousness necessary to access these levers, and on the day-to-day scale it may be strategically wise to smarten up your colleagues on social economics in preparation for use of these levers. I want to talk about goals on the smaller scale, though. Suppose your goals are, for instance:

  • You don’t like agile/scrum and want to interact with it from the other end of a six foot pole and/or replace it with another system
  • Define your own goals and work on the problems you think are important at your own discretion moreso than at the discretion of your manager
  • Skip meetings you know are wasting your time
  • Set working hours that suit you or take time off on your terms
  • Work from home or in-office in an arrangement that meets your own wants/needs
  • Exercise agency over your tools, such as installing the software you want to use on your work laptop

You might also have more intimidating goals you want to address:

  • Demand a raise or renegotiating benefits
  • Negotiate a 4-day workweek
  • Replace your manager or move teams
  • Remove a problematic colleague from your working environment

All of these goals are within your power to achieve, and perhaps more easily than you expect.

First of all, you already have more agency than you know. Your job description and assigned tasks tells a narrow story of your role at the business: your real job is ultimately to make money for the business. If you install Linux on your work laptop because it allows you to work more efficiently, then you are doing your job better and making more money for the business; they have no right to object to this and you have a very defensible position for exercising agency in this respect. Likewise if you adapt the workflows around agile (or whatever) to better suit your needs rather than to fall in line with the prescription, if it makes you more productive and happy then it makes the business more money. Remember your real job – to make money – and you can adjust the parameters of your working environment relatively freely provided that you are still aligned with this goal.

Often you can simply exercise agency in cases like these, but in other cases you may have to reach for your tools. Say you don’t just want to have maintain a personal professional distance from agile, but you want to replace it entirely: now you need to talk to your colleagues. You can go straight to management and start making your case, but another option – probably the more effective one – is to start with your immediate colleagues. Your team also possesses a collective agency, and if you agree together, without anyone’s permission, to work according to your own terms, then so long as you’re all doing your jobs – making money – then no one is going to protest. This is more effective than following the chain of command and asking them to take risks they don’t understand. Be aware of the importance of optics here: you need not only to make money, but to be seen making money. How you are seen to be doing this may depend on how far up the chain you need to justify yourself to; if your boss doesn’t like it then make sure your boss’s boss does.

Ranked in descending order of leverage within the business: your team, your boss, you.

More individual-oriented goals such as negotiating a different working schedule or skipping meetings calls for different tools. Simple cases, such as coming in at ten and leaving at four every day, are a case of simple exercise of agency; so long as you’re making the company money no one is going to raise a fuss. If you want, for instance, a four day work-week, or to work from home more often, you may have to justify yourself to someone. In such cases you may be less likely to have your team’s solidarity at your disposal, but if you’re seen to be doing your job – making money – then a simple argument that it makes you better at that job will often suffice.

You can also be clever. “Hey, I’ll be working from home on Friday” works better than “can I work from home on Friday?” If you want to work from home every Friday, however, then you can think strategically: keeping mum about your final goal of taking all Fridays from home may be wise if you can start by taking some Fridays at home to establish that you’re still productive and fulfilling the prime directive1 under those terms and allow yourself to “accidentally” slip into a new normal of working home every Friday without asking until it’s apparent that the answer will be yes. Don’t be above a little bit of subversion and deception; your employer is using those tools against you too.

Then there are the big guns: human resources. HR is the enemy; their job is to protect the company from you. They can, however, be useful if you understand the risks they’re trying to manage and press the right buttons with them. If your manager is a dick, HR may be the tool to use to fix this, but you need to approach it the right way. HR does not give two fucks that you don’t like your manager, if your manager is making money then they are doing their job. What HR does give a fuck about is managing the company’s exposure to lawsuits.

They can also make your life miserable. If HR does not like you then you are going to suffer, so when you talk to them it is important to know your enemy and to make strategic use of them without making them realize you know the game. They present themselves as your ally, let them think you believe it’s so. At the same time, there is a coded language you can use that will get them to act in your interest. HR will perk up as soon as they smell “unsafe working conditions”, “sexual harassment”, “collective action”, and so on – the risks they were hired to manage – over the horizon. The best way to interact with HR is for them to conclude that you are on a path which ends in these problems landing on their desk without making them think you are a subversive element within the organization. And if you are prepared to make your knowledge of and willingness to use these tools explicit, all communication which suggests as much should be delivered to HR with your lawyer’s signature and only when you have a new job offer lined up as a fallback. HR should either view you as mostly harmless or look upon you with fear, but nothing in between.

These are your first steps towards class consciousness as a white-collar employee. Know your worth, know the leverage you have, and be prepared to use the tools at your disposal to bring about the outcomes you desire, and know your employer will be doing the same. Good luck out there, and don’t forget to actually write some code or whatever when you’re not busy planning a corporate coup.

Burnout

It kind of crept up on me. One day, sitting at my workstation, I stopped typing, stared blankly at the screen for a few seconds, and a switch flipped in my head.

On the night of New Year’s Eve, my backpack was stolen from me on the train from Berlin to Amsterdam, and with it about $2000 worth of equipment, clothes, and so on. A portent for the year that was to come. I generally keep my private and public lives carefully separated, but perhaps I will offer you a peek behind the curtain today.

It seems like every week or two this year, another crisis presented itself, each manageable in isolation. Some were independent events, others snowballed as the same problems escalated. Gossip at the hackerspace, my personal life put on display and mocked. A difficult break-up in February, followed by a close friend facing their own relationship’s hurtful end. Another close friend – old, grave problems, once forgotten, remembered, and found to still be causing harm. Yet another friend, struggling to deal with depression and emotional abuse at the hands of their partner. Another friendship still: lost, perhaps someday to be found again.

Dependable Drew, an ear to listen, a shoulder to cry on, always knowing the right words to say, ready to help and proud to be there for his friends. Friends who, amidst these crises, are struggling to be there for him.

These events, set over the background of a world on fire.

One of the more difficult crises in my purview reached its crescendo one week ago, culminating in death. A selfish end for a selfish person, a person who had hurt people I love; a final, cruel cut to the wounds we were trying to heal.

I took time for myself throughout these endless weeks, looked after myself as best I could, and allowed my productivity to wane as necessary, unburdened by guilt in so doing. I marched on when I had the energy to, and made many achievements I’m proud of.

Something changed this week. I have often remarked that when you’re staring down a hard problem, one which might take years or even decades to finish, that you have two choices: give up or get to work. The years are going to pass either way. I am used to finding myself at the base of a mountain, picking up my shovel, and getting started. Equipped with this mindset, I have patiently ground down more than one mountain in my time. But this week, for the first time in my life, as I gazed upon that mountain, I felt intimidated.

I’m not sure what the purpose of this blog post is. Perhaps I’m sharing an experience that others might be able to relate to. Perhaps it’s healing in some way. Maybe it’s just indulgent.

I’m going to take the time I need to rest. I enjoy the company of wonderful colleagues at SourceHut, who have been happy to pick up some of the slack. I have established a formal group of maintainers for Hare and given them my blessing to work without seeking my approval. My projects will remain healthy as I take a leave. See you soon.

Who should lead us?

Consider these two people, each captured in the midst of delivering a technical talk.

A picture of a young trans woman in a red dress A picture of a middle-aged white man in a red shirt

Based on appearances alone, what do you think of them?

The person on the left is a woman. She’s also pretty young, one might infer something about her level of experience accordingly. I imagine that she has led a much different life than I have, and may have a much different perspective, worldview, identity, and politics than I. Does she complain about sexism and discrimination in her work? Is she a feminist? Does she lean left or right on the political spectrum?

The person on the right looks like most of the hackers I’ve met. You’ve met someone who looks like this a thousand times. He is a man, white and middle-aged – that suggests a fair bit of experience. He probably doesn’t experience or concern himself with race or gender discrimination in the course of his work. He just focuses on the software. His life experiences probably map relatively well onto my own, and we may share a similar worldview and identity.

Making these assumptions is a part of human nature – it’s a useful shortcut in many situations. But they are assumptions based only on appearances. What are the facts?

The person on the right is Scott Guthrie, Vice President of Cloud and AI at Microsoft, giving a talk about Azure’s cloud services. He lives in an $11M house in Hunts Point, Washington. On the left is Alyssa Rosenzweig, main developer for the free software Panfrost GPU drivers and a trans woman, talking about how she reverse engineers proprietary graphics hardware.

You and I have a lot more in common with Alyssa than with Scott. The phone I have in my pocket right now would not work without her drivers. Alyssa humbles me with her exceptional talent and dedication, and the free software community is indebted to her. If you use ARM devices with free software, you owe something to Alyssa. As recently as February, her Wikipedia page was vandalized by someone who edited “she” and “her” to “he” and “him”.

Appearances should not especially matter when considering the merit of someone considered for a leadership role in our community, be it as a maintainer, thought leader, member of our foundations’ boards, etc. I am myself a white man, and I think I perform well in my leadership roles throughout the free software ecosystem. But it’s not my appearance that causes any controversy: someone with the approximate demographic shape of myself or Guthrie would cause no susurration when taking the stage.

It’s those like Alyssa, who aside from anything else is eminently qualified and well-deserving of her leadership role, who are often the target of ire and discrimination in the community. This is an experience shared by many people whose gender expression, skin color, or other traits differ from the “norm”. They’ve been telling us so for years.

Is it any wonder that our community is predominantly made up of white cisgendered men when anyone else is ostracized? It’s not because we’re predisposed to be better at this kind of work. It’s patently absurd to suppose that hackers whose identities and life experience differ from yours or mine cannot be good participants in and leaders of our movement. In actual fact, diverse teams produce better results. While the labor pool is disproportionately filled with white men, we can find many talented hackers who cannot be described as such. If we choose to be inspired by them, and led by them, we will discover new perspectives on our software, and on our movement and its broader place in the world. They can help us create a safe and inviting space for other talented hackers who identify with them. We will be more effective at our mission of bringing free software to everyone with their help.

Moreover, there are a lot of damned good hackers who don’t look like me, and I would be happy to follow their lead regardless of any other considerations.

The free software ecosystem (and the world at large) is not under threat from some woke agenda – a conspiracy theory which has been fabricated out of whole cloth. The people you fear are just people, much like you and I, and they only want to be treated as such. Asking them to shut up and get in line, to suppress their identity, experiences, and politics, to avoid confronting you with uncomfortable questions about your biases and privileges by way of their existence alone – it’s not right.

Forget the politics and focus on the software? It’s simply not possible. Free software is politics. Treating other people with respect, maturity, and professionalism, and valuing their contributions at any level, including leadership, regardless of their appearance or identity – that’s just part of being a good person. That is apolitical.


Alyssa gave her blessing regarding the use of her image and her example in this post. Thanks!

rc: a new shell for Unix

rc is a Unix shell I’ve been working on over the past couple of weeks, though it’s been in the design stages for a while longer than that. It’s not done or ready for general use yet, but it is interesting, so let’s talk about it.

As the name (which is subject to change) implies, rc is inspired by the Plan 9 rc shell. It’s not an implementation of Plan 9 rc, however: it departs in many notable ways. I’ll assume most readers are more familiar with POSIX shell or Bash and skip many of the direct comparisons to Plan 9. Also, though most of the features work as described, the shell is a work-in-progress and some of the design I’m going over today has not been implemented yet.

Let’s start with the basics. Simple usage works much as you’d expect:

name=ddevault
echo Hello $name

But there’s already something important that might catch your eye here: the lack of quotes around $name. One substantial improvement rc makes over POSIX shells and Bash right off the bat is fixing our global shell quoting nightmare. There’s no need to quote variables!

# POSIX shell
x="hello world"
printf '%s\n' $x
# hello
# world

# rc
x="hello world"
printf '%s\n' $x
# hello world

Of course, the POSIX behavior is actually useful sometimes. rc provides for this by acknowledging that shells have not just one fundamental type (strings), but two: strings and lists of strings, i.e. argument vectors.

x=(one two three)
echo $x(1)  # prints first item ("one")
echo $x     # expands to arguments (echo "one" "two" "three")
echo $#x    # length operator: prints 3

x="echo hello world"
$x
# echo hello world: command not found

x=(echo hello world)
$x
# hello world

# expands to a string, list values separated with space:
$"x
# echo hello world: command not found

You can also slice up lists and get a subset of items:

x=(one two three four five)
echo $x(-4) # one two three four
echo $x(2-) # two three four five
echo $x(2-4) # two three four

A departure from Plan 9 rc is that the list operators can be used with strings for string operations as well:

x="hello world"
echo $#x     # 11
echo $x(2)   # e
echo $x(1-5) # hello

rc also supports loops. The simple case is iterating over the command line arguments:

% cat test.rc 
for (arg) {
	echo $arg
}
% rc test.rc one two three 
one
two
three

{ } is a command like any other; this can be simplified to for (arg) echo $arg. You can also enumerate any list with in:

list=(one two three)
for (item in $list) {
	echo $item
}

We also have while loops and if:

while (true) {
	if (test $x -eq 10) {
		echo ten
	} else {
		echo $x
	}
}

Functions are defined like so:

fn greet {
	echo Hello $1
}

greet ddevault

Again, any command can be used, so this can be simplified to fn greet echo $1. You can also add named parameters:

fn greet(user time) {
	echo Hello $user
	echo It is $time
}

greet ddevault `{date}

Note the use of `{script…} instead of $() for command expansion. Additional arguments are still placed in $*, allowing for the user to combine variadic-style functions with named arguments.

Here’s a more complex script that I run to perform sanity checks before applying patches:

#!/bin/rc
fn check_branch(branch) {
	if (test `{git rev-parse --abbrev-ref HEAD} != $branch) {
		echo "Error: not on master branch"
		exit 1
	}
}

fn check_uncommitted {
	if (test `{git status -suno | wc -l} -ne 0) {
		echo "Error: you have uncommitted changes"
		exit 1
	}
}

fn check_behind {
	if (test `{git rev-list "@{u}.." | wc -l} -ne 0) {
		echo "Error: your branch is behind upstream"
		exit 1
	}
}

check_branch master
check_uncommitted
check_behind
exec git pull

That’s a brief introduction to rc! Presently it clocks in at about 2500 lines of Hare. It’s not done yet, so don’t get too excited, but much of what’s described here is already working. Some other stuff which works but I didn’t mention include:

  • Boolean compound commands (x && y, x || y)
  • Pipelines, which can pipe arbitrary file descriptors (“x |[2] y”)
  • Redirects, also including arbitrary fds (“x >[2=1] file”)

It also has a formal context-free grammar, which is a work-in-progress but speaks to our desire to have a robust description of the shell available for users and other implementations. We use Ember Sawady’s excellent madeline for our interactive mode, which supports command line editing, history, ^r, and fish-style forward completion OOTB.

Future plans include:

  • Simple arithmetic expansion
  • Named pipe expansions
  • Sub-shells
  • switch statements
  • Port to ares
  • Find a new name, perhaps

It needs a small amount of polish, cleanup, and bugs fixed as well.

I hope you find it interesting! I will let you know when it’s done. Feel free to play with it in the meanwhile, and maybe send some patches?

The Free Software Foundation is dying

The Free Software Foundation is one of the longest-running missions in the free software movement, effectively defining it. It provides a legal foundation for the movement and organizes activism around software freedom. The GNU project, closely related, has its own long story in our movement as the coding arm of the Free Software Foundation, taking these principles and philosophy into practice by developing free software; notably the GNU operating system that famously rests atop GNU/Linux.

Today, almost 40 years on, the FSF is dying.

Their achievements are unmistakable: we must offer them our gratitude and admiration for decades of accomplishments in establishing and advancing our cause. The principles of software freedom are more important than ever, and the products of these institutions remain necessary and useful – the GPL license family, GCC, GNU coreutils, and so on. Nevertheless, the organizations behind this work are floundering.

The Free Software Foundation must concern itself with the following ahead of all else:

  1. Disseminating free software philosophy
  2. Developing, publishing, and promoting copyleft licenses
  3. Overseeing the health of the free software movement

It is failing in each of these regards, and as its core mission fails, the foundation is investing its resources into distractions.

In its role as the thought-leaders of free software philosophy, the message of the FSF has a narrow reach. The organization’s messaging is tone-deaf, ineffective, and myopic. Hammering on about “GNU/Linux” nomenclature, antagonism towards our allies in the open source movement, maligning the audience as “useds” rather than “users”; none of this aids the cause. The pages and pages of dense philosophical essays and poorly organized FAQs do not provide a useful entry point or reference for the community. The message cannot spread like this.

As for copyleft, well, it’s no coincidence that many people struggle with the FSF’s approach. Do you, dear reader, know the difference between free software and copyleft? Many people assume that the MIT license is not free software because it’s not viral. The GPL family of licenses are essential for our movement, but few people understand its dense and esoteric language, despite the 16,000-word FAQ which supplements it. And hip new software isn’t using copyleft: over 1 million npm packages use a permissive license while fewer than 20,000 use the GPL; cargo sports a half-million permissive packages and another 20,000 or so GPL’d.

And is the free software movement healthy? This one gets an emphatic “yes!” – thanks to the open source movement and the near-equivalence between free software and open source software. There’s more free software than ever and virtually all new software contains free software components, and most people call it open source.

The FOSS community is now dominated by people who are beyond the reach of the FSF’s message. The broader community is enjoying a growth in the diversity of backgrounds and values represented, and the message does not reach these people. The FSF fails to understand its place in the world as a whole, or its relationship to the progressive movements taking place in the ecosystem and beyond. The foundation does not reach out to new leaders in the community, leaving them to form insular, weak institutions among themselves with no central leadership, and leaving us vulnerable to exploitation from growing movements like open core and commercial attacks on the free and open source software brand.

Reforms are sorely needed for the FSF to fulfill it basic mission. In particular, I call for the following changes:

  1. Reform the leadership. It’s time for Richard Stallman to go. His polemeic rhetoric rivals even my own, and the demographics he represents – to the exclusion of all others – is becoming a minority within the free software movement. We need more leaders of color, women, LGBTQ representation, and others besides. The present leadership, particularly from RMS, creates an exclusionary environment in a place where inclusion and representation are important for the success of the movement.
  2. Reform the institution. The FSF needs to correct its myopic view of the ecosystem, reach out to emerging leaders throughout the FOSS world, and ask them to take charge of the FSF’s mission. It’s these leaders who hold the reins of the free software movement today – not the FSF. If the FSF still wants to be involved in the movement, they need to recognize and empower the leaders who are pushing the cause forward.
  3. Reform the message. People depend on the FSF to establish a strong background in free software philosophy and practices within the community, and the FSF is not providing this. The message needs to be made much more accessible and level in tone, and the relationship between free software and open source needs to be reformed so that the FSF and OSI stand together as the pillars at the foundations of our ecosystem.
  4. Decouple the FSF from the GNU project. FSF and GNU have worked hand-in-hand over decades to build the movement from scratch, but their privileged relationship has become obsolete. The GNU project represents a minute fraction of the free software ecosystem today, and it’s necessary for the Free Software Foundation to stand independently of any particular project and focus on the health of the ecosystem as a whole.
  5. Develop new copyleft licenses. The GPL family of licenses has served us well, but we need to do better. The best copyleft license today is the MPL, whose terse form and accessible language outperforms the GPL in many respects. However, it does not provide a comprehensive answer to the needs of copyleft, and new licenses are required to fill other niches in the market – the FSF should write these licenses. Furthermore, the FSF should present the community with a free software perspective on licenses as a resource that project leaders can depend on to understand the importance of their licensing choice such that they understand the appeal of copyleft licenses without feeling pushed away from permissive approaches.

The free software movement needs a strong force uniting it: we face challenges from many sides, and today’s Free Software Foundation is not equal to the task. The FOSS ecosystem is flourishing, and it’s time for the FSF to step up to the wheel and direct its coming successes in the name of software freedom.

Writing Helios drivers in the Mercury driver environment

Helios is a microkernel written in the Hare programming language and is part of the larger Ares operating system. You can watch my FOSDEM 2023 talk introducing Helios on PeerTube.

Let’s take a look at the new Mercury driver development environment for Helios.

As you may remember from my FOSDEM talk, the Ares operating system is built out of several layers which provide progressively higher-level environments for an operating system. At the bottom is the Helios microkernel, and today we’re going to talk about the second layer: the Mercury environment, which is used for writing and running device drivers in userspace. Let’s take a look at a serial driver written against Mercury and introduce some of the primitives used by driver authors in the Mercury environment.

Drivers for Mercury are written as normal ELF executables with an extra section called .manifest, which includes a file similar to the following (the provided example is for the serial driver we’ll be examining today):

[driver]
name=pcserial
desc=Serial driver for x86_64 PCs

[capabilities]
0:ioport = min=3F8, max=400
1:ioport = min=2E8, max=2F0
2:note = 
3:irq = irq=3, note=2
4:irq = irq=4, note=2
_:cspace = self
_:vspace = self
_:memory = pages=32

[services]
devregistry=

Helios uses a capability-based design, in which access to system resources (such as I/O ports, IRQs, or memory) is governed by capability objects. Each process has a capability space, which is a table of capabilities assigned to that process, and when performing operations (such as writing to an I/O port) the user provides the index of the desired capability in a register when invoking the appropriate syscall.

The manifest first specifies a list of capabilities required to operate the serial port. It requests, assigned static capability addresses, capabilities for the required I/O ports and IRQs, as well as a notification object which the IRQs will be delivered to. Some capability types, such as I/O ports, have configuration parameters, in this case the minimum and maximum port numbers which are relevant. The IRQ capabilities require a reference to a notification as well.

Limiting access to these capabilities provides very strong isolation between device drivers. On a monolithic kernel like Linux, a bug in the serial driver could compromise the entire system, but a vulnerability in our driver could, at worst, write garbage to your serial port. This model also provides better security than something like OpenBSD’s pledge by declaratively specifying what we need and nothing else.

Following the statically allocated capabilities, we request our own capability space and virtual address space, the former so we can copy and destroy our capabilities, and the latter so that we can map shared memory to perform reads and writes for clients. We also request 32 pages of memory, which we use to allocate page tables to perform those mappings; this will be changed later. These capabilities do not require any specific address for the driver to work, so we use “_” to indicate that any slot will suit our needs.

Mercury uses some vendor extensions over the System-V ABI to communicate information about these capabilities to the runtime. Notes about each of the _’d capabilities are provided by the auxiliary vector, and picked up by the Mercury runtime – for instance, the presence of a memory capability is detected on startup and is used to set up the allocator; the presence of a vspace capability is automatically wired up to the mmap implementation.

Each of these capabilities is implemented by the kernel, but additional services are available in userspace via endpoint capabilities. Each of these endpoints implements a particular API, as defined by a protocol definition file. This driver requires access to the device registry, so that it can create devices for its serial ports and expose them to clients.

These protocol definitions are written in a domain-specific language and parsed by ipcgen to generate client and server implementations of each. Here’s a simple protocol to start us off:

namespace io;

# The location with respect to which a seek operation is performed.
enum whence {
	# From the start of the file
	SET,
	# From the current offset
	CUR,
	# From the end of the file
	END,
};

# An object with file-like semantics.
interface file {
	# Reads up to amt bytes of data from a file.
	call read{pages: page...}(buf: uintptr, amt: size) size;

	# Writes up to amt bytes of data to a file.
	call write{pages: page...}(buf: uintptr, amt: size) size;

	# Seeks a file to a given offset, returning the new offset.
	call seek(offs: i64, w: whence) size;
};

Each interface includes a list of methods, each of which can take a number of capabilities and parameters, and return a value. The “read” call here, when implemented by a file-like object, accepts a list of memory pages to perform the read or write with (shared memory), as well as a pointer to the buffer address and size. Error handling is still a to-do.

ipcgen consumes these files and writes client or server code as appropriate. These are generated as part of the Mercury build process and end up in *_gen.ha files. The generated client code is filed away into the relevant modules (this protocol ends up at io/file_gen.ha), alongside various hand-written files which provide additional functionality and often wrap the IPC calls in a higher-level interface. The server implementations end up in the “serv” module, e.g. serv/io/file_gen.ha.

Let’s look at some of the generated client code for io::file objects:

// This file was generated by ipcgen; do not modify by hand
use helios;
use rt;

// ID for the file IPC interface.
export def FILE_ID: u32 = 0x9A533BB3;

// Labels for operations against file objects.
export type file_label = enum u64 {
	READ = FILE_ID << 16u64 | 1,
	WRITE = FILE_ID << 16u64 | 2,
	SEEK = FILE_ID << 16u64 | 3,
};

export fn file_read(
	ep: helios::cap,
	pages: []helios::cap,
	buf: uintptr,
	amt: size,
) size = {
	// ...
};

Each interface has a unique ID (generated from the FNV-1a hash of its fully qualified name), which is bitwise-OR’d with a list of operations to form call labels. The interface ID is used elsewhere; we’ll refer to it again later. Then each method generates an implementation which arranges the IPC details as necessary and invokes the “call” syscall against the endpoint capability.

The generated server code is a bit more involved. Some of the details are similar – FILE_ID is generated again, for instance – but there are some additional details as well. First is the generation of a vtable defining the functions implementing each operation:

// Implementation of a [[file]] object.
export type file_iface = struct {
	read: *fn_file_read,
	write: *fn_file_write,
	seek: *fn_file_seek,
};

We also define a file object which is subtyped by the implementation to store implementation details, and which provides to the generated code the required bits of state.

// Instance of an file object. Users may subtype this object to add
// instance-specific state.
export type file = struct {
	_iface: *file_iface,
	_endpoint: helios::cap,
};

Here’s an example of a subtype of file used by the initramfs to store additional state:

// An open file in the bootstrap filesystem
type bfs_file = struct {
	serv::io::file,
	fs: *bfs,
	ent: tar::entry,
	cur: io::off,
	padding: size,
};

The embedded serv::io::file structure here is populated with an implementation of file_iface, here simplified for illustrative purposes:

const bfs_file_impl = serv_io::file_iface {
	read = &bfs_file_read,
	write = &bfs_file_write,
	seek = &bfs_file_seek,
};

fn bfs_file_read(
	obj: *serv_io::file,
	pages: []helios::cap,
	buf: uintptr,
	amt: size,
) size = {
	let file = obj: *bfs_file;
	const fs = file.fs;
	const offs = (buf & rt::PAGEMASK): size;
	defer helios::destroy(pages...)!;

	assert(offs + amt <= len(pages) * rt::PAGESIZE);
	const buf = helios::map(rt::vspace, 0, map_flags::W, pages...)!: *[*]u8;

	let buf = buf[offs..offs+amt];
	// Not shown: reading the file data into this buffer
};

The implementation can prepare a file object and call dispatch on it to process client requests: this function blocks until a request arrives, decodes it, and invokes the appropriate function. Often this is incorporated into an event loop with poll to service many objects at once.

// Prepare a file object
const ep = helios::newendpoint()!;
append(fs.files, bfs_file {
	_iface = &bfs_file_impl,
	_endpoint = ep,
	fs = fs,
	ent = ent,
	cur = io::tell(fs.buf)!,
	padding = fs.rd.padding,
});

// ...

// Process requests associated with this file
serv::io::file_dispatch(file);

Okay, enough background: back to the serial driver. It needs to implement the following protocol:

namespace dev;
use io;

# TODO: Add busy error and narrow semantics

# Note: TWO is interpreted as 1.5 for some char lengths (5)
enum stop_bits {
	ONE,
	TWO,
};

enum parity {
	NONE,
	ODD,
	EVEN,
	MARK,
	SPACE,
};

# A serial device, which implements the file interface for reading from and
# writing to a serial port. Typical implementations may only support one read
# in-flight at a time, returning errors::busy otherwise.
interface serial :: io::file {
	# Returns the baud rate in Hz.
	call get_baud() uint;

	# Returns the configured number of bits per character.
	call get_charlen() uint;

	# Returns the configured number of stop bits.
	call get_stopbits() stop_bits;

	# Returns the configured parity setting.
	call get_parity() parity;

	# Sets the baud rate in Hz.
	call set_baud(hz: uint) void;

	# Sets the number of bits per character. Must be 5, 6, 7, or 8.
	call set_charlen(bits: uint) void;

	# Configures the number of stop bits to use.
	call set_stopbits(bits: stop_bits) void;

	# Configures the desired parity.
	call set_parity(parity: parity) void;
};

This protocol inherits the io::file interface, so the serial port is usable like any other file for reads and writes. It additionally defines serial-specific methods, such as configuring the baud rate or parity. The generated interface we’ll have to implement looks something like this, embedding the io::file_iface struct:

export type serial_iface = struct {
	io::file_iface,
	get_baud: *fn_serial_get_baud,
	get_charlen: *fn_serial_get_charlen,
	get_stopbits: *fn_serial_get_stopbits,
	get_parity: *fn_serial_get_parity,
	set_baud: *fn_serial_set_baud,
	set_charlen: *fn_serial_set_charlen,
	set_stopbits: *fn_serial_set_stopbits,
	set_parity: *fn_serial_set_parity,
}

Time to dive into the implementation. Recall the driver manifest, which provides the serial driver with a suitable environment:

[driver]
name=pcserial
desc=Serial driver for x86_64 PCs

[capabilities]
0:ioport = min=3F8, max=400
1:ioport = min=2E8, max=2F0
2:note = 
3:irq = irq=3, note=2
4:irq = irq=4, note=2
_:cspace = self
_:vspace = self
_:memory = pages=32

[services]
devregistry=

I/O ports for reading and writing to the serial devices, IRQs for receiving serial-related interrupts, a device registry to add our serial devices to the system, and a few extra things for implementation needs. Some of these are statically allocated, some of them are provided via the auxiliary vector. Our serial driver opens by defining constants for the statically allocated capabilities:

def IOPORT_A: helios::cap = 0;
def IOPORT_B: helios::cap = 1;
def IRQ: helios::cap = 2;
def IRQ3: helios::cap = 3;
def IRQ4: helios::cap = 4;

The first thing we do on startup is create a serial device.

export fn main() void = {
	let serial0: helios::cap = 0;
	const registry = helios::service(sys::DEVREGISTRY_ID);
	sys::devregistry_new(registry, dev::SERIAL_ID, &serial0);
	helios::destroy(registry)!;
	// ...

The device registry is provided via the aux vector, and we can use helios::service to look it up by its interface ID. Then we use the devregistry::new operation to create a serial device:

# Device driver registry.
interface devregistry {
	# Creates a new device implementing the given interface ID using the
	# provided endpoint capability and returns its assigned serial number.
	call new{; out}(iface: u64) uint;
};

After this we can destroy the registry – we won’t need it again and it’s best to get rid of it so that we can work with the minimum possible privileges at runtime. After this we initialize the serial port, acknowledge any interrupts that might have been pending before we got started, an enter the main loop.

com_init(&ports[0], serial0);

helios::irq_ack(IRQ3)!;
helios::irq_ack(IRQ4)!;

let poll: [_]pollcap = [
	pollcap { cap = IRQ, events = pollflags::RECV, ... },
	pollcap { cap = serial0, events = pollflags::RECV, ... },
];
for (true) {
	helios::poll(poll)!;
	if (poll[0].revents & pollflags::RECV != 0) {
		dispatch_irq();
	};
	if (poll[1].revents & pollflags::RECV != 0) {
		dispatch_serial(&ports[0]);
	};
};

The dispatch_serial function is of interest, as this provides the implementation of the serial object we just created with the device registry.

type comport = struct {
	dev::serial,
	port: u16,
	rbuf: [4096]u8,
	wbuf: [4096]u8,
	rpending: []u8,
	wpending: []u8,
};

fn dispatch_serial(dev: *comport) void = {
	dev::serial_dispatch(dev);
};

const serial_impl = dev::serial_iface {
	read = &serial_read,
	write = &serial_write,
	seek = &serial_seek,
        get_baud = &serial_get_baud,
        get_charlen = &serial_get_charlen,
        get_stopbits = &serial_get_stopbits,
        get_parity = &serial_get_parity,
        set_baud = &serial_set_baud,
        set_charlen = &serial_set_charlen,
        set_stopbits = &serial_set_stopbits,
        set_parity = &serial_set_parity,
};

fn serial_read(
	obj: *io::file,
	pages: []helios::cap,
	buf: uintptr,
	amt: size,
) size = {
	const port = obj: *comport;
	const offs = (buf & rt::PAGEMASK): size;
	const buf = helios::map(rt::vspace, 0, map_flags::W, pages...)!: *[*]u8;
	const buf = buf[offs..offs+amt];

	if (len(port.rpending) != 0) {
		defer helios::destroy(pages...)!;
		return rconsume(port, buf);
	};

	pages_static[..len(pages)] = pages[..];
	pending_read = read {
		reply = helios::store_reply(helios::CADDR_UNDEF)!,
		pages = pages_static[..len(pages)],
		buf = buf,
	};
	return 0;
};

// (other functions omitted)

We’ll skip much of the implementation details for this specific driver, but I’ll show you how read works at least. It’s relatively straightforward: first we mmap the buffer provided by the caller. If there’s already readable data pending from the serial port (stored in that rpending slice in the comport struct, which is a slice of the statically-allocated rbuf field), we copy it into the buffer and return the number of bytes we had ready. Otherwise, we stash details about the caller, storing the special reply capability in our cspace (this is one of the reasons we need cspace = self in our manifest) so we can reply to this call once data is available. Then we return to the main loop.

The main loop also wakes up on an interrupt, and we have an interrupt unmasked on the serial device to wake us whenever there’s data ready to be read. Eventually this gets us here, which finishes the call we saved earlier:

// Reads data from the serial port's RX FIFO.
fn com_read(com: *comport) size = {
	let n: size = 0;
	for (comin(com.port, LSR) & RBF == RBF; n += 1) {
		const ch = comin(com.port, RBR);
		if (len(com.rpending) < len(com.rbuf)) {
			// If the buffer is full we just drop chars
			static append(com.rpending, ch);
		};
	};

	if (pending_read.reply != 0) {
		const n = rconsume(com, pending_read.buf);
		helios::send(pending_read.reply, 0, n)!;
		pending_read.reply = 0;
		helios::destroy(pending_read.pages...)!;
	};

	return n;
};

I hope that gives you a general idea of how drivers work in this environment! I encourage you to read the full implementation if you’re curious to know more about the serial driver in particular – it’s just 370 lines of code.

The last thing I want to show you is how the driver gets executed in the first place. When Helios boots up, it starts /sbin/sysinit, which is provided by Mercury and offers various low-level userspace runtime services, such as the device registry and bootstrap filesystem we saw earlier. After setting up its services, sysinit executes /sbin/usrinit, which is provided by the next layer up (Gaia, eventually) and sets up the rest of the system according to user policy, mounting filesystems and starting up drivers and such. At the moment, usrinit is fairly simple, and just runs a little demo. Here it is in full:

use dev;
use fs;
use helios;
use io;
use log;
use rt;
use sys;

export fn main() void = {
	const fs = helios::service(fs::FS_ID);
	const procmgr = helios::service(sys::PROCMGR_ID);
	const devmgr = helios::service(sys::DEVMGR_ID);
	const devload = helios::service(sys::DEVLOADER_ID);

	log::printfln("[usrinit] Running /sbin/drv/serial");
	let proc: helios::cap = 0;
	const image = fs::open(fs, "/sbin/drv/serial")!;
	sys::procmgr_new(procmgr, &proc);
	sys::devloader_load(devload, proc, image);
	sys::process_start(proc);

	let serial: helios::cap = 0;
	log::printfln("[usrinit] open device serial0");
	sys::devmgr_open(devmgr, dev::SERIAL_ID, 0, &serial);

	let buf: [rt::PAGESIZE]u8 = [0...];
	for (true) {
		const n = match (io::read(serial, buf)!) {
		case let n: size =>
			yield n;
		case io::EOF =>
			break;
		};

		// CR => LF
		for (let i = 0z; i < n; i += 1) {
			if (buf[i] == '\r') {
				buf[i] = '\n';
			};
		};

		// echo
		io::write(serial, buf[..n])!;
	};
};

Each of the services shown at the start are automatically provided in usrinit’s aux vector by sysinit, and includes all of the services required to bootstrap the system. This includes a filesystem (the initramfs), a process manager (to start up new processes), the device manager, and the driver loader service.

usrinit starts by opening up /sbin/drv/serial (the serial driver, of course) from the provided initramfs using fs::open, which is a convenience wrapper around the filesystem protocol. Then we create a new process with the process manager, which by default has an empty address space – we could load a normal process into it with sys::process_load, but we want to load a driver, so we use the devloader interface instead. Then we start the process and boom: the serial driver is online.

The serial driver registers itself with the device registry, which means that we can use the device manager to open the 0th device which implements the serial interface. Since this is compatible with the io::file interface, it can simply be used normally with io::read and io::write to utilize the serial port. The main loop simply echos data read from the serial port back out. Simple!


That’s a quick introduction to the driver environment provided by Mercury. I intend to write a few more drivers soon myself – PC keyboard, framebuffer, etc – and set up a simple shell. We have seen a few sample drivers written pre-Mercury which would be nice to bring into this environment, such as virtio networking and block devices. It will be nice to see them re-introduced in an environment where they can provide useful services to the rest of userspace.

If you’re interested in learning more about Helios or Mercury, consult ares-os.org for documentation – though beware of the many stub pages. If you have any questions or want to get involved in writing some drivers yourself, jump into our IRC channel: #helios on Libera Chat.

When to comment that code

My software tends to have a surprisingly low number of comments. One of my projects, scdoc, has 25 comments among its 1,133 lines of C code, or 2%, compared to the average of 19%.1 Naturally, I insist that my code is well-written in spite of this divergence from the norm. Allow me to explain.

The philosophy and implementation of code comments varies widely in the industry, and some view comment density as a proxy for code quality.2 I’ll state my views here, but will note that yours may differ and I find that acceptable; I am not here to suggest that your strategy is wrong and I will happily adopt it when I write a patch for your codebase.

Let’s begin with an illustrative example from one of my projects:

// Reads the next entry from an EFI [[FILE_PROTOCOL]] handle of an open
// directory. The return value is statically allocated and will be overwritten
// on the next call.
export fn readdir(dir: *FILE_PROTOCOL) (*FILE_INFO | void | error) = {
	// size(FILE_INFO) plus reserve up to 512 bytes for file name (FAT32
	// maximum, times two for wstr encoding)
	static let buf: [FILE_INFO_SIZE + 512]u8 = [0...];
	const n = read(dir, buf)?;
	if (n == 0) {
		return;
	};
	return &buf[0]: *FILE_INFO;
};

This code illustrates two of my various approaches to writing comments. The first comment is a documentation comment: the intended audience is the consumer of this API. The call-site has access to the following information:

  • This comment
  • The name of the function, and the module in which it resides (efi::readdir)
  • The parameter names and types
  • The return type

The goal is for the user of this function to gather enough information from these details to correctly utilize this API.

The module in which it resides suggests that this function interacts with the EFI (Extensible Firmware Interface) standard, and the user would be wise to pair a reading of this code (or API) with skimming the relevant standard. Indeed, the strategic naming of the FILE_PROTOCOL and FILE_INFO types (notably written in defiance of the Hare style guide), provide hints to the relevant parts of the EFI specification to read for a complete understanding of this code.

The name of the function is also carefully chosen to carry some weight: it is a reference to the Unix readdir function, which brings with it an intuition about its purpose and usage for programmers familiar with a Unix environment.

The return type also provides hints about the function’s use: it may return either a FILE_INFO pointer, void (nothing), or an error. Without reading the documentation string, and taking the name and return type into account, we might (correctly) surmise that we need to call this function repeatedly to read file details out of a directory until it returns void, indicating that all entries have been processed, handling any errors which might occur along the way.

We have established a lot of information about this function without actually reading the comment; in my philosophy of programming I view this information as a critical means for the author to communicate to the user, and we can lean on it to reduce the need for explicit documentation. Nevertheless, the documentation comment adds something here. The first sentence is a relatively information-sparse summary of the function’s purpose, and mainly exists to tick a box in the Hare style guide.3 The second sentence is the only real reason this comment exists: to clarify an important detail for the user which is not apparent from the function signature, namely the storage semantics associated with the return value.

Let’s now study the second comment’s purpose:

// size(FILE_INFO) plus reserve up to 512 bytes for file name (FAT32
// maximum, times two for wstr encoding)
static let buf: [FILE_INFO_SIZE + 512]u8 = [0...];

This comment exists to explain the use of the magic constant of 512. The audience of this comment is someone reading the implementation of this function. This audience has access to a different context than the user of the function, for instance they are expected to have a more comprehensive knowledge of EFI and are definitely expected to be reading the specification to a much greater degree of detail. We can and should lean on that context to make our comments more concise and useful.

An alternative writing which does not rely on this context, and which in my view is strictly worse, may look like the following:

// The FILE_INFO structure includes the file details plus a variable length
// array for the filename. The underlying filesystem is always FAT32 per the
// EFI specification, which has a maximum filename length of 256 characters. The
// filename is encoded as a wide-string (UCS-2), which encodes two bytes per
// character, and is not NUL-terminated, so we need to reserve up to 512 bytes
// for the filename.
static let buf: [FILE_INFO_SIZE + 512]u8 = [0...];

The target audience of this comment should have a reasonable understanding of EFI. We simply need to clarify that this constant is the FAT32 max filename length, times two to account for the wstr encoding, and our magic constant is sufficiently explained.

Let’s move on to another kind of comment I occasionally write: medium-length prose. These often appear at the start of a function or the start of a file and serve to add context to the implementation, to justify the code’s existence or explain why it works. Another sample:

fn init_pagetables() void = {
	// 0xFFFF0000xxxxxxxx - 0xFFFF0200xxxxxxxx: identity map
	// 0xFFFF0200xxxxxxxx - 0xFFFF0400xxxxxxxx: identity map (dev)
	// 0xFFFF8000xxxxxxxx - 0xFFFF8000xxxxxxxx: kernel image
	//
	// L0[0x000]    => L1_ident
	// L0[0x004]    => L1_devident
	// L1_ident[*]  => 1 GiB identity mappings
	// L0[0x100]    => L1_kernel
	// L1_kernel[0] => L2_kernel
	// L2_kernel[0] => L3_kernel
	// L3_kernel[0] => 4 KiB kernel pages
	L0[0x000] = PT_TABLE | &L1_ident: uintptr | PT_AF;
	L0[0x004] = PT_TABLE | &L1_devident: uintptr | PT_AF;
	L0[0x100] = PT_TABLE | &L1_kernel: uintptr | PT_AF;
	L1_kernel[0] = PT_TABLE | &L2_kernel: uintptr | PT_AF;
	L2_kernel[0] = PT_TABLE | &L3_kernel: uintptr | PT_AF;
	for (let i = 0u64; i < len(L1_ident): u64; i += 1) {
		L1_ident[i] = PT_BLOCK | (i * 0x40000000): uintptr |
			PT_NORMAL | PT_AF | PT_ISM | PT_RW;
	};
	for (let i = 0u64; i < len(L1_devident): u64; i += 1) {
		L1_devident[i] = PT_BLOCK | (i * 0x40000000): uintptr |
			PT_DEVICE | PT_AF | PT_ISM | PT_RW;
	};
};

This comment shares a trait with the previous example: its purpose, in part, is to justify magic constants. It explains the indices of the arrays by way of the desired address space, and a perceptive reader will notice that 1 GiB = 1073741824 bytes = 0x40000000 bytes.

To fully understand this, we must again consider the intended audience. This is an implementation comment, so the reader is an implementer. They will need to possess some familiarity with the behavior of page tables to be productive in this code, and they likely have the ARM manual up on their second monitor. This comment simply fills in the blanks for an informed reader.

There are two additional kinds of comments I often write: TODO and XXX.

A TODO comment indicates some important implementation deficiency; it must be addressed at some point in the future and generally indicates that the function does not meet its stated interface and is often accompanied by an assertion, or a link to a ticket on the bug tracker, or both.

assert(ep.send == null); // TODO: support multiple senders

This function should support multiple senders, but does not; an assertion here prevents the code from running under conditions it does not yet support and the TODO comment indicates that this should be addressed in the future. The target audience for this comment is someone who brings about these conditions and runs into the assertion failure.

fn memory_empty(mem: *memory) bool = {
	// XXX: This O(n) linked list traversal is bad
	let next = mem.next;
	let pages = 0u;
	for (next != FREELIST_END; pages += 1) {
		const addr = mem.phys + (next * mem::PAGESIZE): uintptr;
		const ptr = mem::phys_tokernel(addr): *uint;
		next = *ptr;
	};
	return pages == mem.pages;
};

Here we find an example of an XXX comment. This code is correct: it implements the function’s interface perfectly. However, given its expected usage, a performance of O(n) is not great: this function is expected to be used in hot paths. This comment documents the deficiency, and provides a hint to a reader that might be profiling this code in regards to a possible improvement.

One final example:

// Invalidates the TLB for a virtual address.
export fn invalidate(virt: uintptr) void = {
	// TODO: Notify other cores (XXX SMP)
	invlpg(virt);
};

This is an atypical usage of XXX, but one which I still occasionally reach for. Here we have a TODO comment which indicates a case which this code does not consider, but which must be addressed in the future: it will have to raise an IPI to get other cores to invalidate the affected virtual address. However, this is one of many changes which fall under a broader milestone of SMP support, and the “XXX SMP” comment is here to make it easy to grep through the codebase for any places which are known to require attention while implementing SMP support. An XXX comment is often written for the purpose of being easily found with grep.

That sums up most of the common reasons I will write a comment in my software. Each comment is written considering a target audience and the context provided by the code in which it resides, and aims to avoid stating redundant information within these conditions. It’s for this reason that my code is sparse on comments: I find the information outside of the comments equally important and aim to be concise such that a comment is not redundant with information found elsewhere.

Hopefully this post inspired some thought in you, to consider your comments deliberately and to be more aware of your ability to communicate information in other ways. Even if you chose to write your comments more densely than I do, I hope you will take care to communicate well through other mediums in your code as well.

Porting Helios to aarch64 for my FOSDEM talk, part one

Helios is a microkernel written in the Hare programming language, and the subject of a talk I did at FOSDEM earlier this month. You can watch the talk here if you like:

A while ago I promised someone that I would not do any talks on Helios until I could present them from Helios itself, and at FOSDEM I made good on that promise: my talk was presented from a Raspberry Pi 4 running Helios. The kernel was originally designed for x86_64 (though we were careful to avoid painting ourselves into any corners so that we could port it to more architectures later on), and I initially planned to write an Intel HD Graphics driver so that I could drive the projector from my laptop. But, after a few days spent trying to comprehend the IHD manuals, I decided it would be much easier to port the entire system to aarch64 and write a driver for the much-simpler RPi GPU instead. 42 days later the port was complete, and a week or so after that I successfully presented the talk at FOSDEM. In a series of blog posts, I will take a look at those 42 days of work and explain how the aarch64 port works. Today’s post focuses on the bootloader.

The Helios boot-up process is:

  1. Bootloader starts up and loads the kernel, then jumps to it
  2. The kernel configures the system and loads the init process
  3. Kernel provides runtime services to init (and any subsequent processes)

In theory, the port to aarch64 would address these steps in order, but in practice step (2) relies heavily on the runtime services provided by step (3), so much of the work was ordered 1, 3, 2. This blog post focuses on part 1, I’ll cover parts 2 and 3 and all of the fun problems they caused in later posts.

In any case, the bootloader was the first step. Some basic changes to the build system established boot/+aarch64 as the aarch64 bootloader, and a simple qemu-specific ARM kernel was prepared which just gave a little “hello world” to demonstrate the multi-arch build system was working as intended. More build system refinements would come later, but it’s off to the races from here. Targeting qemu’s aarch64 virt platform was useful for most of the initial debugging and bring-up (and is generally useful at all times, as a much easier platform to debug than real hardware); the first tests on real hardware came much later.

Booting up is a sore point on most systems. It involves a lot of arch-specific procedures, but also generally calls for custom binary formats and annoying things like disk drivers — which don’t belong in a microkernel. So the Helios bootloaders are separated from the kernel proper, which is a simple ELF executable. The bootloader loads this ELF file into memory, configures a few simple things, then passes some information along to the kernel entry point. The bootloader’s memory and other resources are hereafter abandoned and are later reclaimed for general use.

On aarch64 the boot story is pretty abysmal, and I wanted to avoid adding the SoC-specific complexity which is endemic to the platform. Thus, two solutions are called for: EFI and device trees. At the bootloader level, EFI is the more important concern. For qemu-virt and Raspberry Pi, edk2 is the free-software implementation of choice when it comes to EFI. The first order of business is producing an executable which can be loaded by EFI, which is, rather unfortunately, based on the Windows COFF/PE32+ format. I took inspiration from Linux and made an disgusting EFI stub solution, which involves hand-writing a PE32+ header in assembly and doing some truly horrifying things with binutils to massage everything into order. Much of the header is lifted from Linux:

.section .text.head
.global base
base:
.L_head:
	/* DOS header */
	.ascii "MZ"
	.skip 58
	.short .Lpe_header - .L_head
	.align 4
.Lpe_header:
	.ascii "PE\0\0"
	.short 0xAA64                              /* Machine = AARCH64 */
	.short 2                                   /* NumberOfSections */
	.long 0                                    /* TimeDateStamp */
	.long 0                                    /* PointerToSymbolTable */
	.long 0                                    /* NumberOfSymbols */
	.short .Lsection_table - .Loptional_header /* SizeOfOptionalHeader */
	/* Characteristics:
	 * IMAGE_FILE_EXECUTABLE_IMAGE |
	 * IMAGE_FILE_LINE_NUMS_STRIPPED |
	 * IMAGE_FILE_DEBUG_STRIPPED */
	.short 0x206
.Loptional_header:
	.short 0x20b                     /* Magic = PE32+ (64-bit) */
	.byte 0x02                       /* MajorLinkerVersion */
	.byte 0x14                       /* MinorLinkerVersion */
	.long _data - .Lefi_header_end   /* SizeOfCode */
	.long __pecoff_data_size         /* SizeOfInitializedData */
	.long 0                          /* SizeOfUninitializedData */
	.long _start - .L_head           /* AddressOfEntryPoint */
	.long .Lefi_header_end - .L_head /* BaseOfCode */
.Lextra_header:
	.quad 0                          /* ImageBase */
	.long 4096                       /* SectionAlignment */
	.long 512                        /* FileAlignment */
	.short 0                         /* MajorOperatingSystemVersion */
	.short 0                         /* MinorOperatingSystemVersion */
	.short 0                         /* MajorImageVersion */
	.short 0                         /* MinorImageVersion */
	.short 0                         /* MajorSubsystemVersion */
	.short 0                         /* MinorSubsystemVersion */
	.long 0                          /* Reserved */

	.long _end - .L_head             /* SizeOfImage */

	.long .Lefi_header_end - .L_head /* SizeOfHeaders */
	.long 0                          /* CheckSum */
	.short 10                        /* Subsystem = EFI application */
	.short 0                         /* DLLCharacteristics */
	.quad 0                          /* SizeOfStackReserve */
	.quad 0                          /* SizeOfStackCommit */
	.quad 0                          /* SizeOfHeapReserve */
	.quad 0                          /* SizeOfHeapCommit */
	.long 0                          /* LoaderFlags */
	.long 6                          /* NumberOfRvaAndSizes */

	.quad 0 /* Export table */
	.quad 0 /* Import table */
	.quad 0 /* Resource table */
	.quad 0 /* Exception table */
	.quad 0 /* Certificate table */
	.quad 0 /* Base relocation table */

.Lsection_table:
	.ascii ".text\0\0\0"              /* Name */
	.long _etext - .Lefi_header_end   /* VirtualSize */
	.long .Lefi_header_end - .L_head  /* VirtualAddress */
	.long _etext - .Lefi_header_end   /* SizeOfRawData */
	.long .Lefi_header_end - .L_head  /* PointerToRawData */
	.long 0                           /* PointerToRelocations */
	.long 0                           /* PointerToLinenumbers */
	.short 0                          /* NumberOfRelocations */
	.short 0                          /* NumberOfLinenumbers */
	/* IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE */
	.long 0x60000020

	.ascii ".data\0\0\0"        /* Name */
	.long __pecoff_data_size    /* VirtualSize */
	.long _data - .L_head       /* VirtualAddress */
	.long __pecoff_data_rawsize /* SizeOfRawData */
	.long _data - .L_head       /* PointerToRawData */
	.long 0                     /* PointerToRelocations */
	.long 0                     /* PointerToLinenumbers */
	.short 0                    /* NumberOfRelocations */
	.short 0                    /* NumberOfLinenumbers */
	/* IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_WRITE */
	.long 0xc0000040

.balign 0x10000
.Lefi_header_end:

.global _start
_start:
	stp x0, x1, [sp, -16]!

	adrp x0, base
	add x0, x0, #:lo12:base
	adrp x1, _DYNAMIC
	add x1, x1, #:lo12:_DYNAMIC
	bl relocate
	cmp w0, #0
	bne 0f

	ldp x0, x1, [sp], 16

	b bmain

0:
	/* relocation failed */
	add sp, sp, -16
	ret

The specific details about how any of this works are complex and unpleasant, I’ll refer you to the spec if you’re curious, and offer a general suggestion that cargo-culting my work here would be a lot easier than understanding it should you need to build something similar.1

Note the entry point for later; we store two arguments from EFI (x0 and x1) on the stack and eventually branch to bmain.

This file is assisted by the linker script:

ENTRY(_start)
OUTPUT_FORMAT(elf64-littleaarch64)

SECTIONS {
	/DISCARD/ : {
		*(.rel.reloc)
		*(.eh_frame)
		*(.note.GNU-stack)
		*(.interp)
		*(.dynsym .dynstr .hash .gnu.hash)
	}

	. = 0xffff800000000000;

	.text.head : {
		_head = .;
		KEEP(*(.text.head))
	}

	.text : ALIGN(64K) {
		_text = .;
		KEEP(*(.text))
		*(.text.*)
		. = ALIGN(16);
		*(.got)
	}

	. = ALIGN(64K);
	_etext = .;

	.dynamic : {
		*(.dynamic)
	}

	.data : ALIGN(64K) {
		_data = .;
		KEEP(*(.data))
		*(.data.*)

		/* Reserve page tables */
		. = ALIGN(4K);
		L0 = .;
		. += 512 * 8;
		L1_ident = .;
		. += 512 * 8;
		L1_devident = .;
		. += 512 * 8;
		L1_kernel = .;
		. += 512 * 8;
		L2_kernel = .;
		. += 512 * 8;
		L3_kernel = .;
		. += 512 * 8;
	}

	.rela.text : {
		*(.rela.text)
		*(.rela.text*)
	}
	.rela.dyn : {
		*(.rela.dyn)
	}
	.rela.plt : {
		*(.rela.plt)
	}
	.rela.got : {
		*(.rela.got)
	}
	.rela.data : {
		*(.rela.data)
		*(.rela.data*)
	}

	.pecoff_edata_padding : {
		BYTE(0);
		. = ALIGN(512);
	}
	__pecoff_data_rawsize = ABSOLUTE(. - _data);
	_edata = .;

	.bss : ALIGN(4K) {
		KEEP(*(.bss))
		*(.bss.*)
		*(.dynbss)
	}

	. = ALIGN(64K);
	__pecoff_data_size = ABSOLUTE(. - _data);
	_end = .;
}

Items of note here are the careful treatment of relocation sections (cargo-culted from earlier work on RISC-V with Hare; not actually necessary as qbe generates PIC for aarch64)2 and the extra symbols used to gather information for the PE32+ header. Padding is also added in the required places, and static aarch64 page tables are defined for later use.

This is built as a shared object, and the Makefile mutilates reformats the resulting ELF file to produce a PE32+ executable:

$(BOOT)/bootaa64.so: $(BOOT_OBJS) $(BOOT)/link.ld
	$(LD) -Bsymbolic -shared --no-undefined \
		-T $(BOOT)/link.ld \
		$(BOOT_OBJS) \
		-o $@

$(BOOT)/bootaa64.efi: $(BOOT)/bootaa64.so
	$(OBJCOPY) -Obinary \
		-j .text.head -j .text -j .dynamic -j .data \
		-j .pecoff_edata_padding \
		-j .dynstr -j .dynsym \
		-j .rel -j .rel.* -j .rel* \
		-j .rela -j .rela.* -j .rela* \
		$< $@

With all of this mess sorted, and the PE32+ entry point branching to bmain, we can finally enter some Hare code:

export fn bmain(
	image_handle: efi::HANDLE,
	systab: *efi::SYSTEM_TABLE,
) efi::STATUS = {
    // ...
};

Getting just this far took 3 full days of work.

Initially, the Hare code incorporated a lot of proof-of-concept work from Alexey Yerin’s “carrot” kernel prototype for RISC-V, which also booted via EFI. Following the early bringing-up of the bootloader environment, this was refactored into a more robust and general-purpose EFI support layer for Helios, which will be applicable to future ports. The purpose of this module is to provide an idiomatic Hare-oriented interface to the EFI boot services, which the bootloader makes use of mainly to read files from the boot media and examine the system’s memory map.

Let’s take a look at the first few lines of bmain:

efi::init(image_handle, systab)!;

const eficons = eficons_init(systab);
log::setcons(&eficons);
log::printfln("Booting Helios aarch64 via EFI");

if (readel() == el::EL3) {
	log::printfln("Booting from EL3 is not supported");
	return efi::STATUS::LOAD_ERROR;
};

let mem = allocator { ... };
init_mmap(&mem);
init_pagetables();

Significant build system overhauls were required such that Hare modules from the kernel like log (and, later, other modules like elf) could be incorporated into the bootloader, simplifying the process of implementing more complex bootloaders. The first call of note here is init_mmap, which scans the EFI memory map and prepares a simple high-watermark allocator to be used by the bootloader to allocate memory for the kernel image and other items of interest. It’s quite simple, it just finds the largest area of general-purpose memory and sets up an allocator with it:

// Loads the memory map from EFI and initializes a page allocator using the
// largest area of physical memory.
fn init_mmap(mem: *allocator) void = {
	const iter = efi::iter_mmap()!;
	let maxphys: uintptr = 0, maxpages = 0u64;
	for (true) {
		const desc = match (efi::mmap_next(&iter)) {
		case let desc: *efi::MEMORY_DESCRIPTOR =>
			yield desc;
		case void =>
			break;
		};
		if (desc.DescriptorType != efi::MEMORY_TYPE::CONVENTIONAL) {
			continue;
		};
		if (desc.NumberOfPages > maxpages) {
			maxphys = desc.PhysicalStart;
			maxpages = desc.NumberOfPages;
		};
	};
	assert(maxphys != 0, "No suitable memory area found for kernel loader");
	assert(maxpages <= types::UINT_MAX);
	pagealloc_init(mem, maxphys, maxpages: uint);
};

init_pagetables is next. This populates the page tables reserved by the linker with the desired higher-half memory map, illustrated in the comments shown here:

fn init_pagetables() void = {
	// 0xFFFF0000xxxxxxxx - 0xFFFF0200xxxxxxxx: identity map
	// 0xFFFF0200xxxxxxxx - 0xFFFF0400xxxxxxxx: identity map (dev)
	// 0xFFFF8000xxxxxxxx - 0xFFFF8000xxxxxxxx: kernel image
	//
	// L0[0x000]    => L1_ident
	// L0[0x004]    => L1_devident
	// L1_ident[*]  => 1 GiB identity mappings
	// L0[0x100]    => L1_kernel
	// L1_kernel[0] => L2_kernel
	// L2_kernel[0] => L3_kernel
	// L3_kernel[0] => 4 KiB kernel pages
	L0[0x000] = PT_TABLE | &L1_ident: uintptr | PT_AF;
	L0[0x004] = PT_TABLE | &L1_devident: uintptr | PT_AF;
	L0[0x100] = PT_TABLE | &L1_kernel: uintptr | PT_AF;
	L1_kernel[0] = PT_TABLE | &L2_kernel: uintptr | PT_AF;
	L2_kernel[0] = PT_TABLE | &L3_kernel: uintptr | PT_AF;
	for (let i = 0u64; i < len(L1_ident): u64; i += 1) {
		L1_ident[i] = PT_BLOCK | (i * 0x40000000): uintptr |
			PT_NORMAL | PT_AF | PT_ISM | PT_RW;
	};
	for (let i = 0u64; i < len(L1_devident): u64; i += 1) {
		L1_devident[i] = PT_BLOCK | (i * 0x40000000): uintptr |
			PT_DEVICE | PT_AF | PT_ISM | PT_RW;
	};
};

In short, we want three larger memory regions to be available: an identity map, where physical memory addresses correlate 1:1 with virtual memory, an identity map configured for device MMIO (e.g. with caching disabled), and an area to load the kernel image. The first two are straightforward, they use uniform 1 GiB mappings to populate their respective page tables. The latter is slightly more complex, ultimately the kernel is loaded in 4 KiB pages so we need to set up intermediate page tables for that purpose.

We cannot actually enable these page tables until we’re finished making use of the EFI boot services — the EFI specification requires us to preserve the online memory map at this stage of affairs. However, this does lay the groundwork for the kernel loader: we have an allocator to provide pages of memory, and page tables to set up virtual memory mappings that can be activated once we’re done with EFI. bmain thus proceeds with loading the kernel:

const kernel = match (efi::open("\\helios", efi::FILE_MODE::READ)) {
case let file: *efi::FILE_PROTOCOL =>
	yield file;
case let err: efi::error =>
	log::printfln("Error: no kernel found at /helios");
	return err: efi::STATUS;
};

log::printfln("Load kernel /helios");
const kentry = match (load(&mem, kernel)) {
case let err: efi::error =>
	return err: efi::STATUS;
case let entry: uintptr =>
	yield entry: *kentry;
};
efi::close(kernel)!;

The loader itself (the “load” function here) is a relatively straightforward ELF loader; if you’ve seen one you’ve seen them all. Nevertheless, you may browse it online if you so wish. The only item of note here is the function used for mapping kernel pages:

// Maps a physical page into the kernel's virtual address space.
fn kmmap(virt: uintptr, phys: uintptr, flags: uintptr) void = {
	assert(virt & ~0x1ff000 == 0xffff800000000000: uintptr);
	const offs = (virt >> 12) & 0x1ff;
	L3_kernel[offs] = PT_PAGE | PT_NORMAL | PT_AF | PT_ISM | phys | flags;
};

The assertion enforces a constraint which is implemented by our kernel linker script, namely that all loadable kernel program headers are located within the kernel’s reserved address space. With this constraint in place, the implementation is simpler than many mmap implementations; we can assume that L3_kernel is the correct page table and just load it up with the desired physical address and mapping flags.

Following the kernel loader, the bootloader addresses other items of interest, such as loading the device tree and boot modules — which includes, for instance, the init process image and an initramfs. It also allocates & populates data structures with information which will be of later use to the kernel, including the memory map. This code is relatively straightforward and not particularly interesting; most of these processes takes advantage of the same straightforward Hare function:

// Loads a file into continuous pages of memory and returns its physical
// address.
fn load_file(
	mem: *allocator,
	file: *efi::FILE_PROTOCOL,
) (uintptr | efi::error) = {
	const info = efi::file_info(file)?;
	const fsize = info.FileSize: size;
	let npage = fsize / PAGESIZE;
	if (fsize % PAGESIZE != 0) {
		npage += 1;
	};

	let base: uintptr = 0;
	for (let i = 0z; i < npage; i += 1) {
		const phys = pagealloc(mem);
		if (base == 0) {
			base = phys;
		};

		const nbyte = if ((i + 1) * PAGESIZE > fsize) {
			yield fsize % PAGESIZE;
		} else {
			yield PAGESIZE;
		};
		let dest = (phys: *[*]u8)[..nbyte];
		const n = efi::read(file, dest)?;
		assert(n == nbyte);
	};

	return base;
};

It is not necessary to map these into virtual memory anywhere, the kernel later uses the identity-mapped physical memory region in the higher half to read them. Tasks of interest resume at the end of bmain:

efi::exit_boot_services();
init_mmu();
enter_kernel(kentry, ctx);

Once we exit boot services, we are free to configure the MMU according to our desired specifications and make good use of all of the work done earlier to prepare a kernel memory map. Thus, init_mmu:

// Initializes the ARM MMU to our desired specifications. This should take place
// *after* EFI boot services have exited because we're going to mess up the MMU
// configuration that it depends on.
fn init_mmu() void = {
	// Disable MMU
	const sctlr_el1 = rdsctlr_el1();
	wrsctlr_el1(sctlr_el1 & ~SCTLR_EL1_M);

	// Configure MAIR
	const mair: u64 =
		(0xFF << 0) | // Attr0: Normal memory; IWBWA, OWBWA, NTR
		(0x00 << 8);  // Attr1: Device memory; nGnRnE, OSH
	wrmair_el1(mair);

	const tsz: u64 = 64 - 48;
	const ips = rdtcr_el1() & TCR_EL1_IPS_MASK;
	const tcr_el1: u64 =
		TCR_EL1_IPS_42B_4T |	// 4 TiB IPS
		TCR_EL1_TG1_4K |	// Higher half: 4K granule size
		TCR_EL1_SH1_IS |	// Higher half: inner shareable
		TCR_EL1_ORGN1_WB |	// Higher half: outer write-back
		TCR_EL1_IRGN1_WB |	// Higher half: inner write-back
		(tsz << TCR_EL1_T1SZ) |	// Higher half: 48 bits
		TCR_EL1_TG0_4K |	// Lower half: 4K granule size
		TCR_EL1_SH0_IS |	// Lower half: inner sharable
		TCR_EL1_ORGN0_WB |	// Lower half: outer write-back
		TCR_EL1_IRGN0_WB |	// Lower half: inner write-back
		(tsz << TCR_EL1_T0SZ);	// Lower half: 48 bits
	wrtcr_el1(tcr_el1);

	// Load page tables
	wrttbr0_el1(&L0[0]: uintptr);
	wrttbr1_el1(&L0[0]: uintptr);
	invlall();

	// Enable MMU
	const sctlr_el1: u64 =
		SCTLR_EL1_M |		// Enable MMU
		SCTLR_EL1_C |		// Enable cache
		SCTLR_EL1_I |		// Enable instruction cache
		SCTLR_EL1_SPAN |	// SPAN?
		SCTLR_EL1_NTLSMD |	// NTLSMD?
		SCTLR_EL1_LSMAOE |	// LSMAOE?
		SCTLR_EL1_TSCXT |	// TSCXT?
		SCTLR_EL1_ITD;		// ITD?
	wrsctlr_el1(sctlr_el1);
};

There are a lot of bits here! Figuring out which ones to enable or disable was a project in and of itself. One of the major challenges, funnily enough, was finding the correct ARM manual to reference to understand all of these registers. I’ll save you some time and link to it directly, should you ever find yourself writing similar code. Some question marks in comments towards the end point out some flags that I’m still not sure about. The ARM CPU is very configurable and identifying the configuration that produces the desired behavior for a general-purpose kernel requires some effort.

After this function completes, the MMU is initialized and we are up and running with the kernel memory map we prepared earlier; the kernel is loaded in the higher half and the MMU is prepared to service it. So, we can jump to the kernel via enter_kernel:

@noreturn fn enter_kernel(entry: *kentry, ctx: *bootctx) void = {
	const el = readel();
	switch (el) {
	case el::EL0 =>
		abort("Bootloader running in EL0, breaks EFI invariant");
	case el::EL1 =>
		// Can boot immediately
		entry(ctx);
	case el::EL2 =>
		// Boot from EL2 => EL1
		//
		// This is the bare minimum necessary to get to EL1. Future
		// improvements might be called for here if anyone wants to
		// implement hardware virtualization on aarch64. Good luck to
		// this future hacker.

		// Enable EL1 access to the physical counter register
		const cnt = rdcnthctl_el2();
		wrcnthctl_el2(cnt | 0b11);

		// Enable aarch64 in EL1 & SWIO, disable most other EL2 things
		// Note: I bet someday I'll return to this line because of
		// Problems
		const hcr: u64 = (1 << 1) | (1 << 31);
		wrhcr_el2(hcr);

		// Set up SPSR for EL1
		// XXX: Magic constant I have not bothered to understand
		wrspsr_el2(0x3c4);

		enter_el1(ctx, entry);
	case el::EL3 =>
		// Not supported, tested earlier on
		abort("Unsupported boot configuration");
	};
};

Here we see the detritus from one of many battles I fought to port this kernel: the EL2 => EL1 transition. aarch64 has several “exception levels”, which are semantically similar to the x86_64 concept of protection rings. EL0 is used for userspace code, which is not applicable under these circumstances; an assertion sanity-checks this invariant. EL1 is the simplest case, this is used for normal kernel code and in this situation we can jump directly to the kernel. The EL2 case is used for hypervisor code, and this presented me with a challenge. When I tested my bootloader in qemu-virt, it worked initially, but on real hardware it failed. After much wailing and gnashing of teeth, the cause was found to be that our bootloader was started in EL2 on real hardware, and EL1 on qemu-virt. qemu can be configured to boot in EL2, which was crucial in debugging this problem, via -M virt,virtualization=on. From this environment I was able to identify a few important steps to drop to EL1 and into the kernel, though from the comments you can probably ascertain that this process was not well-understood. I do have a better understanding of it now than I did when this code was written, but the code is still serviceable and I see no reason to change it at this stage.

At this point, 14 days into the port, I successfully reached kmain on qemu-virt. Some initial kernel porting work was done after this, but when I was prepared to test it on real hardware I ran into this EL2 problem — the first kmain on real hardware ran at T+18.

That sums it up for the aarch64 EFI bootloader work. 24 days later the kernel and userspace ports would be complete, and a couple of weeks after that it was running on stage at FOSDEM. The next post will cover the kernel port (maybe more than one post will be required, we’ll see), and the final post will address the userspace port and the inner workings of the slidedeck demo that was shown on stage. Look forward to it, and thanks for reading!

Should private platforms engage in censorship?

Private service providers are entitled to do business with whom they please, or not to. Occasionally, a platform will take advantage of this to deny service to a particular entity on any number of grounds, often igniting a flood of debate online regarding whether or not censorship in this form is just. Recently, CloudFlare pulled the plug on a certain forum devoted to the coordinated harassment of its victims. Earlier examples include the same service blocking a far-right imageboard, or Namecheap cancelling service for a neo-Nazi news site.

In each of these cases, a private company elected to terminate service for a customer voluntarily, without a court order. Absent from these events was any democratic or judicial oversight. A private company which provides some kind of infrastructure for the Internet simply elected to unilaterally terminate service for a customer or class of customers.

When private companies choose with whom they do or do not do business with, this is an exercise of an important freedom: freedom of association. Some companies have this right limited by regulation — for instance, utility companies are often required to provide power to everyone who wants it within their service area. Public entities are required to provide their services to everyone — for instance, the US postal service cannot unilaterally choose not to deliver your mail. However, by default, private companies are generally allowed to deny their services to whomever they please.1

Are they right to?

An argument is often made that, when a platform reaches a given size (e.g. Facebook), or takes on certain ambitions (e.g. CloudFlare), it may become large and entrenched enough in our society that it should self-impose a role more analogous to a public utility than a private company. Under such constraints, such a platform would choose to host any content which is not explicitly illegal, and defer questions over what content is appropriate to the democratic process. There are a number of angles from which we can examine this argument.

For a start, how might we implement the scenario called for by this argument? Consider one option: regulation. Power companies are subject to regulations regarding how and with whom they do business; they must provide service to everyone and they are not generally allowed to shut off your heat in the cold depths of winter. Similarly, we could regulate digital platforms to require them to provide a soapbox for all legally expressible viewpoints, then utilize the democratic process to narrow this soapbox per society’s mutually-agreed-upon views regarding matters such as neo-Nazi propaganda.2

It’s important when making this argument to note that regulation of this sort imposes obligations on private businesses which erode their own right to free association; radical free speech for individuals requires radical curtailing of free association for businesses. Private businesses are owned and staffed by individuals, and requiring them to allow all legal forms of content on their platform is itself a limitation on their freedom. The staff of a newspaper may not appreciate being required by law to provide space in the editorials for KKK members to espouse their racist philosophy, but would nevertheless be required to typeset such articles under such an arrangement.

Another approach to addressing this argument is not to question the rights of a private business, but instead to question whether or not they should be allowed to grow to a size such that their discretion in censorship constitutes a disruption to society due to their scale and entrenched market position. Under this lens, we can suggest another government intervention that does not take the form of regulation, but of an application of antitrust law. With more platforms to choose from, we can explore more approaches to moderation and censorship, and depend on the market’s invisible hand to lead us true.

The free speech absolutist who makes similar arguments may find themselves in a contradiction: expanding free speech for some people (platform users) requires, in this scenario, curtailing freedoms for others (platform owners and staff). Someone in this position may concede that, while they support the rights of individuals, they might not offer the same rights to businesses who resemble utilities. The tools for implementing this worldview, however, introduce further contradictions when combined with the broader political profile of a typical free speech absolutist: calling for regulation isn’t very consistent with any “small government” philosophy; and those who describe themselves as Libertarian and make either of these arguments provide me with no small amount of amusement.

There is another flaw in this line of thinking which I want to highlight: the presumption that the democratic process can address these problems in the first place. Much of the legitimacy of this argument rests on the assumption that the ability for maligned users to litigate their grievances is not only more just, but also equal to the threat posed by hate speech and other concerns which are often the target of censorship on private platforms. I don’t think that this is true.

The democratic and judicial processes are often corrupt and inefficient. It is still the case that the tone of your skin has an outsized effect on the outcome of your court case; why shouldn’t similar patterns emerge when de-platformed racists are given their day before a judge? Furthermore, the pace of government interventions are generally insufficient. Could Facebook appeal a court for the right to remove the Proud Boys from their platform faster than they could organize an attack on the US Capitol building? And can lawmakers keep up with innovation at a pace sufficient to address new forms and mediums for communicating harmful content before they’re a problem?

We should also question if the democratic process will lead to moral outcomes. Minorities are, by definition, in the minority, and a purely democratic process will only favor their needs subject to the will of the majority. Should the rights of trans people to live free of harassment be subject to the pleasure of the cisgendered majority?

These systems, when implemented, will perform as they always have: they will provide disproportionately unfavorable outcomes for disadvantaged members of society. I am a leftist: if asked to imagine a political system which addresses these problems, I will first imagine sweeping reforms to our existing system, point out that the free market isn’t, lean in favor of regulation and nationalization of important industries, and seek to empower the powerless against the powerful. It will require a lot of difficult, ongoing work to get there, and I imagine most of this work will be done in spite of the protests of the typical free speech absolutist.

I am in favor of these reforms, but they are decades away from completion, and many will disagree on the goals and their implementation. But I am also a pragmatic person, and when faced with the system in which we find ourselves today, I seek a pragmatic solution to this problem; ideally one which is not predicated on revolution. When faced with the question, “should private platforms engage in censorship?”, what is the pragmatic answer?

To provide such an answer, we must de-emphasize idealism in favor of an honest examination of the practical context within which our decision-making is done. Consider again the status quo: private companies are generally permitted to exercise their right to free association by kicking people off of their platforms. A pragmatic framework for making these decisions examines the context in which they are made. In the current political climate, this context should consider the threats faced by many different groups of marginalized people today: racism is still alive and strong, what few LGBT rights exist are being dismantled, and many other civil liberties are under attack.

When someone (or some entity such as business) enjoys a particular freedom, the way they exercise it is meaningful. Inaction is a form of complicity; allowing hate to remain on your platform is an acknowledgement of your favor towards the lofty principles outlined in the arguments above in spite of the problems enumerated here and the realities faced by marginalized people today. A purely moral consideration thus suggests that exercising your right to free association in your role as a decision-maker at a business is a just response to this status quo.

I expect the people around me (given a definition of “around me” that extends to the staff at businesses I patronize) to possess a moral compass which is compatible with my own, and to act in accordance with it; in the absence of this I will express my discontent by voting with my feet. However, businesses in the current liberal economic regime often disregard morals in favor of profit-oriented decision making. Therefore, in order for the typical business behave morally, their decision-making must exist within a context where the moral outcomes align with the profitable outcomes.

We are seeing increasing applications of private censorship because this alignment is present. Businesses depend on two economic factors which are related to this issue: access to a pool of profitable users, and access to a labor pool with which to develop and maintain their profits. Businesses which platform bigots are increasingly finding public opinion turning against them; marginalized people and moderates tend to flee to less toxic spaces and staff members are looking to greener pastures. The free market currently rewards private censorship, therefore in a system wherein the free market reigns supreme we observe private censorship.

I reject the idea that it is appropriate for businesses to sideline morality in favor of profit, and I don’t have much faith in the free market to produce moral outcomes. For example, the market is responding poorly to the threat of climate change. However, in the case of private censorship, the incentives are aligned such that the outcomes we’re observing match the outcomes I would expect.

This is a complex topic which we have examined from many angles. In my view, freedom of association is just as important as freedom of speech, and its application to private censorship is not clearly wrong. If you view private censorship as an infringement of the principle of free speech, but agree that freedom of association is nevertheless important, we must resolve this contradiction. The democratic or judicial processes are an enticing and idealistic answer, but these are flawed processes that may not produce just outcomes. If I were to consider these tools to address this question, I’m going to present solutions from a socialist perspective which may or may not jibe with your sensibilities.

Nevertheless, the system as it exists today produces outcomes which approximate both rationality and justice, and I do not stand in opposition to the increased application of private censorship under the current system, flawed though it may be.

My plans at FOSDEM: SourceHut, Hare, and Helios

FOSDEM is right around the corner, and finally in person after long years of dealing with COVID. I’ll be there again this year, and I’m looking forward to it! I have four slots on the schedule (wow! Thanks for arranging these, FOSDEM team) and I’ll be talking about several projects. There is a quick lightning talk on Saturday to introduce Helios and tease a full-length talk on Sunday, a meetup for the Hare community, and a meetup for the SourceHut community. I hope to see you there!

Lightning talk: Introducing Helios

Saturday 12:00 at H.2215 (Ferrer)

Helios is a simple microkernel written in part to demonstrate the applicability of the Hare programming language to kernels. This talk briefly explains why Helios is interesting and is a teaser for a more in-depth talk in the microkernel room tomorrow.

Hare is a systems programming language designed to be simple, stable, and robust. Hare uses a static type system, manual memory management, and a minimal runtime. It is well-suited to writing operating systems, system tools, compilers, networking software, and other low-level, high performance tasks. Helios uses Hare to implement a microkernel, largely inspired by seL4.

BoF: The Hare programming language

Saturday 15:00 at UB2.147

Hare is a systems programming language designed to be simple, stable, and robust. Hare uses a static type system, manual memory management, and a minimal runtime. It is well-suited to writing operating systems, system tools, compilers, networking software, and other low-level, high performance tasks.

At this meeting we’ll sum up the state of affairs with Hare, our plans for the future, and encourage discussions with the community. We’ll also demonstrate a few interesting Hare projects, including Helios, a micro-kernel written in Hare, and encourage each other to work on interesting projects in the Hare community.

BoF: SourceHut meetup

Saturday 16:00 at UB2.147

SourceHut is a free software forge for developing software projects, providing git and mercurial hosting, continuous integration, mailing lists, and more. We’ll be meeting here again in 2023 to discuss the platform and its community, the completion of the GraphQL rollout and the migration to the EU, and any other topics on the minds of the attendees.

Introducing Helios

Sunday 13:00 at H.1308 (Rolin)

Helios is a simple microkernel written in part to demonstrate the applicability of the Hare programming language to kernels. This talk will introduce the design and rationale for Helios, address some details of its implementation, compare it with seL4, and elaborate on the broader plans for the system.

Setting a new focus for my blog

Just shy of two months ago, I published I shall toil at a reduced volume, which addressed the fact that I’m not getting what I want from my blog anymore, and I would be taking an indefinite break. Well, I am ready to resume my writing, albeit with a different tone and focus than before.

Well, that was fast.

– Everyone

Since writing this, I have been considering what exactly the essential subject of my dissatisfaction with my writing has been. I may have found the answer: I lost sight of my goals. I got so used to writing that I would often think to myself, “I want to write a blog post!”, then dig a topic out of my backlog (which is 264 items long) and write something about it. This is not the way; much of the effort expended on writing in this manner is not spent on the subjects I care about most, or those which most urgently demand an expenditure of words.

The consequences of this misalignment of perspective are that my writing has often felt dull and rote. It encourages shallower takes and lends itself to the rants or unthoughtful criticisms that my writings are, unfortunately, (in)famous for. When I take an idea off of the shelf, or am struck by an idea that, in the moment, seemingly demands to be spake of, I often end up with a disappointing result when the fruit of this inspiration is published a few hours later.

Over the long term, these issues manifest as demerits to my reputation, and deservedly so. What’s more, when a critical tone is well-justified, the posts which utilize it are often overlooked by readers due to the normalization of this tone throughout less important posts. Take for instance my recent post on Rust in Linux. Though this article could have been written with greater nuance, I still find its points about the value of conservatism in software decision-making accurate and salient. However, the message is weakened riding on the coat-tails of my long history of less poignant critiques of Rust. As I resume my writing, I will have to take a more critical examination of myself and the broader context of my writing before reaching for a negative tone as a writing tool.

With these lessons in mind, I am seeking out stronger goals to align my writing with, in the hope that the writing is both more fulfilling for me, and more compelling for the reader. Among these goals I have identified two particularly important ones, whose themes resonate through my strongest articles throughout the years:

  1. The applicability of software to the just advancement of society, its contextualization within the needs of the people who use it, a deep respect for these people and the software’s broader impact on the world, and the use of free software to acknowledge and fulfill these needs.
  2. The principles of good software engineering, such that software built to meet these goals is reliable, secure, and comprehensible. It is in the service of this goal that I beat the drum of simplicity with a regular rhythm.

Naturally many people have important beliefs on these subjects. I simply aim to share my own perspective, and I find it rewarding when I am able to write compelling arguments which underline these goals.

There is another kind of blog post that I enjoy writing and plan to resume: in-depth technical analysis of my free software projects. I’m working on lots of interesting and exciting projects, and I want to talk about them more, and I think people enjoy reading about them. I just spent six weeks porting Helios to aarch64, for instance, and have an essay on the subject half-written in the back of my head. I would love to type it in and publish it.

So, I will resume writing, and indeed at a “reduced volume”, with a renewed focus on the message and its context, and an emphasis on serving the goals I care about the most. Hopefully I find it more rewarding to write in this manner, and you find the results more compelling to read! Stay tuned.

$ rm ~/sources/drewdevault.com/todo.txt

I shall toil at a reduced volume

Over the last nine years I have written 300,000 words for this blog on the topics which are important to me. I am not certain that I have much left to say.

I can keep revisiting these topics for years, each time adding a couple more years of wisdom and improvements to my writing skills to present my arguments more effectively. However, I am starting to feel diminishing returns from my writing. It does not seem like my words are connecting with readers anymore. And, though the returns on my work seem to be diminishing, the costs are not. Each new article spurs less discussion than the last, but provides an unwavering supply of spiteful responses.

Software is still the same mess it was when I started writing and working, or perhaps even worse. You can’t overcome perverse incentives. As Cantrill once famously noted, the lawnmower can’t have empathy. The truth he did not speak is that we all have some Oracle in our hearts, and the lawnmower is the size of the entire industry.

I have grown tired of it. I will continue my work quietly, building the things I believe in, and remaining true to my principles. I do not yet know if this is a cessation or a siesta, but I do know that I will not write again for some time. Thank you for reading, and good luck in your endeavours. I hope you found something of value in these pages.

Here are some of the blog posts I am most proud of, should you want to revisit them today or the next time you happen upon my website:

Codegen in Hare v2

I spoke about code generation in Hare back in May when I wrote a tool for generating ioctl numbers. I wrote another code generator over the past few weeks, and it seems like a good time to revisit the topic on my blog to showcase another approach, and the improvements we’ve made for this use-case.

In this case, I wanted to generate code to implement IPC (inter-process communication) interfaces for my operating system. I have designed a DSL for describing these interfaces — you can read the grammar here. This calls for a parser, which is another interesting topic for Hare, but I’ll set that aside for now and focus on the code gen. Assume that, given a file like the following, we can parse it and produce an AST:

namespace hello;

interface hello {
	call say_hello() void;
	call add(a: uint, b: uint) uint;
};

The key that makes the code gen approach we’re looking at today is the introduction of strings::template to the Hare standard library. This module is inspired by a similar feature from Python, string.Template. An example of its usage is provided in Hare’s standard library documentation:

const src = "Hello, $user! Your balance is $$$balance.\n";
const template = template::compile(src)!;
defer template::finish(&template);
template::execute(&template, os::stdout,
	("user", "ddevault"),
	("balance", 1000),
)!; // "Hello, ddevault! Your balance is $1000.

Makes sense? Cool. Let’s see how this can be applied to code generation. The interface shown above compiles to the following generated code:

// This file was generated by ipcgen; do not modify by hand
use errors;
use helios;
use rt;

def HELLO_ID: u32 = 0xC01CAAC5;

export type fn_hello_say_hello = fn(object: *hello) void;
export type fn_hello_add = fn(object: *hello, a: uint, b: uint) uint;

export type hello_iface = struct {
	say_hello: *fn_hello_say_hello,
	add: *fn_hello_add,
};

export type hello_label = enum u64 {
	SAY_HELLO = HELLO_ID << 16u64 | 1,
	ADD = HELLO_ID << 16u64 | 2,
};

export type hello = struct {
	_iface: *hello_iface,
	_endpoint: helios::cap,
};

export fn hello_dispatch(
	object: *hello,
) void = {
	const (tag, a1) = helios::recvraw(object._endpoint);
	switch (rt::label(tag): hello_label) {
	case hello_label::SAY_HELLO =>
		object._iface.say_hello(
			object,
		);
		match (helios::reply(0)) {
		case void =>
			yield;
		case errors::invalid_cslot =>
			yield; // callee stored the reply
		case errors::error =>
			abort(); // TODO
		};
	case hello_label::ADD =>
		const rval = object._iface.add(
			object,
			a1: uint,
			rt::ipcbuf.params[1]: uint,
		);
		match (helios::reply(0, rval)) {
		case void =>
			yield;
		case errors::invalid_cslot =>
			yield; // callee stored the reply
		case errors::error =>
			abort(); // TODO
		};
	case =>
		abort(); // TODO
	};
};

Generating this code starts with the following entry-point:

// Generates code for a server to implement the given interface.
export fn server(out: io::handle, doc: *ast::document) (void | io::error) = {
	fmt::fprintln(out, "// This file was generated by ipcgen; do not modify by hand")!;
	fmt::fprintln(out, "use errors;")!;
	fmt::fprintln(out, "use helios;")!;
	fmt::fprintln(out, "use rt;")!;
	fmt::fprintln(out)!;

	for (let i = 0z; i < len(doc.interfaces); i += 1) {
		const iface = &doc.interfaces[i];
		s_iface(out, doc, iface)?;
	};
};

Here we start with some simple use of basic string formatting via fmt::fprintln. We see some of the same approach repeated in the meatier functions like s_iface:

fn s_iface(
	out: io::handle,
	doc: *ast::document,
	iface: *ast::interface,
) (void | io::error) = {
	const id: ast::ident = [iface.name];
	const name = gen_name_upper(&id);
	defer free(name);

	let id: ast::ident = alloc(doc.namespace...);
	append(id, iface.name);
	defer free(id);
	const hash = genhash(&id);

	fmt::fprintfln(out, "def {}_ID: u32 = 0x{:X};\n", name, hash)!;

Our first use of strings::template appears when we want to generate type aliases for interface functions, via s_method_fntype. This is where some of the trade-offs of this approach begin to present themselves.

const s_method_fntype_src: str =
	`export type fn_$iface_$method = fn(object: *$object$params) $result;`;
let st_method_fntype: tmpl::template = [];

@init fn s_method_fntype() void = {
	st_method_fntype= tmpl::compile(s_method_fntype_src)!;
};

fn s_method_fntype(
	out: io::handle,
	iface: *ast::interface,
	meth: *ast::method,
) (void | io::error) = {
	assert(len(meth.caps_in) == 0); // TODO
	assert(len(meth.caps_out) == 0); // TODO

	let params = strio::dynamic();
	defer io::close(&params)!;
	if (len(meth.params) != 0) {
		fmt::fprint(&params, ", ")?;
	};
	for (let i = 0z; i < len(meth.params); i += 1) {
		const param = &meth.params[i];
		fmt::fprintf(&params, "{}: ", param.name)!;
		ipc_type(&params, &param.param_type)!;

		if (i + 1 < len(meth.params)) {
			fmt::fprint(&params, ", ")!;
		};
	};

	let result = strio::dynamic();
	defer io::close(&result)!;
	ipc_type(&result, &meth.result)!;

	tmpl::execute(&st_method_fntype, out,
		("method", meth.name),
		("iface", iface.name),
		("object", iface.name),
		("params", strio::string(&params)),
		("result", strio::string(&result)),
	)?;
	fmt::fprintln(out)?;
};

The simple string substitution approach of strings::template prevents it from being as generally useful as a full-blown templating engine ala jinja2. To work around this, we have to write Hare code which does things like slurping up the method parameters into a strio::dynamic buffer where we might instead reach for something like {% for param in method.params %} in jinja2. Once we have prepared all of our data in a format suitable for a linear string substitution, we can pass it to tmpl::execute. The actual template is stored in a global which is compiled during @init, which runs at program startup. Anything which requires a loop to compile, such as the parameter list, is fetched out of the strio buffer and passed to the template.

We can explore a slightly different approach when we generate this part of the code, back up in the s_iface function:

export type hello_iface = struct {
	say_hello: *fn_hello_say_hello,
	add: *fn_hello_add,
};

To output this code, we render several templates one after another, rather than slurping up the generated code into heap-allocated string buffers to be passed into a single template.

const s_iface_header_src: str =
	`export type $iface_iface = struct {`;
let st_iface_header: tmpl::template = [];

const s_iface_method_src: str =
	`	$method: *fn_$iface_$method,`;
let st_iface_method: tmpl::template = [];

@init fn s_iface() void = {
	st_iface_header = tmpl::compile(s_iface_header_src)!;
	st_iface_method = tmpl::compile(s_iface_method_src)!;
};

// ...

tmpl::execute(&st_iface_header, out,
	("iface", iface.name),
)?;
fmt::fprintln(out)?;

for (let i = 0z; i < len(iface.methods); i += 1) {
	const meth = &iface.methods[i];
	tmpl::execute(&st_iface_method, out,
		("iface", iface.name),
		("method", meth.name),
	)?;
	fmt::fprintln(out)?;
};

fmt::fprintln(out, "};\n")?;

The remainder of the code is fairly similar.

strings::template is less powerful than a more sophisticated templating system might be, such as Golang’s text/template. A more sophisticated templating engine could be implemented for Hare, but it would be more challenging — no reflection or generics in Hare — and would not be a great candidate for the standard library. This approach hits the sweet spot of simplicity and utility that we’re aiming for in the Hare stdlib. strings::template is implemented in a single ~180 line file.

I plan to continue polishing this tool so I can use it to describe interfaces for communications between userspace drivers and other low-level userspace services in my operating system. If you have any questions, feel free to post them on my public inbox, or shoot them over to my new fediverse account. Until next time!

In praise of Plan 9

Plan 9 is an operating system designed by Bell Labs. It’s the OS they wrote after Unix, with the benefit of hindsight. It is the most interesting operating system that you’ve never heard of, and, in my opinion, the best operating system design to date. Even if you haven’t heard of Plan 9, the designers of whatever OS you do use have heard of it, and have incorporated some of its ideas into your OS.

Plan 9 is a research operating system, and exists to answer questions about ideas in OS design. As such, the Plan 9 experience is in essence an exploration of the interesting ideas it puts forth. Most of the ideas are small. Many of them found a foothold in the broader ecosystem — UTF-8, goroutines, /proc, containers, union filesystems, these all have their roots in Plan 9 — but many of its ideas, even the good ones, remain unexplored outside of Plan 9. As a consequence, Plan 9 exists at the center of a fervor of research achievements which forms a unique and profoundly interesting operating system.

One example I often raise to illustrate the design ideals of Plan 9 is to compare its approach to network programming with that of the Unix standard, Berkeley sockets. BSD sockets fly in the face of Unix sensibilities and are quite alien on the system, though by now everyone has developed stockholm syndrome with respect to them so they don’t notice. When everything is supposed to be a file on Unix, why is it that the networking API is entirely implemented with special-purpose syscalls and ioctls? On Unix, creating a TCP connection involves calling the “socket” syscall to create a magic file descriptor, then the “connect” syscall to establish a connection. Plan 9 is much more Unix in its approach: you open /net/tcp/clone to reserve a connection, and read the connection ID from it. Then you open /net/tcp/n/ctl and write “connect 127.0.0.1!80” to it, where “n” is that connection ID. Now you can open /net/tcp/n/data and that file is a full-duplex stream. No magic syscalls, and you can trivially implement it in a shell script.

This composes elegantly with another idea from Plan 9: the 9P protocol. All file I/O on the entire system uses the 9P protocol, which defines operations like read and write. This protocol is network transparent, and you can mount remote servers into your filesystem namespace and access their files over 9P. You can do something similar on Unix, but on Plan 9 you get much more mileage from the idea because everything is actually a file, and there are no magic syscalls or ioctls. For instance, your Ethernet interface is at /net/ether0, and everything in there is just a file. Say you want to establish a VPN: you simply mount a remote server’s /net/ether0 at /net/ether1, and now you have a VPN. That’s it.

The mountpoints are interesting as well, because they exist within a per-process filesystem namespace. Mounting filesystems does not require special permissions like on Unix, because these mounts only exist within the process tree that creates them, rather than modifying global state. The filesystems can also be implemented in userspace rather trivially via the 9P protocol, similar to FUSE but much more straightforward. Many programs provide a programmable/scriptable interface via a special filesystem such as this.

Userspace programs can also provide filesystems compatible with those normally implemented by kernel drivers, like /net/ether0, and provide these to processes in their namespace. For example, /dev/draw is analogous to a framebuffer device: you open it to write pixels to the screen. The window manager, Rio, implements a /dev/draw-like interface in userspace, then mounts it in the filesystem namespace of its children. All GUI programs can thus be run both on a framebuffer or in a window, without any awareness of which it’s using. The same is also true over the network: to implement VNC-like functionality, just mount your local /dev/draw and /dev/kbd on a remote server. Add /dev/audio if you like.

These ideas can also be built upon to form something resembling a container runtime, pre-dating even early concepts like BSD jails by several years, and implementing them much more effectively. Recall that everything really is just a file on Plan 9, unlike Unix. Access to the hardware is provided through normal files, and per-process namespaces do not require special permissions to modify mountpoints. Making a container is thus trivial: just unmount all of the hardware you don’t want the sandboxed program to have access to. Done. You don’t even have to be root. Want to forward a TCP port? Write an implementation of /net/tcp which is limited to whatever ports you need — perhaps with just a hundred lines of shell scripting — and mount it into the namespace.

The shell, rc, is also wonderful. The debugger is terribly interesting, and its ideas didn’t seem to catch on with the likes of gdb. The editors, acme and sam, are also interesting and present a unique user interface that you can’t find anywhere else. The plumber is cool, it’s like “what if xdg-open was good actually”. The kernel is concise and a pleasure to read. The entire operating system, kernel and userspace, can be built from source code on my 12 year old laptop in about 5 minutes. The network database, ndb, is brilliant. The entire OS is stuffed to the brim with interesting ideas, all of them implemented with elegance, conciseness, and simplicity.

Plan 9 failed, in a sense, because Unix was simply too big and too entrenched by the time Plan 9 came around. It was doomed by its predecessor. Nevertheless, its design ideas and implementation resonate deeply with me, and have provided an endless supply of inspiration for my own work. I think that everyone owes it to themselves to spend a few weeks messing around with and learning about Plan 9. The dream is kept alive by 9front, which is the most actively maintained fork of Plan 9 available today. Install it on your ThinkPad and mess around.

I will offer a caveat, however: leave your expectations at the door. Plan 9 is not Unix, it is not Unix-compatible, and it is certainly not yet another Linux distribution. Everything you’re comfortable and familiar with in your normal Unix setup will not translate to Plan 9. Come to Plan 9 empty handed, and let it fill those hands with its ideas. You will come away from the experience as a better programmer.

Notes from kernel hacking in Hare, part 3: serial driver

Today I would like to show you the implementation of the first userspace driver for Helios: a simple serial driver. All of the code we’re going to look at today runs in userspace, not in the kernel, so strictly speaking this should be “notes from OS hacking in Hare”, but I won’t snitch if you don’t.

Note: In the previous entry to this series, I promised to cover the userspace threading API in this post. I felt like covering this instead. Sorry!

A serial port provides a simple protocol for transferring data between two systems. It generalizes a bit, but for our purposes we can just think of this as a terminal which you can use over a simple cable and a simple protocol. It’s a standard x86_64 feature (though one which has been out of style for a couple of decades now), and its simple design (and high utility) makes it a good choice for the first driver to write for Helios. We’re going to look at the following details today:

  1. The system’s initramfs
  2. The driver loader
  3. The serial driver itself

The initramfs used by Helios, for the time being, is just a tarball. I imported format::tar from the standard library, a module which I designed for this express purpose, and made a few minor tweaks to make it suitable for Helios’ needs. I also implemented seeking within a tar entry to make it easier to write an ELF loader from it. The bootloader loads this tarball into memory, the kernel provides page capabilities to init for it, and then we can map it into memory and study it, something like this:

let base = rt::map_range(rt::vspace, 0, 0, &desc.pages)!;
let slice = (base: *[*]u8)[..desc.length];

const buf = bufio::fixed(slice, io::mode::READ);
const rd = tar::read(&buf);

Pulling a specific driver out of it looks like this:

// Loads a driver from the bootstrap tarball.
fn earlyload(fs: *bootstrapfs, path: str) *process = {
	tar::reset(&fs.rd)!;
	path = strings::ltrim(path, '/');

	for (true) {
		const ent = match (tar::next(&fs.rd)) {
		case io::EOF =>
			break;
		case let ent: tar::entry =>
			yield ent;
		case let err: tar::error =>
			abort("Invalid bootstrap.tar file");
		};
		defer tar::skip(&ent)!;

		if (ent.name == path) {
			// TODO: Better error handling here
			const proc = match (load_driver(&ent)) {
			case let err: io::error =>
				abort("Failed to load driver from boostrap");
			case let err: errors::error =>
				abort("Failed to load driver from boostrap");
			case let proc: *process =>
				yield proc;
			};
			helios::task_resume(proc.task)!;
			return proc;
		};
	};

	abort("Missing bootstrap driver");
};

This code finds a file in the tarball with the given path (e.g. drivers/serial), creates a process with the driver loader, then resumes the thread and the driver is running. Let’s take a look at that driver loader next. The load_driver entry point takes an I/O handle to an ELF file and loads it:

fn load_driver(image: io::handle) (*process | io::error | errors::error) = {
	const loader = newloader(image);
	let earlyconf = driver_earlyconfig {
		cspace_radix = 12,
	};
	load_earlyconfig(&earlyconf, &loader)?;

	let proc = newprocess(earlyconf.cspace_radix)?;
	load(&loader, proc)?;
	load_config(proc, &loader)?;

	let regs = helios::context {
		rip = loader.header.e_entry,
		rsp = INIT_STACK_ADDR,
		...
	};
	helios::task_writeregisters(proc.task, &regs)?;
	return proc;
};

This is essentially a standard ELF loader, which it calls via the more general “newprocess” and “load” functions, but drivers have an extra concern: the driver manifest. The “load_earlyconfig” processes manifest keys which are necessary to configure prior to loading the ELF image, and the “load_config” function takes care of the rest of the driver configuration. The remainder of the code configures the initial thread.

The actual driver manifest is an INI file which is embedded in a special ELF section in driver binaries. The manifest for the serial driver looks like this:

[driver]
name=pcserial
desc=Serial driver for x86_64 PCs

[cspace]
radix=12

[capabilities]
0:serial =
1:note = 
2:cspace = self
3:ioport = min=3F8, max=400
4:ioport = min=2E8, max=2F0
5:irq = irq=3, note=1
6:irq = irq=4, note=1

Helios is a capability-oriented system, and in order to do anything useful, each process needs to have capabilities to work with. Each driver declares exactly what capabilities it needs and receives only these capabilities, and nothing else. This provides stronger isolation than Unix systems can offer (even with something like OpenBSD’s pledge(2)) — this driver cannot even allocate memory.

A standard x86_64 ISA serial port uses two I/O port ranges, 0x3F8-0x400 and 0x2E8-0x2F0, as well as two IRQs, IRQ 3 and 4, together providing support for up to four serial ports. The driver first requests a “serial” capability, which is a temporary design for an IPC endpoint that the driver will use to actually process read or write requests. This will be replaced with a more sophisticated device manager system in the future. It also creates a notification capability, which is later used to deliver the IRQs, and requests a capability for its own cspace so that it can manage capability slots. This will be necessary later on. Following this it requests capabilities for the system resources it needs, namely the necessary I/O ports and IRQs, the latter configured to be delivered to the notification in capability slot 1.

With the driver isolated in its own address space, running in user mode, and only able to invoke this set of capabilities, it’s very limited in what kind of exploits it’s vulnerable to. If there’s a vulnerability here, the worst that could happen is that a malicious actor on the other end of the serial port could crash the driver, which would then be rebooted by the service manager. On Linux, a bug in the serial driver can be used to compromise the entire system.

So, the driver loader parses this file and allocates the requested capabilities for the driver. I’ll skip most of the code, it’s just a boring INI file parser, but the important bit is the table for capability allocations:

type capconfigfn = fn(
	proc: *process,
	addr: uint,
	config: const str,
) (void | errors::error);

// Note: keep these tables alphabetized
const capconfigtab: [_](const str, *capconfigfn) = [
	("cspace", &cap_cspace),
	("endpoint", &cap_endpoint),
	("ioport", &cap_ioport),
	("irq", &cap_irq),
	("note", &cap_note),
	("serial", &cap_serial),
	// TODO: More
];

This table defines functions which, when a given INI key in the [capabilities] section is found, provisions the requested capabilities. This list is not complete; in the future all kernel objects will be added as well as userspace-defined interfaces (similar to serial) which implement various driver interfaces, such as ‘fs’ or ‘gpu’. Let’s start with the notification capability:

fn cap_note(
	proc: *process,
	addr: uint,
	config: const str,
) (void | errors::error) = {
	if (config != "") {
		return errors::invalid;
	};
	const note = helios::newnote()?;
	defer helios::destroy(note)!;
	helios::copyto(proc.cspace, addr, note)?;
};

This capability takes no configuration arguments, so we first simply check that the value is empty. Then we create a notification, copy it into the driver’s capability space at the requested capability address, then destroy our copy. Simple!

The I/O port capability is a bit more involved: it does accept configuration parameters, namely what I/O port range the driver needs.

fn cap_ioport(
	proc: *process,
	addr: uint,
	config: const str,
) (void | errors::error) = {
	let min = 0u16, max = 0u16;
	let have_min = false, have_max = false;

	const tok = strings::tokenize(config, ",");
	for (true) {
		let tok = match (strings::next_token(&tok)) {
		case void =>
			break;
		case let tok: str =>
			yield tok;
		};
		tok = strings::trim(tok);

		const (key, val) = strings::cut(tok, "=");
		let field = switch (key) {
		case "min" =>
			have_min = true;
			yield &min;
		case "max" =>
			have_max = true;
			yield &max;
		case =>
			return errors::invalid;
		};

		match (strconv::stou16b(val, base::HEX)) {
		case let u: u16 =>
			*field = u;
		case =>
			return errors::invalid;
		};
	};

	if (!have_min || !have_max) {
		return errors::invalid;
	};

	const ioport = helios::ioctl_issue(rt::INIT_CAP_IOCONTROL, min, max)?;
	defer helios::destroy(ioport)!;
	helios::copyto(proc.cspace, addr, ioport)?;
};

Here we split the configuration string on commas and parse each as a key/value pair delimited by an equal sign (”=”), looking for a key called “min” and another called “max”. At the moment the config parsing is just implemented in this function directly, but in the future it might make sense to write a small abstraction for capability configurations like this. Once we know the I/O port range the user wants, then we issue an I/O port capability for that range and copy it into the driver’s cspace.

IRQs are a bit more involved still. An IRQ capability must be configured to deliver IRQs to a notification object.

fn cap_irq(
	proc: *process,
	addr: uint,
	config: const str,
) (void | errors::error) = {
	let irq = 0u8, note: helios::cap = 0;
	let have_irq = false, have_note = false;

	// ...config string parsing omitted...

	const _note = helios::copyfrom(proc.cspace, note, helios::CADDR_UNDEF)?;
	defer helios::destroy(_note)!;
	const (ct, _) = rt::identify(_note)!;
	if (ct != ctype::NOTIFICATION) {
		// TODO: More semantically meaningful errors would be nice
		return errors::invalid;
	};

	const irq = helios::irqctl_issue(rt::INIT_CAP_IRQCONTROL, _note, irq)?;
	defer helios::destroy(irq)!;
	helios::copyto(proc.cspace, addr, irq)?;
};

In order to do this, the driver loader copies the notification capability from the driver’s cspace and into the loader’s cspace, then creates an IRQ with that notification. It copies the new IRQ capability into the driver, then destroys its own copy of the IRQ and notification.

In this manner, the driver can declaratively state which capabilities it needs, and the loader can prepare an environment for it with these capabilities prepared. Once these capabilities are present in the driver’s cspace, the driver can invoke them by addressing the numbered capability slots in a send or receive syscall.

To summarize, the loader takes an I/O object (which we know is sourced from the bootstrap tarball) from which an ELF file can be read, finds a driver manifest, then creates a process and fills the cspace with the requested capabilities, loads the program into its address space, and starts the process.

Next, let’s look at the serial driver that we just finished loading.

Let me first note that this serial driver is a proof-of-concept at this time. A future serial driver will take a capability for a device manager object, then probe each serial port and provision serial devices for each working serial port. It will define an API which supports additional serial-specific features, such as configuring the baud rate. For now, it’s pretty basic.

This driver implements a simple event loop:

  1. Configure the serial port
  2. Wait for an interrupt or a read/write request from the user
  3. On interrupt, process the interrupt, writing buffered data or buffering readable data
  4. On a user request, buffer writes or unbuffer reads
  5. GOTO 2

The driver starts by defining some constants for the capability slots we set up in the manifest:

def EP: helios::cap = 0;
def IRQ: helios::cap = 1;
def CSPACE: helios::cap = 2;
def IRQ3: helios::cap = 5;
def IRQ4: helios::cap = 6;

It also defines some utility code for reading and writing to the COM registers, and constants for each of the registers defined by the interface.

// COM1 port
def COM1: u16 = 0x3F8;

// COM2 port
def COM2: u16 = 0x2E8;

// Receive buffer register
def RBR: u16 = 0;

// Transmit holding regiser
def THR: u16 = 0;

// ...other registers omitted...

const ioports: [_](u16, helios::cap) = [
	(COM1, 3), // 3 is the I/O port capability address
	(COM2, 4),
];

fn comin(port: u16, register: u16) u8 = {
	for (let i = 0z; i < len(ioports); i += 1) {
		const (base, cap) = ioports[i];
		if (base != port) {
			continue;
		};
		return helios::ioport_in8(cap, port + register)!;
	};
	abort("invalid port");
};

fn comout(port: u16, register: u16, val: u8) void = {
	for (let i = 0z; i < len(ioports); i += 1) {
		const (base, cap) = ioports[i];
		if (base != port) {
			continue;
		};
		helios::ioport_out8(cap, port + register, val)!;
		return;
	};
	abort("invalid port");
};

We also define some statically-allocated data structures to store state for each COM port, and a function to initialize the port:

type comport = struct {
	port: u16,
	rbuf: [4096]u8,
	wbuf: [4096]u8,
	rpending: []u8,
	wpending: []u8,
};

let ports: [_]comport = [
	comport { port = COM1, ... },
	comport { port = COM2, ... },
];

fn com_init(com: *comport) void = {
	com.rpending = com.rbuf[..0];
	com.wpending = com.wbuf[..0];
	comout(com.port, IER, 0x00);	// Disable interrupts
	comout(com.port, LCR, 0x80);	// Enable divisor mode
	comout(com.port, DL_LSB, 0x01);	// Div Low:  01: 115200 bps
	comout(com.port, DL_MSB, 0x00);	// Div High: 00
	comout(com.port, LCR, 0x03);	// Disable divisor mode, set parity
	comout(com.port, FCR, 0xC7);	// Enable FIFO and clear
	comout(com.port, IER, ERBFI);	// Enable read interrupt
};

The basics are in place. Let’s turn our attention to the event loop.

export fn main() void = {
	com_init(&ports[0]);
	com_init(&ports[1]);

	helios::irq_ack(IRQ3)!;
	helios::irq_ack(IRQ4)!;

	let poll: [_]pollcap = [
		pollcap { cap = IRQ, events = pollflags::RECV },
		pollcap { cap = EP, events = pollflags::RECV },
	];
	for (true) {
		helios::poll(poll)!;
		if (poll[0].events & pollflags::RECV != 0) {
			poll_irq();
		};
		if (poll[1].events & pollflags::RECV != 0) {
			poll_endpoint();
		};
	};
};

We initialize two COM ports first, using the function we were just reading. Then we ACK any IRQs that might have already been pending when the driver starts up, and we enter the event loop proper. Here we are polling on two capabilities, the notification to which IRQs are delivered, and the endpoint which provides the serial driver’s external API.

The state for each serial port includes a read buffer and a write buffer, defined in the comport struct shown earlier. We configure the COM port to interrupt when there’s data available to read, then pull it into the read buffer. If we have pending data to write, we configure it to interrupt when it’s ready to write more data, otherwise we leave this interrupt turned off. The “poll_irq” function handles these interrupts:

fn poll_irq() void = {
	helios::wait(IRQ)!;
	defer helios::irq_ack(IRQ3)!;
	defer helios::irq_ack(IRQ4)!;

	for (let i = 0z; i < len(ports); i += 1) {
		const iir = comin(ports[i].port, IIR);
		if (iir & 1 == 0) {
			port_irq(&ports[i], iir);
		};
	};
};

fn port_irq(com: *comport, iir: u8) void = {
	if (iir & (1 << 2) != 0) {
		com_read(com);
	};
	if (iir & (1 << 1) != 0) {
		com_write(com);
	};
};

The IIR register is the “interrupt identification register”, which tells us why the interrupt occurred. If it was because the port is readable, we call “com_read”. If the interrupt occurred because the port is writable, we call “com_write”. Let’s start with com_read. This interrupt is always enabled so that we can immediately start buffering data as the user types it into the serial port.

// Reads data from the serial port's RX FIFO.
fn com_read(com: *comport) size = {
	let n: size = 0;
	for (comin(com.port, LSR) & RBF == RBF; n += 1) {
		const ch = comin(com.port, RBR);
		if (len(com.rpending) < len(com.rbuf)) {
			// If the buffer is full we just drop chars
			static append(com.rpending, ch);
		};
	};

	// This part will be explained later:
	if (pending_read.reply != 0) {
		const n = rconsume(com, pending_read.buf);
		helios::send(pending_read.reply, 0, n)!;
		pending_read.reply = 0;
	};

	return n;
};

This code is pretty simple. For as long as the COM port is readable, read a character from it. If there’s room in the read buffer, append this character to it.

How about writing? Well, we need some way to fill the write buffer first. This part is pretty straightforward:

// Append data to a COM port read buffer, returning the number of bytes buffered
// successfully.
fn com_wbuffer(com: *comport, data: []u8) size = {
	let z = len(data);
	if (z + len(com.wpending) > len(com.wbuf)) {
		z = len(com.wbuf) - len(com.wpending);
	};
	static append(com.wpending, data[..z]...);
	com_write(com);
	return z;
};

This code just adds data to the write buffer, making sure not to exceed the buffer length (note that in Hare this would cause an assertion, not a buffer overflow). Then we call “com_write”, which does the actual writing to the COM port.

// Writes data to the serial port's TX FIFO.
fn com_write(com: *comport) size = {
	if (comin(com.port, LSR) & THRE != THRE) {
		const ier = comin(com.port, IER);
		comout(com.port, IER, ier | ETBEI);
		return 0;
	};

	let i = 0z;
	for (i < 16 && len(com.wpending) != 0; i += 1) {
		comout(com.port, THR, com.wpending[0]);
		static delete(com.wpending[0]);
	};

	const ier = comin(com.port, IER);
	if (len(com.wpending) == 0) {
		comout(com.port, IER, ier & ~ETBEI);
	} else {
		comout(com.port, IER, ier | ETBEI);
	};

	return i;
};

If the COM port is not ready to write data, we enable an interrupt which will tell us when it is and return. Otherwise, we write up to 16 bytes — the size of the COM port’s FIFO — and remove them from the write buffer. If there’s more data to write, we enable the write interrupt, or we disable it if there’s nothing left. When enabled, this will cause an interrupt to fire when (1) we have data to write and (2) the serial port is ready to write it, and our event loop will call this function again.

That covers all of the code for driving the actual serial port. What about the interface for someone to actually use this driver?

The “serial” capability defined in the manifest earlier is a temporary construct to provision some means of communicating with the driver. It provisions an endpoint capability (which is an IPC primitive on Helios) and stashes it away somewhere in the init process so that I can write some temporary test code to actually read or write to the serial port. Either request is done by “call”ing the endpoint with the desired parameters, which will cause the poll in the event loop to wake as the endpoint becomes receivable, calling “poll_endpoint”.

fn poll_endpoint() void = {
	let addr = 0u64, amt = 0u64;
	const tag = helios::recv(EP, &addr, &amt);
	const label = rt::label(tag);
	switch (label) {
	case 0 =>
		const addr = addr: uintptr: *[*]u8;
		const buf = addr[..amt];
		const z = com_wbuffer(&ports[0], buf);
		helios::reply(0, z)!;
	case 1 =>
		const addr = addr: uintptr: *[*]u8;
		const buf = addr[..amt];
		if (len(ports[0].rpending) == 0) {
			const reply = helios::store_reply(helios::CADDR_UNDEF)!;
			pending_read = read {
				reply = reply,
				buf = buf,
			};
		} else {
			const n = rconsume(&ports[0], buf);
			helios::reply(0, n)!;
		};
	case =>
		abort(); // TODO: error
	};
};

“Calls” in Helios work similarly to seL4. Essentially, when you “call” an endpoint, the calling thread blocks to receive the reply and places a reply capability in the receiver’s thread state. The receiver then processes their message and “replies” to the reply capability to wake up the calling thread and deliver the reply.

The message label is used to define the requested operation. For now, 0 is read and 1 is write. For writes, we append the provided data to the write buffer and reply with the number of bytes we buffered, easy breezy.

Reads are a bit more involved. If we don’t immediately have any data in the read buffer, we have to wait until we do to reply. We copy the reply from its special slot in our thread state into our capability space, so we can use it later. This operation is why our manifest requires cspace = self. Then we store the reply capability and buffer in a variable and move on, waiting for a read interrupt. On the other hand, if there is data buffered, we consume it and reply immediately.

fn rconsume(com: *comport, buf: []u8) size = {
	let amt = len(buf);
	if (amt > len(ports[0].rpending)) {
		amt = len(ports[0].rpending);
	};
	buf[..amt] = ports[0].rpending[..amt];
	static delete(ports[0].rpending[..amt]);
	return amt;
};

Makes sense?

That basically covers the entire serial driver. Let’s take a quick peek at the other side: the process which wants to read from or write to the serial port. For the time being this is all temporary code to test the driver with, and not the long-term solution for passing out devices to programs. The init process keeps a list of serial devices configured on the system:

type serial = struct {
	proc: *process,
	ep: helios::cap,
};

let serials: []serial = [];

fn register_serial(proc: *process, ep: helios::cap) void = {
	append(serials, serial {
		proc = proc,
		ep = ep,
	});
};

This function is called by the driver manifest parser like so:

fn cap_serial(
	proc: *process,
	addr: uint,
	config: const str,
) (void | errors::error) = {
	if (config != "") {
		return errors::invalid;
	};
	const ep = helios::newendpoint()?;
	helios::copyto(proc.cspace, addr, ep)?;
	register_serial(proc, ep);
};

We make use of the serial port in the init process’s main function with a little test loop to echo reads back to writes:

export fn main(bi: *rt::bootinfo) void = {
	log::println("[init] Hello from Mercury!");

	const bootstrap = bootstrapfs_init(&bi.modules[0]);
	defer bootstrapfs_finish(&bootstrap);
	earlyload(&bootstrap, "/drivers/serial");

	log::println("[init] begin echo serial port");
	for (true) {
		let buf: [1024]u8 = [0...];
		const n = serial_read(buf);
		serial_write(buf[..n]);
	};
};

The “serial_read” and “serial_write” functions are:

fn serial_write(data: []u8) size = {
	assert(len(data) <= rt::PAGESIZE);
	const page = helios::newpage()!;
	defer helios::destroy(page)!;
	let buf = helios::map(rt::vspace, 0, map_flags::W, page)!: *[*]u8;
	buf[..len(data)] = data[..];
	helios::page_unmap(page)!;

	// TODO: Multiple serial ports
	const port = &serials[0];
	const addr: uintptr = 0x7fff70000000; // XXX arbitrary address
	helios::map(port.proc.vspace, addr, 0, page)!;

	const reply = helios::call(port.ep, 0, addr, len(data));
	return rt::ipcbuf.params[0]: size;
};

fn serial_read(buf: []u8) size = {
	assert(len(buf) <= rt::PAGESIZE);
	const page = helios::newpage()!;
	defer helios::destroy(page)!;

	// TODO: Multiple serial ports
	const port = &serials[0];
	const addr: uintptr = 0x7fff70000000; // XXX arbitrary address
	helios::map(port.proc.vspace, addr, map_flags::W, page)!;

	const (label, n) = helios::call(port.ep, 1, addr, len(buf));

	helios::page_unmap(page)!;

	let out = helios::map(rt::vspace, 0, 0, page)!: *[*]u8;
	buf[..n] = out[..n];
	return n;
};

There is something interesting going on here. Part of this code is fairly obvious — we just invoke the IPC endpoint using helios::call, corresponding nicely to the other end’s use of helios::reply, with the buffer address and size. However, the buffer address presents a problem: this buffer is in the init process’s address space, so the serial port cannot read or write to it!

In the long term, a more sophisticated approach to shared memory management will be developed, but for testing purposes I came up with this solution. For writes, we allocate a new page, map it into our address space, and copy the data we want to write to it. Then we unmap it, map it into the serial driver’s address space instead, and perform the call. For reads, we allocate a page, map it into the serial driver, call the IPC endpoint, then unmap it from the serial driver, map it into our address space, and copy the data back out of it. In both cases, we destroy the page upon leaving this function, which frees the memory and automatically unmaps the page from any address space. Inefficient, but it works for demonstration purposes.

And that’s really all there is to it! Helios officially has its first driver. The next step is to develop a more robust solution for describing capability interfaces and device APIs, then build a PS/2 keyboard driver and a BIOS VGA mode 3 driver for driving the BIOS console, and combine these plus the serial driver into a tty on which we can run a simple shell.

TOTP for 2FA is incredibly easy to implement. So what's your excuse?

Time-based one-time passwords are one of the more secure approaches to 2FA — certainly much better than SMS. And it’s much easier to implement than SMS as well. The algorithm is as follows:

  1. Divide the current Unix timestamp by 30
  2. Encode it as a 64-bit big endian integer
  3. Write the encoded bytes to a SHA-1 HMAC initialized with the TOTP shared key
  4. Let offs = hmac[-1] & 0xF
  5. Let hash = decode hmac[offs .. offs + 4] as a 32-bit big-endian integer
  6. Let code = (hash & 0x7FFFFFFF) % 1000000
  7. Compare this code with the user’s code

You’ll need a little dependency to generate QR codes with the otpauth:// URL scheme, a little UI to present the QR code and store the shared secret in your database, and a quick update to your login flow, and then you’re good to go.

Here’s the implementation SourceHut uses in Python. I hereby release this code into the public domain, or creative commons zero, at your choice:

import base64
import hashlib
import hmac
import struct
import time

def totp(secret, token):
    tm = int(time.time() / 30)
    key = base64.b32decode(secret)

    for ix in range(-2, 3):
        b = struct.pack(">q", tm + ix)
        hm = hmac.HMAC(key, b, hashlib.sha1).digest()
        offset = hm[-1] & 0x0F
        truncatedHash = hm[offset:offset + 4]
        code = struct.unpack(">L", truncatedHash)[0]
        code &= 0x7FFFFFFF
        code %= 1000000
        if token == code:
            return True

    return False

This implementation has a bit of a tolerance added to make clock skew less of an issue, but that also means that the codes are longer-lived. Feel free to edit these tolerances if you so desire.

Here’s another one written in Hare, also public domain/CC-0.

use crypto::hmac;
use crypto::mac;
use crypto::sha1;
use encoding::base32;
use endian;
use time;

// Computes a TOTP code for a given time and key.
export fn totp(when: time::instant, key: []u8) uint = {
	const now = time::unix(when) / 30;
	const hmac = hmac::sha1(key);
	defer mac::finish(&hmac);

	let buf: [8]u8 = [0...];
	endian::beputu64(buf, now: u64);
	mac::write(&hmac, buf);

	let mac: [sha1::SIZE]u8 = [0...];
	mac::sum(&hmac, mac);

	const offs = mac[len(mac) - 1] & 0xF;
	const hash = mac[offs..offs+4];
	return ((endian::begetu32(hash)& 0x7FFFFFFF) % 1000000): uint;
};

@test fn totp() void = {
	const secret = "3N2OTFHXKLR2E3WNZSYQ====";
	const key = base32::decodestr(&base32::std_encoding, secret)!;
	defer free(key);
	const now = time::from_unix(1650183739);
	assert(totp(now, key) == 29283);
};

In any language, TOTP is just a couple of dozen lines of code even if there isn’t already a library — and there is probably already a library. You don’t have to store temporary SMS codes in the database, you don’t have to worry about phishing, you don’t have to worry about SIM swapping, and you don’t have to sign up for some paid SMS API like Twilio. It’s more secure and it’s trivial to implement — so implement it already! Please!


Update 2022-10-19 @ 07:45 UTC: A reader pointed out that it’s important to have rate limiting on your TOTP attempts, or else a brute force attack can be effective. Fair point!

Status update, October 2022

After a few busy and stressful months, I decided to set aside October to rest. Of course, for me, rest does not mean a cessation of programming, but rather a shift in priorities towards more fun and experimental projects. Consequently, it has been a great month for Helios!

Hare upstream has enjoyed some minor improvements, such as from Pierre Curto’s patch to support parsing IPv6 addresses with a port (e.g. “[::1]:80”) and Kirill Primak’s improvements to the UTF-8 decoder. On the whole, improvements have been conservative. However, queued up for integration once qbe upstream support is merged is support for @threadlocal variables, which are useful for Helios and for ABI compatibility with C. I also drafted up a proof-of-concept for @inline functions, but it still needs work.

Now for the main event: Helios. The large-scale redesign and refactoring I mentioned in the previous status update is essentially complete, and the kernel reached (and exceeded) feature parity with the previous status quo. Since Helios has been my primary focus for the past couple of weeks, I have a lot of news to share about it.

First, I got back into userspace a few days after the last status update, and shortly thereafter implemented a new scheduler. I then began to rework the userspace API (uapi) in the kernel, which differs substantially from its prior incarnation. The kernel object implementations present themselves as a library for kernel use, and the new uapi module handles all interactions with this module from userspace, providing a nice separation of concerns. The uapi module handles more than syscalls now — it also implements send/recv for kernel objects, for instance. As of a few days ago, uapi also supports delivering faults to userspace supervisor processes:

A screenshot of a thread on Helios causing a page fault, then its parent thread receives details of the fault and maps a page onto the address of the attempted write. The child thread is resumed and is surprised to find that the write succeeded (because a page was mapped underneath the write).

@test fn task::pagefault() void = {
	const fault = helios::newendpoint()!;
	defer helios::destroy(fault)!;

	const thread = threads::new(&_task_pagefault)!;
	threads::set_fault(thread, fault)!;
	threads::start(thread)!;

	const fault = helios::recv_fault(fault);
	assert(fault.addr == 0x100);

	const page = helios::newpage()!;
	defer helios::destroy(page)!;
	helios::map(rt::vspace, 0, map_flags::W | map_flags::FIXED, page)!;

	threads::resume(thread)!;
	threads::join(thread)!;
};

fn _task_pagefault() void = {
	let ptr: *int = 0x100: uintptr: *int;
	*ptr = 1337;
	assert(*ptr == 1337);
};

The new userspace threading API is much improved over the hack job in the earlier design. It supports TLS and many typical threading operations, such as join and detach. This API exists mainly for testing the kernel via Vulcan, and is not anticipated to see much use beyond this (though I will implement pthreads for the POSIX C environment at some point). For more details, see this blog post. Alongside this and other userspace libraries, Vulcan has been fleshed out into a kernel test suite once again, which I have been frequently testing on real hardware:

A picture of a laptop showing 15 passing kernel tests

Here’s an ISO you can boot on your own x86_64 hardware to see if it works for you, too. If you have problems, take a picture of the issue, boot Linux and email me said picture, the output of lscpu, and any other details you deem relevant.

The kernel now supports automatic capability address allocation, which is a marked improvement over seL4. The new physical page allocator is also much improved, as it supports allocation and freeing and can either allocate pages sparsely or continuously depending on the need. Mapping these pages in userspace was also much improved, with a better design of the userspace virtual memory map and a better heap, complete with a (partial) implementation of mmap.

I have also broken ground on the next component of the OS, Mercury, which provides a more complete userspace environment for writing drivers. It has a simple tar-based initramfs based on Hare’s format::tar implementation, which I wrote in June for this purpose. It can load ELF files from this tarball into new processes, and implements some extensions that are useful for driver loading. Consequently, the first Mercury driver is up and running:

Demo of a working serial driver

This driver includes a simple driver manifest, which is embedded into its ELF file and processed by the driver loader to declaratively specify the capabilities it needs:

[driver]
name=pcserial
desc=Serial driver for x86_64 PCs

[cspace]
radix=12

[capabilities]
0:endpoint =
1:ioport = min=3F8, max=400
2:ioport = min=2E8, max=2F0
3:note = 
4:irq = irq=3, note=3
5:irq = irq=4, note=3

The driver loader prepares capabilities for the COM1 and COM2 I/O ports, as well as IRQ handlers for IRQ 3 and 4, based on this manifest, then loads them into the capability table for the driver process. The driver is sandboxed very effectively by this: it can only use these capabilities. It cannot allocate memory, modify its address space, or even destroy any of these capabilities. If a bad actor was on the other end of the serial port and exploited a bug, the worst thing it could do is crash the serial driver, which would then be rebooted by the supervisor. On Linux and other monolithic kernels like it, exploiting the serial driver compromises the entire operating system.

The resulting serial driver implementation is pretty small and straightforward, if you’d like to have a look.

This manifest format will be expanded in the future for additional kinds of drivers, such as with details specific to each bus (i.e. PCI vendor information or USB details), and will also have details for device trees when RISC-V and ARM support (the former is already underway) are brought upstream.

Next steps are to implement an I/O abstraction on top of IPC endpoints, which first requires call & reply support — the latter was implemented last night and requires additional testing. Following this, I plan on writing a getty-equivalent which utilizes this serial driver, and a future VGA terminal driver, to provide an environment in which a shell can be run. Then I’ll implement a ramfs to host commands for the shell to run, and we’ll really be cookin’ at that point. Disk drivers and filesystem drivers will be next.

That’s all for now. Quite a lot of progress! I’ll see you next time.

In praise of ffmpeg

My last “In praise of” article covered qemu, a project founded by Fabrice Bellard, and today I want to take a look at another work by Bellard: ffmpeg. Bellard has a knack for building high-quality software which solves a problem so well that every other solution becomes obsolete shortly thereafter, and ffmpeg is no exception.

ffmpeg has been described as the Swiss army knife of multimedia. It incorporates hundreds of video, audio, and image decoders and encoders, muxers and demuxers, filters and devices. It provides a CLI and a set of libraries for working with its tools, and is the core component of many video and audio players as a result (including my preferred multimedia player, mpv). If you want to do almost anything with multimedia files — re-encode them, re-mux them, live stream it, whatever — ffmpeg can handle it with ease.

Let me share an example.

I was recently hanging out at my local hackerspace and wanted to play some PS2 games on my laptop. My laptop is not powerful enough to drive PCSX2, but my workstation on the other side of town certainly was. So I forwarded my game controller to my workstation via USB/IP and pulled up the ffmpeg manual to figure out how to live-stream the game to my laptop. ffmpeg can capture video from KMS buffers directly, use the GPU to efficiently downscale them, grab audio from pulse, encode them with settings tuned for low-latency, and mux it into a UDP socket. On the other end I set up mpv to receive the stream and play it back.

ffmpeg \
  -f pulse \
  -i alsa_output.platform-snd_aloop.0.analog-surround-51.monitor \
  -f kmsgrab \
  -thread_queue_size 64 \   # reduce input latency
  -i - \
  # Capture and downscale frames on the GPU:
  -vf 'hwmap=derive_device=vaapi,scale_vaapi=1280:720,hwdownload,format=bgr0' \
  -c:v libx264 \
  -preset:v superfast \     # encode video as fast as possible
  -tune zerolatency \       # tune encoder for low latency
  -intra-refresh 1 \        # reduces latency and mitigates dropped packets
  -f mpegts \               # mux into mpegts stream, well-suited to this use-case
  -b:v 3M \                 # configure target video bandwidth
  udp://$hackerspace:41841

With an hour of tinkering and reading man pages, I was able to come up with a single command which produced a working remote video game streaming setup from scratch thanks to ffmpeg. ffmpeg is amazing.

I have relied on ffmpeg for many tasks and for many years. It has always been there to handle any little multimedia-related task I might put it to for personal use — re-encoding audio files so they fit on my phone, taking clips from videos to share, muxing fonts into mkv files, capturing video from my webcam, live streaming hacking sessions on my own platform, or anything else I can imagine. It formed the foundation of MediaCrush back in the day, where we used it to optimize multimedia files for efficient viewing on the web, back when that was more difficult than “just transcode it to a webm”.

ffmpeg is notable for being one of the first large-scale FOSS projects to completely eradicate proprietary software in its niche. Virtually all multimedia-related companies rely on ffmpeg to do their heavy lifting. It took a complex problem and solved it, with free software. The book is now closed on multimedia: ffmpeg is the solution to almost all of your problems. And if it’s not, you’re more likely to patch ffmpeg than to develop something new. The code is accessible and the community are experts in your problem domain.

ffmpeg is one of the foremost pillars of achievement in free software. It has touched the lives of every reader, whether they know it or not. If you’ve ever watched TV, or gone to a movie, or watched videos online, or listened to a podcast, odds are that ffmpeg was involved in making it possible. It is one of the most well-executed and important software projects of all time.

Does Rust belong in the Linux kernel?

I am known to be a bit of a polemic when it comes to Rust. I will be forthright with the fact that I don’t particularly care for Rust, and that my public criticisms of it might set up many readers with a reluctance to endure yet another Rust Hot Take from my blog. My answer to the question posed in the title is, of course, “no”. However, let me assuage some of your fears by answering a different question first: does Hare belong in the Linux kernel?

If I should owe my allegiance to any programming language, it would be Hare. Not only is it a systems programming language that I designed myself, but I am using it to write a kernel. Like Rust, Hare is demonstrably useful for writing kernels with. One might even go so far as to suggest that I consider it superior to C for this purpose, given that I chose to to write Helios in Hare rather than C, despite my extensive background in C. But the question remains: does Hare belong in the Linux kernel?

In my opinion, Hare does not belong in the Linux kernel, and neither does Rust. Some of the reasoning behind this answer is common to both, and some is unique to each, but I will be focusing on Rust today because Rust is the language which is actually making its way towards mainline Linux. I have no illusions about this blog post changing that, either: I simply find it an interesting case-study in software engineering decision-making in a major project, and that’s worth talking about.

Each change in software requires sufficient supporting rationale. What are the reasons to bring Rust into Linux? A kernel hacker thinks about these questions differently than a typical developer in userspace. One could espouse the advantages of Cargo, generics, whatever, but these concerns matter relatively little to kernel hackers. Kernels operate in a heavily constrained design space and a language has to fit into that design space. This is the first and foremost concern, and if it’s awkward to mold a language to fit into these constraints then it will be a poor fit.

Some common problems that a programming language designed for userspace will run into when being considered for kernelspace are:

  • Strict constraints on memory allocation
  • Strict constraints on stack usage
  • Strict constraints on recursion
  • No use of floating point arithmetic
  • Necessary evils, such as unsafe memory use patterns or integer overflow
  • The absence of a standard library, runtime, third-party libraries, or other conveniences typically afforded to userspace

Most languages can overcome these constraints with some work, but their suitability for kernel use is mainly defined by how well they adapt to them — there’s a reason that kernels written in Go, C#, Java, Python, etc, are limited to being research curiosities and are left out of production systems.

As Linus recently put it, “kernel needs trump any Rust needs”. The kernel is simply not an environment which will bend to accommodate a language; it must go the other way around. These constraints have posed, and will continue to pose, a major challenge for Rust in Linux, but on the whole, I think that it will be able to rise to meet them, though perhaps not with as much grace as I would like.

If Rust is able to work within these constraints, then it satisfies the ground rules for playing in ring 0. The question then becomes: what advantages can Rust bring to the kernel? Based on what I’ve seen, these essentially break down to two points:1

  1. Memory safety
  2. Trendiness

I would prefer not to re-open the memory safety flamewar, so we will simply move forward with the (dubious) assumptions that memory safety is (1) unconditionally desirable, (2) compatible with the kernel’s requirements, and (3) sufficiently provided for by Rust. I will offer this quote from an unnamed kernel hacker, though:

There are possibly some well-designed and written parts which have not suffered a memory safety issue in many years. It’s insulting to present this as an improvement over what was achieved by those doing all this hard work.

Regarding “trendiness”, I admit that this is a somewhat unforgiving turn of phrase. In this respect I refer to the goal of expanding the kernel’s developer base from a bunch of aging curmudgeons writing C2 towards a more inclusive developer pool from a younger up-and-coming language community like Rust. C is boring3 — it hasn’t really excited anyone in decades. Rust is exciting, and its community enjoys a huge pool of developers building their brave new world with it. Introducing Rust to the kernel will certainly appeal to a broader audience of potential contributors.

But there is an underlying assumption to this argument which is worth questioning: is the supply of Linux developers dwindling, and, if so, is it to such and extent that it demands radical change?

Well, no. Linux has consistently enjoyed a tremendous amount of attention from the software development community. This week’s release of Linux 6.0, one of the largest Linux releases ever, boasted more than 78,000 commits by almost 5,000 different authors since 5.15. Linux has a broad developer base reaching from many different industry stakeholders and independent contributors working on the careful development and maintenance of its hundreds of subsystems. The scale of Linux development is on a level unmatched by any other software project — free software or otherwise.

Getting Rust working in Linux is certainly an exciting project, and I’m all for developers having fun. However, it’s not likely to infuse Linux with a much-needed boost in its contributor base, because Linux has no such need. What’s more, Linux’s portability requirements prevent Rust from being used in most of the kernel in the first place. Most work on Rust in Linux is simply working on getting the systems to cooperate with each other or writing drivers which are redundant with existing C drivers, but cannot replace them due to Rust’s limited selection of targets.4 Few to none of the efforts from the Rust-in-Linux team are likely to support the kernel’s broader goals for some time.

We are thus left with memory safety as the main benefit offered by Rust to Linux, and for the purpose of this article we’re going to take it at face value. So, with the ground rules set and the advantages enumerated, what are some of the problems that Rust might face in Linux?

There are a few problems which could be argued over, such as substantial complexity of Rust compared to C, the inevitable doubling of Linux’s build time, the significant shift in design sensibilities required to support an idiomatic Rust design, the fragile interface which will develop on the boundaries between Rust and C code, or the challenges the kernel’s established base of C developers will endure when learning and adapting to a new language. To avoid letting this post become too subjective or lengthy, I’ll refrain from expanding on these. Instead, allow me to simply illuminate these issues as risk factors.

Linux is, on the whole, a conservative project. It is deployed worldwide in billions of devices and its reliability is depended on by a majority of Earth’s population. Risks are carefully evaluated in Linux as such. Every change presents risks and offers advantages, which must be weighed against each other to justify the change. Rust is one of the riskiest bets Linux has ever considered, and, in my opinion, the advantages may not weigh up. I think that the main reason we’re going to see Rust in the kernel is not due to a careful balancing of risk and reward, but because the Rust community wants Rust in Linux, and they’re large and loud enough to not be worth the cost of arguing with.

I don’t think that changes on this scale are appropriate for most projects. I prefer to encourage people to write new software to replace established software, rather than rewriting the established software. Some projects, such as Redox, are doing just that with Rust. However, operating systems are in a difficult spot in this respect. Writing an operating system is difficult work with a huge scope — few projects can hope to challenge Linux on driver support, for example. The major players have been entrenched for decades, and any project seeking to displace them will have decades of hard work ahead of them and will require a considerable amount of luck to succeed. Though I think that new innovations in kernels are badly overdue, I must acknowledge that there is some truth to the argument that we’re stuck with Linux. In this framing, if you want Rust to succeed in a kernel, getting it into Linux is the best strategy.

But, on the whole, my opinion is that the benefits of Rust in Linux are negligible and the costs are not. That said, it’s going to happen, and the impact to me is likely to be, at worst, a nuisance. Though I would have chosen differently, I wish them the best of luck and hope to see them succeed.

Notes from kernel hacking in Hare, part 2: multi-threading

I have long promised that Hare would not have multi-threading, and it seems that I have broken that promise. However, I have remained true to the not-invented-here approach which is typical of my style by introducing it only after designing an entire kernel to implement it on top of.1

For some background, Helios is a micro-kernel written in Hare. In addition to the project, the Vulcan system is a small userspace designed to test the kernel.

A picture of a laptop running Helios and showing the results of the Vulcan test suite

While I don’t anticipate multi-threaded processes playing a huge role in the complete Ares system in the future, they do have a place. In the long term, I would like to be able to provide an implementation of pthreads for porting existing software to the system. A more immediate concern is how to test the various kernel primitives provided by Helios, such as those which facilitate inter-process communication (IPC). It’s much easier to test these with threads than with processes, since spawning threads does not require spinning up a new address space.

@test fn notification::wait() void = {
	const note = helios::newnote()!;
	defer helios::destroy(note)!;

	const thread = threads::new(&notification_wait, note)!;
	threads::start(thread)!;
	defer threads::join(thread)!;

	helios::signal(note)!;
};

fn notification_wait(note: u64) void = {
	const note = note: helios::cap;
	helios::wait(note)!;
};

So how does it work? Let’s split this up into two domains: kernelspace and userspace.

Threads in the kernel

The basic primitive for threads and processes in Helios is a “task”, which is simply an object which receives some CPU time. A task has a capability space (so it can invoke operations against kernel objects), an virtual address space (so it has somewhere to map the process image and memory), and some state, such as the values of its CPU registers. The task-related structures are:

// A task capability.
export type task = struct {
	caps::capability,
	state: uintptr,
	@offset(caps::LINK_OFFS) link: caps::link,
};

// Scheduling status of a task.
export type task_status = enum {
	ACTIVE,
	BLOCKED, // XXX: Can a task be both blocked and suspended?
	SUSPENDED,
};

// State for a task.
export type taskstate = struct {
	regs: arch::state,
	cspace: caps::cslot,
	vspace: caps::cslot,
	ipc_buffer: uintptr,
	status: task_status,
	// XXX: This is a virtual address, should be physical
	next: nullable *taskstate,
	prev: nullable *taskstate,
};

Here’s a footnote to explain some off-topic curiosities about this code: 2

The most interesting part of this structure is arch::state, which stores the task’s CPU registers. On x86_64,3 this structure is defined as follows:

export type state = struct {
	fs: u64,
	fsbase: u64,

	r15: u64,
	r14: u64,
	r13: u64,
	r12: u64,
	r11: u64,
	r10: u64,
	r9: u64,
	r8: u64,
	rbp: u64,
	rdi: u64,
	rsi: u64,
	rdx: u64,
	rcx: u64,
	rbx: u64,
	rax: u64,

	intno: u64,
	errcode: u64,

	rip: u64,
	cs: u64,
	rflags: u64,
	rsp: u64,
	ss: u64,
};

This structure is organized in part according to hardware constraints and in part at the discretion of the kernel implementer. The last five fields, from %rip to %ss, are constrained by the hardware. When an interrupt occurs, the CPU pushes each of these registers to the stack, in this order, then transfers control to the system interrupt handler. The next two registers serve a special purpose within our interrupt implementation, and the remainder are ordered arbitrarily.

In order to switch between two tasks, we need to save all of this state somewhere, then load the same state for another task when returning from the kernel to userspace. The save/restore process is handled in the interrupt handler, in assembly:

.global isr_common
isr_common:
	_swapgs
	push %rax
	push %rbx
	push %rcx
	push %rdx
	push %rsi
	push %rdi
	push %rbp
	push %r8
	push %r9
	push %r10
	push %r11
	push %r12
	push %r13
	push %r14
	push %r15

	// Note: fsbase is handled elsewhere
	push $0
	push %fs

	cld

	mov %rsp, %rdi
	mov $_kernel_stack_top, %rsp
	call arch.isr_handler
_isr_exit:
	mov %rax, %rsp

	// Note: fsbase is handled elsewhere
	pop %fs
	pop %r15

	pop %r15
	pop %r14
	pop %r13
	pop %r12
	pop %r11
	pop %r10
	pop %r9
	pop %r8
	pop %rbp
	pop %rdi
	pop %rsi
	pop %rdx
	pop %rcx
	pop %rbx
	pop %rax

	_swapgs

	// Clean up error code and interrupt #
	add $16, %rsp

	iretq

I’m not going to go into too much detail on interrupts for this post (maybe in a later post), but what’s important here is the chain of push/pop instructions. This automatically saves the CPU state for each task when entering the kernel. The syscall handler has something similar.

This suggests a question: where’s the stack?

Helios has a single kernel stack,4 which is moved to %rsp from $_kernel_stack_top in this code. This is different from systems like Linux, which have one kernel stack per thread; the rationale behind this design choice is out of scope for this post.5 However, the “stack” being pushed to here is not, in fact, a traditional stack.

x86_64 has an interesting feature wherein an interrupt can be configured to use a special “interrupt stack”. The task state segment is a bit of a historical artifact which is of little interest to Helios, but in long mode (64-bit mode) it serves a new purpose: to provide a list of addresses where up to seven interrupt stacks are stored. The interrupt descriptor table includes a 3-bit “IST” field which, when nonzero, instructs the CPU to set the stack pointer to the corresponding address in the TSS when that interrupt fires. Helios sets all of these to one, then does something interesting:

// Stores a pointer to the current state context.
export let context: **state = null: **state;

fn init_tss(i: size) void = {
	cpus[i].tstate = taskstate { ... };
	context = &cpus[i].tstate.ist[0]: **state;
};

// ...

export fn save() void = {
	// On x86_64, most registers are saved and restored by the ISR or
	// syscall service routines.
	let active = *context: *[*]state;
	let regs = &active[-1];

	regs.fsbase = rdmsr(0xC0000100);
};

export fn restore(regs: *state) void = {
	wrmsr(0xC0000100, regs.fsbase);

	const regs = regs: *[*]state;
	*context = &regs[1];
};

We store a pointer to the active task’s state struct in the TSS when we enter userspace, and when an interrupt occurs, the CPU automatically places that state into %rsp so we can trivially push all of the task’s registers into it.

There is some weirdness to note here: the stack grows downwards. Each time you push, the stack pointer is decremented, then the pushed value is written there. So, we have to fill in this structure from the bottom up. Accordingly, we have to do something a bit unusual here: we don’t store a pointer to the context object, but a pointer to the end of the context object. This is what &active[-1] does here.

Hare has some memory safety features by default, such as bounds testing array accesses. Here we have to take advantage of some of Hare’s escape hatches to accomplish the goal. First, we cast the pointer to an unbounded array of states — that’s what the *[*] is for. Then we can take the address of element -1 without the compiler snitching on us.

There is also a separate step here to save the fsbase register. This will be important later.

This provides us with enough pieces to enter userspace:

// Immediately enters this task in userspace. Only used during system
// initialization.
export @noreturn fn enteruser(task: *caps::capability) void = {
	const state = objects::task_getstate(task);
	assert(objects::task_schedulable(state));
	active = state;
	objects::vspace_activate(&state.vspace)!;
	arch::restore(&state.regs);
	arch::enteruser();
};

What we need next is a scheduler, and a periodic interrupt to invoke it, so that we can switch tasks every so often.

Scheduler design is a complex subject which can have design, performance, and complexity implications ranging from subtle to substantial. For Helios’s present needs we use a simple round-robin scheduler: each task gets the same time slice and we just switch to them one after another.

The easy part is simply getting periodic interrupts. Again, this blog post isn’t about interrupts, so I’ll just give you the reader’s digest:

arch::install_irq(arch::PIT_IRQ, &pit_irq);
arch::pit_setphase(100);

// ...

fn pit_irq(state: *arch::state, irq: u8) void = {
	sched::switchtask();
	arch::pic_eoi(arch::PIT_IRQ);
};

The PIT, or programmable interrupt timer, is a feature on x86_64 which provides exactly what we need: periodic interrupts. This code configures it to tick at 100 Hz and sets up a little IRQ handler which calls sched::switchtask to perform the actual context switch.

Recall that, by the time sched::switchtask is invoked, the CPU and interrupt handler have already stashed all of the current task’s registers into its state struct. All we have to do now is pick out the next task and restore its state.

// see idle.s
let idle: arch::state;

// Switches to the next task.
export fn switchtask() void = {
	// Save state
	arch::save();

	match (next()) {
	case let task: *objects::taskstate =>
		active = task;
		objects::vspace_activate(&task.vspace)!;
		arch::restore(&task.regs);
	case null =>
		arch::restore(&idle);
	};
};

fn next() nullable *objects::taskstate = {
	let next = active.next;
	for (next != active) {
		if (next == null) {
			next = tasks;
			continue;
		};
		const cand = next as *objects::taskstate;
		if (objects::task_schedulable(cand)) {
			return cand;
		};
		next = cand.next;
	};
	const next = next as *objects::taskstate;
	if (objects::task_schedulable(next)) {
		return next;
	};
	return null;
};

Pretty straightforward. The scheduler maintains a linked list of tasks, picks the next one which is schedulable,6 then runs it. If there are no schedulable tasks, it runs the idle task.

Err, wait, what’s the idle task? Simple: it’s another state object (i.e. a set of CPU registers) which essentially works as a statically allocated do-nothing thread.

const idle_frame: [2]uintptr = [0, &pause: uintptr];

// Initializes the state for the idle thread.
export fn init_idle(idle: *state) void = {
	*idle = state {
		cs = seg::KCODE << 3,
		ss = seg::KDATA << 3,
		rflags = (1 << 21) | (1 << 9),
		rip = &pause: uintptr: u64,
		rbp = &idle_frame: uintptr: u64,
		...
	};
};

“pause” is a simple loop:

.global arch.pause
arch.pause:
	hlt
	jmp arch.pause

In the situation where every task is blocking on I/O, there’s nothing for the CPU to do until the operation finishes. So, we simply halt and wait for the next interrupt to wake us back up, hopefully unblocking some tasks so we can schedule them again. A more sophisticated kernel might take this opportunity to go into a lower power state, perhaps, but for now this is quite sufficient.

With this last piece in place, we now have a multi-threaded operating system. But there is one more piece to consider: when a task yields its time slice.

Just because a task receives CPU time does not mean that it needs to use it. A task which has nothing useful to do can yield its time slice back to the kernel through the “yieldtask” syscall. On the face of it, this is quite simple:

// Yields the current time slice and switches to the next task.
export @noreturn fn yieldtask() void = {
	arch::sysret_set(&active.regs, 0, 0);
	switchtask();
	arch::enteruser();
};

The “sysret_set” updates the registers in the task state which correspond with system call return values to (0, 0), indicating a successful return from the yield syscall. But we don’t actually return at all: we switch to the next task and then return to that.

In addition to being called from userspace, this is also useful whenever the kernel blocks a thread on some I/O or IPC operation. For example, tasks can wait on “notification” objects, which another task can signal to wake them up — a simple synchronization primitive. The implementation makes good use of sched::yieldtask:

// Blocks the active task until this notification is signalled. Does not return
// if the operation is blocking.
export fn wait(note: *caps::capability) uint = {
	match (nbwait(note)) {
	case let word: uint =>
		return word;
	case errors::would_block =>
		let note = note_getstate(note);
		assert(note.recv == null); // TODO: support multiple receivers
		note.recv = sched::active;
		sched::active.status = task_status::BLOCKED;
		sched::yieldtask();
	};
};

Finally, that’s the last piece.

Threads in userspace

Phew! That was a lot of kernel pieces to unpack. And now for userspace… in the next post! This one is getting pretty long. Here’s what you have to look forward to:

  • Preparing the task and all of the objects it needs (such as a stack)
  • High-level operations: join, detach, exit, suspend, etc
  • Thread-local storage…
    • in the Hare compiler
    • in the ELF loader
    • at runtime
  • Putting it all together to test the kernel

We’ll see you next time!

The phrase "open source" (still) matters

In 1988, “Resin Identification Codes” were introduced by the plastic industry. These look exactly like the recycling symbol ♺, which is not trademarked or regulated, except that a number is enclosed within the triangle. These symbols simply identify what kind of plastic was used. The vast majority of plastic is non-recyclable, but has one of these symbols on it to suggest otherwise. This is a deceptive business practice which exploits the consumer’s understanding of the recycling symbol to trick them into buying more plastic products.

The meaning of the term “open source” is broadly understood to be defined by the Open Source Initiative’s Open Source Definition, the “OSD”. Under this model, open source has enjoyed a tremendous amount of success, such that virtually all software written today incorporates open source components.

The main advantage of open source, to which much of this success can be attributed, is that it is a product of many hands. In addition to the work of its original authors, open source projects generally accept code contributions from anyone who would offer them. They also enjoy numerous indirect benefits, through the large community of Linux distros which package and ship the software, or people who write docs or books or blog posts about it, or the many open source dependencies it is likely built on top of.

Under this model, the success of an open source project is not entirely attributable to its publisher, but to both the publisher and the community which exists around the software. The software does not belong to its publisher, but to its community. I mean this not only in a moral sense, but also in a legal sense: every contributor to an open source project retains their copyright and the project’s ownership is held collectively between its community of contributors.1

The OSD takes this into account when laying out the conditions for commercialization of the software. An argument for exclusive commercialization of software by its publishers can be made when the software is the result of investments from that publisher alone, but this is not so for open source. Because it is the product of its community as a whole, the community enjoys the right to commercialize it, without limitation. This is a fundamental, non-negotiable part of the open source definition.

However, we often see the odd company or organization trying to forward an unorthodox definition of the “open source”. Generally, their argument goes something like this: “open” is just an adjective, and “source” comes from “source code”, so “open source” just means source code you can read, right?

This argument is wrong,2 but it usually conceals the speaker’s real motivations: they want a commercial monopoly over their project.3 Their real reason is “I should be able to make money from open source, but you shouldn’t”. An argument for an unorthodox definition of “open source” from this perspective is a form of motivated reasoning.

Those making this argument have good reason to believe that they will enjoy more business success if they get away with it. The open source brand is incredibly strong — one of the most successful brands in the entire software industry. Leveraging that brand will drive interest to their project, especially if, on the surface, it looks like it fits the bill (generally by being source available).

When you get down to it, this behavior is dishonest and anti-social. It leverages the brand of open source, whose success has been dependent on the OSD and whose brand value is associated with the user’s understanding of open source, but does not provide the same rights. The deception is motivated by selfish reasons: to withhold those rights from the user for their own exclusive use. This is wrong.

You can publish software under any terms that you wish, with or without commercial rights, with or without source code, whatever — it’s your right. However, if it’s not open source, it’s wrong to call it open source. There are better terms — “source available”, “fair code”, etc. If you describe your project appropriately, whatever the license may be, then I wish you nothing but success.

Status update, September 2022

I have COVID-19 and I am halfway through my stockpile of tissues, so I’m gonna keep this status update brief.

In Hare news, I finally put the last pieces into place to make cross compiling as easy as possible. Nothing else particularly world-shattering going on here. I have a bunch of new stuff in my patch queue to review once I’m feeling better, however, including bigint stuff — a big step towards TLS support. Unrelatedly, TLS support seems to be progressing upstream in qbe. (See what I did there?)

powerctl is a small new project I wrote to configure power management states on Linux. I’m pretty pleased with how it turned out. It makes for a good case study on Hare for systems programming.

In Helios, I have been refactoring the hell out of everything, rewriting or redesigning large parts of it from scratch. Presently this means that a lot of the functionality which was previously present was removed, and is being slowly re-introduced with substantial changes. The key is reworking these features to take better consideration of the full object lifecycle — creating, copying, and destroying capabilities. An improvement which ended up being useful in the course of this work is adding address space IDs (PCIDs on x86_64), which is going to offer a substantial performance boost down the line.

Alright, time for a nap. Bye!

Notes from kernel hacking in Hare, part 1

One of the goals for the Hare programming language is to be able to write kernels, such as my Helios project. Kernels are complex beasts which exist in a somewhat unique problem space and have constraints that many userspace programs are not accustomed to. To illustrate this, I’m going to highlight a scenario where Hare’s low-level types and manual memory management approach shines to enable a difficult use-case.

Helios is a micro-kernel. During system initialization, its job is to load the initial task into memory, prepare the initial set of kernel objects for its use, provide it with information about the system, then jump to userspace and fuck off until someone needs it again. I’m going to focus on the “providing information” step here.

The information the kernel needs to provide includes details about the capabilities that init has access to (such as working with I/O ports), information about system memory, the address of the framebuffer, and so on. This information is provided to init in the bootinfo structure, which is mapped into its address space, and passed to init via a register which points to this structure.1

// The bootinfo structure.
export type bootinfo = struct {
	argv: str,

	// Capability ranges
	memory: cap_range,
	devmem: cap_range,
	userimage: cap_range,
	stack: cap_range,
	bootinfo: cap_range,
	unused: cap_range,

	// Other details
	arch: *arch_bootinfo,
	ipcbuf: uintptr,
	modules: []module_desc,
	memory_info: []memory_desc,
	devmem_info: []memory_desc,
	tls_base: uintptr,
	tls_size: size,
};

Parts of this structure are static (such as the capability number ranges for each capability assigned to init), and others are dynamic - such as structures describing the memory layout (N items where N is the number of memory regions), or the kernel command line. But, we’re in a kernel – dynamically allocating data is not so straightforward, especially for units smaller than a page!2 Moreover, the data structures allocated here need to be visible to userspace, and kernel memory is typically not available to userspace. A further complication is the three different address spaces we’re working with here: a bootinfo object has a physical memory address, a kernel address, and a userspace address — three addresses to refer to a single object in different contexts.

Here’s an example of what the code shown in this article is going to produce:

A 64 by 64 grid of cells representing a page of physical memory. The first set of cells are colored blue; the next set green; then purple; the remainder are brown.

This is a single page of physical memory which has been allocated for the bootinfo data, where each cell is a byte. The bootinfo structure itself comes first, in blue. Following this is an arch-specific bootinfo structure, in green:

// x86_64-specific boot information
export type arch_bootinfo = struct {
	// Page table capabilities
	pdpt: cap_range,
	pd: cap_range,
	pt: cap_range,

	// vbe_mode_info physical address from multiboot (or zero)
	vbe_mode_info: uintptr,
};

After this, in purple, is the kernel command line. These three structures are always consistently allocated for any boot configuration, so the code which sets up the bootinfo page (the code we’re going to read now) always provisions them. Following these three items is a large area of free space (indicated in brown) which will be used to populate further dynamically allocated bootinfo structures, such as descriptions of physical memory regions.

The code to set this up is bootinfo_init, which is responsible for allocating a suitable page, filling in the bootinfo structure, and preparing a vector to dynamically allocate additional data on this page. It also sets up the arch bootinfo and argv, so the page looks like this diagram when the function returns. And here it is, in its full glory:

// Initializes the bootinfo context.
export fn bootinfo_init(heap: *heap, argv: str) bootinfo_ctx = {
	let cslot = caps::cslot { ... };
	let page = objects::init(ctype::PAGE, &cslot, &heap.memory)!;
	let phys = objects::page_phys(page);
	let info = mem::phys_tokernel(phys): *bootinfo;

	const bisz = size(bootinfo);
	let bootvec = (info: *[*]u8)[bisz..arch::PAGESIZE][..0];

	let ctx = bootinfo_ctx {
		page = cslot,
		info = info,
		arch = null: *arch_bootinfo, // Fixed up below
		bootvec = bootvec,
	};

	let (vec, user) = mkbootvec(&ctx, size(arch_bootinfo), size(uintptr));
	ctx.arch = vec: *[*]u8: *arch_bootinfo;
	info.arch = user: *arch_bootinfo;

	let (vec, user) = mkbootvec(&ctx, len(argv), 1);
	vec[..] = strings::toutf8(argv)[..];
	info.argv = *(&types::string {
		data = user: *[*]u8,
		length = len(argv),
		capacity = len(argv),
	}: *str);

	return ctx;
};

The first three lines are fairly straightforward. Helios uses capability-based security, similar in design to seL4. All kernel objects — such as pages of physical memory — are utilized through the capability system. The first two lines set aside a slot to store the page capability in, then allocate a page using that slot. The next two lines grab the page’s physical address and use mem::phys_tokernel to convert it to an address in the kernel’s virtual address space, so that the kernel can write data to this page.

The next two lines are where it starts to get a little bit interesting:

const bisz = size(bootinfo);
let bootvec = (info: *[*]u8)[bisz..arch::PAGESIZE][..0];

This casts the “info” variable (of type *bootinfo) to a pointer to an unbounded array of bytes (*[*]u8). This is a little bit dangerous! Hare’s arrays are bounds tested by default and using an unbounded type disables this safety feature. We want to get a bounded slice again soon, which is what the first slicing operator here does: [bisz..arch::PAGESIZE]. This obtains a bounded slice of bytes which starts from the end of the bootinfo structure and continues to the end of the page.

The last expression, another slicing expression, is a little bit unusual. A slice type in Hare has the following internal representation:

type slice = struct {
	data: nullable *void,
	length: size,
	capacity: size,
};

When you slice an unbounded array, you get a slice whose length and capacity fields are equal to the length of the slicing operation, in this case arch::PAGESIZE - bisz. But when you slice a bounded slice, the length field takes on the length of the slicing expression but the capacity field is calculated from the original slice. So by slicing our new bounded slice to the 0th index ([..0]), we obtain the following slice:

slice {
	data = &(info: *[*]bootinfo)[1]: *[*]u8,
	length = 0,
	capacity = arch::PAGESIZE - bisz,
};

In plain English, this is a slice whose base address is the address following the bootinfo structure and whose capacity is the remainder of the free space on its page, with a length of zero. This is something we can use static append with!3

// Allocates a buffer in the bootinfo vector, returning the kernel vector and a
// pointer to the structure in the init vspace.
fn mkbootvec(info: *bootinfo_ctx, sz: size, al: size) ([]u8, uintptr) = {
	const prevlen = len(info.bootvec);
	let padding = 0z;
	if (prevlen % al != 0) {
		padding = al - prevlen % al;
	};
	static append(info.bootvec, [0...], sz + padding);
	const vec = info.bootvec[prevlen + padding..];
	return (vec, INIT_BOOTINFO_ADDR + size(bootinfo): uintptr prevlen: uintptr);
};

In Hare, slices can be dynamically grown and shrunk using the append, insert, and delete keywords. This is pretty useful, but not applicable for our kernel — remember, no dynamic memory allocation. Attempting to use append in Helios would cause a linking error because the necessary runtime code is absent from the kernel’s Hare runtime. However, you can also statically append to a slice, as shown here. So long as the slice has a sufficient capacity to store the appended data, a static append or insert will succeed. If not, an assertion is thrown at runtime, much like a normal bounds test.

This function makes good use of it to dynamically allocate memory from the bootinfo page. Given a desired size and alignment, it statically appends a suitable number of zeroes to the page, takes a slice of the new data, and returns both that slice (in the kernel’s address space) and the address that data will have in the user address space. If we return to the earlier function, we can see how this is used to allocate space for the arch_bootinfo structure:

let (vec, user) = mkbootvec(&ctx, size(arch_bootinfo), size(uintptr));
ctx.arch = vec: *[*]u8: *arch_bootinfo;
info.arch = user: *arch_bootinfo;

The “ctx” variable is used by the kernel to keep track of its state while setting up the init task, and we stash the kernel’s pointer to this data structure in there, and the user’s pointer in the bootinfo structure itself.

This is also used to place argv into the bootinfo page:

let (vec, user) = mkbootvec(&ctx, len(argv), 1);
vec[..] = strings::toutf8(argv)[..];
info.argv = *(&types::string {
	data = user: *[*]u8,
	length = len(argv),
	capacity = len(argv),
}: *str);

Here we allocate a vector whose length is the length of the argument string, with an alignment of one, and then copy argv into it as a UTF-8 slice. Slice copy expressions like this one are a type-safe and memory-safe way to memcpy in Hare. Then we do something a bit more interesting.

Like slices, strings have an internal representation in Hare which includes a data pointer, length, and capacity. The types module provides a struct with this representation so that you can do low-level string manipulation in Hare should the task call for it. Hare’s syntax allows us to take the address of a literal value, such as a types::string struct, using the & operator. Then we cast it to a pointer to a string and dereference it. Ta-da! We set the bootinfo argv field to a str value which uses the user address of the argument vector.

Some use-cases call for this level of fine control over the precise behavior of your program. Hare’s goal is to accommodate this need with little fanfare. Here we’ve drawn well outside of the lines of Hare’s safety features, but sometimes it’s useful and necessary to do so. And Hare provides us with the tools to get the safety harness back on quickly, such as we saw with the construction of the bootvec slice. This code is pretty weird but to an experienced Hare programmer (which, I must admit, the world has very few of) it should make sense.

I hope you found this interesting! I’m going back to kernel hacking. Next up is loading the userspace ELF image into its address space. I had this working before but decided to rewrite it. Wish me good luck!

In praise of qemu

qemu is another in a long line of great software started by Fabrice Bellard. It provides virtual machines for a wide variety of software architectures. Combined with KVM, it forms the foundation of nearly all cloud services, and it runs SourceHut in our self-hosted datacenters. Much like Bellard’s ffmpeg revolutionized the multimedia software industry, qemu revolutionized virtualisation.

qemu comes with a large variety of studiously implemented virtual devices, from standard real-world hardware like e1000 network interfaces to accelerated virtual hardware like virtio drives. One can, with the right combination of command line arguments, produce a virtual machine of essentially any configuration, either for testing novel configurations or for running production-ready virtual machines. Network adapters, mouse & keyboard, IDE or SCSI or SATA drives, sound cards, graphics cards, serial ports — the works. Lower level, often arch-specific features, such as AHCI devices, SMP, NUMA, and so on, are also available and invaluable for testing any conceivable system configurations. And these configurations work, and work reliably.

I have relied on this testing quite a bit when working on kernels, particularly on my own Helios kernel. With a little bit of command line magic, I can run a fully virtualised system with a serial driver connected to the parent terminal, with a hardware configuration appropriate to whatever I happen to be testing, in a manner such that running and testing my kernel is no different from running any other program. With -gdb I can set up gdb remote debugging and even debug my kernel as if it were a typical program. Anyone who remembers osdev in the Bochs days — or even earlier — understands the unprecedented luxury of such a development environment. Should I ever find myself working on a hardware configuration which is unsupported by qemu, my very first step will be patching qemu to support it. In my reckoning, qemu support is nearly as important for bringing up a new system as a C compiler is.

And qemu’s implementation in C is simple, robust, and comprehensive. On the several occasions when I’ve had to read the code, it has been a pleasure. Furthermore, the comprehensive approach allows you to build out a virtualisation environment tuned precisely to your needs, whatever they may be, and it is accommodating of many needs. Sure, it’s low level — running a qemu command line is certainly more intimidating than, say, VirtualBox — but the trade-off in power afforded to the user opens up innumerable use-cases that are simply not available on any other virtualisation platform.

One of my favorite, lesser-known features of qemu is qemu-user, which allows you to register a binfmt handler to run executables compiled for an arbitrary architecture on Linux. Combined with a little chroot, this has made cross-arch development easier than it has ever been before, something I frequently rely on when working on Hare. If you do cross-architecture work and you haven’t set up qemu-user yet, you’re missing out.

$ uname -a
Linux taiga 5.15.63-0-lts #1-Alpine SMP Fri, 26 Aug 2022 07:02:59 +0000 x86_64 GNU/Linux
$ doas chroot roots/alpine-riscv64/ /bin/sh
# uname -a
Linux taiga 5.15.63-0-lts #1-Alpine SMP Fri, 26 Aug 2022 07:02:59 +0000 riscv64 Linux
This is amazing.

qemu also holds a special place in my heart as one of the first projects I contributed to over email 🙂 And they still use email today, and even recommend SourceHut to make the process easier for novices.

So, to Fabrice, and the thousands of other contributors to qemu, I offer my thanks. qemu is one of my favorite pieces of software.

powerctl: A small case study in Hare for systems programming

powerctl is a little weekend project I put together to provide a simple tool for managing power states on Linux. I had previously put my laptop into suspend with a basic “echo mem | doas tee /sys/power/state”, but this leaves a lot to be desired. I have to use doas to become root, and it’s annoying to enter my password — not to mention difficult to use in a script or to attach to a key binding. powerctl is the solution: a small 500-line Hare program which provides comprehensive support for managing power states on Linux for non-privileged users.

This little project ended up being a useful case-study in writing a tight systems program in Hare. It has to do a few basic tasks which Hare shines in:

  • setuid binaries
  • Group lookup from /etc/group
  • Simple string manipulation
  • Simple I/O within sysfs constraints

Linux documents these features here, so it’s a simple matter of rigging it up to a nice interface. Let’s take a look at how it works.

First, one of the base requirements for this tool is to run as a non-privileged user. However, since writing to sysfs requires root, this program will have to be setuid, so that it runs as root regardless of who executes it. To prevent any user from suspending the system, I added a “power” group and only users who are in this group are allowed to use the program. Enabling this functionality in Hare is quite simple:

use fmt;
use unix;
use unix::passwd;

def POWER_GROUP: str = "power";

// Determines if the current user is a member of the power group.
fn checkgroup() bool = {
	const uid = unix::getuid();
	const euid = unix::geteuid();
	if (uid == 0) {
		return true;
	} else if (euid != 0) {
		fmt::fatal("Error: this program must be installed with setuid (chmod u+s)");
	};

	const group = match (passwd::getgroup(POWER_GROUP)) {
	case let grent: passwd::grent =>
		yield grent;
	case void =>
		fmt::fatal("Error: {} group missing from /etc/group", POWER_GROUP);
	};
	defer passwd::grent_finish(&group);

	const gids = unix::getgroups();
	for (let i = 0z; i < len(gids); i += 1) {
		if (gids[i] == group.gid) {
			return true;
		};
	};

	return false;
};

The POWER_GROUP variable allows distributions that package powerctl to configure exactly which group is allowed to use this tool. Following this, we compare the uid and effective uid. If the uid is zero, we’re already running this tool as root, so we move on. Otherwise, if the euid is nonzero, we lack the permissions to continue, so we bail out and tell the user to fix their installation.

Then we fetch the details for the power group from /etc/group. Hare’s standard library includes a module for working with this file. Once we have the group ID from the string, we check the current user’s supplementary group IDs to see if they’re a member of the appropriate group. Nice and simple. This is also the only place in powerctl where dynamic memory allocation is required, to store the group details, which are freed with “defer passwd::grent_finish”.

The tool also requires some simple string munging to identify the supported set of states. If we look at /sys/power/disk, we can see the kind of data we’re working with:

$ cat /sys/power/disk 
[platform] shutdown reboot suspend test_resume 

These files are a space-separated list of supported states, with the currently enabled state enclosed in square brackets. Parsing these files is a simple matter for Hare. We start with a simple utility function which reads the file and prepares a string tokenizer which splits the string on spaces:

fn read_states(path: str) (strings::tokenizer | fs::error | io::error) = {
	static let buf: [512]u8 = [0...];

	const file = os::open(path)?;
	defer io::close(file)!;

	const z = match (io::read(file, buf)?) {
	case let z: size =>
		yield z;
	case =>
		abort("Unexpected EOF from sysfs");
	};
	const string = strings::rtrim(strings::fromutf8(buf[..z]), '\n');
	return strings::tokenize(string, " ");
};

The error handling here warrants a brief note. This function can fail if the file does not exist or if there is an I/O error when reading it. I don’t think that I/O errors are possible in this specific case (they can occur when writing to these files, though), but we bubble it up regardless using “io::read()?”. The file might not exist if these features are not supported by the current kernel configuration, in which case it’s bubbled up as “errors::noentry” via “os::open()?”. These cases are handled further up the call stack. The other potential error site is “io::close”, which can fail but only in certain circumstances (such as closing the same file twice), and we use the error assertion operator (”!”) to indicate that the programmer believes this case cannot occur. The compiler will check our work and abort at runtime should this assumption be proven wrong in practice.

In the happy path, we read the file, trim off the newline, and return a tokenizer which splits on spaces. The storage for this string is borrowed from “buf”, which is statically allocated.

The usage of this function to query supported disk suspend behaviors is here:

fn read_disk_states() ((disk_state, disk_state) | fs::error | io::error) = {
	const tok = read_states("/sys/power/disk")?;

	let states: disk_state = 0, active: disk_state = 0;
	for (true) {
		let tok = match (strings::next_token(&tok)) {
		case let s: str =>
			yield s;
		case void =>
			break;
		};
		const trimmed = strings::trim(tok, '[', ']');

		const state = switch (trimmed) {
		case "platform" =>
			yield disk_state::PLATFORM;
		case "shutdown" =>
			yield disk_state::SHUTDOWN;
		case "reboot" =>
			yield disk_state::REBOOT;
		case "suspend" =>
			yield disk_state::SUSPEND;
		case "test_resume" =>
			yield disk_state::TEST_RESUME;
		case =>
			continue;
		};
		states |= state;
		if (trimmed != tok) {
			active = state;
		};
	};

	return (states, active);
};

This function returns a tuple which includes all of the supported disk states OR’d together, and a value which indicates which state is currently enabled. The loop iterates through each of the tokens from the tokenizer returned by read_states, trims off the square brackets, and adds the appropriate state bits. We also check the trimmed token against the original token to detect which state is currently active.

There’s two edge cases to be taken into account here: what happens if Linux adds more states in the future, and what happens if none of the states are active? In the former case, we have the continue branch of the switch statement mid-loop. Hare requires all switch statements to be exhaustive, so the compiler forces us to consider this edge case. For the latter case, the return value will be zero, simply indicating that none of these states are active. This is not actually possible given the invariants for this kernel interface, but we could end up in this situation if the kernel adds a new disk mode and that disk mode is active when this code runs.

When the time comes to modify these states, either to put the system to sleep or to configure its behavior when put to sleep, we use the following function:

fn write_state(path: str, state: str) (void | fs::error | io::error) = {
	const file = os::open(path, fs::flags::WRONLY | fs::flags::TRUNC)?;
	defer io::close(file)!;
	let buf: [128]u8 = [0...];
	const file = &bufio::buffered(file, [], buf);
	fmt::fprintln(file, state)?;
};

This code is working within a specific constraint of sysfs: it must complete the write operation in a single syscall. One of Hare’s design goals is giving you sufficient control over the program’s behavior to plan for such concerns. The means of opening the file — WRONLY | TRUNC — was also chosen deliberately. The “single syscall” is achieved by using a buffered file, which soaks up writes until the buffer is full and then flushes them out all at once. The buffered stream flushes automatically on newlines by default, so the “ln” of “fprintln” causes the write to complete in a single call.

With this helper in place, we can write power states. The ones which configure the kernel, but don’t immediately sleep, are straightforward:

// Sets the current mem state.
fn set_mem_state(state: mem_state) (void | fs::error | io::error) = {
	write_state("/sys/power/mem_sleep", mem_state_unparse(state))?;
};

The star of the show, however, has some extra concerns:

// Sets the current sleep state, putting the system to sleep.
fn set_sleep_state(state: sleep_state) (void | fs::error | io::error) = {
	// Sleep briefly so that the keyboard driver can process the key up if
	// the user runs this program from the terminal.
	time::sleep(250 * time::MILLISECOND);
	write_state("/sys/power/state", sleep_state_unparse(state))?;
};

If you enter sleep with a key held down, key repeat will kick in for the duration of the sleep, so when running this from the terminal you’ll resume to find a bunch of new lines. The time::sleep call is a simple way to avoid this, by giving the system time to process your key release event before sleeping. A more sophisticated solution could open the uinput devices and wait for all keys to be released, but that doesn’t seem entirely necessary.

Following this, we jump into the dark abyss of a low-power coma.

And that’s all there is to it! A few hours of work and 500 lines of code later and we have a nice little systems program to make suspending my laptop easier. I was pleasantly surprised to find out how well this little program plays to Hare’s strengths. I hope you found it interesting! And if you happen to need a simple tool for suspending your Linux machines, powerctl might be the program for you.

A review of postmarketOS on the Xiaomi Poco F1

I have recently had cause to start looking into mainline Linux phones which fall outside of the common range of grassroots phones like the PinePhone (which was my daily driver for the past year). The postmarketOS wiki is a great place to research candidate phones for this purpose, and the phone I landed on is the Xiaomi Poco F1, which I picked up on Amazon.nl (for ease of return in case it didn’t work out) for 270 Euro. Phones of this nature have a wide range of support from Linux distros like postmarketOS, from “not working at all” to “mostly working”. The essential features I require in a daily driver phone are (1) a working modem and telephony support, (2) mobile data, and (3) reasonably good performance and battery life; plus of course some sane baseline expectations like a working display and touchscreen driver.

The use of mainline Linux on a smartphone requires a certain degree of bullshit tolerance, and the main question is whether or not the bullshit exceeds your personal threshold. The Poco F1 indeed comes with some bullshit, but I’m pleased to report that it falls short of my threshold and represents a significant quality-of-life improvement over the PinePhone setup I have been using up to now.

The bullshit I have endured for the Poco F1 setup can be categorized into two parts: initial setup and ongoing problems. Of the two, the initial setup is by far the worst. These phones are designed to run Android first, rather than the mainline Linux first approach seen in devices like the PinePhone and Librem 5. This means that it’s back to dealing with things like Android recovery, fastboot, and so on, during the initial setup. The most severe pain point for Xiaomi phones is unlocking the bootloader.

The only officially supported means of doing this is via a Windows-only application published by Xiaomi. A reverse engineered Java application supposedly provides support for completing this process on Linux. However, this approach comes with the typical bullshit of setting up a working Java environment, and, crucially, Xiaomi appears to have sabotaged this effort via a deliberate attempt to close the hole by returning error messages from this reverse engineered API which direct the user to the official tool instead. On top of this, Xiaomi requires you to associate the phone to be unlocked with a user account on their services, paired to a phone number, and has a 30-day waiting period between unlocks. I ultimately had to resort to a Windows 10 VM with USB passthrough to get the damn thing unlocked. This is very frustrating and far from the spirit of free software; Xiaomi earns few points for openness in my books.

Once unlocked, the “initial setup bullshit” did not cease. The main issue is that the postmarketOS flashing tool (which is just a wrapper around fastboot) seemed to have problems writing a consistent filesystem. I was required to apply a level of Linux expertise which exceeds that of even most enthusiasts to obtain a shell in the initramfs, connect to it over postmarketOS’s telnet debugging feature, and run fsck.ext4 to fix the filesystem. Following this, I had to again apply a level of Alpine Linux expertise which exceeds that of many enthusiasts to repair installed packages and get everything up to a baseline of workitude. Overall, it took me the better part of a day to get to a baseline of “running a working installation of postmarketOS”.

However: following the “initial setup bullshit”, I found a very manageable scale of “ongoing problems”. The device’s base performance is excellent, far better than the PinePhone — it just performs much like I would expect from a normal phone. PostmarketOS is, as always, brilliant, and all of the usual mainline Alpine Linux trimmings I would expect are present — I can SSH in, I easily connected it to my personal VPN, and I’m able to run most of the software I’m already used to from desktop Linux systems (though, of course, GUI applications range widely in their ability to accomodate touch screens and a portrait mobile form-factor). I transferred my personal data over from my PinePhone using a method which is 100% certifiably absent of bullshit, namely just rsyncing over my home directory. Excellent!

Telephony support also works pretty well. Audio profiles are a bit buggy, and I can often find my phone using my headphone output while I don’t have them plugged in instead of the speakers, having to resort to manually switching between them from time to time. However, I have never had an issue with the audio profiles being wrong during a phone call (the modem works, by the way); earpiece and speakerphone both work as expected. That said, I have heard complaints from recipients of my phone calls about hearing an echo of their own voice. Additionally, DTMF tones do not work, but the fix has already been merged and is expected in the next release of ModemManager. SMS and mobile data work fine, and mobile data works with a lesser degree of bullshit than I was prepared to expect after reading the pmOS wiki page for this device.

Another problem is that the phone’s onboard cameras do not work at all, and it seems unlikely that this will be solved in the near future. This is not really an issue for me. Another papercut is that Phosh handles the display notch poorly, and though pmOS provides a “tweak” tool which can move the clock over from behind the notch, it leaves something to be desired. The relevant issue is being discused on the Phosh issue tracker and a fix is presumably coming soon — it doesn’t seem particularly difficult to solve. I have also noted that, though GPS works fine, Mepo renders incorrectly and Gnome Maps has (less severe) display issues as well.

The battery life is not as good as the PinePhone, which itself is not as good as most Android phones. However, it meets my needs. It seems to last anywhere from 8 to 10 hours depending on usage, following a full night’s charge. As such, I can leave it off of the juice when I go out without too much fear. That said, I do keep a battery bank in my backpack just in case, but that’s also just a generally useful thing to have around. I think I’ve lent it to others more than I’ve used it myself.

There are many other apps which work without issues. I found that Foliate works great for reading e-books and Evince works nicely for PDFs (two use-cases which one might perceive as related, but which I personally have different UI expectations for). Firefox has far better performance on this device than on the PinePhone and allows for very comfortable web browsing. I also discovered Gnome Feeds which, while imperfect, accommodates my needs regarding an RSS feed reader. All of the “standard” mobile Linux apps that worked fine on the PinePhone also work fine here, such as Lollypop for music and the Porfolio file manager.

I was pleasantly surprised that, after enduring some more bullshit, I was able to get Waydroid to work, allowing me to run Android applications on this phone. My expectations for this were essentially non-existent, so any degree of workitude was a welcome surprise, and any degree of non-workitude was the expected result. On the whole, I’m rather impressed, but don’t expect anything near perfection. The most egregious issue is that I found that internal storage simply doesn’t work, so apps cannot store or read common files (though they seem to be able to persist their own private app data just fine). The camera does not work, so the use-case I was hoping to accommodate here — running my bank’s Android app — is not possible. However, I was able to install F-Droid and a small handful of Android apps that work with a level of performance which is indistinguishable from native Android performance. It’s not quite there yet, but Waydroid has a promising future and will do a lot to bridge the gap between Android and mainline Linux on mobile.

On the whole, I would rate the Poco F1’s bullshit level as follows:

  • Initial setup: miserable
  • Ongoing problems: minor

I have a much higher tolerance for “initial setup” bullshit than for ongoing problems bullshit, so this is a promising result for my needs. I have found that this device is ahead of the PinePhone that I had been using previously in almost all respects, and I have switched to it as my daily driver. In fact, this phone, once the initial bullshit is addressed, is complete enough that it may be the first mainline Linux mobile experience that I might recommend to others as a daily driver. I’m glad that I made the switch.

PINE64 has let its community down

Context for this post:


I know that apologising and taking responsibility for your mistakes is difficult. It seems especially difficult for commercial endeavours, which have fostered a culture of cold disassociation from responsibility for their actions, where admitting to wrongdoing is absolutely off the table. I disagree with this culture, but I understand where it comes from, and I can empathise with those who find themselves in the position of having to reconsider their actions in the light of the harm they have done. It’s not easy.

But, the reckoning must come. I have been a long-time supporter of PINE64. On this blog I have written positively about the PinePhone and PineBook Pro.1 I believed that PINE64 was doing the right thing and was offering something truly revolutionary on the path towards getting real FOSS systems into phones. I use a PinePhone as my daily driver,2 and I also own a PineBook Pro, two RockPro64s, a PinePhone Pro, and a PineNote as well. All of these devices have issues, some of them crippling, but PINE64’s community model convinced me to buy these with confidence in the knowledge that they would be able to work with the community to address these flaws given time.

However, PINE64’s treatment of its community has been in a steady decline for the past year or two, culminating in postmarketOS developer Martijn Braam’s blog post outlining a stressful and frustrating community to participate in, a lack of respect from PINE64 towards this community, and a model moving from a diverse culture that builds working software together to a Manjaro mono-culture that doesn’t. PINE64 offered a disappointing response. In their blog post, they dismiss the problems Martijn brings up, paint his post as misguided at best and disingenuous at worst, and fail to take responsibility for their role in any of these problems.

The future of PINE64’s Manjaro mono-culture is dim. Manjaro is a very poorly run Linux distribution with a history of financial mismanagement, ethical violations, security incidents, shipping broken software, and disregarding the input of its peers in the distribution community. Just this morning they allowed their SSL certificates to expire — for the fourth time. An open letter, signed jointly by 16 members of the Linux mobile community, called out bad behaviors which are largely attributable to Manjaro. I do not respect their privileged position in the PINE64 community, which I do not expect to be constructive or in my best interests. I have never been interested in running Manjaro on a PINE64 device and once they turn their back on the lush ecosystem they promised, I no longer have any interest in the platform.

It’s time for PINE64 to take responsibility for these mistakes, and make clear plans to correct them. To be specific:

  • Apologise for mistreatment of community members.
  • Make a tangible commitment to honoring and respecting the community.
  • Rescind their singular commitment to Manjaro.
  • Re-instate community editions and expand the program.
  • Deal with this stupid SPI problem. The community is right, listen to them.

I understand that it’s difficult to acknowledge our mistakes. But it is also necessary, and important for the future of PINE64 and the future of mobile Linux in general. I call on TL Lim, Marek Kraus, and Lukasz Erecinski to personally answer for these problems.

There are three possible outcomes to this controversy, depending on PINE64’s response. If PINE64 refuses to change course, the community will continue to decay and fail — the community PINE64 depends on to make its devices functional and useful. Even the most mature PINE64 products still need a lot of work, and none of the new products are even remotely usable. This course of events will be the end of PINE64 and deal a terrible blow to the mobile FOSS movement.

The other option for PINE64 to change its behavior. They do this with grace, or without. If they crumble under public pressure and, for example, spitefully agree to re-instate community editions without accepting responsibility for their wrongdoings, it does not bode well for addressing the toxic environment which is festering in the PINE64 community. This may be better than the worst case, but may not be enough. New community members may hesitate to join, maligned members may not offer their forgiveness, and PINE64’s reputation will suffer for a long time.

The last option is for PINE64 to act with grace and humility. Acknowledge your mistakes and apologise to those who have been hurt. Re-commit to honoring your community and treating your peers with respect. Remember, the community are volunteers. They have no obligation to make peace, so it’s on you to mend these wounds. It will still be difficult to move forward, but doing it with humility, hand in hand with the community, will set PINE64 up with the best chance of success. We’re counting on you to do the right thing.

Status update, August 2022

It is a blessedly cool morning here in Amsterdam. I was busy moving house earlier this month, so this update is a bit quieter than most.

For a fun off-beat project this month, I started working on a GameBoy emulator written in Hare. No promises on when it will be functional or how much I plan on working on it – just doing it for fun. In more serious Hare news, I have implemented Thread-Local Storage (TLS) for qbe, our compiler backend. Hare’s standard library does not support multi-threading, but I needed this for Helios, whose driver library does support threads. It will also presumably be of use for cproc once it lands upstream.

Speaking of Helios, it received the runtime components for TLS support on x86_64, namely the handling of %fs and its base register MSR in the context switch, and updates to the ELF loader for handling .tdata/.tbss sections. I have also implemented support for moving and copying capabilities, which will be useful for creating new processes in userspace. Significant progress towards capability destructors was also made, with some capabilities — pages and page tables in particular — being reclaimable now. Next goal is to finish up all of this capability work so that you can freely create, copy, move, and destroy capabilities, then use all of these features to implement a simple shell. There is also some refactoring due at some point soon, so we’ll see about that.

Other Hare progress has been slow this month, as I’m currently looking at a patch queue 123 emails backed up. When I’m able to sit down and get through these, we can expect a bunch of updates in short order.

SourceHut news will be covered in the “what’s cooking” post later today. That’s all for now! Thanks for tuning in.

How I wish I could organize my thoughts

I keep a pen & notebook on my desk, which I make liberal use of to jot down my thoughts. It works pretty well: ad-hoc todo lists, notes on problems I’m working on, tables, flowcharts, etc. It has some limitations, though. Sharing anything out of my notebook online is an awful pain in the ass. I can’t draw a straight line to save my life, so tables and flowcharts are a challenge. No edits, either, so lots of crossed-out words and redrawn or rewritten pages. And of course, my handwriting sucks and I can type much more efficiently than I can write. I wish this was a digital medium, but there are not any applications available which can support the note-taking paradigm that I wish I could have. What would that look like?

Well, like this (click for full size):

A mock-up of an application. A4 pages are arranged ad-hoc on a grid. Handwritten notes and drawings appear in red across the grid and over the pages. A flowchart is shown outside of a page.

I don’t have the bandwidth to take on a new project of this scope, so I’ll describe what I think this should look like in the hopes that it will inspire another team to work on something like this. Who knows!

The essential interface would be an infinite grid on which various kinds of objects can be placed by the user. The most important of these objects would be pages, at a page size configurable by the user (A4 by default). You can zoom in on a page (double click it or something) to make it your main focus, zooming in automatically to an appropriate level for editing, then type away. A simple WYSIWYG paradigm would be supported here, perhaps supporting only headings, bold/italic text, and ordered and unordered lists — enough to express your thoughts but not a full blown document editor/typesetter.1 When you run out of page, another is generated next to the current page, either to the right or below — configurable.

Other objects would include flowcharts, tables, images, hand-written text and drawings, and so on. These objects can be placed free form on the grid, or embedded in a page, or moved between each mode.

The user input paradigm should embrace as many modes of input as the user wants to provide. Mouse and keyboard: middle click to pan, scroll to zoom in or out, left click and drag to move objects around, shift+click to select objects, etc. A multi-point trackpad should support pinch to zoom, two finger pan, etc. Touch support is fairly obvious. Drawing tablet support is also important: the user should be able to use one to draw and write free-form. I’d love to be able to make flowcharts by drawing boxes and arrows and having the software recognize them and align them to the grid as first-class vector objects. Some drawing tablets support trackpad and touch-screen-like features as well — so all of those interaction options should just werk.

Performance is important here. I should be able to zoom in and out and pan around while all of the objects rasterize themselves in real-time, never making the user suffer through stuttery interactions. There should also be various ways to export this content. A PDF exporter should let me arrange the pages in the desired linear order. SVG exporters should be able to export objects like flowcharts and diagrams. Other potential features includes real-time collaboration or separate templates for presentations.

Naturally this application should be free software and should run on Linux. However, I would be willing to pay a premium price for this tool — a one-time fee of as much as $1000, or subscriptions on the order of $100/month if real-time collaboration or cloud synchronization are included. If you’d like some ideas for how to monetize free software projects like this, feel free to swing by my talk on the subject in Italy early this September to talk about it.

Well, that’s enough dreaming for now. I hope this inspired you, and in the meantime it’s back to pen and paper for me.

Conciseness

Conciseness is often considered a virtue among hackers and software engineers. FOSS maintainers in particular generally prefer to keep bug reports, questions on mailing lists, discussions in IRC channels, and so on, close to the point and with minimal faff. It’s not considered impolite to skip the formalities — quite the opposite. So: keep your faffery to a minimum. A quick “thanks!” at the end of a discussion will generally suffice. And, when someone is being direct with you, don’t interpret it as a slight: simply indulge in the blissful freedom of a discussion absent of faffery.

The past and future of open hardware

They say a sucker is born every day, and at least on the day of my birth, that certainly may have been true. I have a bad habit of spending money on open hardware projects that ultimately become vaporware or seriously under-deliver on their expectations. In my ledger are EOMA68, DragonBox Pyra, the Jolla Tablet — which always had significant non-free components — and the Mudita Pure, though I did successfully receive a refund for the latter two.1

There are some success stories, though. My Pine64 devices work great — though they have non-free components — and I have a HiFive Unmatched that I’m reasonably pleased with. Raspberry Pi is going well, if you can find one — also with non-free components — and Arduino and products like it are serving their niche pretty well. I hear the MNT Reform went well, though by then I had learned to be a bit more hesitant to open my wallet for open hardware, so I don’t have one myself. Pebble worked, until it didn’t. Caveats abound in all of these projects.

What does open hardware need to succeed, and why have many projects failed? And why do the successful products often have non-free components and poor stock? We can’t blame it all on the chip shortage and/or COVID: it’s been an issue for a long time.

I don’t know the answers, but I hope we start seeing improvements. I hope that the successful projects will step into a mentorship role to provide up-and-comers with tips on how they made their projects work, and that we see a stronger focus on liberating non-free components. Perhaps Crowd Supply can do some work in helping to secure investment2 for open hardware projects, and continue the good work they’re already doing on guiding them through the development and production processes.

Part of this responsibility comes down to the consumer: spend your money on free projects, and don’t spend your money on non-free projects. But, we also need to look closely at the viability of each project, and open hardware projects need to be transparent about their plans, lest we get burned again. Steering the open hardware movement out of infancy will be a challenge for all involved.

Are you working on a cool open hardware project? Let me know. Explain how you plan on making it succeed and, if I’m convinced that your idea has promise, I’ll add a link here.

Code review at the speed of email

I’m a big proponent of the email workflow for patch submission and code review. I have previously published some content (How to use git.sr.ht’s send-email feature, Forks & pull requests vs email, git-send-email.io) which demonstrates the contributor side of this workflow, but it’s nice to illustrate the advantages of the maintainer workflow as well. For this purpose, I’ve recorded a short video demonstrating how I manage code review as an email-oriented maintainer.

Disclaimer: I am the founder of SourceHut, a platform built on this workflow which competes with platforms like GitHub and GitLab. This article’s perspective is biased.

This blog post provides additional material to supplement this video, and also includes all of the information from the video itself. For those who prefer reading over watching, you can just stick to reading this blog post. Or, you can watch the video and skim the post. Or you can just do something else! When was the last time you called your grandmother?

With hundreds of hours of review experience on GitHub, GitLab, and SourceHut, I can say with confidence the email workflow allows me to work much faster than any of the others. I can review small patches in seconds, work quickly with multiple git repositories, easily test changes and make tweaks as necessary, rebase often, and quickly chop up and provide feedback for larger patches. Working my way through a 50-email patch queue usually takes me about 20 minutes, compared to an hour or more for the same number of merge requests.

This workflow also works entirely offline. I can read and apply changes locally, and even reply with feedback or to thank contributors for their patch. My mail setup automatically downloads mail from IMAP using isync and outgoing mails are queued with postfix until the network is ready. I have often worked through my patch queue on an airplane or a train with spotty or non-functional internet access without skipping a beat. Working from low-end devices like a Pinebook or a phone are also no problem — aerc is very lightweight in the terminal and the SourceHut web interface is much lighter & faster than any other web forge.

The centerpiece of my setup is an email client I wrote specifically for software development using this workflow: aerc.1 The stock configuration of aerc is pretty good, but I make a couple of useful additions specifically for development on SourceHut. Specifically, I add a few keybindings to ~/.config/aerc/binds.conf:

[messages]
ga = :flag<Enter>:pipe -mb git am -3<Enter>
gp = :term git push<Enter>
gl = :term git log<Enter>

rt = :reply -a -Tthanks<Enter>
Rt = :reply -qa -Tquoted_thanks<Enter>

[compose::review]
V = :header -f X-Sourcehut-Patchset-Update NEEDS_REVISION<Enter>
A = :header -f X-Sourcehut-Patchset-Update APPLIED<Enter>
R = :header -f X-Sourcehut-Patchset-Update REJECTED<Enter>

The first three commands, ga, gp, and gl, are for invoking git commands. “ga” applies the current email as a patch, using git am, and “gp” simply runs git push. “gl” is useful for quickly reviewing the git log. ga also flags the email so that it shows up in the UI as having been applied, which is useful as I’m jumping all over a patch queue. I also make liberal use of \ (:filter) to filter my messages to patches applicable to specific projects or goals.

rt and Rt use aerc templates installed at ~/.config/aerc/templates/ to reply to emails after I’ve finished reviewing them. The “thanks” template is:

X-Sourcehut-Patchset-Update: APPLIED

Thanks!

{{exec "{ git remote get-url --push origin; git reflog -2 origin/master --pretty=format:%h | xargs printf '%s\n' | tac; } | xargs printf 'To %s\n   %s..%s  master -> master'" ""}}

And quoted_thanks is:

X-Sourcehut-Patchset-Update: APPLIED

Thanks!

{{exec "{ git remote get-url --push origin; git reflog -2 origin/master --pretty=format:%h | xargs printf '%s\n' | tac; } | xargs printf 'To %s\n   %s..%s  master -> master'" ""}}

On {{dateFormat (.OriginalDate | toLocal) "Mon Jan 2, 2006 at 3:04 PM MST"}}, {{(index .OriginalFrom 0).Name}} wrote:
{{wrapText .OriginalText 72 | quote}}

Both of these add a magic “X-Sourcehut-Patchset-Update” header, which updates the status of the patch on the mailing list. They also include a shell pipeline which adds some information about the last push from this repository, to help the recipient understand what happened to their patch. I often make some small edits to request the user follow-up with a ticket for some future work, or add other timely comments. The second template, quoted_reply, is also particularly useful for this: it quotes the original message so I can reply to specific parts of it, in the commit message, timely commentary, or the code itself, often pointing out parts of the code that I made some small tweaks to before applying.

And that’s basically it! You can browse all of my dotfiles here to see more details about my system configuration. With this setup I am able to work my way through a patch queue easier and faster than ever before. That’s why I like the email workflow so much: for power users, no alternative is even close in terms of efficiency.

Of course, this is the power user workflow, and it can be intimidating to learn all of these things. This is why we offer more novice-friendly tools, which lose some of the advantages but are often more intuitive. For instance, we are working on user interface on the web for patch review, mirroring our existing web interface for patch submission. But, in my opinion, it doesn’t get better than this for serious FOSS maintainers.

Feel free to reach out on IRC in #sr.ht.watercooler on Libera Chat, or via email, if you have any questions about this workflow and how you can apply it to your own projects. Happy hacking!

❌