❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Go's choice of multiple return values was the simpler option

By: cks
21 March 2025 at 02:56

Yesterday I wrote about Go's use of multiple return values and Go types, in reaction to Mond's Were multiple return values Go's biggest mistake?. One of the things that I forgot to mention in that entry is that I think Go's choice to have multiple values for function returns and a few other things was the simpler and more conservative approach in its overall language design.

In a statically typed language that expects to routinely use multiple return values, as Go was designed to with the 'result, error' pattern, returning multiple values as a typed tuple means that tuple-based types are pervasive. This creates pressures on both the language design and the API of the standard library, especially if you start out (as Go did) being a fairly strongly nominally typed language, where different names for the same concrete type can't be casually interchanged. Or to put it another way, having a frequently used tuple container (meta-)type significantly interacts with and affects the rest of the language.

(For example, if Go had handled multiple values through tuples as explicit typed entities, it might have had to start out with something like type aliases (added only in Go 1.9) and it might have been pushed toward some degree of structural typing, because that probably makes it easier to interact with all of the return value tuples flying around.)

Having multiple values as a special case for function returns, range, and so on doesn't create anywhere near this additional influence and pressure on the rest of the language. There are a whole bunch of questions and issues you don't face because multiple values aren't types and can't be stored or manipulated as single entities. Of course you have to be careful in the language specification and it's not trivial, but it's simpler and more contained than going the tuple type route. I also feel it's the more conservative approach, since it doesn't affect the rest of the language as much as a widely used tuple container type would.

(As Mond criticizes, it does create special cases. But Go is a pragmatic language that's willing to live with special cases.)

Go's multiple return values and (Go) types

By: cks
20 March 2025 at 03:31

Recently I read Were multiple return values Go's biggest mistake? (via), which wishes that Go had full blown tuple types (to put my spin on it). One of the things that struck me about Go's situation when I read the article is exactly the inverse of what the article is complaining about, which is that because Go allows multiple values for function return types (and in a few other places), it doesn't have to have tuple types.

One problem with tuple types in a statically typed language is that they must exist as types, whether declared explicitly or implicitly. In a language like Go, where type definitions create new distinct types even if the structure is the same, it isn't particularly difficult to wind up with an ergonomics problem. Suppose that you want to return a tuple that is a net.Conn and an error, a common pair of return values in the net package today. If that tuple is given a named type, everyone must use that type in various places; merely returning or storing an implicitly declared type that's structurally the same is not acceptable under Go's current type rules. Conversely, if that tuple is not given a type name in the net package, everyone is forced to stick to an anonymous tuple type. In addition, this up front choice is now an API; it's not API compatible to give your previously anonymous tuple type a name or vice versa, even if the types are structurally compatible.

(Since returning something and error is so common an idiom in Go, we're also looking at either a lot of anonymous types or a lot more named types. Consider how many different combinations of multiple return values you find in the net package alone.)

One advantage of multiple return values (and the other forms of tuple assignment, and for range clauses) is that they don't require actual formal types. Functions have a 'result type', which doesn't exist as an actual type, but you also needed to handle the same sort of 'not an actual type' thing for their 'parameter type'. My guess is that this let Go's designers skip a certain amount of complexity in Go's type system, because they didn't have to define an actual tuple (meta-)type or alternately expand how structs worked to cover the tuple usage case,

(Looked at from the right angle, structs are tuples with named fields, although then you get into questions of nested structs act in tuple-like contexts.)

A dynamically typed language like Python doesn't have this problem because there are no explicit types, so there's no need to have different types for different combinations of (return) values. There's simply a general tuple container type that can be any shape you want or need, and can be created and destructured on demand.

(I assume that some statically typed languages have worked out how to handle tuples as a data type within their type system. Rust has tuples, for example; I haven't looked into how they work in Rust's type system, for reasons.)

I don't think error handling is a solved problem in language design

By: cks
18 March 2025 at 02:53

There are certain things about programming language design that are more or less solved problems, where we generally know what the good and bad approaches are. For example, over time we've wound up agreeing on various common control structures like for and while loops, if statements, and multi-option switch/case/etc statements. The syntax may vary (sometimes very much, as for example in Lisp), but the approach is more or less the same because we've come up with good approaches.

I don't believe this is the case with handling errors. One way to see this is to look at the wide variety of approaches and patterns that languages today take to error handling. There is at least 'errors as exceptions' (for example, Python), 'errors as values' (Go and C), and 'errors instead of results and you have to check' combined with 'if errors happen, panic' (both Rust). Even in Rust there are multiple idioms for dealing with errors; some Rust code will explicitly check its Result types, while other Rust code sprinkles '?' around and accepts that if the program sails off the happy path, it simply dies.

If you were creating a new programming language from scratch, there's no clear agreed answer to what error handling approach you should pick, not the way we have more or less agreed on how for, while, and so on should work. You'd be left to evaluate trade offs in language design and language ergonomics and to make (and justify) your choices, and there probably would always be people who think you should have chosen differently. The same is true of changing or evolving existing languages, where there's no generally agreed on 'good error handling' to move toward.

(The obvious corollary of this is that there's no generally agreed on keywords or other syntax for error handling, the way 'for' and 'while' are widely accepted as keywords as well as concepts. The closest we've come is that some forms of error handling have generally accepted keywords, such as try/catch for exception handling.)

I like to think that this will change at some point in the future. Surely there actually is a good pattern for error handling out there and at some point we will find it (if it hasn't already been found) and then converge on it, as we've converged on programming language things before. But I feel it's clear that we're not there yet today.

Updating local commits with more changes in Git (the harder way)

By: cks
3 March 2025 at 03:34

One of the things I do with Git is maintain personal changes locally on top of the upstream version, with my changes updated via rebasing every time I pull upstream to update it. In the simple case, I have only a single local change and commit, but in more complex cases I split my changes into multiple local commits; my local version of Firefox currently carries 12 separate personal commits. Every so often, upstream changes something that causes one of those personal changes to need an update, without actually breaking the rebase of that change. When this happens I need to update my local commit with more changes, and often it's not the 'top' local commit (which can be updated simply).

In theory, the third party tool git-absorb should be ideal for this, and I believe I've used it successfully for this purpose in the past. In my most recent instance, though, git-absorb frustratingly refused to do anything in a situation where it felt it should work fine. I had an additional change to a file that was changed in exactly one of my local commits, which feels like an easy case.

(Reading the git-absorb readme carefully suggests that I may be running into a situation where my new change doesn't clash with any existing change. This makes git-absorb more limited than I'd like, but so it goes.)

In Git, what I want is called a 'fixup commit', and how to use it is covered in this Stackoverflow answer. The sequence of commands is basically:

# modify some/file with new changes, then
git add some/file

# Use this to find your existing commit ID
git log some/file

# with the existing commid ID
git commit --fixup=<commit ID>
git rebase --interactive --autosquash <commit ID>^

This will open an editor buffer with what 'git rebase' is about to do, which I can immediately exit out of because the defaults are exactly what I want (assuming I don't want to shuffle around the order of my local commits, which I probably don't, especially as part of a fixup).

I can probably also use 'origin/main' instead of '<commit ID>^', but that will rebase more things than is strictly necessary. And I need the commit ID for the 'git commit --fixup' invocation anyway.

(Sufficiently experienced Git people can probably put together a script that would do this automatically. It would get all of the files staged in the index, find the most recent commit that modified each of them, abort if they're not all the same commit, make a fixup commit to that most recent commit, and then potentially run the 'git rebase' for you.)

Go's behavior for zero value channels and maps is partly a choice

By: cks
25 February 2025 at 04:30

How Go behaves if you have a zero value channel or map (a 'nil' channel or map) is somewhat confusing (cf, via). When we talk about it, it's worth remembering that this behavior is a somewhat arbitrary choice on Go's part, not a fundamental set of requirements that stems from, for example, other language semantics. Go has reasons to have channels and maps behave as they do, but some those reasons have to do with how channel and map values are implemented and some are about what's convenient for programming.

As hinted at by how their zero value is called a 'nil' value, channel and map values are both implemented as pointers to runtime data structures. A nil channel or map has no such runtime data structure allocated for it (and the pointer value is nil); these structures are allocated by make(). However, this doesn't entirely allow us to predict what happens when you use nil values of either type. It's not unreasonable for an attempt to assign an element to a nil map to panic, since the nil map has no runtime data structure allocated to hold anything we try to put in it. But you don't have to say that a nil map is empty and looking up elements in it gives you a zero value; I think you could have this panic instead, just as assigning an element does. However, this would probably result in less safe code that paniced more (and probably had more checks for nil maps, too).

Then there's nil channels, which don't behave like nil maps. It would make sense for receiving from a nil channel to yield the zero value, much like looking up an element in a nil map, and for sending to a nil channel to panic, again like assigning to an element in a nil map (although in the channel case it would be because there's no runtime data structure where your goroutine could metaphorically hang its hat waiting for a receiver). Instead Go chooses to make both operations (permanently) block your goroutine, with panicing on send reserved for sending to a non-nil but closed channel.

The current semantics of sending on a closed channel combined with select statements (and to a lesser extent receiving from a closed channel) means that Go needs a channel zero value that is never ready to send or receive. However, I believe that Go could readily make actual sends or receives on nil channels panic without any language problems. As a practical matter, sending or receiving on a nil channel is a bug that will leak your goroutine even if your program doesn't deadlock.

Similarly, Go could choose to allocate an empty map runtime data structure for zero value maps, and then let you assign to elements in the resulting map rather than panicing. If desired, I think you could preserve a distinction between empty maps and nil maps. There would be some drawbacks to this that cut against Go's general philosophy of being relatively explicit about (heap) allocations and you'd want a clever compiler that didn't bother creating those zero value runtime map data structures when they'd just be overwritten by 'make()' or a return value from a function call or the like.

(I can certainly imagine a quite Go like language where maps don't have to be explicitly set up any more than slices do, although you might still use 'make()' if you wanted to provide size hints to the runtime.)

Sidebar: why you need something like nil channels

We all know that sometimes you want to stop sending or receiving on a channel in a select statement. On first impression it looks like closing a channel (instead of setting the channel to nil) could be made to work for this (it doesn't currently). The problem is that closing a channel is a global thing, while you may only want a local effect; you want to remove the channel from your select, but not close down other uses of it by other goroutines.

This need for a local effect pretty much requires a special, distinct channel value that is never ready for sending or receiving, so you can overwrite the old channel value with this special value, which we might as well call a 'nil channel'. Without a channel value that serves this purpose you'd have to complicate select statements with some other way to disable specific channels.

(I had to work this out in my head as part of writing this entry so I might as well write it down for my future self.)

Build systems and their effects on versioning and API changes

By: cks
2 February 2025 at 21:52

In a comment on my entry on modern languages and bad packaging outcomes at scale, sapphirepaw said (about backward and forward compatibility within language ecologies), well, I'm going to quote from it because it's good (but go read the whole comment):

I think there’s a social contract that has broken down somewhere.

[...]

If a library version did break things, it was generally considered a bug, and developers assumed it would be fixed in short order. Then, for the most part, only distributions had to worry about specific package/library-version incompatibilities.

This all falls apart if a developer, or the ecosystem of libraries/language they depend on, ends up discarding that compatibility-across-time. That was the part that made it feasible to build a distribution from a collection of projects that were, themselves, released across time.

I have a somewhat different view. I think that the way it was in the old days was less a social contract and more an effect of the environment that software was released into and built in, and now that the environment has changed, the effects have too.

C famously has a terrible story around its (lack of a) build system and dependency management, and for much of its life you couldn't assume pervasive and inexpensive Internet connectivity (well, you still can't assume the latter globally, but people have stopped caring about such places). This gave authors of open source software a strong incentive to be both backward and forward compatible. If you released a program that required the features of a very recent version of a library, you reduced your audience to people who already had the recent version (or better) or who were willing to go through the significant manual effort to get and build that version of the library, and then perhaps make all of their other programs work with it, since C environments often more or less forced global installation of libraries. If you were a library author releasing a new minor version or patch level that had incompatibilities, people would be very slow to actually install and adopt that version because of those incompatibilities; most of their programs using your libraries wouldn't update on the spot, and there was no good mechanism to use the old version of the library for some programs.

(Technically you could make this work with static linking, but static linking was out of favour for a long time.)

All of this creates a quite strong practical and social push toward stability. If you wanted your program or its new version to be used widely (and you usually did), it had better work with the old versions of libraries that people already had; requiring new APIs or new library behavior was dangerous. If you wanted the new version of your library to be used widely, it had better be compatible with old programs using the old API, and if you wanted a brand new library to be used by people in programs, it had better demonstrate that it was going to be stable.

Much of this spilled over into other languages like Perl and Python. Although both of these developed central package repositories and dependency management schemes, for a long time these mostly worked globally, just like the C library and header ecology, and so they faced similar pressures. Python only added fully supported virtual environments in 2012, for example (in Python 3.3).

Modern languages like Go and Rust (and the Node.js/NPM ecosystem, and modern Python venv based operation) don't work like that. Modern languages mostly use static linking instead of shared libraries (or the equivalent of static linking for dynamic languages, such as Python venvs), and they have build systems that explicitly support automatically fetching and using specific versions of dependencies (or version ranges; most build systems are optimistic about forward compatibility). This has created an ecology where it's much easier to use a recent version of something than it was in C, and where API changes in dependencies often have much less effect because it's much easier (and sometimes even the default) to build old programs with old dependency versions.

(In some languages this has resulted in a lot of programs and packages implicitly requiring relatively recent versions of their dependencies, even if they don't say so and claim wide backward compatibility. This happens because people would have to take explicit steps to test with their stated minimum version requirements and often people don't, with predictable results. Go is an exception here because of its choice of 'minimum version selection' for dependencies over 'maximum version selection', but even then it's easy to drift into using new language features or new standard library APIs without specifically requiring that version of Go.)

One of the things about technology is that technology absolutely affects social issues, so different technology creates different social expectations. I think that's what's happened with social expectations around modern languages. Because they have standard build systems that make it easy to do it, people feel free to have their programs require specific version ranges of dependencies (modern as well as old), and package authors feel free to break things and then maybe fix them later, because programs can opt in or not and aren't stuck with the package's choices for a particular version. There are still forces pushing towards compatibility, but they're weaker than they used to be and more often violated.

Or to put it another way, there was a social contract of sorts for C libraries in the old days but the social contract was a consequence of the restrictions of the technology. When the technology changed, the 'social contract' also changed, with unfortunate effects at scale, which most developers don't care about (most developers aren't operating at scale, they're scratching their own itch). The new technology and the new social expectations are probably better for the developers of programs, who can now easily use new features of dependencies (or alternately not have to update their code to the latest upstream whims), and for the developers of libraries and packages, who can change things more easily and who generally see their new work being used faster than before.

(In one perspective, the entire 'semantic versioning' movement is a reaction to developers not following the expected compatibility that semver people want. If developers were already doing semver, there would be no need for a movement for it; the semver movement exists precisely because people weren't. We didn't have a 'semver' movement for C libraries in the 1990s because no one needed to ask for it, it simply happened.)

Sometimes print-based debugging is your only choice

By: cks
20 January 2025 at 04:20

Recently I had to investigate a mysterious issue in our Django based Python web application. This issue happened only when the application was actually running as part of the web server (using mod_wsgi, which effectively runs as an Apache process). The only particularly feasible way to dig into what was going on was everyone's stand-by, print based debugging (because I could print into Apache's error log; I could have used any form of logging that would surface the information). Even if I might have somehow been able to attach a debugger to things to debug a HTTP request in flight, using print based debugging was a lot easier and faster in practice.

I'm a long time fan of print based debugging. Sometimes this is because print based debugging is easier if you only dip into a language every so often, but that points to a deeper issue, which is that almost every environment can print or log. Print or log based 'debugging' is an almost universal way to extract information from a system, and sometimes you have no other practical way to do that.

(The low level programming people sometimes can't even print things out, but there are other very basic ways to communicate things.)

As in my example, one of the general cases where you have very little access other than logs is when your issue only shows up in some sort of isolated or encapsulated environment (a 'production' environment). We have a lot of ways of isolating things these days, things like daemon processes, containers, 'cattle' (virtual) servers, and so on, but they all share the common trait that they deliberately detach themselves away from you. There are good reasons for this (which often can be boiled down to wanting to run in a controlled and repeatable environment), but it has its downsides.

Should print based debugging be the first thing you reach for? Maybe not; some sorts of bugs cause me to reach for a debugger, and in general if you're a regular user of your chosen debugger you can probably get a lot of information with it quite easily, easier than sprinkling print statements all over. But I think that you probably should build up some print debugging capabilities, because sooner or later you'll probably need them.

Realizing why Go reflection restricts what struct fields can be modified

By: cks
10 January 2025 at 04:19

Recently I read Rust, reflection and access rules. Among other things, it describes how a hypothetical Rust reflection system couldn't safely allow access to private fields of things, and especially how it couldn't allow code to set them through reflection. My short paraphrase of the article's discussion is that in Rust, private fields can be in use as part of invariants that allow unsafe operations to be done safely through suitable public APIs. This brought into clarity what had previously been a somewhat odd seeming restriction in Go's reflect package.

Famously (for people who've dabbled in reflect), you can only set exported struct fields. This is covered in both the Value.CanSet() package documentation and The Laws of Reflection (in passing). Since one of the uses of reflection is for going between JSON and structs, encoding/json only works on exported struct fields and you'll find a lot of such fields in lots of code. This requirement can be a bit annoying. Wouldn't it be nice if you didn't have to make your fields public just to serialize them easily?

(You can use encoding/json and still serialize non-exported struct fields, but you have to write some custom methods instead of just marking struct fields the way you could if they were exported.)

Go has this reflect restriction, presumably, for the same reason that reflection in Rust wouldn't be able to modify private fields. Since private fields in a Go struct may be used by functions and methods in the package to properly manage the struct, modifying those fields yourself is unsafe (in the general sense). The reflect package will let you see the fields (and their values) but not change their values. You're allowed to change exported fields because (in theory) arbitrary Go code can already change the value of those fields, and so code in the struct's package can't count on them having any particular value. It can at least sort of count on private fields having approved values (or the zero value, I believe).

(I understand why the reflect documentation doesn't explain the logic of not being able to modify private fields, since package documentation isn't necessarily the right place for a rationale. Also, perhaps it was considered obvious.)

Good union types in Go would probably need types without a zero value

By: cks
3 December 2024 at 04:00

One of the classical big reason to want union types in Go is so that one can implement the general pattern of an option type, in order to force people to deal explicitly with null values. Except this is not quite true on both sides. The compiler can enforce null value checks before use already, and union and option types by themselves don't fully protect you against null values. Much like people ignore error returns (and the Go compiler allows this), people can skip over that they can't extract an underlying value from their Result value and return a zero value from their 'get a result' function.

My view is that the power of option types is what they do in the rest of the language, but they can only do this if you can express their guarantees in the type system. The important thing you need for this is non-nullable types. This is what lets you guarantee that something is a proper value extracted from an error-free Result or whatever. If you can't express this in your types, everyone has to check, one way or another, or you risk a null sneaking in.

Go doesn't currently have a type concept for 'something that can't be null', or for that matter a concept that is exactly 'null'. The closest Go equivalent is the general idea of zero values, of which nil pointers (and nil interfaces) are a special case (but you can also have zero value maps and channels, which also have special semantics; the zero value of slices is more normal). If you want to make Result and similar types particularly useful in Go, I believe that you need to change this, somehow introducing types that don't have a zero value.

(Such types would likely be a variation of existing types with zero values, and presumably you could only use values or assign to variables of that type if the compiler could prove that what you were using or assigning wasn't a zero value.)

As noted in a comment by loreb on my entry on how union types would be complicated, these 'union' or 'sum' types in Go also run into issues with their zero value, and as Ian Lance Taylor's issue comment says, zero values are built quite deeply into Go. You can define semantics for union types that allow zero values, but I don't think they're really particularly useful for anything except cramming some data structures into a few less bytes in a somewhat opaque way, and I'm not sure that's something Go should be caring about.

Given that zero values are a deep part of Go and the Go developers don't seem particularly interested in trying to change this, I doubt that we're ever going to get the powerful form of union types in Go. If anything like union types appears, it will probably be merely to save memory, and even then union types are complicated in Go's runtime.

Sidebar: the simple zero value allowed union type semantics

If you allow union types to have a zero value, the obvious meaning of a zero value is something that can't have a value of any type successfully extracted from it. If you try the union type equivalent of a type assertion you get a zero value and 'false' for all possible options. Of course this completely gives up on the 'no zero value' type side of things, but at least you have a meaning.

This makes a zero value union very similar to a nil interface, which will also fail all type assertions. At this point my feeling is that Go might as well stick with interfaces and not attempt to provide union types.

Union types ('enum types') would be complicated in Go

By: cks
2 December 2024 at 04:31

Every so often, people wish that Go had enough features to build some equivalent of Rust's Result type or Option type, often so that Go programmers could have more ergonomic error handling. One core requirement for this is what Rust calls an Enum and what is broadly known as a Union type. Unfortunately, doing a real enum or union type in Go is not particularly simple, and it definitely requires significant support by the Go compiler and the runtime.

At one level we easily do something that looks like a Result type in Go, especially now that we have generics. You make a generic struct that has private fields for an error, a value of type T, and a flag that says which is valid, and then give it some methods to set and get values and ask it which it currently contains. If you ask for a sort of value that's not valid, it panics. However, this struct necessarily has space for three fields, where the Rust enums (and generally union types) act more like C unions, only needing space for the largest type possible in them and sometimes a marker of what type is in the union right now.

(The Rust compiler plays all sorts of clever tricks to elide the enum marker if it can store this information in some other way.)

To understand why we need deep compiler and runtime support, let's ask why we can't implement such a union type today using Go's unsafe package to perform suitable manipulation of a suitable memory region. Because it will make the discussion easier, let's say that we're on a 64-bit platform and our made up Result type will contain either an error (which is an interface value) or an int64[2] array. On a 64-bit platform, both of these types occupy 16 bytes, since an interface value is two pointers in a trenchcoat, so it looks like we should be able to use the same suitably-aligned 16-byte memory area for each of them.

However, now imagine that Go is performing garbage collection. How does the Go runtime know whether or not our 16-byte memory area contains two live pointers, which it must follow as part of garbage collection, or two 64-bit integers, which it definitely cannot treat as pointers and follow? If we've implemented our Result type outside of the compiler and runtime, the answer is that garbage collection has no idea which it currently is. In the Go garbage collector, it's not values that have types, but storage locations, and Go doesn't provide an API for changing the type of a storage location.

(Internally the runtime can set and change information about what pieces of memory contain pointers, but this is not exposed to the outside world; it's part of the deep integration of runtime memory allocation and the runtime garbage collector.)

In Go, without support from the runtime and the compiler the best you can do is store an interface value or perhaps an unsafe.Pointer to the actual value involved. However, this probably forces a separate heap allocation for the value, which is less efficient in several ways that the compiler supported version that Rust has. On the positive side, if you store an interface value you don't need to have any marker for what's stored in your Result type, since you can always extract that from the interface with suitable type assertion.

The corollary to all of this is that adding union types to Go as a language feature wouldn't be merely a modest change in the compiler. It would also require a bunch of work in how such types interact with garbage collection, Go's memory allocation systems (which in the normal Go toolchain allocate things with pointers into separate memory arenas than things without them), and likely other places in the runtime.

(I suspect that Go is pretty unlikely to add union types given this, since you can have much of the API that union types present with interface types and generics. And in my view, union types like Result wouldn't be really useful without other changes to Go's type system, although that's another entry.)

PS: Something like this has come up before in generic type sets.

Two API styles of doing special things involving text in UIs

By: cks
20 November 2024 at 04:43

A lot of programs (or applications) that have a 'user interface' mostly don't have a strongly graphical one; instead, they mostly have text, although with special presentation (fonts, colours, underlines, etc) and perhaps controls and meaning attached to interacting with it (including things like buttons that are rendered as text with a border around it). All of these are not just plain text, so programs have to create and manipulate all of them through some API or collection of APIs. Over time, there have sprung up at least two styles of APIs, which I will call external and inline, after how they approach the problem.

The external style API is the older of the two. In the external API, the program makes distinct API calls to do anything other than plain text (well, it makes API calls for plain text, but you have to do something there). If you want to make some text italic or underlined, you have a special API call (or perhaps you modify the context of a 'display this text' API). If you want to attach special actions to things like clicking on a piece of text or hovering the mouse pointer over it, again, more API calls. This leads to programs that make a lot of API calls in their code and are very explicit about what they're doing in their UI. Sometimes this is bundled together with a layout model in the API, where the underlying UI library will flexibly lay out a set of controls so that they accommodate your variously sized and styled text, your buttons, your dividers, and so on.

In the inline style API, you primarily communicate all of this by passing in text that is in some way marked up, instead of plain text that is rendered literally. One form of such inline markup is HTML (and it is popularly styled by CSS). However, there have been other forms, such as XML markup, and even with HTML, you and the UI library will cooperate to attach special meanings and actions to various DOM nodes. Inline style APIs are less efficient at runtime because they have to parse the text you pass in to determine all of this, instead of your program telling the UI library directly through API calls. At the same time, inline style APIs are quite popular at a number of levels. For example, it's popular in UI toolkits to use textual formats to describe your program's UI layout (sometimes this is then compiled into a direct form of UI API calls, and sometimes you hand the textual version to the UI library for it to interpret).

Despite it being potentially less efficient at runtime, my impression is that plenty of programmers prefer the inline style to the external style for text focused applications, where styled text and text based controls are almost all of the UI. My belief is also that an inline style API is probably what's needed for an attractive text focused programming environment.

The missing text focused programming environment

By: cks
17 November 2024 at 03:59

On the Fediverse, I had a hot take:

Hot take: the enduring popularity of writing applications in a list of environments that starts with Emacs Lisp and goes on to encompass things like Electron shows that we've persistently failed to create a good high level programming system for writing text-focused applications.

(Plan 9's Acme had some good ideas but it never caught on, partly because Plan 9 didn't.)

(By 'text focused' here I mean things that want primarily to display text and have some controls and user interface elements; this is somewhat of a superset of 'TUI' ideas.)

People famously have written a variety of what are effectively applications inside GNU Emacs; there are multiple mail readers, the Magit Git client, at least one news reader, at least one syndication feed reader, and so on. Some of this might be explained by the 'I want to do everything in GNU Emacs' crowd writing things to scratch their itch even if the result is merely functional enough, but several of these applications are best in class, such as Magit (among the best Git clients as far as I know) and MH-E (the best NMH based mail reading environment, although there isn't much competition, and a pretty good Unix mail reading environment in general). Many of these applications could in theory be stand alone programs, but instead they've been written in GNU Emacs Lisp to run inside an editor even if they don't have much to do with Emacs in any regular sense.

(In GNU Emacs, many of these applications extensively rebind regular keys to effectively create their own set of keyboard commands that have nothing to do with how regular Emacs behaves. They sometimes still do take advantage of regular Emacs key bindings for things like making selections, jumping to the start and end of displayed text, or searching.)

A similar thing goes on with Electron-based applications, a fair number of which are fairly text-focused things (especially if you extend text focused things to cover emojis, a certain amount of images, and so on). For a prominent example, VSCode is a GUI text editor and IDE, so much of what it deals with is text, although sometimes somewhat fancied up text (with colours, font choices, various line markings, and so on).

On the Internet, you can find a certain amount of people mocking these applications for the heavy-weight things that they use as host environments. It's my hot take that this is an unproductive and backward view. Programmers don't necessarily like using such big, complex host environments and turn to them by preference; instead, that they turn to them shows that we've collectively failed to create better, more attractive alternatives.

It's possible that this use of heavy weight environments is partly because parts of what modern applications want and need to do are intrinsically complex. For example, a lot of text focused applications want to lay out text in somewhat complex, HTML-like ways and also provide the ability to have interactive controls attached to various text elements. Some of them need to handle and render actual HTML. Using an environment like GNU Emacs or Electron gets you a lot of support for this right away (effectively you get a lot of standard libraries to make use of), and that support is itself complex to implement (so the standard libraries are substantial).

However, I also think we're lacking text focused environments for smaller scale programs, the equivalent of shell scripts or BASIC programs. There have been some past efforts toward things that could be used for this, such as Acme and Tcl/Tk, but they didn't catch on for various reasons.

(At this point I think any viable version of this probably needs to be based around HTML and CSS, although hopefully we don't need a full sized browser rendering engine for it, and I certainly hope we can use a different language than JavaScript. Not necessarily because JavaScript is a bad language or reasonably performing JavaScript engines are themselves big, but partly because using JavaScript raises expectations about the API surface, the performance, the features, and so on, all of which push toward a big environment.)

Implementing some Git aliases indirectly, in shell scripts

By: cks
14 November 2024 at 04:10

Recently I wrote about two ways to (maybe) skip 'Dependabot' commits when using git log, and said at the end that I was probably going to set up Git aliases for both approaches. I've now both done that and failed to do that, at the same time. While I have Git aliases for both approaches, the actual git aliases just shell out to shell scripts.

The simpler and more frustrating case is for only seeing authors that aren't Dependabot:

git log --perl-regexp --author='^((?!dependabot\[bot]).*)$'

This looks like it should be straightforward as an alias, but I was unable to get the alias quoting right in my .gitconfig. No matter what I did it either produced syntax errors from Git or didn't work. So I punted by putting the 'git log ...' bit in a shell script (where I can definitely understand the quoting requirements and get them right) and making the actual alias be in the magic git-config format that runs an external program:

[alias]
  ....
  ndlog = !gitndeplog

The reason this case works as a simple alias is that all of the arguments I'd supply (such as a commit range) come after the initial arguments to 'git log'. This isn't the case for the second approach, with attempts to exclude go.mod and go.sum from file paths:

git log -- ':!:go.mod' ':!:go.sum'

The moment I started thinking about how to use this alias, I realized that I'd sometimes want to supply a range of commits (for example, because I just did a 'git pull' and want to see what the newly pulled commits changed). This range has to go in the middle of the command line, which means that a Git alias doesn't really work. And sometimes I might want to supply additional 'git log' switches, like '-p', or maybe supply a file or path (okay, probably I'll never do that). There are probably some sophisticated ways to make this work as an alias, especially if I assume that all of the arguments I supply will go before the '--', but the simple approach was to write a shell script that did the argument handling and invoke it via an alias in the same way as 'git ndlog' does.

Right now the scripts are named in a terse way as if I might want to run them by hand someday, but I should probably rename them both to 'git-<something>'. In practice I'm probably always going to run them as 'git ...', and a git-<something> name makes it clearer what's going on, and easier to find by command completion in my shell if I forget.

Maybe skipping 'Dependabot' commits when using 'git log'

By: cks
9 November 2024 at 04:15

I follow a number of projects written in Go that are hosted on Github. Many of these projects enable Github's "Dependabot" feature (also). This use of Dependabot, coupled with the overall Go ecology's habit of relatively frequent small updates to packages, creates a constant stream of Dependabot commits that update the project's go.mod and go.sum files with small version updates of some dependency, sometimes intermixed with people merging those commits (for example, the Cloudflare eBPF Prometheus exporter).

As someone who reads the commit logs of these repositories to stay on top of significant changes, these Dependabot dependency version bumps are uninteresting to me and, like any noise, they make it harder to see what I'm interested in (and more likely that I'll accidentally miss a commit I want to read about that's stuck between two Dependabot updates I'm skipping with my eyes glazed over). What I'd like to be able to do is to exclude these commits from what 'git log' or some equivalent is showing me.

There are two broad approaches. The straightforward and more or less workable approach is to exclude commits from specific authors, as covered in this Stack Overflow question and answer:

git log --perl-regexp --author='^((?!dependabot\[bot]).*)$'

However, this doesn't exclude the commits of people merging these Dependabot commits into the repository, which happens in some (but not all) of the repositories I track. A better approach would be to get 'git log' to ignore all commits that don't change anything other than go.mod and go.sum. I don't think Git can quite do this, at least not without side effects, but we can get close with some pathspecs:

git log -- ':!:go.mod' ':!:go.sum'

(I think this might want to be '!/' for full correctness instead of just '!'.)

For using plain 'git log', this is okay, but it has the side effect that if you use, eg, 'git log -p' to see the changes, any changes a listed commit makes to go.mod or go.sum will be excluded.

The approach of excluding paths can be broadened beyond go.mod and go.sum to include things like commits that update various administrative files, such as things that control various automated continuous integration actions. In repositories with a lot of churn and updates to these, this could be useful; I care even less about a project's use of CI infrastructure than I care about their Dependabot go.mod and go.sum updates.

(I suspect I'll set up Git aliases for both approaches, since they each have their own virtues.)

Quoting and not quoting command substitution in the Bourne shell

By: cks
22 October 2024 at 02:49

Over on the Fediverse, I said something:

Bourne shell trivia of the day:
Β Β var=$(program ...)
is the same as
Β Β var="$(program ...)"
so the quotes are unnecessary.

But:
Β Β program2 $(program ...)
is not the same as:
Β Β program2 "$(program ..)"
and often the quotes are vital.

(I have been writing the variable assignment as var="$(...)" for ages without realizing that the quotes were unnecessary.)

This came about because I ran an old shell script through shellcheck, which recommended replacing its use of var=`...` with var=$(...), and then I got to wondering why shellcheck wasn't telling me to write the second as var="$(...)" for safety against multi-word expansions. The answer is of course that multi-word expansion doesn't happen in this context; even if the $(...) produces what would normally be multiple words of output, they're all assigned to 'var' as a single word.

On the one hand, this is what you want; there's almost no circumstance where you want a command that produces multiple words of output to have the first word assigned to 'var' and then the rest interpreted as a command and its arguments. On the other hand, the Bourne shell is generally not known for being friendly about its quoting. It would be perfectly in character for the Bourne shell to require you to quote the '$(...)' even in variable assignment.

On the one hand, shellcheck doesn't complain about the quoted version and it's consistent with quoting $(...) in other circumstances (when it really does matter). On the other hand, you can easily forget or not know (as I did) that the quoting is unnecessary here, and then you can be alarmed when you see an unquoted 'var=$(...)' in the wild or have it suggested. Since I've mostly written the quoted version, I'll probably continue doing so in my scripts unless I'm dealing with a script that already has some unquoted examples, where I should probably make everything unquoted so that no one reading the script in the future ever thinks there's a difference between the two.

The Go module proxy and forcing Go to actually update module versions

By: cks
19 October 2024 at 03:05

Suppose, not hypothetically, that you have two modules, such as a program and a general module that it uses. Through working on the program, you realize that there are some bugs in the general module, so you fix them and then test them in the program by temporarily using a replace directive, or perhaps a workspace. Eventually you're satisfied with the changes to your module, so you commit them and push the change to the public repository. Now you want to update your program's go.mod to use the module version you've just pushed.

As lots of instructions will tell you, this is straightforward; you want some version of 'go get -u', perhaps 'go get -u .'. However, if you try this immediately, you may discover that Go is not updating the module's version. No matter what you do, not even removing the module from 'go.mod' and then go-get'ing it again, will make Go budge. As far as Go seems to be concerned, your module has not updated and the only available version is the previous one.

(It's possible that 'go get -u <module>@latest' will work here, I didn't think to try it when this happened to me.)

As far as I can tell, what is going on here is the Go module proxy. By default, 'go get' will consult the (public) Go module proxy, and the Go module proxy can have a delay between when you push an update to the public repositories and when the module proxy sees it. I assume that under the hood there's various sorts of rate limiting and other caching, since I expect neither the Go proxy nor the various forges out there want the Go proxy to query forges on every single request just in case an infrequently updated module has been updated this time around.

The blunt hammer way of defeating this is to force 'go get -u' to not use the Go module proxy, with 'GOPROXY=direct go get -u'. This will force Go to directly query the public source and so make it notice your just-pushed update.

PS: If you tagged a new version I believe you can hand edit your go.mod to have the new version. This is more difficult if your module is not officially released, has no version tags, and is using the 'v0.0.0-<git information>' format in go.mod.

PPS: Possibly there is another way you're supposed to do this. If so, it doesn't seem to be well documented.

Go's new small language features from 1.22 and 1.23 are nice

By: cks
3 October 2024 at 01:42

Recently I was writing some Go code involving goroutines. After I was done, I realized that I had used some new small language features added in Go 1.21 and Go 1.22, without really thinking about it, despite not having paid much attention when the features were added. Specifically, what I used are the new builtins of max() and min(), and 'range over integers' (and also a use of clear(), but only in passing).

Ranging over integers may have sounded a bit silly to me when I first read about it, but it turns out that there is one situation where it's a natural idiom, and that's spawning a certain number of goroutines:

for range min(maxpar, len(args)) {
   wg.Add(1)
   go func() {
     resolver()
     wg.Done()
   }()
}

Before Go 1.21, I would have wound up writing this as:

for i := 0; i < maxpar; i++ {
  [...]
}

I wouldn't have bothered writing and using the function equivalent of min(), because it wouldn't be worth the extra hassle for my small scale usage, so I'd always have started maxpar goroutines even if some of them would wind up doing nothing.

The new max() and min() builtins aren't anything earthshaking, and you could do them as generic functions, but they're a nice little ergonomic improvement in Go. Ranging over integers is something you could always do but it's more compact now and it's nice to directly see what the loop is doing (and also that I'm not actually using the index variable for anything in the loop).

(The clear() builtin is nice, but it also has a good reason for existing. I was only using it on a slice, though, where you can fully duplicate its effects.)

Go doesn't strictly need max(), min(), and range over integers (although the latter is obviously connected to ranging over functions, which is important for putting user container types closer to par with builtin ones). But adding them makes it nicer, and they're small (although growing the language and its builtins does have a quiet cost), and Go has never presented itself as a mathematically minimal language.

(Go will have to draw the line somewhere, because there are a lot of little conveniences that could be added to the language. But the Go team is generally conservative and they're broadly in a position to not do things, so I expect it to be okay.)

Go and my realization about what I'll call the 'Promises' pattern

By: cks
25 September 2024 at 03:23

Over on the Fediverse, I had a belated realization:

This is my face when I realize I have a situation that 'promises'/asynchronously waitable objects would be great for, but I would have to build them by hand in Go. Oh well.

(I want asynchronous execution but to report the results in order, as each becomes available. With promises as I understand them, generate all the promises in an array, wait for each one in order, report results from it, done.)

A common pattern with work(er) pools in Go and elsewhere is that you want to submit requests to a pool of asynchronous workers and you're happy to handle the completion of that work in any order. This is easily handled in Go with a pair of channels, one for requests and the other for completions. However, this time around I wanted asynchronous requests but to be able to report on completed work in order.

(The specific context is that I've got a little Go program to do IP to name DNS lookups (it's in Go for reasons), and on the one hand it would be handy to do several DNS lookups in parallel because sometimes they take a while, but on the other hand I want to print the results in command line order because otherwise it gets confusing.)

In an environment with 'promises' or some equivalent, asynchronous work with ordered reporting of completion is relatively straightforward. You submit all the work and get an ordered collection of Promises or the equivalent, and then you go through in order harvesting results from each Promise in turn. In Go, I think there are two plausible alternatives; you can use a single common channel for results but put ordering information in them, or you can use a separate reply channel for each request. Having done scratch implementations of both, my conclusion is that the separate reply channel version is simpler for me (and in the future I'm not going to be scared off by thoughts of how many channels it can create).

For the common reply channel version, your requests must include a sequence number and then the replies from the workers will also include that sequence number. You'll receive the replies in some random sequence and then it's on you to reassemble them into order. If you want to start processing replies in order before everything has completed, you have to do additional work (you may want, for example, a container/heap).

For the separate reply channel version, you'll be creating a lot of channels (one per request) and passing them to workers as part of the request; remember to give them a one element buffer size, so that workers never block when they 'complete' each request and send the answer down the request's reply channel. However, handling completed requests in order is simple once you've accumulated a block of them:

var replies []chan ...
for _, req := range worktodo {
  // 'pool' is your worker pool
  replies = append(replies, pool.submit(req))
}

for i := range replies {
  v := <- replies[i]
  // process v
}

If a worker has not yet finished processing request number X when you get to trying to use the reply, you simply block on the channel read. If the worker has already finished, it will have sent the reply into the (buffered, remember) channel and moved on, and the reply is ready for you to pick up immediately.

(In both versions, if you have a lot of things to process, you probably want to handle them in blocks, submitting and then draining N items, repeating until you've handled all items. I think this is probably easier to do in the separate reply channel version, although I haven't implemented it yet.)

Open source maintainers with little time and changes

By: cks
19 September 2024 at 03:02

'Unmaintained' open source code represents a huge amount of value, value that shouldn't and can't be summarily ignored when considering issues like language backward compatibility. Some of that code is more or less unmaintained, but some of it is maintained by people spending a bit of time working on things to keep projects going. It is perhaps tempting to say that such semi-maintained projects should deal with language updates and so on. I maintain that this is wrong.

These people keeping the lights on in these projects often have limited amounts of time that they either can or will spend on their projects. They don't owe the C standard or anyone else any amount of that time, not even if the C standard people think it should be small and insignificant and easy. Outside backward incompatible changes (in anything) that force these people to spend their limited time keeping up (or force them to spend more time) are at the least kind of rude.

(Such changes are also potentially ineffective or dangerous, in that they push people towards not updating at all and locking themselves to old compilers, old compiler settings, old library and package versions, and so on. Or abandoning the project entirely because it's too much work.)

Of course this applies to more than just backward incompatible language changes; especially it applies to API changes. Both language and API changes force project maintainers into a Red Queen's Race, where their effort doesn't improve their project, it just keeps it working. Does this mean that you can never change languages or APIs in ways that break backward compatibility? Obviously not, but it does mean that you should make sure that the change is worth the cost, and the more used your language or API is, the higher the cost. C is an extremely widely used language, so the cost of any break with backward compatibility in it (including in the C standard library) is quite high.

The corollary of this for maintainers is that if you want your project to not require much of your time, you can't depend on APIs that are prone to backward incompatible changes. Unfortunately this may limit the features you can provide or the languages that you want to use (depending not just on the rate of change in the language itself but also in the libraries that the language will force you to use).

(For example, as a pragmatic thing I would rather write a low maintenance TLS using program in Go than in anything else right now, because the Go TLS package is part of the core Go library and is covered by the Go 1 compatibility guarantee. C and C++ may be pretty stable languages and less likely to change than Go, but OpenSSL's API is not.)

Mercurial's extdiff extension and reporting filenames in diffs

By: cks
31 August 2024 at 02:49

We have a long standing Mercurial 'alias' (it's not an alias in the Git sense) called 'hg sdiff' that provides diffs in non-context form, because for system administrator usage we're often changing things where the context of standard (context) diffs isn't useful and we want the terseness of standard diffs. For a long time we've had a little irritation, where if you changed only one file in a Mercurial repository 'hg sdiff' wouldn't show you the file name, but if you changed several files, 'hg sdiff' would. Today I dug into what was going on and it is more peculiar than I expected.

Mercurial has no native 'hg diff' option to do non-context diffs, so we actually do this through the standard Extdiff extension, which allows you to use external programs to provide diffs. We configure our 'sdiff' custom extdiff command to run 'diff -Nr', which on our Ubuntu machines uses GNU diffutils. However, GNU Diff has no command line option to print filenames. Nor are the file names printed by the code of the Extdiff extension itself when it runs your external diff command.

The clue to what is going on is in the '-r' argument to diff, or alternately this sentence in the Extdiff documentation:

The external diff programs are called with a configurable set of options and two non-option arguments: paths to directories containing snapshots of files to compare, or paths to the files themselves if only one file is modified.

If we run 'hg sdiff' and only one file has been changed in the repository, Extdiff will invoke 'diff -Nr file1.old file1' (to simplify the arguments a bit), and diff itself won't print any filenames. However, if we run 'hg sdiff' and we've changed two or more files, Extdiff will create two directories and invoke 'diff -Nr directory1 directory2', and then since it's running in recursive mode, diff will print the filenames along with the changes in the files.

(I believe things may be slightly more complex if you run Extdiff to compare two revisions, instead of comparing a revision to the working tree, but even then our 'hg sdiff' doesn't print the file name for single-file changes.)

As far as I can tell there's no Extdiff option to always create directories and do recursive diffs, which would do what we want here. Extdiff does have the '--per-file' option to do the reverse and always pass your external program two files. One could write a cover script for Extdiff usage that detects the two-files case and prints filenames appropriately, but we're going to just continue to live with the situation.

(We're not interested in switching to zero-context unified or context diffs, both of which would print the filenames but be more verbose and potentially confusing.)

It's not simple to add function keyword arguments to Go

By: cks
20 August 2024 at 02:30

I recently read An unordered list of things I miss in Go (via). One of those things is 'keyword and default arguments for functions', which in the article was not labeled as requiring a major revision to the language. In one sense this is true, but in another sense it's not, because adding keyword arguments to any compiled language raises ABI questions. This is especially the case in Go, which is generally supposed to be a low-overhead language in terms of how things are stored in memory and passed around in function calls (and Go went through an ABI change not too long ago to enable passing function arguments in registers instead of on the stack).

(I think that keyword arguments with default values don't really raise general API issues, assuming that keyword arguments can only be passed as keywords.)

If Go wants keyword arguments to be useful for small functions that will be called often and should be efficient, such as the article's example of strings.Replace() making the final 'n' argument be a keyword argument with a default value, then calling such a function needs to be roughly as efficient as calling a version without a keyword argument. This implies that such arguments should be passed more or less like non-keyword arguments are today, rather than assembled into a separate mechanism that would necessarily have more overhead; basically you'd treat them as regular arguments that can be specified by name instead of position.

Making default values efficient is tricky and has implications for what sort of default values are allowed. If default values must be compile time constants, every call site can add them in for unspecified keyword arguments. If they can be values only established at runtime (as the initial value of variables can be), then you need some scheme to record these values and then either fetch them at the call sites or pass information on what keyword arguments need default arguments to the function being called. If default values are only determined at the time the function is called, you must do the last, but this will probably be the least efficient option.

All of these choices have implications for ABI stability, which affects what Go shared libraries can be used for and how. For instance, people using shared libraries for Go packages would probably like it if adding a new keyword argument to some function did not break existing compiled code that was calling that function. But it certainly would be simpler if all code had to be compiled together with exactly current information and there was no shared library ABI compatibility of the form that is common for C shared libraries.

(In C, adding an extra argument to a function is an API break, but as mentioned, this isn't necessarily true for adding a keyword argument with a default argument. If it's not an API break, it would be convenient if it's not an ABI break either, but that's challenging, especially for efficient calls even with keyword arguments.)

Generally, it would be best if the (hypothetical) Go language specification for function keyword arguments didn't preclude some of these options, even if the main Go compiler and toolchain was not going to use them. Go already has several implementations, so someday there might be an implementation that values C-like ABI stability. Nor do you want to preclude efficient implementations with no such ABI compatibility, although some of the choices for things like what default argument values are allowed affect that.

(Python doesn't have problems with all of this because it's not trying to be Go's kind of language. Python can define abstract semantics without worrying about efficient implementation or whether some of the language semantics require various inefficiencies. And even then, Python has traps in default argument values, and also.)

A downside or two of function keyword arguments (and default values)

By: cks
18 August 2024 at 19:28

Recently I read An unordered list of things I miss in Go (via). One of the things is 'keyword and default arguments for functions', to which I had a reaction:

Hot take: having keyword arguments and optional arguments (and default values) for function calls in your language encourages people to make functions that take too many arguments.

(I can see why Python did it, because on the one hand class/object construction and on the other hand, in a dynamic language it lets you change a function's API without having to hunt down absolutely everyone who calls it and now doesn't have that extra argument. Just make the new argument a keyword one, done.)

Technically you can have keyword arguments without supporting optional arguments or default values, but I don't think very many languages do. The languages I've seen with this generally tend to make keyword arguments optional and then sometimes let you set a default value if no value is supplied (otherwise an unsupplied argument typically gets a zero value specific to its type or the language; for example I believe Emacs Lisp makes them all nil).

I'm sure it's possible to make tasteful, limited use of keyword arguments to good benefit; the article suggests one (in Go) and I'm sure I've seen examples in Python. But it's far more common to create sprawling APIs with a dozen or more parameters, such as Python's subprocess.run() (or the even larger subprocess.Popen()). I'm personally not a fan of such APIs, although this is one of those taste issues that is hard to quantify.

(The usual excuse is that you don't normally use all that many of those keyword arguments. But if you do, things are messy, and the API is messy because all of them exist. There are other patterns that can be used for APIs that intrinsically have lots of options; some are common Go practice. These patterns are more verbose, but in my view that verbosity is not necessarily a bad thing.)

Another unfortunate aspect of optional keyword arguments with default values is that they enable a lazy way of expanding and changing function APIs (and method APIs and constructor APIs and so on). Rather than add or change a regular argument and have to update all of the call sites, you add a new keyword argument with a default value, and then only update or add call sites that need to use the new part of the API. I've done this myself because it was quick and easy, and I've also wound up with the end state of this that you sometimes get, where all of my call sites were using the new API with the new keyword argument, so I could have gotten rid of it as a keyword and made it a regular argument.

I also feel that keyword arguments encourage a certain bad way of evolving APIs (by making this way the easiest way to move forward). In one version, some bad behavior is allowed to linger because everyone is supposed to know to turn it off with a keyword argument. In another version, some incompatibility is eventually allowed to be added because everyone who doesn't want it is supposed to have used a keyword argument to disable it. In either situation, if you don't know about the magic trick with the keyword argument, you lose out.

What you can say for keyword arguments is that taking a lot of keyword arguments is better than taking a lot of non-keyword arguments, but to me this is not much of an endorsement. It's better not to have lots of arguments in general.

Maybe understanding Crowdstrike's argument arity problem

By: cks
8 August 2024 at 03:46

Crowdstrike recently released an "External Technical Root Cause Analysis" [PDF] (via) for their recent extreme failure. The writeup is rather unclear about what exactly happened, but I think I understand it and if I do, it's an uncomfortably easy programming mistake to make. So here is my version of the core programming issue.

Part of Crowdstrike's agent is a signature matching system, where they match signature patterns (templates) against what's going on. There are different types of patterns (template types) depending on what sort of activity the signature pattern is matching. Rather than have one big regular expression or equivalent in each pattern that matches against all data for the activity, all smashed together and encoded somehow, Crowdstrike's code breaks this out so pattern types have some number of fields (each of them a separate regular expression or equivalent) and are supplied with matching input values (or input data) extracted from the activity in order to do the field by field matching.

(You could invert this so that during the matching of each pattern (template), the matching code calls back to obtain the input values from the activity, but since you can have a lot of different templates for a given template type, it's more efficient to extract the data once and then reuse it across all of the templates of that type. You can still do this with the call back pattern, but it's more complicated.)

For whatever reason, the input values were passed not as separate arguments but as a single variable size array, the input data array (although this might not have been as a literal array but instead as varargs function arguments). It's possible that the core matching code was in fact generic; it was given a template with some number of fields and an input data array with some number of fields and it walked through matching them up. If all of the field matching genuinely is text regular expression matches, this wouldn't be a crazy way to structure the code. You'd have one general matching system plus a collection of template type specific code to that was passed information about the activity and had the job of pulling data out of it and populating the input data array for its patterns.

The mismatch occurred when the IPC templates specified and matched against 21 fields, but the input data array only had 20 pieces of data. The issue had previously been masked because all previous IPC templates had a wildcard match for the 21st field and this skipped the actual comparison. In hand-waved Go code, one version of this might look like:

func (t *Template) DoesMatch(eventdata SomeType[]) bool {
  for i := range t.Fields {
    if t.Fields[i] != MatchesAll &&
       !FieldMatches(t.Fields[i], eventdata[i]) {
      return false
    }
  }
  return true
}

(To be clear, I'm not claiming that this is what Crowdstrike's code was doing. All we know about the actual code from Crowdstrike's report is that it involved an arity mismatch that wasn't detected at compile time.)

Now, you would certainly like this code to start with a check that 'len(t.Fields) == len(eventdata)' so that it doesn't panic if there's a length mismatch. But you might not put that check in, depending on the surrounding context. And in general this code and design is, in my opinion, not particularly bad or unnatural; you might write something like it in all sorts of situations where you have a variable amount of data that needs to be provided to something (or various different things).

Compilers and language environments are broadly not all that good at statically detecting dynamic arity mismatches; it's a hard problem involving things like dataflow analysis. You can sometimes change the code to avoid being dynamic this way, but it might involve awkward restructuring that leads to other issues. For example, you might have a per template type 'myEventData' structure with named fields, insuring that you can never access an undefined field, but then you probably have to have per template type matching code so that you can create and fill in the right myEventData structure and call everything in a type-safe way (although remember to write tests that verify that every field in the structure is actually filled in). This would involve duplicating the logic of 'check all of these templates of the give type' across these functions, and that code might have complexities of its own.

(Modern language environments have done a lot of work to detect arity problems in things like printf() or its equivalents, but this is generally done by specific knowledge of how to determine the arity (and types) given a format string or the like.)

If you have a master source of truth for what the arity should be (how many fields or arguments are required), such as a data file definition, you can build additional tooling to check that the arity is correct or to automatically generate test cases that will verify that. But this requires having that source of truth and building the generally custom tooling to use it. You also have to insure that the test cases are doing what you think they're doing, which might have been another issue in play here (the people writing the IPC template type code and its tests may not have known that wildcard field matches were short circuited so that they didn't even look at the matching element of the input data array).

(IDLs also don't have a great reputation among programmers, partly because there have been a lot of bad IDLs over the years that people have been forced to deal with.)

There are also intermediate positions between the fully dynamic arity issue that Crowdstrike apparently had and the fully type safe version you could write with enough work. But they all involve some runtime dynamic behavior, which means at least a possibility for runtime mistakes, which would have to be carefully caught lest they crash your code. For instance, in Go you could have a per template type structure, pass it through the common code as an interface{}, and then type-assert it back to the real structure type in your per template type matcher function. But you'd have to handle the possibility of that type assertion failing, however unlikely it should be.

PS: None of this should be taken as excusing Crowdstrike. They were writing software that ran in an extremely privileged and important situation, and they should have done that (much) better.

PPS: Neither Go nor Rust by themselves will save you from this specific sort of dynamic arity mistake; they merely change how your code will fail, from an invalid memory dereference crash to a panic in the language runtime or your (macro-expanded) code.

Backward compatibility, even for settings, has real costs

By: cks
28 July 2024 at 02:47

When I wrote about how I felt GNU Emacs configuration bankruptcy was inevitable in the longer term, one of the reasons was that Emacs configuration settings have changed over time (and will change more in the future), so that your old way of configuring something (like the C indentation style) would have to be updated periodically. One reaction to this is to say that open source software should keep backward compatibility for such settings. Unfortunately and as usual, such backward compatibility would have real costs; it would effectively act as a tax on development.

If you promise backward compatibility for settings, you must devote (programming) effort to mapping old settings to the new behavior and perhaps the new settings. Where there's no exact equivalent of the old setting's behavior, you may have to add code or additional (new) settings to synthesize that behavior, or materialize it if someone ever asks for it. Or you can only imitate the old setting value imperfectly, but then (some) people will complain. All of this takes work, especially if the setting is controlling some old behavior and old code that you're trying to move away from.

Open source software has finite time to spend on development (and increasing usage doesn't necessarily scale up the time available the way it can for commercial software). So the more backward compatibility you maintain, the more of your development time goes to that, and the less you're moving forward. And all of this is for something that you certainly hope fewer and fewer people are using over time, with new users and some number of old people moving to your current system of settings.

It's not surprising when large software like GNU Emacs doesn't preserve backward compatibility in settings, especially over the long term. In fact, I'd go further than that; it's a good thing when open source software doesn't attempt to do this if it's at all difficult. I'd much rather see limited resources going on improving the project and moving it forward rather than letting me not modify my .emacs for another year or three.

(Much of this applies to backward compatibility in general, at least in straightforward software that people use directly. Operating systems, libraries, and similar infrastructure things are somewhat different for reasons beyond the scope of this entry.)

That software forges are often better than email is unfortunate

By: cks
14 July 2024 at 03:09

Over on the Fediverse, there was a discussion of doing software development things using email and I said something:

My heretical opinion is that I would rather file a Github issue against your project than send you or your bug tracker email, because I do not trust you to safeguard my email against spammers, so I have to make up an entire new email address for you and carefully manage it. I don't trust Github either, but I have already done all of this email address handling for them.

(I also make up an email address for my Git commits. And yes, spammers have scraped it and spammed me.)

Github is merely a convenient example (and the most common one I deal with). What matters is that the forge is a point of centralization (so it covers a lot of projects) and that it does not require me to expose my email to lots of people. Any widely used forge-style environment has the same appeal (and conversely, small scale forges do not; if I am going to report issues to only one project per forge, it is not much different than a per-project bug tracker or bug mailing list).

That email is so much of a hassle today is a bad and sad thing. Email is a widely implemented open standard with a huge suite of tools that allows for a wide range of ways of working with it. It should be a great light-weight way of sending in issues, bug reports, patches, etc etc, and any centralized, non-email place to do this (like Github) has a collection of potential problems that should make open source/free software people nervous.

Unfortunately email has been overrun by spammers in a way that forges have not (yet) been, and in the case of email the problem is essentially intractable. Even my relatively hard to obtain Github-specific email address gets spam email, and my Git commit email address gets more. And demonstrating the problem with not using forges, the email address I used briefly to make some GNU Emacs bug reports about MH-E got spam almost immediately, which shows why I really don't want to have to send my issues by email to an exposed mailing list with public archives.

While there are things that might make the email situation somewhat better (primarily by hiding your email address from as many parties as possible), I don't think there's any general fix for the situation. Thanks to spam and abuse, we're stuck with a situation where setting yourself up on a few central development sites with good practices about handling your contact methods is generally more convenient than an open protocol, especially for people who don't do this all the time.

I think (GNU) Emacs bankruptcy is inevitable in the longer term

By: cks
8 July 2024 at 02:30

Recently I read Avoiding Emacs bankruptcy, with good financial habits (via). To badly summarize the article, it suggests avoiding third party packages and minimizing the amount of customization you do. As it happens, I have experience with more or less this approach, and in the end it didn't help. Because I built my old Emacs environment in the days before third party Emacs package management, it didn't include third party packages (although it may have had a few functions I'd gotten from other people). And by modern standards it wasn't all that customized, because I didn't go wildly rebinding keys or the like. Instead, I mostly did basic things like set indentation styles. But over the time from Emacs 18 to 2012, even that stuff stopped working. The whole experience has left me feeling that Emacs bankruptcy is inevitable over the longer term.

The elements pushing towards Emacs bankruptcy are relatively straightforward. First, Emacs wants personal customization in practice, so you will build up a .emacs for your current version of Emacs even if you don't use third party packages. Second, Emacs itself changes over time, or if you prefer the standard, built-in packages change over time to do things like handle indentation and mail reading better. This means that your customizations of them will need updating periodically. Third, the Emacs community changes over time in terms of what people support, talk about, recommend, and so on. If you use the community at all for help, guidance, and the like, what it will be able to help you with and what it will suggest will change over time, and thus so will what you want in your Emacs environment to go with it. Finally, both your options for third party packages and the third party packages themselves will change over time, again forcing you to make changes in your Emacs environment to compensate.

In addition, as the article implicitly admits, that a package is in the Emacs standard library doesn't mean that it can't have problems or effectively be abandoned with little or no changes and updates (for example, the state of Flymake for a long time). Sticking to the packages that come with Emacs can be limiting and restrictive, much like not customizing Emacs at all and accepting all of its defaults. You can work with Emacs that way (and people used to, back in the days before there was a vibrant ecology of third party packages), but you're being potentially hard on yourself in order to reduce the magnitude of something that's probably going to happen to you anyway.

(For instance, until recently not using third party packages would have meant that you did not have support for language servers.)

My view is that in practice, there's no way to leave your Emacs setup alone for a long time. You can go 'bankrupt' in small pieces of work every so often, or in big bangs like the one I went through (although the small pieces approach is more likely if you keep using Emacs regularly).

I don't think this is a bad thing. It's ultimately a choice on the spectrum between evolution and backward compatibility, where both GNU Emacs and the third party ecosystem would rather move forward (hopefully for the better) instead of freezing things once something is implemented.

(GNU) Emacs wants personal customization in practice

By: cks
25 June 2024 at 03:41

Recently I read Avoiding Emacs bankruptcy, with good financial habits (via), which sparked some thoughts. One of them is that I feel that GNU Emacs is an editor that winds up with personal customizations from people who use it, even if you don't opt to install any third party packages and stick purely with what comes with Emacs.

There are editors that you can happily use in their stock or almost stock configuration; this is most of how I use vim. In theory you can use Emacs this way too. In practice I think that GNU Emacs is not such an editor. You can use GNU Emacs without any customization and it will edit text and do a variety of useful things for you, but I believe you're going to run into a variety of limitations with the result that will push you towards at least basic customization of built in settings.

I believe that there are multiple issues, at least:

  • The outside world can have multiple options where you have to configure the choice (such as what C indentation style to use) that matches your local environment.

  • Emacs (and its built in packages) are opinionated and those opinions are not necessarily yours. If opinions clash enough, you'll very much want to change some settings to your opinions.

    (This drove a lot of my customization of GNU Emacs' MH-E mode, although some of that was that I was already a user of (N)MH.)

  • You want to (automatically) enable certain things that aren't on by default, such as specific minor modes or specific completion styles. Sure, you can turn on appealing minor modes by hand, but this gets old pretty fast.

  • Some things may need configuration and have no defaults that Emacs can provide, so either you put in your specific information or you don't get that particular (built in) package working.

Avoiding all of these means using GNU Emacs in a constrained way, settling for basic Emacs style text editing instead of the intelligent environment that GNU Emacs can be. Or to put it another way, Emacs makes it appealing to tap into its power with only a few minor settings through the built in customization system (at least initially).

I believe that most people who pick GNU Emacs and stick with it want to use something like its full power and capability; they aren't picking it up as a basic text editor. Even without third party packages, this leads them to non-trivial customizations to their specific environment, opinions, and necessary choices.

(Perhaps this is unsurprising and is widely accepted within the GNU Emacs community. Or perhaps there is a significant sub-community that does use GNU Emacs only in its role as a basic text editor, without the various superintelligence that it's capable of.)

Go's 'range over functions' iterators and avoiding iteration errors

By: cks
18 June 2024 at 03:10

Go is working on allowing people to range-over function iterators, and currently this is scheduled to be in Go 1.23, due out this summer (see issue 61405 and issue 61897). The actual implementation is somewhat baroque and some people have been unhappy about that (for example). My view is that this is about bringing user-written container types closer to parity with the special language container types, but recently another view of this occurred to me.

As people have noted, what is most special about this proposal is not that it creates an officially supported iteration protocol in Go, but that this protocol gets direct language support. The compiler itself will transform 'for ... = range afunction' into different code that actively implements the iteration protocol that the Go developers have chosen. This direct language support is critical to making ranging over functions like 'for k,v := range map', but it also does another thing, which is that it implements all of the details of the iteration protocol for the person writing the 'for' loop.

(People seem to generally envision that the actual usage will be 'for ... = range generator(...)', where 'generator()' is a function that returns the actual function that is used for iteration. But I think you could use method values in some situations.)

Iteration protocols are generally fairly complicated. They have to deal with setup, finalization, early exits from process of iteration, finalization in the face of early exits, and so on. The actual implementations of these protocols tends to be gnarly and somewhat subtle, with various potential mistakes and omissions that can be made, and some of these will not manifest in clear bugs until some special situation arises. Go could make everyone who wanted to use 'iterate over a function or special data structure' write out the explicit code needed to do this using the protocol, but if it did we know what the result would be; some of that code would be buggy and incomplete.

By embedding its chosen iteration protocol into the language itself, Go insures that most of that code won't have to be written by you (or by any of the plenty of people who might use user-written types and want to iterate over them). The compiler itself will take a straightforward 'for ... range' block and transform it to correctly and completely implement the protocol. In fact, the protocol is not even particularly accessible to you within the 'for' block you're writing.

People writing the iterator functions for their user-written types will have to care about the protocol, of course (although the Go protocol seems relatively simple in that regard too). But there are likely to be many fewer such iterator creators than there will be iterator users, much as Go assumes that there will be many more people using generic types than people creating them.

Reasons to not expose Go's choice of default TLS ciphers

By: cks
26 May 2024 at 02:13

When I wrote about the long-overdue problem people are going to have with go:linkname in Go 1.23, the specific case that caused me to notice this was something trying to access crypto/tls's 'defaultCipherSuitesTLS13' variable. As its name suggests, this variable holds the default cipher suites used by Go for TLS 1.3. One reaction to this specific problem is to ask why Go doesn't expose this information as part of crypto/tls's API.

One reason why not is contained in the documentation for crypto/tls.CipherSuites():

[...] Note that the default cipher suites selected by this package might depend on logic that can't be captured by a static list, and might not match those returned by this function.

In fact the TLS 1.3 cipher suites that Go uses may not match the ones in defaultCipherSuitesTLS13, because there is actually a second set of them, defaultCipherSuitesTLS13NoAES. As its name suggests, this set of cipher suites applies when the current machine doesn't have hardware support for AES GCM, or at least hardware support that Go recognizes. Well, even that is too simple a description; if Go is being used as a TLS server, whether the 'no AES GCM' version is used also depends on if the client connecting to the Go server appears to prefer AES GCM (likely signaling that the client has hardware support for it).

Today, Go can't expose a useful API for 'the default TLS 1.3 cipher suites' because there is no such straightforward thing; the actual default cipher suites used depend on multiple factors, some of which can't be used by even a top level function like CipherSuites(). If Go had exported such a variable or API in the past, Go's general attitude on backward compatibility might have forced it to freeze the logic of TLS 1.3 cipher suite choice so that it did respect this default list no matter what, much like the random number generation algorithm became frozen because people depended on it.

The Go 1 compatibility promise is a powerful enabler for Go. But because it is so strong and the Go developers interpret it broadly, it means that Go has to be really careful and certain about what APIs it exposes. As we've seen with math/rand, early decisions to expose certain things, even implicitly, can later constrain Go's ability to make important changes.

PS: Another reason to not expose this information from crypto/tls is that Go has explicitly decided to not make TLS 1.3 cipher suites something that you can control. As covered in the documentation for crypto/tls.Config, the 'CipherSuites' struct element is ignored for TLS 1.3; TLS 1.3 cipher suites are not configurable.

The long-overdue problem coming for some people in Go 1.23

By: cks
25 May 2024 at 02:06

Right now, if you try to build anything using the currently released version of github.com/quic-go/quic-go with the current development version of Go (such as this DNS query program), you will probably encounter the following error:

link: github.com/quic-go/quic-go/internal/qtls: invalid reference to crypto/tls.defaultCipherSuitesTLS13

Experienced Go developers may now be scratching their heads about how quic-go/internal/qtls is referring to crypto/tls.defaultCipherSuitesTLS13, since the latter isn't an exported identifier (in Go, all exported identifiers start with a capital letter). The simple answer is that the qtls package is cheating (in cipher_suite.go).

The official Go compiler has a number of special compiler directives. A few of them are widely known and used, for example '//go:noinline' is common in some benchmarking to stop the compiler from optimizing your test functions too much. One of the not well known ones is '//go:linkname', and for this I'll just quote from its documentation:

//go:linkname localname [importpath.name]

[...] This directive determines the object-file symbol used for a Go var or func declaration, allowing two Go symbols to alias the same object-file symbol, thereby enabling one package to access a symbol in another package even when this would violate the usual encapsulation of unexported declarations, or even type safety. For that reason, it is only enabled in files that have imported "unsafe".

Let me translate that: go:linkname allows you to access unexported variables and functions of other packages. In particular, it allows you to access unexported variables (and functions) from the Go standard library, such as crypto/tls.defaultCipherSuitesTLS13.

The Go standard library uses go:linkname internally to access various unexported things from other packages and from the core runtime, which is perfectly fair; the entire standard library is developed by the same people, and they have to be very careful and conservative with the public API. However, go:linkname has also been used by a wide assortment of third party packages to access unexported pieces of the standard library that those packages found convenient or useful (such as Go's default cipher suites for TLS 1.3). Accessing unexported things from the Go standard library isn't covered by the Go 1 compatibility guarantee, for obvious reasons, but in practice the Go developers find themselves not wanting to break too much of the Go package ecosystem even if said ecosystem is doing unsupported things.

Last week, the Go developers noticed this (I believe not for the first time) and Russ Cox filed issue #67401: cmd/link: lock down future uses of linkname, where you can find a thorough discussion of the issue. The end result is that the current development version of Go, which will become Go 1.23, is now much more restrictive about go:linkname, requiring that the target symbol opt in to this usage. Starting from Go 1.23, you will not be able to 'go:linkname' to things in the standard library that have not specifically allowed this (and the rules are probably going to get stricter in future Go versions; in a few versions I wouldn't be surprised if you couldn't go:linkname into the standard library at all from outside packages).

So this is what is happening with github.com/quic-go/quic-go. It is internally using a go:linkname to get access to crypto/tls's defaultCipherSuitesTLS13, but in Go 1.23, defaultCipherSuitesTLS13 is not one of the symbols that has opted in to this use, so the build is now failing. The quic-go package is probably far from the only package that is going to get caught out by this, now and in the future.

(The Go developers have been adding specific opt-ins for sufficiently used internal identifiers, in files generally called 'badlinkname.go' in the packages. You can see the current state for crypto/tls in its badlinkname.go file.)

Go's old $GOPATH story for development and dependencies

By: cks
22 May 2024 at 03:37

As people generally tell the story today, Go was originally developed without support for dependency management. Various community efforts evolved over time and then were swept away in 2019 by Go Modules, which finally added core support for dependency management. I happen to feel that this story is a little bit incomplete and sells the original Go developers short, because I think they did originally have a story for how Go development and dependency management was supposed to work. To me, one of the fascinating bits in Go's evolution to modules is how that original story didn't work out. Today I'm going to outline how I see that original story.

In Go 1.0, the idea was that you would have one or more of what are today called multi-module workspaces. Each workspace contained one (or several) of your projects and all of its dependencies, in the form of cloned and checked-out repositories. With separate repositories, each workspace could have different (and independent) versions of the same packages if you needed that, and updating the version of one dependency in one workspace wouldn't update any other workspace. Your current workspace would be chosen by setting and changing $GOPATH, and the workspace would contain not just the source code but also precompiled build artifacts, built binaries, and so on, all hermetically confined under its $GOPATH.

This story of multiple $GOPATH workspaces allows each separate package or package set of yours to be wrapped up in a directory hierarchy that effectively has all of its dependencies 'vendored' into it. If you want to preserve this for posterity or give someone else a copy of it, you can archive or send the whole directory tree, or at least the src/ portion of it. The whole thing is fairly similar to a materialized Python virtual environment.

(The original version of Go did not default $GOPATH to $HOME/go, per for example the Go 1.1 release notes. It would take until Go 1.8 for this default to be added.)

This story broadly assumes that updates to dependencies will normally be compatible, because otherwise you really want to track the working dependency versions even in a workspace. While you can try to update a dependency and then roll it back (since you normally have its checked out repository with full history), Go won't help you by remembering the identity of the old, working version. It's up to you to dig this out with tools like the git reflog or your own memory that you were at version 'x.y.z' of the package before you updated it. And 'go get -u' to update all your dependencies at once only makes sense if their new versions will normally all work.

This story also leaves copying workspaces to give them to someone else (or to preserve them in their current state) as a problem for you, not Go. However, Go did add 'experimental' support for vendoring dependencies in Go 1.5, which allowed people to create self-contained objects that could be used with 'go get' or other simple repository copying and cloning. A package that had its dependencies fully vendored was effectively a miniature workspace, but this approach had some drawbacks of its own.

I feel this original story, while limited, is broadly not unreasonable. It could have worked, at least in theory, in a world where preserving API compatibility (in a broad sense) is much more common than it clearly is (or isn't) in this one.

My GNU Emacs MH mail folder completion in MH-E

By: cks
20 May 2024 at 03:38

When I wrote about understanding the orderless package, I mentioned that orderless doesn't work well with hierarchical completions such as file names, which are completed one component at a time. I also said this mattered to me because MH-E completed the names of mail folders in this part by part manner, but I didn't feel like rewriting MH-E's folder completion system to fix it. Well, you can probably guess what happened next.

In the GNU Emacs way, I didn't so much rewrite MH-E's mail folder completion as add a second folder completion system along side it, and then rebound some keys to use my system. Writing my system was possible because it turned out MH-E had already done most of the work for me, by being able to collect a complete list of all folder names (which it used to support its use of the GNU Emacs Speedbar).

To put the summary up front, I was pleasantly surprised by how easy it was to add my own completion stuff and make use of it within my MH-E environment. At the same time, reverse engineering some of MH-E's internal data structures was a bit annoying and it definitely feels like a bit of a hack (although one that's unlikely to bite me; MH-E is not exactly undergoing rapid and dramatic evolution these days, so those data structures are unlikely to change).

There are many sophisticated way to do minibuffer completion in GNU Emacs, but if your purpose is to work well with orderless, the simplest approach is to generate a list of all of your completion candidates up front and then provide this list to completing-read. This results in code that looks like this:

(defvar cks/mh-folder-history '() "History of MH folder targets.")
(defun cks/mh-get-folder (msg)
  (let ((cks/completion-category 'mh-e-folder-full))
    (completing-read msg (cks/mh-all-folders) nil t "+" cks/mh-folder-history)))

Here I've made the decision that this completion interface should require that I select an existing MH mail folder, to avoid problems. If I want to create a new mail folder I fall back to the standard MH-E functions, with their less convenient completion but greater freedom. I've also decided to give this completion a history, so I can easily re-use my recent folder destinations.

(The cks/completion-category stuff is for forcing the minibuffer completion category so that I can customize how vertico presents it, including listing those recent folder destinations first.)

This 'get MH folder' function is then used in a straightforward way:

(defun mh-refile-msg-full (range folder)
  (interactive (list (mh-interactive-range "Refile")
                     (intern (cks/mh-get-folder "Refile to folder? "))))
  (mh-refile-msg range folder))

This defers all of the hard work to the underlying MH-E command for refiling messages. This is one of the great neat tricks in GNU Emacs with the (interactive ...) form; when you make a function a command with (interactive ...), it's natural to find up with it callable from other ELisp code with the arguments you'd normally be prompted for interactively. So I can reuse the mh-refile-msg command non-interactively, sticking my own interactive frontend on it.

All of the hard work happens in cks/mh-all-folders. Naturally, MH-E maintains its own data structures in a way that it finds convenient, so its 'mh-sub-folders-cache' hash table is not structured as a list of all MH folder names but instead has hash entries storing all of the immediate child folders of a parent plus some information on each (at the root, the 'parent' is nil). So we start with a function to transform various combination of a hash key and a hash value into a MH folder name:

(defun cks/mh-hash-folder-name (key elem)
  (cond
   ((and key elem) (concat key "/" (car elem)))
   (key key)
   (elem (concat "+" (car elem)))))

And then we go over mh-sub-folders-cache using our mapping function with:

(cl-loop for key being the hash-keys of mh-sub-folders-cache
  using (hash-values v)
  collect (cks/mh-hash-folder-name key nil)
  append (cl-loop for sub in v
		  collect (cks/mh-hash-folder-name key sub)))))

After getting this list we need to sort it alphabetically, and also remove duplicate entries just in case (and also a surplus nil entry), using the following:

(sort (remq nil (seq-uniq flist)) 'string-lessp)

(Here, 'flist' is the let variable I have stuck the cl-loop result into. My actual code then removes some folder names I don't want to be there cluttering up the completion list for various reasons.)

There are some additional complications because MH-E will invalidate bits of its sub-folders cache every so often, so we may need to force the entire cache to be rebuilt from scratch (which requires some hackery, but turns out to be very fast these days). I'm not putting those relatively terrible hacks down here (also, the whole thing is somewhat long).

(If I was a clever person I would split this into two functions, one of which generated the full MH mail folder list and the second of which filtered out the stuff I don't want in it. Then I could publish the first function for people's convenience, assuming that anyone was interested. However, my ELisp often evolves organically as I realize what I want.)

(Probably) forcing Git to never prompt for authentication

By: cks
15 April 2024 at 03:11

My major use of Git is to keep copies of the public repositories of various projects from various people. Every so often, one of the people involved gets sufficiently irritated with the life of being an open source maintainer and takes their project's repository private (or their current Git 'forge' host does it for other reasons). When this happens, on my next 'git pull', Git skids to a halt with:

; git pull
Username for 'https://gitlab.com':

This is not a useful authentication prompt for me. I have no special access to these repositories; if anonymous access doesn't work, there is nothing I can enter for a username and password that will improve the situation. What I want is for Git to fail with a pull error, the same way it would if the repository URL returned a 404 or the connection to the host timed out.

(Git prompts you here because sometimes people do deal with private repositories which they have special credentials for.)

As far as I know, Git unfortunately has no configuration option or command line option that is equivalent to OpenSSH's 'batch mode' for ssh, where it will never prompt you for password challenges and will instead just fail. The closest you can come is setting core.askPass to something that generates output (such as 'echo'), in which case Git will try to authenticate with that bogus information, fail, and complain much more verbosely, which is not the same thing (among other issues, it causes the Git host to see you as trying invalid login credentials, which may have consequences).

If you're running your 'git pull' invocations from a script, as I often am, you can have the script set 'GIT_TERMINAL_PROMPT=0' (and export it into the environment). According to the documentation, this causes Git to fail rather than prompting you for anything, including authentication. It seems somewhat dangerous to set this generally in my environment, since I have no idea what else Git might someday want to prompt me about (and obviously if you need to sometimes get prompted you can't set this). Apparently this is incomplete if you fetch Git repositories over SSH, but I don't do that for public repositories that I track.

(I found this environment variable along with a lot of other discussion in this serverfault question and its answers.)

Some environments that run git behind the scenes, such as the historical 'go get' behavior, default to disabling git prompts. If you use such an environment it may have already handled this for you.

Don't require people to change 'source code' to configure your programs

By: cks
9 April 2024 at 02:16

Often, programs have build time configuration settings for features they include, paths they use, and so on. Some of the time, people suggest that the way to handle these is not through systems like 'configure' scripts (whether produced by Autoconf or some other means) but instead by having people edit their settings into things such as your Makefiles or header files ('source code' in a broad sense). As someone who has spent a bunch of time and effort building other people's software over the years, my strong opinion is that you should not do this.

The core problem of this approach is not that you require people to know the syntax of Makefiles or config.h or whatever in order to configure your software, although that's a problem too. The core problem is you're having people modify files that you will also change, for example when you release a new version of your software that has new options that you want people to be able to change or configure. When that happens, you're making every person who upgrades your software deal with merging their settings into your changes. And merging changes is hard and prone to error, especially if people haven't kept good records of what they changed (which they often won't if your configuration instructions are 'edit these files').

One of the painful lessons about maintaining systems that we've learned over the years is that you really don't want to have two people changing the same file, including the software provider and you. This is the core insight behind extremely valuable modern runtime configuration features such as 'drop-in files' (where you add or change things by putting your own files into some directory, instead of everything trying to update a common file). When you tell people to configure your program by editing a header file or a Makefile or indeed any file that you provide, you're shoving them back into this painful past. Every new release, every update they pull from your VCS, it's all going to be a source of pain for them.

A system where people maintain (or can maintain) their build time configurations entirely outside of anything you ship is far easier for people to manage. It doesn't matter exactly how this is implemented and there are mny options for relatively simple systems; you certainly don't need GNU Autoconf or even CMake.

The corollary to this is that if you absolutely insist on having people configure your software by editing files you ship, those files should be immutable by you. You should ship them in some empty state and promise never to change that, so that people building your software can copy their old versions from their old build of your software into your new release (or never get a merge conflict when they pull from your version control system repository). If your build system can't handle even this restriction, then you need to rethink it.

GNU Autoconf is not replaceable in any practical sense

By: cks
7 April 2024 at 02:50

In the wake of the XZ Utils backdoor, which involved GNU Autoconf, it's been somewhat popular to call for Autoconf to go away. Over on the Fediverse I said something about that:

Hot take: autoconf going away would be a significant net loss to OSS, perhaps as bad as the net loss of the Python 2 to Python 3 transition, and for much the same reason. There are a lot of projects out there that use autoconf/configure today and it works, and they would all have to do a bunch of work to wind up in exactly the same place ('a build system that works and has some switches and we can add our feature checks to').

(The build system can never supply all needed tests. Never.)`

Autoconf can certainly be replaced in general, either by one of the existing and more modern configuration and build systems, such as CMake, or by something new. New projects today often opt for one of the existing alternative build systems and (I believe) often find them simpler. But what can't be replaced easily is autoconf's use in existing projects, especially projects that use autoconf in non-trivial ways.

You can probably convert most projects to alternate build systems. However, much of this work will have to be done by hand, by each project that is converted, and this work (and the time it takes) won't particularly move the project forward. That means you're asking (or demanding) projects to spend their limited time to merely wind up in the same place, with a working build system. Further, some projects will still wind up running a substantial amount of their own shell code as part of the build system in order to determine and do things that are specific to the project.

(Although it may be an extreme example, you can look at the autoconf pieces that OpenZFS has in its config/ subdirectory. Pretty much all of that work would have to be done in any build system that OpenZFS used, and generally it would have to be significantly transformed to fit.)

There likely would be incremental security-related improvements even for such projects. For example, I believe many modern build systems don't expect you to ship their generated files the way that autoconf sort of expects you to ship its generated configure script (and the associated infrastructure), which was one part of what let the XZ backdoor slip files into the generated tarballs that weren't in their repository. But this is not a particularly gigantic improvement, and as mentioned it requires projects to do work to get it, possibly a lot of work.

You also can't simplify autoconf by declaring some standard checks obsolete and dropping everything to do with them. It may indeed be the case that few autoconf based programs today are actually going to cope with, for example, there being no string.h header file (cf), but that doesn't mean you can remove mentioning it from the generated header files and so on, since existing projects require those mentions to work right. The most you could do would be to make the generated 'configure' scripts simply assume a standard list of features and put them in the output those scripts generate.

(Of course it would be nice if projects using autoconf stopped making superstitious use of things like 'HAVE_STRING_H' and just assume that standard headers are present. But projects generally have more important things to spend limited time on than cleaning up header usage.)

PS: There's an entire additional discussion that we could have about whether 'supply chain security' issues such as Autoconf and release tarballs that can't be readily reproduced by third parties are even the project's problem in the first place.

GNU Emacs and the case of special space characters

By: cks
4 April 2024 at 02:58

One of the things I've had to wrestle with due to my move to reading my email with MH-E in GNU Emacs is that any number of Emacs modes involved in this like to be helpful by reformatting and annotating your email messages in various ways. Often it's not obvious to an outsider what mode (or code) is involved. For what I believe are historical reasons, a lot of MIME handling code has wound up in GNUS (also), which was originally a news reader; some of the code and variables has 'gnus' prefixes while others has 'mm' or 'mml' prefixes. In MH-E (and I believe most things that use Emacs' standard GNUS-based MIME handling), by default you will get nominally helpful things like message fontisizing and maybe highlighting of certain whitespace that the code thinks you might care about. I mostly don't want this, so I have been turning it off where I saw it and could identify the cause.

(As far as message fontisizing goes, sometimes I don't object to it but I very much object to the default behavior of hiding the characters that triggered the fontisizing. I don't want bits of message text hidden on me so that I have to reverse engineer the actual text from visual appearance changes that I may or may not notice and understand.)

Recently I was reading an email message and there was some white space in it that Emacs had given red underlines, causing me to get a bit irritated. People who are sufficiently familiar with GNU Emacs have already guessed the cause, and in fact the answer was right there in what I saw from Leah Neukirchen's suggestion of looking at (more or less) 'C-u C-x ='. What I was seeing was GNU Emacs' default handling of various special space characters.

(I was going to say that this was a non-breaking space, but it turns out not to be; instead it was U+2002, 'en space'. A true non-breaking space is U+00A0.)

As covered in How Text Is Displayed, Emacs normally displays these special characters and others with the (Emacs) nobreak-space face, which (on suitable displays) renders the character as red with a (red) underline. Since all space variants have nothing to render, you get a red underline. As covered in the documentation, you can turn this off generally or for a buffer by setting nobreak-char-display to nil, which I definitely won't be doing generally but might do for MH-E mail buffers, since my environment generally maps special space characters to a plain space if I paste them into terminals and the like.

(A full list of Emacs font faces is in Standard Faces.)

Zero-width spaces (should I ever encounter any in email or elsewhere) are apparently normally displayed using Glyphless Character Display's 'thin-space' method, along with other glyphless characters, and are Unicode U+200B. It's not clear to me if these will display with a red underline in my environment (see this emacs.stackexchange question and answers). Some testing suggests that zero width spaces may hide out without a visual marker (based on using 'C-x 8 RET' aka 'insert-char' to enter a zero-width space, a key binding which I also found out about through this exercise). At this point I am too lazy to figure out how to force zero-width spaces to be clearly visible.

PS: Other spaces known by insert-char include U+2003 (em space), U+2007 (figure space), U+2005 (four per em space), U+200A (hair space), U+3000 (ideographic space), U+205F (medium mathematical space), U+2008 (punctuation space), U+202F (narrow non-breaking space), and more. It's slightly terrifying. Most of the spaces render in the same way. I probably won't remember any of these Unicode numbers, but maybe I can remember C-u C-x = and that 'nobreak-space' as an Emacs face is an important marker.

PPS: Having gone through all of this, it's somewhat tempting to write some ELisp that will let me flip back and forth between displaying these characters in some clearly visible escaped form and displaying them 'normally' (showing as (marked) spaces and so on). That way I could normally see them very clearly, but make them unobtrusive if I had to deal with something that full of them in a harmless way. This is one of the temptations of GNU Emacs (or in general any highly programmable environment).

When I reimplement one of my programs, I often wind up polishing it too

By: cks
21 March 2024 at 03:10

Today I discovered a weird limitation of some IP address lookup stuff on the Linux machines I use (a limitation that's apparently not universal). In response to this, I rewrote the little Python program that I had previously been using for looking up IP addresses as a Go program, because I was relatively confident I could get Go to work (although it turns out I couldn't use net.LookupAddr() and had to be slightly more complicated). I could have made the Go program a basically straight port of the Python one, but as I was writing it, I couldn't resist polishing off some of the rough edges and adding missing features (some of which the Python program could have had, and some which would have been awkward to add).

This isn't even the first time this particular program has been polished as part of re-doing it; it was one of the Python programs I added things to when I moved them to Python 3 and the argparse package. That was a lesser thing than the Go port and the polishing changes were smaller, but they were still there.

This 'reimplementation leads to polishing' thing is something I've experienced before. It seems that more often than not, if I'm re-doing something I'm going to make it better (or at least what I consider better), unless I'm specifically implementing something with the goal of being essentially an exact duplicate but in a faster environment (which happened once). It doesn't have to be a reimplementation in a different language, although that certainly helps; I've re-done Python programs and shell scripts and had it lead to polishing.

One trigger for polishing is writing new documentation and code comments. In a pattern that's probably familiar to many programmers, when I find myself about to document some limitation or code issue, I'll frequently get the urge to fix it instead. Or I'll write the documentation about the imperfection, have it quietly nibble at me, and then go back to the code so I can delete that bit of the documentation after all. But some of what drives this polishing is the sheer momentum of having the code open in my editor and already changing or writing it.

Why doesn't happen when I write the program the first time? I think part of it is that I understand the problem and what I want to do better the second time around. When I'm putting together the initial quick utility, I have no experience with it and I don't necessarily know what's missing and what's awkward; I'm sort of building a 'minimum viable product' to deal with my immediate need (such as turning IP addresses into host names with validation of the result). When I come back to re-do or re-implement some or all of the program, I know both the problem and my needs better.

A realization about shell pipeline steps on multi-core machines

By: cks
9 March 2024 at 03:27

Over on the Fediverse, I had a realization:

This is my face when I realize that on a big multi-core machine, I want to do 'sed ... | sed ... | sed ...' instead of the nominally more efficient 'sed -e ... -e ... -e ...' because sed is single-threaded and if I have several costly patterns, multiple seds will parallelize them across those multiple cores.

Even when doing on the fly shell pipelines, I've tended to reflexively use 'sed -e ... -e ...' when I had multiple separate sed transformations to do, instead of putting each transformation in its own 'sed' command. Similarly I sometimes try to cleverly merge multi-command things into one command, although usually I don't try too hard. In a world where you have enough cores (well, CPUs), this isn't necessarily the right thing to do. Most commands are single threaded and will use only one CPU, but every command in a pipeline can run on a different CPU. So splitting up a single giant 'sed' into several may reduce a single-core bottleneck and speed things up.

(Giving sed multiple expressions is especially single threaded because sed specifically promises that they're processed in order, and sometimes this matters.)

Whether this actually matters may vary a lot. In my case, it only made a trivial difference in the end, partly because only one of my sed patterns was CPU-intensive (but that pattern alone made sed use all the CPU it could get and made it the bottleneck in the entire pipeline). In some cases adding more commands may add more in overhead than it saves from parallelism. There are no universal answers.

One of my lessons learned from this is that if I'm on a machine with plenty of cores and doing a one-time thing, it probably isn't worth my while to carefully optimize how many processes are being run as I evolve the pipeline. I might as well jam more pipeline steps whenever and wherever they're convenient. If it's easy to move one step closer to the goal with one more pipeline step, do it. Even if it doesn't help, it probably won't hurt very much.

Another lesson learned is that I might want to look for single threaded choke points if I've got a long-running shell pipeline. These are generally relatively easy to spot; just run 'top' and look for what's using up all of one CPU (on Linux, this is 100% CPU time). Sometimes this will be as easy to split as 'sed' was, and other times I may need to be more creative (for example, if zcat is hitting CPU limits, maybe pigz can help a bit.

(If I have the fast disk space, possibly un-compressing the files in place in parallel will work. This comes up in system administration work more than you'd think, since we can want to search and process log files and they're often stored compressed.)

How to make your GNU Emacs commands 'relevant' for M-X

By: cks
27 February 2024 at 03:11

Today I learned about the M-X command (well, key binding) (via), which "[queries the] user for a command relevant to the current mode, and then execute it". In other words it's like M-x but it restricts what commands it offers to relevant ones. What is 'relevant' here? To quote the docstring:

[...] This includes commands that have been marked as being specially designed for the current major mode (and enabled minor modes), as well as commands bound in the active local key maps.

If you're someone like me who has written some Lisp commands to customize your experience in a major mode like MH-E, you might wonder how you mark your personal Lisp commands as 'specially designed' for the relevant major mode.

In modern Emacs, the answer is that this is an extended part of '(interactive ...)', the normal Lisp form you use to mark your Lisp functions as commands (things which will be offered in M-x and can be run interactively). As mentioned in the Emacs Lisp manual section Using interactive, 'interactive' takes additional arguments to label what modes your command is 'specially designed' for; more discussion is in Specifying Modes For Commands. The basic usage is, say, '(interactive "P" mh-folder-mode)'

If your commands already take arguments, life is simple and you can just put the modes on the end. But not all commands do (especially for quick little things you do for yourself). If you have just '(interactive)', the correct change is to make it '(interactive nil mh-folder-mode)'; a nil first argument is how you tell interactive that there is no argument.

(Don't make my initial mistake and assume that '(interactive "" mh-folder-mode)' will work. That produced a variety of undesirable results.)

Is it useful to do this, assuming you have personal commands that are truly specific to a given mode (as I do for commands that operate on MH messages and the MH folder display)? My views so far are a decided maybe in my environment.

First, you don't need to do this if your commands have keybindings in your major mode, because M-X (execute-extended-command-for-buffer) will already offer any commands that have keybindings. Second, my assortment of packages already gives me quite a lot of selection power to narrow in on likely commands in plain M-x, provided that I've named them sensibly. The combination of vertico, marginalia, and orderless let me search for commands by substrings, easily see a number of my options, and also see part of their descriptions. So if I know I want something to do with MH forwarding I can type 'M-x mh forw' and get, among other things, my function for forwarding in 'literal plaintext' format.

With that said, adding the mode to '(interactive)' isn't much work and it does sort of add some documentation about your intentions that your future self may find useful. And if you want a more minimal minibuffer completion experience, it may be more useful to have a good way to winnow down the selection. If you use M-X frequently and you have commands you want to be able to select in it in applicable modes without having them bound to keys, you really have no choice.

The Go 'range over functions' proposal and user-written container types

By: cks
25 February 2024 at 03:30

In Go 1.22, the Go developers have made available a "range over function" experiment, as described in the Go Wiki's "Rangefunc Experiment". Recently I read a criticism of this, Richard Ulmer's Questioning Go's range-over-func Proposal (via). As I read Ulmer's article, it questions the utility of the range over func (proposed) feature based on the grounds that this isn't a significant enough improvement in standard library functions like strings.Split (which is given as an example in the "more motivation" section of the wiki article).

I'm not unsympathetic to this criticism, especially when it concerns standard library functionality. If the Go developers want to extend various parts of the standard library to support streaming their results instead of providing the results all at once, then there may well be better, lower-impact ways of doing so, such as developing a standard API approach or set of approaches for this and then using this to add new APIs. However, I think that extending the standard library into streaming APIs is by far the less important side of the "range over func" proposal (although this is what the "more motivation" section of the wiki article devotes the most space to).

Right from the beginning, one of the criticisms of Go was that it had some privileged, complex builtin types that couldn't be built using normal Go facilities, such as maps. Generics have made it mostly possible to do equivalents of these (generic) types yourself at the language level (although the Go compiler still uniquely privileges maps and other builtin types at the implementation level). However, these complex builtin types still retain some important special privileges in the language, and one of them is that they were the only types that you could write convenient 'range' based for loops.

In Go today you can write, for example, a set type or a key/value type with some complex internal storage implementation and make it work even for user-provided element types (through generics). But people using your new container types cannot write 'for elem := range set' or 'for k, v := range kvstore'. The best you can give them is an explicit push or pull based iterator based on your type (in a push iterator, you provide a callback function that is given each value; in a pull iterator, you repeatedly call some function to obtain the next value). The "range over func" proposal bridges this divide, allowing non-builtin types to be ranged over almost as easily as builtin types. You would be able to write types that let people write 'for elem := range set.Forward()' or 'for k, v := kvstore.Walk()'.

This is an issue that can't really be solved without language support. You could define a standard API for iterators and iteration (and the 'iter' package covered in the wiki article sort of is that), but it would still be more code and somewhat awkward code for people using your types to write. People are significantly attracted to what is easy to program; the more difficult it is to iterate user types compared to builtin types, the less people will do it (and the more they will use builtin types even when they aren't a good fit). If Go wants to put user (generic) types on almost the same level (in the language) as builtin types, then I feel it needs some version of a "range over func" approach.

(Of course, you may feel that Go should not prioritize putting user types on almost the same level as builtin types.)

❌
❌