❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayChris's Wiki :: blog

My blocking of some crawlers is an editorial decision unrelated to crawl volume

By: cks
30 May 2025 at 02:33

Recently I read a lobste.rs comment on one of my recent entries that said, in part:

Repeat after me everyone: the problem with these scrapers is not that they scrape for LLM’s, it’s that they are ill-mannered to the point of being abusive. LLM’s have nothing to do with it.

This may be some people's view but it is not mine. For me, blocking web scrapers here on Wandering Thoughts is partly an editorial decision of whether I want any of my resources or my writing to be fed into whatever they're doing. I will certainly block scrapers for doing what I consider an abusive level of crawling, and in practice most of the scrapers that I block come to my attention due to their volume, but I will block low-volume scrapers because I simply don't like what they're doing it for.

Are you a 'brand intelligence' firm that scrapes the web and sells your services to brands and advertisers? Blocked. In general, do you charge for access to whatever you're generating from scraping me? Probably blocked. Are you building a free search site for a cause (and with a point of view) that I don't particularly like? Almost certainly blocked. All of this is an editorial decision on my part on what I want to be even vaguely associated with and what I don't, not a technical decision based on the scraping's effects on my site.

I am not going to even bother trying to 'justify' this decision. It's a decision that needs no justification to some and to others, it's one that can never be justified. My view is that ethics matter. Technology and our decisions of what to do with technology are not politically neutral. We can make choices, and passively not doing anything is a choice too.

(I could say a lot of things here, probably badly, but ethics and politics are in part about what sort of a society we want, and there's no such thing as a neutral stance on that. See also.)

I would block LLM scrapers regardless of how polite they are. The only difference them being politer would make is that I would be less likely to notice (and then block) them. I'm probably not alone in this view.

Our Grafana and Loki installs have quietly become 'legacy software' here

By: cks
29 May 2025 at 03:00

At this point we've been running Grafana for quite some time (since late 2018), and (Grafana) Loki for rather less time and on a more ad-hoc and experimental basis. However, over time both have become 'legacy software' here, by which I mean that we (I) have frozen their versions and don't update them any more, and we (I) mostly or entirely don't touch their configurations any more (including, with Grafana, building or changing dashboards).

We froze our Grafana version due to backward compatibility issues. With Loki I could say that I ran out of enthusiasm for going through updates, but part of it was that Loki explicitly deprecated 'promtail' in favour of a more complex solution ('Alloy') that seemed to mostly neglect the one promtail feature we seriously cared about, namely reading logs from the systemd/journald complex. Another factor was it became increasingly obvious that Loki was not intended for our simple setup and future versions of Loki might well work even worse in it than our current version does.

Part of Grafana and Loki going without updates and becoming 'legacy' is that any future changes in them would be big changes. If we ever have to update our Grafana version, we'll likely have to rebuild a significant number of our current dashboards, because they use panels that aren't supported any more and the replacements have a quite different look and effect, requiring substantial dashboard changes for the dashboards to stay decently usable. With Loki, if the current version stopped working I'd probably either discard the idea entirely (which would make me a bit sad, as I've done useful things through Loki) or switch to something else that had similar functionality. Trying to navigate the rapids of updating to a current Loki is probably roughly as much work (and has roughly as much chance of requiring me to restart our log collection from scratch) as moving to another project.

(People keep mentioning VictoriaLogs (and I know people have had good experiences with it), but my motivation for touching any part of our Loki environment is very low. It works, it hasn't eaten the server it's on and shows no sign of doing that any time soon, and I'm disinclined to do any more work with smart log collection until a clear need shows up. Our canonical source of history for logs continues to be our central syslog server.)

Intel versus AMD is currently an emotional decision for me

By: cks
28 May 2025 at 02:40

I recently read Michael Stapelberg's My 2025 high-end Linux PC. One of the decisions Stapelberg made was choosing an Intel (desktop) CPU because of better (ie lower) idle power draw. This is a perfectly rational decision to make, one with good reasoning behind it, and also as I read the article I realized that it was one I wouldn't have made. Not because I don't value idle power draw; like Stapelberg's machine but more so, my desktops spend most of their time essentially idle. Instead, it was because I realized (or confirmed my opinion) that right now, I can't stand to buy Intel CPUs.

I am tired of all sorts of aspects of Intel. I'm tired of their relentless CPU product micro-segmentation across desktops and servers, with things like ECC allowed in some but not all models. I'm tired of their whole dance of P-cores and E-cores, and also of having to carefully read spec sheets to understand the P-core and E-core tradeoffs for a particular model. I'm tired of Intel just generally being behind AMD and repeatedly falling on its face with desperate warmed over CPU refreshes that try to make up for its process node failings. I'm tired of Intel's hardware design failure with their 13th and 14th generation CPUs (see eg here). I'm sure AMD Ryzens have CPU errata too that would horrify me if I knew, but they're not getting rubbed in my face the way the Intel issue is.

At this point Intel has very little going for its desktop CPUs as compared to the current generation AMD Ryzens. Intel CPUs have better idle power levels, and may have better single-core burst performance. In absolute performance I probably won't notice much difference, and unlike Stapelberg I don't do the kind of work where I really care about build speed (and if I do, I have access to much more powerful machines). As far as the idle power goes, I likely will notice the better idle power level (some of the time), but my system is likely to idle at lower power in general than Stapelberg's will, especially at home where I'll try to use the onboard graphics if at all possible (so I won't have the (idle) power price of a GPU card).

(At work I need to drive two 4K displays at 60Hz and I don't think there are many motherboards that will do that with onboard graphics, even if the CPU's built in graphics system is up to it in general.)

But I don't care about the idle power issue. If or when I build a new home desktop, I'll eat the extra 20 watts or so of idle power usage for an AMD CPU (although this may vary in practice, especially with screens blanked). And I'll do it because right now I simply don't want to give Intel my money.

My GNU Emacs settings for the vertico package (as of mid 2025)

By: cks
27 May 2025 at 02:38

As covered in my Emacs packages, vertico is one of the third party Emacs packages that I have installed to modify how minibuffer completion works for me, or at least how it looks. In my experience, vertico took a significant amount of customization before I really liked it (eventually including some custom code), so I'm going to write down some notes about why I made various settings.

Vertico itself is there to always show me a number of the completion targets, as a help to narrowing in on what I want; I'm willing to trade vertical space during completion for a better view of what I'm navigating around. It's not the only way to do this (there's fido-vertical-mode in standard GNU Emacs, for example), but it's what I started with and it has a number of settings that let me control both how densely the completions are presented (and so how many of them I get to see at once) and how they're presented.

The first thing I do with vertico is override its key binding for TAB, because I want standard Emacs minibuffer tab completion, not vertico's default behavior of inserting the current thing completion is currently on. Specifically, my key bindings are:

 :bind (:map vertico-map
             ("TAB" . minibuffer-complete)
             ;; M-v is taken by vertico
             ("M-g M-c" . switch-to-completions)
             ;; Original tab binding, which we want sometimes when
             ;; using orderless completion.
             ("M-TAB" . vertico-insert))

I normally work by using regular tab completion and orderless's completion until I'm happy, then hitting M-TAB if necessary and then RET. I use M-g M-c so rarely that I'd forgotten it until writing this entry. Using M-TAB is especially likely for a long filename completion, where I might use the cursor keys (or theoretically the mouse) to move vertico's selection to a directory and then hit M-TAB to fill it in so I can then tab-complete within it.

Normally, vertico displays a single column of completion candidates, which potentially leaves a lot of wasted space on the right; I use marginalia to add information some sorts of completion targets (such as Emacs Lisp function names) in this space. For other sorts of completions where there's no particular additional information, such as MH-E mail folder names, I use vertico's vertico-multiform-mode to switch to a vertico-grid so I fill the space with several columns of completion candidates and reduce the number of vertical lines that vertico uses (both are part of vertico's extensions).

(I also have vertico-mouse enabled when I'm using Emacs under X, but in practice I mostly don't use it.)

Another important change (for me) is to turn off vertico's default behavior of remembering the history of your completions and putting recently used entries first in the list. This sounds like a fine idea, but in practice I want my completion order to be completely predictable and I'm rarely completing the same thing over and over again. The one exception is my custom MH-E folder completion, where I do enable history because I may be, for example, refiling messages into one of a few folders. This is done through another extension, vertico-sort, or at least I think it is.

(When vertico is installed as an ELPA or MELPA package and then use-package'd, you apparently get all of the extensions without necessarily having to specifically enable them and can just use bits from them.)

My feeling is that effective use of vertico probably requires this sort of customization if you regularly use minibuffer completion for anything beyond standard things where vertico (and possibly marginalia) can make good use of all of your horizontal space. Beyond what key bindings and other vertico behavior you can stand and what behavior you have to change, you want to figure out how to tune vertico so that it's significantly useful for each thing you regularly complete, instead of mostly showing you a lot of empty space and useless results. This is intrinsically a relatively personal thing.

PS: One area where vertico's completion history is not as useful as it looks is filename completion or anything that looks like it (such as standard MH-E folder completion). This is because Emacs filename completion and thus vertico's history happens component by component, while you probably want your history to give you the full path that you wound up completing.

PPS: I experimented with setting vertico-resize, but found that the resulting jumping around was too visually distracting.

A thought on JavaScript "proof of work" anti-scraper systems

By: cks
26 May 2025 at 02:50

One of the things that people are increasingly using these days to deal with the issue of aggressive LLM and other web scrapers is JavaScript based "proof of work" systems, where your web server requires visiting clients to run some JavaScript to solve a challenge; one such system (increasingly widely used) is Xe Iaso's Anubis. One of the things that people say about these systems is that LLM scrapers will just start spending the CPU time to run this challenge JavaScript, and LLM scrapers may well have lots of CPU time available through means such as compromised machines. One of my thoughts is that things are not quite as simple for the LLM scrapers as they look.

An LLM scraper is operating in a hostile environment (although its operator may not realize this). In a hostile environment, dealing with JavaScript proof of work systems is not as simple as simply running it, because you can't particularly tell a JavaScript proof of work system from JavaScript that does other things. Letting your scraper run JavaScript means that it can also run JavaScript for other purposes, for example for people who would like to exploit your scraper's CPU to do some cryptocurrency mining, or simply have you run JavaScript for as long as you'll let it keep going (perhaps because they've recognized you as a LLM scraper and want to waste as much of your CPU as possible).

An LLM scraper can try to recognize a JavaScript proof of work system but this is a losing game. The other parties have every reason to make themselves look like a proof of work system, and the proof of work systems don't necessarily have an interest in being recognized (partly because this might allow LLM scrapers to short-cut their JavaScript with optimized host implementations of the challenges). And as both spammers and cryptocurrency miners have demonstrated, there is no honor among thieves. If LLM scrapers dangle free computation in front of people, someone will spring up to take advantage of it. This leaves LLM scrapers trying to pick a JavaScript runtime limit that doesn't cut them off from too many sites, while sites can try to recognize LLM scrapers and increase their proof of work difficulty if they see a suspect.

(This is probably not an original thought, but it's been floating around my head for a while.)

PS: JavaScript proof of work systems aren't the greatest thing, but they're going to happen unless someone convincingly demonstrates a better alternative.

The length of file names in early Unix

By: cks
25 May 2025 at 01:33

If you use Unix today, you can enjoy relatively long file names on more or less any filesystem that you care to name. But it wasn't always this way. Research V7 had 14-byte filenames, and the System III/System V lineage continued this restriction until it merged with BSD Unix, which had significantly increased this limit as part of moving to a new filesystem (initially called the 'Fast File System', for good reasons). You might wonder where this unusual number came from, and for that matter, what the file name limit was on very early Unixes (it was 8 bytes, which surprised me; I vaguely assumed that it had been 14 from the start).

I've mentioned before that the early versions of Unix had a quite simple format for directory entries. In V7, we can find the directory structure specified in sys/dir.h (dir(5) helpfully directs you to sys/dir.h), which is so short that I will quote it in full:

#ifndef	DIRSIZ
#define	DIRSIZ	14
#endif
struct	direct
{
    ino_t    d_ino;
    char     d_name[DIRSIZ];
};

To fill in the last blank, ino_t was a 16-bit (two byte) unsigned integer (and field alignment on PDP-11s meant that this structure required no padding), for a total of 16 bytes. This directory structure goes back to V4 Unix. In V3 Unix and before, directory entries were only ten bytes long, with 8 byte file names.

(Unix V4 (the Fourth Edition) was when the kernel was rewritten in C, so that may have been considered a good time to do this change. I do have to wonder how they handled the move from the old directory format to the new one, since Unix at this time didn't have multiple filesystem types inside the kernel; you just had the filesystem, plus all of your user tools knew the directory structure.)

One benefit of the change in filename size is that 16-byte directory entries fit evenly in 512-byte disk blocks (or other powers-of-two buffer sizes). You never have a directory entry that spans two disk blocks, so you can deal with directories a block at a time. Ten byte directory entries don't have this property; eight-byte ones would, but then that would leave space for only six character file names, and presumably that was considered too small even in Unix V1.

PS: That inode numbers in V7 (and earlier) were 16-bit unsigned integers does mean what you think it means; there could only be at most 65,536 inodes in a single classical V7 filesystem. If you needed more files, you had better make more filesystems. Early Unix had a lot of low limits like that, some of them quite hard-coded.

What keeps Wandering Thoughts more or less free of comment spam (2025 edition)

By: cks
24 May 2025 at 02:50

Like everywhere else, Wandering Thoughts (this blog) gets a certain amount of automated comment spam attempts. Over the years I've fiddled around with a variety of anti-spam precautions, although not all of them have worked out over time. It's been a long time since I've written anything about this, because one particular trick has been extremely effective ever since I introduced it.

That one trick is a honeypot text field in my 'write a comment' form. This field is normally hidden by CSS, and in any case the label for the field says not to put anything in it. However, for a very long time now, automated comment spam systems seem to operate by stuffing some text into every (text) form field that they find before they submit the form, which always trips over this. I log the form field's text out of curiosity; sometimes it's garbage and sometimes it's (probably) meaningful for the spam comment that the system is trying to submit.

Obviously this doesn't stop human-submitted spam, which I get a small amount of every so often. In general I don't expect anything I can reasonably do to stop humans who do the work themselves; we've seen this play out in email and I don't have any expectations that I can do better. It also probably wouldn't work if I was using a popular platform that had this as a general standard feature, because then it would be worth the time of the people writing automated comment spam systems to automatically recognize it and work around it.

Making comments on Wandering Thoughts also has an additional small obstacle in the way of automated comment spammers, which is that you must initially preview your comment before you can submit it (although you don't have to submit the comment that you previewed, you can edit it after the first preview). Based on a quick look at my server logs, I don't think this matters to the current automated comment spam systems that try things here, as they only appear to try submitting once. I consider requiring people to preview their comment before posting it to be a good idea in general, especially since Wandering Thoughts uses a custom wiki-syntax and a forced preview gives people some chance of noticing any mistakes.

(I think some amount of people trying to write comments here do miss this requirement and wind up not actually posting their comment in the end. Or maybe they decide not to after writing one version of it; server logs give me only so much information.)

In a world that is increasingly introducing various sorts of aggressive precautions against LLM crawlers, including 'proof of work' challenges, all of this may become increasingly irrelevant. This could go either way; either the automated comment spammers die off as more and more systems have protections that are too aggressive for them to deal with, or the automated systems become increasingly browser-based and sidestep my major precaution because they no longer 'see' the honeypot field.

Fedora's DNF 5 and the curse of mandatory too-smart output

By: cks
23 May 2025 at 02:49

DNF is Fedora's high(er) level package management system, which pretty much any system administrator is going to have to use to install and upgrade packages. Fedora 41 and later have switched from DNF 4 to DNF 5 as their normal (and probably almost mandatory) version of DNF. I ran into some problems with this switch, and since then I've found other issues, all of which boil down to a simple issue: DNF 5 insists on doing too-smart output.

Regardless of what you set your $TERM to and what else you do, if DNF 5 is connected to a terminal (and perhaps if it isn't), it will pretty-print its output in an assortment of ways. As far as I can tell it simply assumes ANSI cursor addressability, among other things, and will always fit its output to the width of your terminal window, truncating output as necessary. This includes output from RPM package scripts that are running as part of the update. Did one of them print a line longer than your current terminal width? Tough, it was probably truncated. Are you using script so that you can capture and review all of the output from DNF and RPM package scripts? Again, tough, you can't turn off the progress bars and other things that will make a complete mess of the typescript.

(It's possible that you can find the information you want in /var/log/dnf5.log in un-truncated and readable form, but if so it's buried in debug output and I'm not sure I trust dnf5.log in general.)

DNF 5 is far from the only offender these days. An increasing number of command line programs simply assume that they should always produce 'smart' output (ideally only if they're connected to a terminal). They have no command line option to turn this off and since they always use 'ANSI' escape sequences, they ignore the tradition of '$TERM' and especially 'TERM=dumb' to turn that off. Some of them can specifically disable colour output (typically with one of a number of environment variables, which may or may not be documented, and sometimes with a command line option), but that's usually the limits of their willingness to stop doing things. The idea of printing one whole line at a time as you do things and not printing progress bars, interleaving output, and so on has increasingly become a non-starter for modern command line tools.

(Another semi-offender is Debian's 'apt' and also 'apt-get' to some extent, although apt-get's progress bars can be turned off and 'apt' is explicitly a more user friendly front end to apt-get and friends.)

PS: I can't run DNF with its output directed into a file because it wants you to interact with it to approve things, and I don't feel like letting it run freely without that.

Thinking about what you'd want in a modern simple web server

By: cks
22 May 2025 at 02:14

Over on the Fediverse, I said:

I'm currently thinking about what you'd want in a simple modern web server that made life easy for sites that weren't purely static. I think you want CGI, FastCGI, and HTTP reverse proxying, plus process supervision. Automatic HTTPS of course. Rate limiting support, and who knows what you'd want to make it easier to deal with the LLM crawler problem.

(This is where I imagine a 'stick a third party proxy in the middle' mode of operation.)

What I left out of my Fediverse post is that this would be aimed at small scale sites. Larger, more complex sites can and should invest in the power, performance, and so on of headline choices like Apache, Nginx, and so on. And yes, one obvious candidate in this area is Caddy, but at the same time something that has "more scalable" (than alternatives) as a headline features is not really targeting the same area as I'm thinking of.

This goal of simplicity of operation is why I put "process supervision" into the list of features. In a traditional reverse proxy situation (whether this is FastCGI or HTTP), you manage the reverse proxy process separately from the main webserver, but that requires more work from you. Putting process supervision into the web server has the goal of making all of that more transparent to you. Ideally, in common configurations you wouldn't even really care that there was a separate process handling FastCGI, PHP, or whatever; you could just put things into a directory or add some simple configuration to the web server and restart it, and everything would work. Ideally this would extend to automatically supporting PHP by just putting PHP files somewhere in the directory tree, just like CGI; internally the web server would start a FastCGI process to handle them or something.

(Possibly you'd implement CGI through a FastCGI gateway, but if so this would be more or less pre-configured into the web server and it'd ship with a FastCGI gateway for this (and for PHP).)

This is also the goal for making it easy to stick a third party filtering proxy in the middle of processing requests. Rather than having to explicitly set up two web servers (a frontend and a backend) with an anti-LLM filtering proxy in the middle, you would write some web server configuration bits and then your one web server would split itself into a frontend and a backend with the filtering proxy in the middle. There's no technical reason you can't do this, and even control what's run through the filtering proxy and what's served directly by the front end web server.

This simple web server should probably include support for HTTP Basic Authentication, so that you can easily create access restricted areas within your website. I'm not sure if it should include support for any other sort of authentication, but if it did it would probably be OpenID Connect (OIDC), since that would let you (and other people) authenticate through external identity providers.

It would be nice if the web server included some degree of support for more or less automatic smart in-memory (or on-disk) caching, so that if some popular site linked to your little server, things wouldn't explode (or these days, if a link to your site was shared on the Fediverse and all of the Fediverse servers that it propagated to immediately descended on your server). At the very least there should be enough rate limiting that your little server wouldn't fall over, and perhaps some degree of bandwidth limits you could set so that you wouldn't wake up to discover you had run over your outgoing bandwidth limits and were facing large charges.

I doubt anyone is going to write such a web server, since this isn't likely to be the kind of web server that sets the world on fire, and probably something like Caddy is more or less good enough.

(Doing a good job of writing such a server would also involve a fair amount of research to learn what people want to run at a small scale, how much they know, what sort of server resources they have or want to use, what server side languages they wind up using, what features they need, and so on. I certainly don't know enough about the small scale web today.)

PS: One reason I'm interested in this is that I'd sort of like such a server myself. These days I use Apache and I'm quite familiar with it, but at the same time I know it's a big beast and sometimes it has entirely too many configuration options and special settings and so on.

The five platforms we have to cover when planning systems

By: cks
21 May 2025 at 03:33

Suppose, not entirely hypothetically, that you're going to need a 'VPN' system that authenticates through OIDC. What platforms do you need this VPN system to support? In our environment, the answer is that we have five platforms that we need to care about, and they're the obvious four plus one more: Windows, macOS, iOS, Android, and Linux.

We need to cover these five platforms because people here use our services from all of those platforms. Both Windows and macOS are popular on laptops (and desktops, which still linger around), and there's enough people who use Linux to be something we need to care about. On mobile devices (phones and tablets), obviously iOS and Android are the two big options, with people using either or both. We don't usually worry about the versions of Windows and macOS and suggest that people to stick to supported ones, but that may need to change with Windows 10.

Needing to support mobile devices unquestionably narrows our options for what we can use, at least in theory, because there are certain sorts of things you can semi-reasonably do on Linux, macOS, and Windows that are infeasible to do (at least for us) on mobile devices. But we have to support access to various of our services even on iOS and Android, which constrains us to certain sorts of solutions, and ideally ones that can deal with network interruptions (which are quite common on mobile devices in Toronto, as anyone who takes our subways is familiar with).

(And obviously it's easier for open source systems to support Linux, macOS, and Windows than it is for them to extend this support to Android and especially iOS. This extends to us patching and rebuilding them for local needs; with various modern languages, we can produce Windows or macOS binaries from modified open source projects. Not so much for mobile devices.)

In an ideal world it would be easy to find out the support matrix of platforms (and features) for any given project. In this world, the information can sometimes be obscure, especially for what features are supported on what platforms. One of my resolutions to myself is that when I find interesting projects but they seem to have platform limitations, I should note down where in their documentation they discuss this, so I can find it later to see if things have changed (or to discuss with people why certain projects might be troublesome).

Python, type hints, and feeling like they create a different language

By: cks
20 May 2025 at 02:31

At this point I've only written a few, relatively small programs with type hints. At times when doing this, I've wound up feeling that I was writing programs in a language that wasn't quite exactly Python (but obviously was closely related to it). What was idiomatic in one language was non-idiomatic in the other, and I wanted to write code differently. This feeling of difference is one reason I've kept going back and forth over whether I should use type hints (well, in personal programs).

Looking back, I suspect that this is partly a product of a style where I tried to use typing.NewType a lot. As I found out, this may not really be what I want to do. Using type aliases (or just structural descriptions of the types) seems like it's going to be easier, since it's mostly just a matter of marking up things. I also suspect that this feeling that typed Python is a somewhat different language from plain Python is a product of my lack of experience with typed Python (which I can fix by doing more with types in my own code, perhaps revising existing programs to add type annotations).

However, I suspect some of this feeling of difference is that you (I) want to structure 'typed' Python code differently than untyped code. In untyped Python, duck typing is fine, including things like returning None or some meaningful type, and you can to a certain extent pass things around without caring what type they are. In this sort of situation, typed Python has pushed me toward narrowing the types involved in my code (although typing.Optional can help here). Sometimes this is a good thing; at other times, I wind up using '0.0' to mean 'this float value is not set' when in untyped Python I would use 'None' (because propagating the type difference of the second way through the code is too annoying). Or to put it another way, typed Python feels less casual, and there are good and bad aspects to this.

Unfortunately, one significant source of Python code that I work on is effectively off limits for type hints, and that's the Python code I write for work. For that code, I need to stick to the subset of Python that my co-workers know and can readily understand, and that subset doesn't include Python's type hints. I could try to teach my co-workers about type hints, but my view is that if I'm wrestling with whether it's worth it, my co-workers will be even less receptive to the idea of trying to learn and remember them (especially when they look at my Python code only infrequently). If we were constantly working with medium to large Python programs where type hints were valuable for documenting things and avoiding irritating errors it would be one thing, but as it is our programs are small and we can go months between touching any Python code. I care about Python type hints and have active exposure to them, and even I have to refresh my memory on them from time to time.

(Perhaps some day type hints will be pervasive enough in third party Python code and code examples that my co-workers will absorb and remember them through osmosis, but that day isn't today.)

The lack of a good command line way to sort IPv6 addresses

By: cks
19 May 2025 at 02:54

A few years ago, I wrote about how 'sort -V' can sort IPv4 addresses into their natural order for you. Even back then I was smart enough to put in that 'IPv4' qualification and note that this didn't work with IPv6 addresses, and said that I didn't know of any way to handle IPv6 addresses with existing command line tools. As far as I know, that remains the case today, although you can probably build a Perl, Python, or other language program that does such sorting for you if you need to do this regularly.

Unix tools like 'sort' are pretty flexible, so you might innocently wonder why it can't be coerced into sorting IPv6 addresses. The first problem is that IPv6 addresses are written in hex without leading 0s, not decimal. Conventional sort will correctly sort hex numbers if all of the numbers are the same length, but IPv6 addresses are written in hex groups that conventionally drop leading zeros, so you will have 'ff' instead of '00ff' in common output (or '0' instead of '0000'). The second and bigger problem is the IPv6 '::' notation, which stands for the longest run of all-zero fields, ie some number of '0000' fields.

(I'm ignoring IPv6 scopes and zones for this, let's assume we have public IPv6 addresses.)

If IPv6 addresses were written out in full, with leading 0s on fields and all their 0000 fields, you could handle them as a simple conventional sort (you wouldn't even need to tell sort that the field separator was ':'). Unfortunately they almost never are, so you need to either transform them to that form, print them out, sort the output, and perhaps transform them back, or read them into a program as 128-bit numbers, sort the numbers, and print them back out as IPv6 addresses. Ideally your language of choice for this has a way to sort a collection of IPv6 addresses.

The very determined can probably do this with awk with enough work (people have done amazing things in awk). But my feeling is that doing this in conventional Unix command line tools is a Turing tarpit; you might as well use a language where there's a type of IPv6 addresses that exposes the functionality that you need.

(And because IPv6 addresses are so complex, I suspect that GNU Sort will never support them directly. If you need GNU Sort to deal with them, the best option is a program that turns them into their full form.)

PS: People have probably written programs to sort IPv6 addresses, but with the state of the Internet today, the challenge is finding them.

It's not obvious how to verify TLS client certificates issued for domains

By: cks
18 May 2025 at 02:24

TLS server certificate verification has two parts; you first verify that the TLS certificate is valid, CA-signed certificate, and then you verify that the TLS certificate is for the host you're connecting to. One of the practical issues with TLS 'Client Authentication' certificates for host and domain names (which are on the way out) is that there's no standard meaning for how you do the second part of this verification, and if you even should. In particular, what host name are you validating the TLS client certificate against?

Some existing protocols provide the 'client host name' to the server; for example, SMTP has the EHLO command. However, existing protocols tend not to have explicitly standardized using this name (or any specific approach) for verifying a TLS client certificate if one is presented to the server, and large mail providers vary in what they send as a TLS client certificate in SMTP conversations. For example, Google's use of 'smtp.gmail.com' doesn't match any of the other names available, so its only meaning is 'this connection comes from a machine that has access to private keys for a TLS certificate for smtp.gmail.com', which hopefully means that it belongs to GMail and is supposed to be used for this purpose.

If there is no validation of the TLS client certificate host name, that is all that a validly signed TLS client certificate means; the connecting host has access to the private keys and so can be presumed to be 'part of' that domain or host. This isn't nothing, but it doesn't authenticate what exactly the client host is. If you want to validate the host name, you have to decide what to validate against and there are multiple answers. If you design the protocol you can have the protocol send a client host name and then validate the TLS certificate against the hostname; this is slightly better than using the TLS certificate's hostname as is in the rest of your processing, since the TLS certificate might have a wildcard host name. Otherwise, you might validate the TLS certificate host name against its reverse DNS, which is more complicated than you might expect and which will fail if DNS isn't working. If the TLS client certificate doesn't have a wildcard, you could also try to look up the IP addresses associated with the host names in the TLS certificate and see if any of the IP addresses match, but again you're depending on DNS.

(You can require non-wildcard TLS certificate names in your protocol, but people may not like it for various reasons.)

This dependency on DNS for TLS client certificates is different from the DNS dependency for TLS server certificates. If DNS doesn't work for the server case, you're not connecting at all since you have no target IPs; if you can connect, you have a target hostname to validate against (in the straightforward case of using a hostname instead of an IP address). In the TLS client certificate case, the client can connect but then the TLS server may deny it access for apparently arbitrary reasons.

That your protocol has to specifically decide what verifying TLS client certificates means (and there are multiple possible answers) is, I suspect, one reason that TLS client certificates aren't used more in general Internet protocols. In turn this is a disincentive for servers implementing TLS-based protocols (including SMTP) from telling TLS clients that they can send a TLS client certificate, since it's not clear what you should do with it if one is sent.

Let's Encrypt drops "Client Authentication" from its TLS certificates

By: cks
17 May 2025 at 02:53

The TLS news of the time interval is that Let's Encrypt certificates will no longer be usable to authenticate your client to a TLS server (via a number of people on the Fediverse). This is driven by a change in Chrome's "Root Program", covered in section 3.2, with a further discussion of this in Chrome's charmingly named Moving Forward, Together in the "Understanding dedicated hierarchies" section; apparently only half of the current root Certificate Authorities actually issue TLS server certificates. As far as I know this is not yet a CA/Browser Forum requirement, so this is all driven by Chrome.

In TLS client authentication, a TLS client (the thing connecting to a TLS server) can present its own TLS certificate to the TLS server, just as the TLS server presents its certificate to the client. The server can then authenticate the client certificate however it wants to, although how to do this is not as clear as when you're authenticating a TLS server's certificate. To enable this usage, a TLS certificate and the entire certificate chain must be marked as 'you can use these TLS certificates for client authentication' (and similarly, a TLS certificate that will be used to authenticate a server to clients must be marked as such). That marking is what Let's Encrypt is removing.

This doesn't affect public web PKI, which basically never used conventional CA-issued host and domain TLS certificates as TLS client certificates (websites that used TLS client certificates used other sorts of TLS certificates). It does potentially affect some non-web public TLS, where domain TLS certificates have seen small usage in adding more authentication to SMTP connections between mail systems. I run some spam trap SMTP servers that advertise that sending mail systems can include a TLS client certificate if the sender wants to, and some senders (including GMail and Outlook) do send proper public TLS certificates (and somewhat more SMTP senders include bad TLS certificates). Most mail servers don't, though, and given that one of the best sources of free TLS certificates has just dropped support for this usage, that's unlikely to change. Let's Encrypt's TLS certificates can still be used by your SMTP server for receiving email, but you'll no longer be able to use them for sending it.

On the one hand, I don't think this is going to have material effects on much public Internet traffic and TLS usage. On the other hand, it does cut off some possibilities in non-web public TLS, at least until someone starts up a free, ACME-enabled Certificate Authority that will issue TLS client certificates. And probably some number of mail servers will keep sending their TLS certificates to people as client certificates even though they're no longer valid for that purpose.

PS: If you're building your own system and you want to, there's nothing stopping you from accepting public TLS server certificates from TLS clients (although you'll have to tell your TLS library to validate them as TLS server certificates, not client certificates, since they won't be marked as valid for TLS client usage). Doing the security analysis is up to you but I don't think it's a fatally flawed idea.

Classical "Single user computers" were a flawed or at least limited idea

By: cks
16 May 2025 at 02:33

Every so often people yearn for a lost (1980s or so) era of 'single user computers', whether these are simple personal computers or high end things like Lisp machines and Smalltalk workstations. It's my view that the whole idea of a 1980s style "single user computer" is not what we actually want and has some significant flaws in practice.

The platonic image of a single user computer in this style was one where everything about the computer (or at least its software) was open to your inspection and modification, from the very lowest level of the 'operating system' (which was more of a runtime environment than an OS as such) to the highest things you interacted with (both Lisp machines and Smalltalk environments often touted this as a significant attraction, and it's often repeated in stories about them). In personal computers this was a simple machine that you had full control over from system boot onward.

The problem is that this unitary, open environment is (or was) complex and often lacked resilience. Famously, in the case of early personal computers, you could crash the entire system with programming mistakes, and if there's one thing people do all the time, it's make mistakes. Most personal computers mitigated this by only doing one thing at once, but even then it was unpleasant, and the Amiga would let you blow multiple processes up at once if you could fit them all into RAM. Even on better protected systems, like Lisp and Smalltalk, you still had the complexity and connectedness of a unitary environment.

One of the things that we've learned from computing over the past N decades is that separation, isolation, and abstraction are good ideas. People can only keep track of so many things in their heads at once, and modularity (in the broad sense) is one large way we keep things within that limit (or at least closer to it). Single user computers were quite personal but usually not very modular. There are reasons that people moved to computers with things like memory protection, multiple processes, and various sorts of privilege separation.

(Let us not forget the great power of just having things in separate objects, where you can move around or manipulate or revert just one object instead of 'your entire world'.)

I think that there is a role for computers that are unapologetically designed to be used by only a single person who is in full control of everything and able to change it if they want to. But I don't think any of the classical "single user computer" designs are how we want to realize a modern version of the idea.

(As a practical matter I think that a usable modern computer system has to be beyond the understanding of any single person. There is just too much complexity involved in anything except very restricted computing, even if you start from complete scratch. This implies that an 'understandable' system really needs strong boundaries between its modules so that you can focus on the bits that are of interest to you without having to learn lots of things about the rest of the system or risk changing things you don't intend to.)

Two broad approaches to having Multi-Factor Authentication everywhere

By: cks
15 May 2025 at 03:05

In this modern age, more and more people are facing more and more pressure to have pervasive Multi-Factor Authentication, with every authentication your people perform protected by MFA in some way. I've come to feel that there are two broad approaches to achieving this and one of them is more realistic than the other, although it's also less appealing in some ways and less neat (and arguably less secure).

The 'proper' way to protect everything with MFA is to separately and individually add MFA to everything you have that does authentication. Ideally you will have a central 'single sign on' system, perhaps using OIDC, and certainly your people will want you to have only one form of MFA even if it's not all run through your SSO. What this implies is that you need to add MFA to every service and protocol you have, which ranges from generally easy (websites) through being annoying to people or requiring odd things (SSH) to almost impossible at the moment (IMAP, authenticated SMTP, and POP3). If you opt to set it up with no exemptions for internal access, this approach to MFA insures that absolutely everything is MFA protected without any holes through which an un-MFA'd authentication can be done.

The other way is to create some form of MFA-protected network access (a VPN, a mesh network, a MFA-authenticated SSH jumphost, there are many options) and then restrict all non-MFA access to coming through this MFA-protected network access. For services where it's easy enough, you might support additional MFA authenticated access from outside your special network. For other services where MFA isn't easy or isn't feasible, they're only accessible from the MFA-protected environment and a necessary step for getting access to them is to bring up your MFA-protected connection. This approach to MFA has the obvious problem that if someone gets access to your MFA-protected network, they have non-MFA access to everything else, and the not as obvious problem that attackers might be able to MFA as one person to the network access and then do non-MFA authentication as another person on your systems and services.

The proper way is quite appealing to system administrators. It gives us an array of interesting challenges to solve, neat technology to poke at, and appealingly strong security guarantees. Unfortunately the proper way has two downsides; there's essentially no chance of it covering your IMAP and authenticated SMTP services any time soon (unless you're willing to accept some significant restrictions), and it requires your people to learn and use a bewildering variety of special purpose, one-off interfaces and sometimes software (and when it needs software, there may be restrictions on what platforms the software is readily available on). Although it's less neat and less nominally secure, the practical advantage of the MFA protected network access approach is that it's universal and it's one single thing for people to deal with (and by extension, as long as the network system itself covers all platforms you care about, your services are fully accessible from all platforms).

(In practice the MFA protected network approach will probably be two things for people to deal with, not one, since if you have websites the natural way to protect them is with OIDC (or if you have to, SAML) through your single sign on system. Hopefully your SSO system is also what's being used for the MFA network access, so people only have to sign on to it once a day or whatever.)

Using awk to check your script's configuration file

By: cks
14 May 2025 at 02:39

Suppose, not hypothetically, that you have a shell script with a relatively simple configuration file format that people can still accidentally get wrong. You'd like to check the configuration file for problems before you use it in the rest of your script, for example by using it with 'join' (where things like the wrong number or type of fields will be a problem). Recently on the Fediverse I shared how I was doing this with awk, so here's a slightly more elaborate and filled out version:

errs=$(awk '
         $1 ~ "^#" { next }
         NF != 3 {
            printf " line %d: wrong number of fields\n", NR;
            next }
         [...]
         ' "$cfg_file"
       )

if [ -n "$errs" ]; then
   echo "$prog: Errors found in '$cfg_file'. Stopping." 1>&2
   echo "$errs" 1>&2
   exit 1
fi

(Here I've chosen to have awk's diagnostic messages indented by one space when the script prints them out, hence the space before 'line %d: ...'.)

The advantage of having awk simply print out the errors it detects and letting the script deal with them later is that you don't need to mess around with awk's exit status; your awk program can simply print what it finds and be done. Using awk for the syntax checks is handy because it lets you express a fair amount of logic and checks relatively simply (you can even check for duplicate entries and so on), and it also gives you line numbers for free.

One trick with using awk in this way is to progressively filter things in your checks (by skipping further processing of the current line with 'next'). We start out by skipping all comments, then we report and otherwise skip every line with the wrong number of fields, and then every check after this can assume that at least we have the right number of fields so it can confidently check what should be in each one. If the number of fields in a line is wrong there's no point in complaining about how one of them has the wrong sort of value, and the early check and 'next' to skip the rest of this line's processing is the simple way.

If you're also having awk process the configuration file later you might be tempted to have it check for errors at the same time, in an all-in-one awk program, but my view is that it's simpler to split the error checking from the processing. That way you don't have to worry about stopping the processing if you detect errors or intermingle processing logic with checking logic. You do have to make sure the two versions have the same handling of comments and so on, but in simple configuration file formats this is usually easy.

(Speaking from personal experience, you don't want to use '$1 == "#"' as your comment definition, because then you can't just stick a '#' in front of an existing configuration file line to comment it out. Instead you have to remember to make it '# ', and someday you'll forget.)

PS: If your awk program is big and complex enough, it might make more sense to use a here document to create a shell variable containing it, which will let you avoid certain sorts of annoying quoting problems.

Our need for re-provisioning support in mesh networks (and elsewhere)

By: cks
13 May 2025 at 03:00

In a comment on my entry on how WireGuard mesh networks need a provisioning system, vcarceler pointed me to Innernet (also), an interesting but opinionated provisioning system for WireGuard. However, two bits of it combined made me twitch a bit; Innernet only allows you to provision a given node once, and once a node is assigned an internal IP, that IP is never reused. This lack of support for re-provisioning machines would be a problem for us and we'd likely have to do something about it, one way or another. Nor is this an issue unique to Innernet, as a number of mesh network systems have it.

Our important servers have fixed, durable identities, and in practice these identities are both DNS names and IP addresses (we have some generic machines, but they aren't as important). We also regularly re-provision these servers, which is to say that we reinstall them from scratch, usually on new hardware. In the usual course of events this happens roughly every two years or every four years, depending on whether we're upgrading the machine for every Ubuntu LTS release or every other one. Over time this is a lot of re-provisionings, and we need the re-provisioned servers to keep their 'identity' when this happens.

We especially need to be able to rebuild a dead server as an identical replacement if its hardware completely breaks and eats its system disks. We're already in a crisis, we don't want to have a worse crisis because other things need to be updated because we can't exactly replace the server but instead have to build a new server that fills the same role, or will once DNS is updated, configurations are updated, etc etc.

This is relatively straightforward for regular Linux servers with regular networking; there's the issue of SSH host keys, but there's several solutions. But obviously there's a problem if the server is also a mesh network node and the mesh network system will not let it be re-provisioned under the same name or the same internal IP address. Accepting this limitation would make it difficult to use the mesh network for some things, especially things where we don't want to depend on DNS working (for example, sending system logs via syslog). Working around the limitation requires reverse engineering where the mesh network system stores local state and hopefully being able to save a copy elsewhere and restore it; among other things, this has implications for the mesh network system's security model.

For us, it would be better if mesh networking systems explicitly allowed this re-provisioning. They could make it a non-default setting that took explicit manual action on the part of the network administrator (and possibly required nodes to cooperate and extend more trust than normal to the central provisioning system). Or a system like Innernet could have a separate class of IP addresses, call them 'service addresses', that could be assigned and reassigned to nodes by administrators. A node would always have its unique identity but could also be assigned one or more service addresses.

(Of course our other option is to not use a mesh network system that imposes this restriction, even if it would otherwise make our lives easier. Unless we really need the system for some other reason or its local state management is explicitly documented, this is our more likely choice.)

PS: The other problem with permanently 'consuming' IP addresses as machines are re-provisioned is that you run out of them sooner or later unless you use gigantic network blocks that are many times larger than the number of servers you'll ever have (well, in IPv4, but we're not going to switch to IPv6 just to enable a mesh network provisioning system).

How and why typical (SaaS) pricing is too high for university departments

By: cks
12 May 2025 at 02:48

One thing I've seen repeatedly is that companies that sell SaaS or SaaS like things and offer educational pricing (because they want to sell to universities too) are setting (initial) educational pricing that is in practice much too high. Today I'm going to work through a schematic example to explain what I mean. All of this is based on how it works in Canadian and I believe US universities; other university systems may be somewhat different.

Let's suppose that you're a SaaS vendor and like many vendors, you price your product at $X per person per month; I'll pick $5 (US, because most of the time the prices are in USD). Since you want to sell to universities and other educational institutions and you understand they don't have as much money to spend as regular companies, you offer a generous academic discount; they pay only $3 USD per person per month.

(If these numbers seem low, I'm deliberately stacking the deck in the favour of the SaaS company. Things get worse for your pricing as the numbers go up.)

The research and graduate student side of a large but not enormous university department is considering your software. They have 100 professors 'in' the department, 50 technical and administrative support staff (this is a low ratio), and professors have an average of 10 graduate students, research assistants, postdocs, outside collaborators, undergraduate students helping out with research projects, and so on around them, for a total of 1,000 additional people 'in' the department who will also have to be covered. These 1,150 people will cost the department $3,450 USD a month for your software, a total of $41,400 USD a year, which is a significant saving over what a commercial company would pay for the same number of people.

Unfortunately, unless your software is extremely compelling or absolutely necessary, this cost is likely to be a very tough sell. In many departments, that's enough money to fund (or mostly fund) an additional low-level staff position, and it's certainly enough money to hire more TAs, supplement more graduate student stipends (these are often the same thing, since hiring graduate students as TAs is one of the ways that you support them), or pay for summer projects, all of which are likely to be more useful and meaningful to the department than a year of your service. It's also more than enough money to cause people in the department to ask awkward questions like 'how much technical staff time will it take to put together an inferior but functional enough alternate to this', which may well not be $41,000 worth of time (especially not every year).

(Of course putting together a complete equivalent of your SaaS will cost much more than that, since you have multiple full time programmers working on it and you've invested years in your software at this point. But university departments are already used to not having nice things, and staff time is often considered almost free.)

If you decide to make your pricing nicer by only charging based on the actual number of people who wind up using your stuff, unfortunately you've probably made the situation worse for the university department. One thing that's worse than a large predictable bill is an uncertain but possibly large bill; the department will have to reserve and allocate the money in its budget to cover the full cost, and then figure out what to do with the unused budget at the end of the year (or the end of every month, or whatever). Among other things, this may lead to awkward conversations with higher powers about how the department's initial budget and actual spending don't necessarily match up.

As we can see from the numbers, one big part of the issue is those 1,000 non-professor, non-staff people. These people aren't really "employees" the way they would be in a conventional organization (and mostly don't think of themselves as employees), and the university isn't set up to support their work and spend money on them in the way it is for the people it considers actual employees. The university cares if a staff member or a professor can't get their work done, and having them work faster or better is potentially valuable to the university. This is mostly not true for graduate students and many other additional people around a department (and almost entirely not true if the person is an outside collaborator, an undergraduate doing extra work to prepare for graduate studies elsewhere, and so on).

In practice, most of those 1,000 extra people will and must be supported on a shoestring basis (for everything, not just for your SaaS). The university as a whole and their department in particular will probably only pay a meaningful per-person price for them for things that are either absolutely necessary or extremely compelling. At the same time, often the software that the department is considering is something that those people should be using too, and they may need a substitute if the department can't afford the software for them. And once the department has the substitute, it becomes budgetarily tempting and perhaps politically better if everyone uses the substitute and the department doesn't get your software at all.

(It's probably okay to charge a very low price for such people, as opposed to just throwing them in for free, but it has to be low enough that the department or the university doesn't have to think hard about it. One way to look at it is that regardless of the numbers, the collective group of those extra people is 'less important' to provide services to than the technical staff, the administrative staff, and the professors, and the costs probably should work out accordingly. Certainly the collective group of extra people isn't more important than the other groups, despite having a lot more people in it.)

Incidentally, all of this applies just as much (if not more so) when the 'vendor' is the university's central organizations and they decide to charge (back) people within the university for something on a per-person basis. If this is truly cost recovery and accurately represents the actual costs to provide the service, then it's not going to be something that most graduate students get (unless the university opts to explicitly subsidize it for them).

PS: All of this is much worse if undergraduate students need to be covered too, because there are even more of them. But often the department or the university can get away with not covering them, partly because their interactions with the university are often much narrower than those of graduate students.

Using WireGuard seriously as a mesh network needs a provisioning system

By: cks
11 May 2025 at 02:45

One thing that my recent experience expanding our WireGuard mesh network has driven home to me is how (and why) WireGuard needs a provisioning system, especially if you're using it as a mesh networking system. In fact I think that if you use a mesh WireGuard setup at any real scale, you're going to wind up either adopting or building such a provisioning system.

In a 'VPN' WireGuard setup with a bunch of clients and one or a small number of gateway servers, adding a new client is mostly a matter of generating and giving it some critical information. However, it's possible to more or less automate this and make it relatively easy for people who want to connect to you to do this. You'll still need to update your WireGuard VPN server too, but at least you only have one of them (probably), and it may well be the host where you generate the client configuration and provide it to the client's owner.

The extra problem with adding a new client to a WireGuard mesh network is that there's many more WireGuard nodes that need to be updated (and also the new client needs a lot more information; it needs to know about all of the other nodes it's supposed to talk to). More broadly, every time you change the mesh network configuration, every node needs to update with the new information. If you add a client, remove a client, a client changes its keys for some reason (perhaps it had to be re-provisioned because the hardware died), all of these means nodes need updates (or at least the nodes that talk to the changed node). In the VPN model, only the VPN server node (and the new client) needed updates.

Our little WireGuard mesh is operating at a small scale, so we can afford to do this by hand. As you have more WireGuard nodes and more changes in nodes, you're not going to want to manually update things one by one, any more than you want to do that for other system administration work. Thus, you're going to want some sort of a provisioning system, where at a minimum you can say 'this is a new node' or 'this node has been removed' and all of your WireGuard configurations are regenerated, propagated to WireGuard nodes, trigger WireGuard configuration reloads, and so on. Some amount of this can be relatively generic in your configuration management system, but not all of it.

(Many configuration systems can propagate client-specific files to clients on changes and then trigger client side actions when the files are updated. But you have to build the per-client WireGuard configuration.)

PS: I haven't looked into systems that will do this for you, either as pure WireGuard provisioning systems or as bigger 'mesh networking using WireGuard' software, so I don't have any opinions on how you want to handle this. I don't even know if people have built and published things that are just WireGuard provisioning systems, or if everything out there is a 'mesh networking based on WireGuard' complex system.

Some notes on using 'join' to supplement one file with data from another

By: cks
10 May 2025 at 02:45

Recently I said something vaguely grumpy about the venerable Unix 'join' tool. As the POSIX specification page for join will unhelpfully tell you, join is a 'relational database operator', which means that it implements the rough equivalent of SQL joins. One way to use join is to add additional information for some lines in your input data.

Suppose, not entirely hypothetically, that we have an input file (or data stream) that starts with a login name and contains some additional information, and that for some logins (but not all of them) we have useful additional data about them in another file. Using join, the simple case of this is easy, if the 'master' and 'suppl' files are already sorted:

join -1 1 -2 1 -a 1 master suppl

(I'm sticking to POSIX syntax here. Some versions of join accept '-j 1' as an alternative to '-1 1 -2 1'.)

Our specific options tell join to join each line of 'master' and 'suppl' on the first field in each (the login) and print them, and also print all of the lines from 'master' that didn't have a login in 'suppl' (that's the '-a 1' argument). For lines with matching logins, we get all of the fields from 'master' and then all of the extra fields from 'suppl'; for lines from 'master' that don't match, we just get the fields from 'master'. Generally you'll tell apart which lines got supplemented and which ones didn't by how many fields they have.

If we want something other than all of the fields in the order that they are in the existing data source, in theory we have the '-o <list>' option to tell join what fields from each source to output. However, this option has a little problem, which I will show you by quoting the important bit from the POSIX standard (emphasis mine):

The fields specified by list shall be written for all selected output lines. Fields selected by list that do not appear in the input shall be treated as empty output fields.

What that means is that if we're also printing non-joined lines from our 'master' file, our '-o' still applies and any fields we specified from 'suppl' will be blank and empty (unless you use '-e'). This can be inconvenient if you were re-ordering fields so that, for example, a field from 'suppl' was listed before some fields from 'master'. It also means that you want to use '1.1' to get the login from 'master', which is always going to be there, not '2.1', the login from 'suppl', which is only there some of the time.

(All of this assumes that your supplementary file is listed second and the master file first.)

On the other hand, using '-e' we can simplify life in some situations. Suppose that 'suppl' contains only one additional interesting piece of information, and it has a default value that you'll use if 'suppl' doesn't contain a line for the login. Then if 'master' has three fields and 'suppl' two, we can write:

join -1 1 -2 1 -a 1 -e "$DEFVALUE" -o '1.1,1.2,1.3,2.2' master suppl

Now we don't have to try to tell whether or not a line from 'master' was supplemented by counting how many fields it has; everything has the same number of fields, it's just sometimes the last (supplementary) field is the default value.

(This is harder to apply if you have multiple fields from the 'suppl' file, but possibly you can find a 'there is nothing here' value that works for the rest of your processing.)

In Apache, using OIDC instead of SAML makes for easier testing

By: cks
9 May 2025 at 02:56

In my earlier installment, I wrote about my views on the common Apache modules for SAML and OIDC authentication, where I concluded that OpenIDC was generally easier to use than Mellon (for SAML). Recently I came up with another reason to prefer OIDC, one sufficiently strong enough that we converted one of our remaining Mellon uses over to OIDC. The advantage is that OIDC is easier to test if you're building a new version of your web server under another name.

Suppose that you're (re)building a version of your Apache based web server with authentication on, for example, a new version of Ubuntu, using a test server name. You want to test that everything still works before you deploy it, including your authentication. If you're using Mellon, as far as I can see you have to generate an entirely new SP configuration using your test server's name and then load it into your SAML IdP. You can't use your existing SAML SP configuration from your existing web server, because it specifies the exact URL the SAML IdP needs to use for various parts of the SAML protocol, and of course those URLs point to your production web server under its production name. As far as I know, to get another set of URLs that point to your test server, you need to set up an entirely new SP configuration.

OIDC has an equivalent thing in its redirect URI, but the OIDC redirect URL works somewhat differently. OIDC identity providers typically allow you to list multiple allowed redirect URIs for a given OIDC client, and it's the client that tells the server what redirect URI to use during authentication. So when you need to test your new server build under a different name, you don't need to register a new OIDC client; you can just add some more redirect URIs to your existing production OIDC client registration to allow your new test server to provide its own redirect URI. In the OpenIDC module, this will typically require no Apache configuration changes at all (from the production version), as the module automatically uses the current virtual host as the host for the redirect URI. This makes testing rather easier in practice, and it also generally tests the Apache OIDC configuration you'll use in production, instead of a changed version of it.

(You can put a hostname in the Apache OIDCRedirectURI directive, but it's simpler to not do so. Even if you did use a full URL in this, that's a single change in a text file.)

Chosing between "it works for now" and "it works in the long term"

By: cks
8 May 2025 at 02:50

A comment on my entry about how Netplan can only have WireGuard peers in one file made me realize one of my implicit system administration views (it's the first one by Jon). That is the tradeoff between something that works now and something that not only works now but is likely to keep working in the long term. In system administration this is a tradeoff, not an obvious choice, because what you want is different depending on the circumstances.

Something that works now is, for example, something that works because of how Netplan's code is currently written, where you can hack around an issue by structuring your code, your configuration files, or your system in a particular way. As a system administrator I do a surprisingly large amount of these, for example to fix or work around issues in systemd units that people have written in less than ideal or simply mistaken ways.

Something that's going to keep working in the longer term is doing things 'correctly', which is to say in whatever way that the software wants you to do and supports. Sometimes this means doing things the hard way when the software doesn't actually implement some feature that would make your life better, even if you could work around it with something that works now but isn't necessarily guaranteed to keep working in the future.

When you need something to work and there's no other way to do it, you have to take a solution that (only) works now. Sometimes you take a 'works now' solution even if there's an alternative because you expect your works-now version to be good enough for the lifetime of this system, this OS release, or whatever; you'll revisit things for the next version (at least in theory, workarounds to get things going can last a surprisingly long time if they don't break anything). You can't always insist on a 'works now and in the future' solution.

On the other hand, sometimes you don't want to do a works-now thing even if you could. A works-now thing is in some sense technical debt, with all that that implies, and this particular situation isn't important enough to justify taking on such debt. You may solve the problem properly, or you may decide that the problem isn't big and important enough to solve at all and you'll leave things in their imperfect state. One of the things I think about when making this decision is how annoying it would be and how much would have to change if my works-now solution broke because of some update.

(Another is how ugly the works-now solution is, including how big of a note we're going to want to write for our future selves so we can understand what this peculiar load bearing thing is. The longer the note, the more I generally wind up questioning the decision.)

It can feel bad to not deal with a problem by taking a works-now solution. After all, it works, and otherwise you're stuck with the problem (or with less pleasant solutions). But sometimes it's the right option and the works-now solution is simply 'too clever'.

(I've undoubtedly made this decision many times over my career. But Jon's comment and my reply to it crystalized the distinction between a 'works now' and a 'works for the long term' solution in my mind in a way that I think I can sort of articulate.)

Netplan can only have WireGuard peers in one file

By: cks
7 May 2025 at 02:43

We have started using WireGuard to build a small mesh network so that machines outside of our network can securely get at some services inside it (for example, to send syslog entries to our central syslog server). Since this is all on Ubuntu, we set it up through Netplan, which works but which I said 'has warts' in my first entry about it. Today I discovered another wart due to what I'll call the WireGuard provisioning problem:

Current status: provisioning WireGuard endpoints is exhausting, at least in Ubuntu 22.04 and 24.04 with netplan. So many netplan files to update. I wonder if Netplan will accept files that just define a single peer for a WG network, but I suspect not.

The core WireGuard provisioning problem is that when you add a new WireGuard peer, you have to tell all of the other peers about it (or at least all of the other peers you want to be able to talk to the new peer). When you're using iNetplan, it would be convenient if you could put each peer in a separate file in /etc/netplan; then when you add a new peer, you just propagate the new Netplan file for the peer to everything (and do the special Netplan dance required to update peers).

(Apparently I should now call it 'Canonical Netplan', as that's what its front page calls it. At least that makes it clear exactly who is responsible for Netplan's state and how it's not going to be widely used.)

Unfortunately this doesn't work, and it doesn't work in a dangerous way, which is that Netplan only notices one set of WireGuard peers in one netplan file (at least on servers, using systemd-networkd as the backend). If you put each peer in its own file, only the first peer is picked up. If you define some peers in the file where you define your WireGuard private key, local address, and so on, and some peers in another file, only peers from whichever is first will be used (even if the first file only defines peers, which isn't enough to bring up a WireGuard device by itself). As far as I can see, Netplan doesn't report any errors or warnings to the system logs on boot about this situation; instead, you silently get incomplete WireGuard configurations.

This is visibly and clearly a Netplan issue, because on servers you can inspect the systemd-networkd files written by Netplan (in /run/systemd/network). When I do this, the WireGuard .netdev file has only the peers from one file defined in it (and the .netdev file matches the state of the WireGuard interface). This is especially striking when the netplan file with the private key and listening port (and some peers) is second; since the .netdev file contains the private key and so on, Netplan is clearly merging data from more than one netplan file, not completely ignoring everything except the first one. It's just ignoring any peers encountered after the first set of them.

My overall conclusion is that in Netplan, you need to put all configuration for a given WireGuard interface into a single file, however tempting it might be to try splitting it up (for example, to put core WireGuard configuration stuff in one file and then list all peers in another one).

I don't know if this is an already filed Netplan bug and I don't plan on bothering to file one for it, partly because I don't expect Canonical to fix Netplan issues any more than I expect them to fix anything else and partly for other reasons.

PS: I'm aware that we could build a system to generate the Netplan WireGuard file, or maybe find a YAML manipulating program that could insert and delete blocks that matched some criteria. I'm not interested in building yet another bespoke custom system to deal with what is (for us) a minor problem, since we don't expect to be constantly deploying or removing WireGuard peers.

I moved my local Firefox changes between Git trees the easy way

By: cks
6 May 2025 at 03:20

Firefox recently officially switched to Git, in a completely different Git tree than their old mirror. This presented me a little bit of a problem because I have a collection of local changes I make to my own Firefox builds, which I carry as constantly-rebased commits on top of the upstream Firefox tree. The change in upstream trees meant that I was going to have to move my commits to the new tree. When I wrote my first entry I thought I might try to do this in some clever way similar to rebasing my own changes on top of something that was rebased, but in the end I decided to do it the simple and brute force way that I was confident would either work or would leave me in a situation I could back out from easily.

This simple and brute force way was to get both my old tree and my new 'firefox' tree up to date, then export my changes with 'git format-patch' from the old tree and import them into the new tree with 'git am'. There were a few irritations along the way, of course. First I (re)discovered that 'git am' can't directly consume the directory of patches you create with 'git format-patch'. Git-am will consume a Maildir of patches, but git-format-patch will only give you a directory full of files with names like '00NN-<author>-<title>.patch', which is not a proper Maildir. The solution is to cat all of the .patch files together in order to some other file, which is now a mailbox that git-am will handle. The other minor thing is that git-am unsurprisingly has no 'dry-run' option (which would probably be hard to implement). Of course in my situation, I can always reset 'main' back to 'origin/main', which was one reason I was willing to try this.

(Looking at the 'git format-patch' manual page suggests that what I might have wanted was the '--stdout' option, which would have automatically created the mbox format version for me. On the other hand it was sort of nice to be able to look at the list of patches and see that they were exactly what I expected.)

On the one hand, moving my changes in this brute force way (and to a completely separate new tree) feels like giving in to my unfamiliarity with git. There are probably clever git ways to do this move in a single tree without having to turn everything into patches and then apply them (even if most of that is automated). On the other hand, this got the job done with minimal hassles and time consumed, and sometimes I need to put a stop to my programmer's urge to be clever.

LLMs ('AI') are coming for our jobs whether or not they work

By: cks
5 May 2025 at 02:58

Over on the Fediverse, I said something about this:

Hot take: I don't really know what vibe coding is but I can confidently predict that it's 'coming for', if not your job, then definitely the jobs of the people who work in internal development at medium to large non-tech companies. I can predict this because management at such companies has *always* wanted to get rid of programmers, and has consistently seized on every excuse presented by the industry to do so. COBOL, report generators, rule based systems, etc etc etc at length.

(The story I heard is that at one point COBOL's English language basis was at least said to enable non-programmers to understand COBOL programs and maybe even write them, and this was seen as a feature by organizations adopting it.)

The current LLM craze is also coming for the jobs of system administrators for the same reason; we're overhead, just like internal development at (most) non-tech companies. In most non-tech organizations, both internal development and system administration is something similar to janitorial services; you have to have it because otherwise your organization falls over, but you don't like it and you're happy to spend as little on it as possible. And, unfortunately, we have a long history in technology that shows the long term results don't matter for the people making short term decisions about how many people to hire and who.

(Are they eating their seed corn? Well, they probably don't think it matters to them, and anyway that's a collective problem, which 'the market' is generally bad at solving.)

As I sort of suggested by using 'excuse' in my Fediverse post, it doesn't really matter if LLMs truly work, especially if they work over the long run. All they need to do in order to get senior management enthused about 'cutting costs' is appear to work well enough over the short term, and appearing to work is not necessarily a matter of substance. In sort of a flipside of how part of computer security is convincing people, sometimes it's enough to simply convince (senior) people and not have obvious failures.

(I have other thoughts about the LLM craze and 'vibe coding', as I understand it, but they don't fit within the margins of this entry.)

PS: I know it's picky of me to call this an 'LLM craze' instead of an 'AI craze', but I feel I have to both as someone who works in a computer science department that does all sorts of AI research beyond LLMs and as someone who was around for a much, much earlier 'AI' craze (that wasn't all of AI either, cf).

These days, Linux audio seems to just work (at least for me)

By: cks
4 May 2025 at 02:29

For a long time, the common perception was that 'Linux audio' was the punchline for a not particularly funny joke. I sort of shared that belief; although audio had basically worked for me for a long time, I had a simple configuration and dreaded having to make more complex audio work in my unusual desktop environment. But these days, audio seems to just work for me, even in systems that have somewhat complex audio options.

On my office desktop, I've wound up with three potential audio outputs and two audio inputs: the motherboard's standard sound system, a USB headset with a microphone that I use for online meetings, the microphone on my USB webcam, and (to my surprise) a HDMI audio output because my LCD displays do in fact have tiny little speakers built in. In PulseAudio (or whatever is emulating it today), I have the program I use for online meetings set to use the USB headset and everything else plays sound through the motherboard's sound system (which I have basic desktop speakers plugged into). All of this works sufficiently seamlessly that I don't think about it, although I do keep a script around to reset the default audio destination.

On my home desktop, for a long time I had a simple single-output audio system that played through the motherboard's sound system (plus a microphone on a USB webcam that was mostly not connected). Recently I got an outboard USB DAC and, contrary to my fears, it basically plugged in and just worked. It was easy to set the USB DAC as the default output in pavucontrol and all of the settings related to it stick around even when I put it to sleep overnight and it drops off the USB bus. I was quite pleased by how painless the USB DAC was to get working, since I'd been expecting much more hassles.

(Normally I wouldn't bother meticulously switching the USB DAC to standby mode when I'm not using it for an extended time, but I noticed that the case is clearly cooler when it rests in standby mode.)

This is still a relatively simple audio configuration because it's basically static. I can imagine more complex ones, where you have audio outputs that aren't always present and that you want some programs (or more generally audio sources) to use when they are present, perhaps even with priorities. I don't know if the Linux audio systems that Linux distributions are using these days could cope with that, or if they did would give you any easy way to configure it.

(I'm aware that PulseAudio and so on can be fearsomely complex under the hood. As far as the current actual audio system goes, I believe that what my Fedora 41 machines are using for audio is PipeWire (also) with WirePlumber, based on what processes seem to be running. I think this is the current Fedora 41 audio configuration in general, but I'm not sure.)

The HTTP status codes of responses from about 22 hours of traffic to here (part 2)

By: cks
3 May 2025 at 03:09

A few months ago, I wrote an entry about this topic, because I'd started putting in some blocks against crawlers, including things that claimed to be old versions of browsers, and I'd also started rate-limiting syndication feed fetching. Unfortunately, my rules at the time were flawed, rejecting a lot of people that I actually wanted to accept. So here are some revised numbers from today, a day when my logs suggest that I've seen what I'd call broadly typical traffic and traffic levels.

I'll start with the overall numbers (for HTTP status codes) for all requests:

  10592 403		[26.6%]
   9872 304		[24.8%]
   9388 429		[23.6%]
   8037 200		[20.2%]
   1629 302		[ 4.1%]
    114 301
     47 404
      2 400
      2 206

This is a much more balanced picture of activity than the last time around, with a lot less of the overall traffic being HTTP 403s. The HTTP 403s are from aggressive blocks, the HTTP 304s and HTTP 429s are mostly from syndication feed fetchers, and the HTTP 302s are mostly from things with various flaws that I redirect to informative static pages instead of giving HTTP 403s. The two HTTP 206s were from Facebook's 'externalhit' agent on a recent entry. A disturbing amount of the HTTP 403s were from Bing's crawler and almost 500 of them were from something claiming to be an Akkoma Fediverse server. 8.5% of the HTTP 403s were from something using Go's default User-Agent string.

The most popular User-Agent strings today for successful requests (of anything) were for versions of NetNewsWire, FreshRSS, and Miniflux, then Googlebot and Applebot, and then Chrome 130 on 'Windows NT 10'. Although I haven't checked, I assume that all of the first three were for syndication feeds specifically, with few or no fetches of other things. Meanwhile, Googlebot and Applebot can only fetch regular pages; they're blocked from syndication feeds.

The picture for syndication feeds looks like this:

   9923 304		[42%]
   9535 429		[40%]
   1984 403		[ 8.5%]
   1600 200		[ 6.8%]
    301 302
     34 301
      1 404

On the one hand it's nice that 42% of syndication feed fetches successfully did a conditional GET. On the other hand, it's not nice that 40% of them got rate-limited, or that there were clearly more explicitly blocked requests that there were HTTP 200 responses. On the sort of good side, 37% of the blocked feed fetches were from one IP that's using "Go-http-client/1.1" as its User-Agent (and which accounts for 80% of the blocks of that). This time around, about 58% of the requests were for my syndication feed, which is better than it was before but still not great.

These days, if certain problems are detected in a request I redirect the request to a static page about the problem. This gives me some indication of how often these issues are detected, although crawlers may be re-visiting the pages on their own (I can't tell). Today's breakdown of this is roughly:

   78%  too-old browser
   13%  too generic a User-Agent
    9%  unexpectedly using HTTP/1.0

There were slightly more HTTP 302 responses from requests to here than there were requests for these static pages, so I suspect that not everything that gets these redirects follows them (or at least doesn't bother re-fetching the static page).

I hope that the better balance in HTTP status codes here is a sign that I have my blocks in a better state than I did a couple of months ago. It would be even better if the bad crawlers would go away, but there's little sign of that happening any time soon.

The complexity of mixing mesh networking and routes to subnets

By: cks
2 May 2025 at 02:51

One of the in things these days is encrypted (overlay) mesh networks, where you have a bunch of nodes and the nodes have encrypted connections to each other that they use for (at least) internal IP traffic. WireGuard is one of the things that can be used for this. A popular thing to add to such mesh network solutions is 'subnet routes', where nodes will act as gateways to specific subnets, not just endpoints in themselves. This way, if you have an internal network of servers at your cloud provider, you can establish a single node on your mesh network and route to the internal network through that node, rather than having to enroll every machine in the internal network.

(There are various reasons not to enroll every machine, including that on some of them it would be a security or stability risk.)

In simple configurations this is easy to reason about and easy to set up through the tools that these systems tend to give you. Unfortunately, our network configuration isn't simple. We have an environment with multiple internal networks, some of which are partially firewalled off from each other, and where people would want to enroll various internal machines in any mesh networking setup (partly so they can be reached directly). This creates problems for a simple 'every node can advertise some routes and you accept the whole bundle' model.

The first problem is what I'll call the direct subnet problem. Suppose that you have a subnet with a bunch of machines on it and two of them are nodes (call them A and B), with one of them (call it A) advertising a route to the subnet so that other machines in the mesh can reach it. The direct subnet problem is that you don't want B to ever send its traffic for the subnet to A; since it's directly connected to the subnet, it should send the traffic directly. Whether or not this happens automatically depends on various implementation choices the setup makes.

The second problem is the indirect subnet problem. Suppose that you have a collection of internal networks that can all talk to each other (perhaps through firewalls and somewhat selectively). Not all of the machines on all of the internal networks are part of the mesh, and you want people who are outside of your networks to be able to reach all of the internal machines, so you have a mesh node that advertises routes to all of your internal networks. However, if a mesh node is already inside your perimeter and can reach your internal networks, you don't want it to go through your mesh gateway; you want it to send its traffic directly.

(You especially want this if mesh nodes have different mesh IPs from their normal IPs, because you probably want the traffic to come from the normal IP, not the mesh IP.)

You can handle the direct subnet case with a general rule like 'if you're directly attached to this network, ignore a mesh subnet route to it', or by some automatic system like route priorities. The indirect subnet case can't be handled automatically because it requires knowledge about your specific network configuration and what can reach what without the mesh (and what you want to reach what without the mesh, since some traffic you want to go over the mesh even if there's a non-mesh route between the two nodes). As far as I can see, to deal with this you need the ability to selectively configure or accept (subnet) routes on a mesh node by mesh node basis.

(In a simple topology you can get away with accepting or not accepting all subnet routes, but in a more complex one you can't. You might have two separate locations, each with their own set of internal subnets. Mesh nodes in each location want the other location's subnet routes, but not their own location's subnet routes.)

Being reminded that Git commits are separate from Git trees

By: cks
1 May 2025 at 02:28

Firefox's official source repository has moved to Git, but to a completely new Git repository, not the Git mirror that I've used for the past few years. This led me to a lament on the Fediverse:

This is my sad face that Firefox's switch to using git of course has completely different commit IDs than the old not-official gecko-dev git repository, meaning that I get to re-clone everything from scratch (all ~8 GB of it). Oh well, so it goes in the land of commit hashes.

Then Tim Chase pointed out something that I should have thought of:

If you add the new repo as a secondary remote in your existing one and pull from it, would it mitigate pulling all the blobs (which likely remain the same), limiting your transfer to just the commit-objects (and possibly some treeish items and tags)?

Git is famously a form of content-addressed storage, or more specifically a tree of content addressed storage, where as much as possible is kept the same over time. This includes all the portions of the actual source tree. A Git commit doesn't directly include a source tree; instead it just has the hash of the source tree (well, its top level, cf).

What this means is that if you completely change the commits so that all of them have new hashes, for example by rebuilding your history from scratch in a new version of the repository, but you keep the actual tree contents the same in most or all of the commits, the only thing that actually changes is the commits. If you add this new repository (with its new commit history) as a Git remote to your existing repository and pull from it, most or all of the tree contents are the same across the two sets of commits and won't have to be fetched. So you don't fetch gigabytes of tree contents, you only fetch megabytes (one hopes) of commits.

As I mentioned on the Fediverse, I was told this too late to save me from re-fetching the entire new Firefox repository from scratch on my office desktop (which has lots of bandwidth). I may yet try this on my home desktop, or alternately use it on my office desktop to easily move my local changes on top of the new official Git history.

(I think this is effectively rebasing my own changes on top of something that's been rebased, which I've done before, although not recently. I'll also want to refresh my understanding of what 'git rebase' does.)

The appeal of keyboard launchers for (Unix) desktops

By: cks
30 April 2025 at 03:41

A keyboard launcher is a big part of my (modern) desktop, but over on the Fediverse I recently said something about them in general:

I don't necessarily suggest that people use dmenu or some equivalent. Keyboard launchers in GUI desktops are an acquired taste and you need to do a bunch of setup and infrastructure work before they really shine. But if you like driving things by the keyboard and will write scripts, dmenu or equivalents can be awesome.

The basic job of a pure keyboard launcher is to let you hit a key, start typing, and then select and do 'something'. Generally the keyboard launcher will make a window appear so that you can see what you're typing and maybe what you could complete it to or select.

The simplest and generally easiest way to use a keyboard launcher, and how many of them come configured to work, is to use it to select and run programs. You can find a version of this idea in GNOME, and even Windows has a pseudo-launcher in that you can hit a key to pop up the Start menu and the modern Start menu lets you type in stuff to search your programs (and other things). One problem with the GNOME version, and many basic versions, is that in practice you don't necessarily launch desktop programs all that often or launch very many different ones, so you can have easier ways to invoke the ones you care about. One problem with the Windows version (at least in my experience) is that it will do too much, which is to say that no matter what garbage you type into it by accident, it will do something with that garbage (such as launching a web search).

The happy spot for a keyboard launcher is somewhere in the middle, where they do a variety of things that are useful for you but not without limits. The best window launcher for your desktop is one that gives you fast access to whatever things you do a lot, ideally with completion so you type as little as possible. When you have it tuned up and working smoothly the feel is magical; I tap a key, type a couple of characters and then hit tab, hit return, and the right thing happens without me thinking about it, all fast enough that I can and do type ahead blindly (which then goes wrong if the keyboard launcher doesn't start fast enough).

The problem with keyboard launchers, and why they're not for everyone, is that everyone has a different set of things that they do a lot and that are useful for them to trigger entirely through the keyboard. No keyboard launcher will come precisely set up for what you do a lot in their default installation, so at a minimum you need to spend the time and effort to curate what the launcher will do and how it does it. If you're more ambitious, you may need to build supporting scripts that give the launcher a list of things to complete and then act on them when you complete one. If you don't curate the launcher and throw in the kitchen sink, you wind up with the Windows experience where it will certainly do something when you type things but perhaps not really what you wanted.

(For example, I routinely ssh to a lot of our machines, so my particular keyboard launcher setup lets me type a machine name (with completion) to start a session to it. But I had to build all of that, including sourcing the machine names I wanted included from somewhere, and this isn't necessarily useful for people who aren't constantly ssh'ing to machines.)

There are a variety of keyboard launchers for both X and Wayland, basically none of which I have any experience with. See the Arch Wiki section on application launchers. Someday I will have to get a Wayland equivalent to my particular modified dmenu, a thought that fills me with no more enthusiasm than any other part of replacing my whole X environment.

PS: Another issue with keyboard launchers is that sometimes you're wrong about what you want to do with them. I once built an entire keyboard launcher setup to select terminal windows and then later wound up abandoning it when I didn't use it enough.

Updating venv-based things by replacing the venv not updating it

By: cks
29 April 2025 at 03:01

These days, we have mostly switched over to installing third-party Python programs (and sometimes things like Django) in virtual environments instead of various past practices. This is clearly the way Python expects you to do things and increasingly problems emerge if you don't. One of the issues I've been thinking about is how we want to handle updating these programs when they release new versions, because there are two approaches.

One option would be to update the existing venv in place, through various 'pip' commands. However, pip-based upgrades have some long standing issues, and also they give you no straightforward way to revert an upgrade if something goes wrong. The other option is to build a separate venv with the new version of the program (and all of its current dependency versions) and then swap the whole new venv into place, which works because venvs can generally be moved around. You can even work with symbolic links, creating a situation where you refer to 'dir/program', which is a symlink to 'dir/venvs/program-1.2.0' or 'dir/venvs/programs-1.3.0' or whatever you want today.

In practice we're more likely to have 'dir/program' be a real venv and just create 'dir/program-new', rename directories, and so on. The full scale version with always versioned directories is likely to only be used for things, like Django, where we want to be able to easily see what version we're running and switch back very simply.

Our Django versions were always going to be handled by building entirely new venvs and switching to them (it's the venv version of what we did before). We haven't had upgrades of other venv based programs until recently, and when I started thinking about it, I reached the obvious conclusion: we'll update everything by building a new venv and replacing the old one, because this deals with pretty much all of the issues at the small cost of yet more disk space for yet more venvs.

(This feels quite obvious once I'd made the decision, but I want to write it down anyway. And who knows, maybe there are reasons to update venvs in place. The one that I can think of is to only change the main program version but not any of the dependencies, if they're still compatible.)

The glass box/opaque box unit testing argument in light of standards

By: cks
28 April 2025 at 02:28

One of the traditional divides in unit testing is whether you should write 'glass box' or 'opaque box' tests (like GeePawHill I think I prefer those terms to the traditional ones), which is to say whether you should write tests exploiting your knowledge of the module's code or without it. Since I prefer testing inside my modules, I'm implicitly on the side of glass box tests; even if I'm testing public APIs, I write tests with knowledge of potential corner cases. Recently, another reason for this occurred to me, by analogy to standards.

I've read about standards (and read the actual standards) enough by now to have absorbed the lesson that it is very hard to write a (computer) standard that can't be implemented perversely. Our standards need good faith implementations and there's only so much you can do to make it hard for people implementing them in bad faith. After that, you have to let the 'market' sort it out (including the market of whether or not people want to use perverse implementations, which generally they don't).

(Of course some time the market doesn't give you a real choice. Optimizing C compilers are an example, where your only two real options (GCC and LLVM) have aggressively exploited arguably perverse readings of 'undefined behavior' as part of their code optimization passes. There's some recent evidence that this might not always be worth it [PDF], via.)

If you look at them in the right way, unit tests are also a sort of standard. And like standards, opaque box unit tests have a very hard time of completely preventing perverse implementations. While people usually don't deliberately create perverse implementations, they can happen by accident or by misunderstandings, and there can be areas of perverse problems due to bugs. Your cheapest assurance that you don't have a perverse implementation is to peer inside and then write glass box tests that in part target the areas where perverse problems could arise. If you write opaque box tests, you're basically hoping that you can imagine all of the perverse mistakes that you'll make.

(Some things are amenable to exhaustive testing, but usually not very many.)

PS: One way to get perverse implementations is 'write code until all of the tests pass, then stop'. This doesn't guarantee a perverse implementation but it certainly puts the onus on the tests to force the implementation to do things, much like with standards (cf).

Trying to understand OpenID Connect (OIDC) and its relation to OAuth2

By: cks
27 April 2025 at 02:34

The OIDC specification describes it as "a simple identity layer" on top of OAuth2. As I've been discovering, this is sort of technically true but also misleading. Since I think I've finally sorted this out, here's what I've come to understand about the relationship.

OAuth2 describes a HTTP-based protocol where a client (typically using a web browser) can obtain an access token from an authorization server and then present this token to a resource server to gain access to something. For example, your mail client works with a browser to obtain an access token from an OAuth2 identity provider, which it then presents to your IMAP server. However, the base OAuth2 specification is only concerned with the interaction between clients and the authorization server; it explicitly has nothing to say about issues like how a resource server validations and uses the access tokens. This is right at the start of RFC 6749:

The interaction between the authorization server and resource server is beyond the scope of this specification. [...]

Because it's purely about the client to authorization server flows, the base OAuth2 RFC provides nothing that will let your IMAP server validate the alleged 'OAuth2 access token' your mail client has handed it (or find out from the token who you are). There were customary ways to do this, and then later you had RFC 7662 Token Introspection or perhaps RFC 9068 JWT access tokens, but those are all outside basic OAuth2.

(This has obvious effects on interoperability. You can't write a resource server that works with arbitrary OAuth2 identity providers, or an OAuth2 identity provider of your own that everyone will be able to work with. I suspect that this is one reason why, for example, IMAP mail clients often only support a few big OAuth2 identity providers.)

OIDC takes the OAuth2 specification and augments it in a number of ways. In addition to an OAuth2 access token, an OIDC identity provider can also give clients (you) an ID Token that's a (signed) JSON Web Token (JWT) that has a specific structure and contains at least a minimal set of information about who authenticated. An OIDC IdP also provides an official Userinfo endpoint that will provide information about an access token, although this is different information than the RFC 7662 Token Introspection endpoint.

Both of these changes make resource servers and by extension OIDC identity providers much more generic. If a client hands a resource server either an OIDC ID Token or an OIDC Access Token, the resource server ('consumer') has standard ways to inspect and verify them. If your resource server isn't too picky (or is sufficiently configurable), I think it can work with either an OIDC Userinfo endpoint or an OAuth2 RFC 7662 Token Introspection endpoint (I believe this is true of Dovecot, cf).

(OIDC is especially convenient in cases like websites, where the client that gets the OIDC ID Token and Access Token is the same thing that uses them.)

An OAuth2 client can talk to an OIDC IdP as if it was an OAuth2 IdP and get back an access token, because the OIDC IdP protocol flow is compatible with the OAuth2 protocol flow. This access token could be described as an 'OAuth2' access token, but this is sort of meaningless to say since OAuth2 gives you nothing you can do with an access token. An OAuth2 resource server (such as an IMAP server) that expects to get 'OAuth2 access tokens' may or may not be able to interact with any particular OIDC IdP to verify those OIDC IdP provided tokens to its satisfaction; it depends on what the resource server supports and requires. For example, if the resource server specifically requires RFC 7662 Token Introspection you may be out of luck, because OIDC IdPs aren't required to support that and not all do.

In practice, I believe that OIDC has been around for long enough and has been popular enough that consumers of 'OAuth2 access tokens', like your IMAP server, will likely have been updated so that they can work with OIDC Access Tokens. Servers can do this either by verifying the access tokens through an OIDC Userinfo endpoint (with suitable server configuration to tell them what to look for) or by letting you tell them that the access token is a JWT and verifying the JWT. OIDC doesn't require the access token to be a JWT but OIDC IdPs can (and do) use JWTs for this, and perhaps you can actually have your client software send the ID Token (which is guaranteed to be a JWT) instead of the OIDC Access Token.

(It helps that OIDC is obviously better if you want to write 'resource server' side software that works with any IdP without elaborate and perhaps custom configuration or even programming for each separate IdP.)

(I have to thank Henryk PlΓΆtz for helping me understand OAuth2's limited scope.)

(The basic OAuth2 has been extended with multiple additional standards, see eg RFC 8414, and if enough of them are implemented in both your IdP and your resource servers, some of this is fixed. OIDC has also been extended somewhat, see eg OpenID Provider Metadata discovery.)

Looking at OIDC tokens and getting information on them as a 'consumer'

By: cks
26 April 2025 at 01:56

In OIDC, roughly speaking and as I understand it, there are three possible roles: the identity provider ('OP'), a Client or 'Relying Party' (the program, website, or whatever that has you authenticate with the IdP and that may then use the resulting authentication information), and what is sometimes called a 'resource server', which uses the IdP's authentication information that it gets from you (your client, acting as a RP). 'Resource Server' is actually an OAuth2 term, which comes into the picture because OIDC is 'a simple identity layer' on top of OAuth2 (to quote from the core OIDC specification). A website authenticating you with OIDC can be described as acting both as a 'RP' and a 'RS', but in cases like IMAP authentication with OIDC/OAuth2, the two roles are separate; your mail client is a RP, and the IMAP server is a RS. I will broadly call both RPs and RSs 'consumers' of OIDC tokens.

When you talk to an OIDC IdP to authenticate, you can get back either or both of an ID Token and an Access Token. The ID Token is always a JWT with some claims in it, including the 'sub(ject)', the 'issuer', and the 'aud(ience)' (which is what client the token was requested by), although this may not be all of the claims you asked for and are entitled to. In general, to verify an ID Token (as a consumer), you need to extract the issuer, consult the issuer's provider metadata to find how to get their keys, and then fetch the keys so you can check the signature on the ID Token (and then proceed to do a number of additional verifications on the information in the token, as covered in the specification). You may cache the keys to save yourself the network traffic, which allows you to do offline verification of ID Tokens. Quite commonly, you'll only accept ID Tokens from pre-configured issuers, not any random IdP on the Internet (ie, you will verify that the 'iss' claim is what you expect). As far as I know, there's no particular way in OIDC to tell if the IdP still considers the ID Token valid or to go from an ID Token alone to all of the claims you're entitled to.

The Access Token officially doesn't have to be anything more than an opaque string. To validate it and get the full set of OIDC claim information, including the token's subject (ie, who it's for), you can use the provider's Userinfo endpoint. However, this doesn't necessarily give you the 'aud' information that will let you verify that this Access Token was created for use with you and not someone else. If you have to know this information, there are two approaches, although an OIDC identity provider doesn't have to support either.

The first is that the Access Token may actually be a RFC 9068 JWT. If it is, you can validate it in the usual OIDC JWT way (as for an ID Token) and then use the information inside, possibly in combination with what you get from the Userinfo endpoint. The second is that your OAuth2 provider may support an RFC 7662 Token Introspection endpoint. This endpoint is not exposed in the issuer's provider metadata and isn't mandatory in OIDC, so your IdP may or may not support it (ours doesn't, although that may change someday).

(There's also an informal 'standard' way of obtaining information about Access Tokens that predates RFC 7662. For all of the usual reasons, this may still be supported by some large, well-established OIDC/OAuth2 identity providers.)

Under some circumstances, the ID Token and the Access Token may be tied together in that the ID Token contains a claim field that you can use to validate that you have the matching Access Token. Otherwise, if you're purely a Resource Server and someone hands you a theoretically matching ID Token and Access Token, all that you can definitely do is use the Access Token with the Userinfo endpoint and verify that the 'sub' matches. If you have a JWT Access Token or a Token Introspection endpoint, you can get more information and do more checks (and maybe the Userinfo endpoint also gives you an 'aud' claim).

If you're a straightforward Relying Party client, you get both the ID Token and the Access Token at the same time and you're supposed to keep track of them together yourself. If you're acting as a 'resource server' as well and need the additional claims that may not be in the ID Token, you're probably going to use the Access Token to talk to the Userinfo endpoint to get them; this is typically how websites acting as OIDC clients behave.

Because the only OIDC standard way to get additional claims is to obtain an Access Token and use it to access the Userinfo endpoint, I think that many OIDC clients that are acting as both a RP and a RS will always request both an ID Token and an Access Token. Unless you know the Access Token is a JWT, you want both; you'll verify the audience in the ID Token, and then use the Access Token to obtain the additional claims. Programs that are only getting things to pass to another server (for example, a mail client that will send OIDC/OAuth2 authentication to the server) may only get an Access Token, or in some protocols only obtain an ID Token.

(If you don't know all of this and you get a mail client testing program to dump the 'token' it obtains from the OIDC IdP, you can get confused because a JWT format Access Token can look just like an ID Token.)

This means that OIDC doesn't necessarily provide a consumer with a completely self-contained single object that both has all of the information about the person who authenticated and that lets you be sure that this object is intended for you. An ID Token by itself doesn't necessarily contain all of the claims, and while you can use any (opaque) Access Token to obtain a full set of claims, I believe that these claims don't have to include the 'aud' claim (although your OIDC IdP may chose to include it).

This is in a sense okay for OIDC. My understanding is that OIDC is not particularly aimed at the 'bearer token' usage case where the RP and the Resource Server are separate systems; instead, it's aimed at the 'website authenticating you' case where the RP is the party that will directly rely on the OIDC information. In this case the RP has (or can have) both the ID Token and the Access Token and all is fine.

(A lot of my understanding on this is due to the generosity of @Denvercoder9 and others after I was confused about this.)

Sidebar: Authorization flow versus implicit flow in OIDC authentication

In the implicit flow, you send people to the OIDC IdP and the OIDC IdP directly returns the ID Token and Access Token you asked for to your redirect URI, or rather has the person's browser do it. In this flow, the ID Token includes a partial hash of the Access Token and you use this to verify that the two are tied together. You need to do this because you don't actually know what happened in the person's browser to send them to your redirect URI, and it's possible things were corrupted by an attacker.

In the authorization flow, you send people to the OIDC IdP and it redirects them back to you with an 'authorization code'. You then use this code to call the OIDC IdP again at another endpoint, which replies with both the ID Token and the Access Token. Because you got both of these at once during the same HTTP conversation directly with the IdP, you automatically know that they go together. As a result, the ID Token doesn't have to contain any partial hash of the Access Token, although it can.

I think the corollary of this is that if you want to be able to hand the ID Token and the Access Token to a Resource Server and allow it to verify that the two are connected, you want to use the implicit flow, because that definitely means that the ID Token has the partial hash the Resource Server will need.

(There's also a hybrid flow which I'll let people read about in the standard.)

Chrome and the burden of developing a browser

By: cks
25 April 2025 at 02:53

One part of the news of the time interval is that the US courts may require Google to spin off Chrome (cf). Over on the Fediverse, I felt this wasn't a good thing:

I have to reluctantly agree that separating Chrome from Google would probably go very badlyΒΉ. Browsers are very valuable but also very expensive public goods, and our track record of funding and organizing them as such in a way to not wind up captive to something is pretty bad (see: Mozilla, which is at best questionable on this). Google is not ideal but at least Chrome is mostly a sideline, not a main hustle.

ΒΉ <Lauren Weinstein Fediverse post> [...]

One possible reaction to this is that it would be good for everyone if people stopped spending so much money on browsers and so everything involving them slowed down. Unfortunately, I don't think that this would work out the way people want, because popular browsers are costly beasts. To quote what I said on the Fediverse:

I suspect that the cost of simply keeping the lights on in a modern browser is probably on the order of plural millions of dollars a year. This is not implementing new things, this is fixing bugs, keeping up with security issues, monitoring CAs, and keeping the development, CI, testing, and update infrastructure running. This has costs for people, for servers, and for bandwidth.

The reality of the modern Internet is that browsers are load bearing infrastructure; a huge amount of things run through them, including and especially on minority platforms. Among other things, no browser is 'secure' and all of them are constantly under attack. We want browser projects that are used by lots of people to have enough resources (in people, build infrastructure, update servers, and so on) to be able to rapidly push out security updates. All browsers need a security team and any browser with addons (which should be all of them) needs a security team for monitoring and dealing with addons too.

(Browsers are also the people who keep Certificate Authorities honest, and Chrome is very important in this because of how many people use it.)

On the whole, it's a good thing for the web that Chrome is in the hands of an organization that can spend tens of millions of dollars a year on maintaining it without having to directly monetize it in some way. It would be better if we could collectively fund browsers as the public good that they are without having corporations in the way, because Google absolutely corrupts Chrome (also) and Mozilla has stumbled spectacularly (more than once). But we have to deal with the world that we have, not the world that we'd like to have, and in this world no government seems to be interested in seriously funding obvious Internet public goods (not only browsers but also, for example, free TLS Certificate Authorities).

(It's not obvious that a government funded browser would come out better overall, but at least there would be a chance of something different than the narrowing status quo.)

PS: Another reason that spending on browsers might not drop is that Apple (with Safari) and Microsoft (with Edge) are also in the picture. Both of these companies might take the opportunity to slow down, or they might decide that Chrome's potentially weak new position was a good moment to push for greater dominance and maybe lock-in through feature leads.

The many ways of getting access to information ('claims') in OIDC

By: cks
24 April 2025 at 03:41

Any authentication and authorization framework, such as OIDC, needs a way for the identity provider (an 'OIDC OP') to provide information about the person or thing that was just authenticated. In OIDC specifically, what you get are claims that are grouped into scopes. You have to ask for specific scopes, and the IdP may restrict what scopes a particular client has access to. Well, that is not quite the full story, and the full story is complicated (more so than I expected when I started writing this entry).

When you talk to the OIDC identity server (OP) to authenticate, you (the program or website or whatever acting as the client) can get back either or both of an ID Token and an Access Token. I believe that in general your Access Token is an opaque string, although there's a standard for making it a JWT. Your ID Token is ultimately some JSON (okay, it's a JWT) and has certain mandatory claims like 'sub' (the subject) that you don't have to ask for with a scope. It would be nice if all of the claims from all of the scopes that you asked for were automatically included in the ID Token, but the OIDC standard doesn't require this. Apparently many but not all OIDC OPs include all the claims (at least by default); however, our OIDC OP doesn't currently do so, and I believe that Google's OIDC OP also doesn't include some claims.

(Unsurprisingly, I believe that there is a certain amount of OIDC-using software out there that assumes that all OIDC OPs return all claims in the ID Token.)

The standard approved and always available way to obtain the additional claims (which in some cases will be basically all claims) is to present your Access Token (not your ID Token) to the OIDC Userinfo endpoint at your OIDC OP. If your Access Token is (still) valid, what you will get back is either a plain, unsigned JSON listing of those claims (and their values) or perhaps a signed JWT of the same thing (which you can find out from the provider metadata). As far as I can see, you don't necessarily use the ID Token in this additional information flow, although you may want to be cautious and verify that the 'sub' claim is the same in the Userinfo response and the ID Token that is theoretically paired with your Access Token.

(As far as I can tell, the ID Token doesn't include a copy of the Access Token as another pseudo-claim. The two are provided to you at the same time (if you asked the OIDC OP for both), but are independent. The ID Token can't quite be verified offline because you need to get the necessary public key from the OIDC OP to check the signature.)

If I'm understanding things correctly (which may not be the case), in an OAuth2 authentication context, such as using OAUTHBEARER with the Dovecot IMAP server, I believe your local program will send the Access Token to the remote end and not do much with the ID Token, if it even requested one. The remote end then uses the Access Token with a pre-configured Userinfo endpoint to get a bunch of claims, and incidentally to validate that the Access Token is still good. In other protocols, such as the current version of OpenPubkey, your local program sends the ID Token (perhaps wrapped up) and so needs it to already include the full claims, and can't use the Userinfo approach. If what you have is a website that is both receiving the OIDC stuff and processing it, I believe that the website will normally ask for both the ID Token and the Access Token and then augment the ID Token information with additional claims from the Userinfo response (this is what the Apache OIDC module does, as far as I can see).

An OIDC OP may optionally allow clients to specifically request that certain claims be included in the ID Token that they get, through the "claims" request parameter on the initial request. One potential complication here is that you have to ask for specific claims, not simple 'all claims in this scope'; it's up to you to know what potentially non-standard claims you should ask for (and I believe that the claims you get have to be covered by the scopes you asked for and that the OIDC OP allows you to get). I don't know how widely implemented this is, but our OIDC OP supports it.

(An OIDC OP can list all of its available claims in its metadata, but doesn't have to. I believe that most OPs will list their scopes, although technically this is just 'recommended'.)

If you really want a self-contained signed object that has all of the information, I think you have to hope for an OIDC OP that either puts all claims in the ID Token by default or lets you ask for all of the claims you care about to be added for your request. Even if an OIDC OP gives you a signed userinfo response, it may not include all of the ID Token information and it might not be possible to validate various things later. You can always validate an Access Token by making a Userinfo request with it, but I don't know if there's any way to validate an ID Token.

We've chosen to 'modernize' all of our ZFS filesystems

By: cks
23 April 2025 at 03:17

We are almost all of the way to the end of a multi-month process of upgrading our ZFS fileservers from Ubuntu 22.04 to 24.04 by also moving to more recent hardware. This involved migrating all of our pools and filesystems, involving terabytes of data. Our traditional way of doing this sort of migration (which we used, for example, when going from our OmniOS fileservers to our Linux fileservers was the good old reliable 'zfs send | zfs receive' approach of sending snapshots over. This sort of migration is fast, reliable, and straightforward. However, it has one drawback, which is that it preserves all of the old filesystem's history, including things like the possibility of panics and possibly other things.

We've been running ZFS for long enough that we had some ZFS filesystems that were still at ZFS filesystem version 4. In late 2023, we upgraded them all to ZFS filesystem version 5, and after that we got some infrequent kernel panics. We could never reproduce the kernel panics and they were very infrequent, but 'infrequent' is not the same as 'never' (the previous state of affairs), and it seemed likely that they were in some way related to upgrading our filesystem versions, which in turn was related to us having some number of very old filesystems. So in this migration, we deliberately decided to 'migrate' filesystems the hard way. Which is to say, rather than migrating the filesystems, we migrated the data with user level tools, moving it into pools and filesystems that were created from scratch on our new Ubuntu 24.04 fileservers (which led us to discover that default property values sometimes change in ways that we care about).

(The filesystems reused the same names as their old versions, because that keeps things easier for our people and for us.)

It's possible that this user level rewriting of all data has wound up laying things out in a better way (although all of this is on SSDs), and it's certainly insured that everything has modern metadata associated with it and so on. The 'fragmentation' value of the new pools on the new fileservers is certainly rather lower than the value for most old pools, although what that means is a bit complicated.

There's a bit of me that misses the deep history of our old filesystems, some of which dated back to our first generation Solaris ZFS fileservers. However, on the whole I'm happy that we're now using filesystems that don't have ancient historical relics and peculiarities that may not be well supported by OpenZFS's code any more (and which were only likely to get less tested and more obscure over time).

(Our pools were all (re)created from scratch as part of our migration from OmniOS to Linux, and anyway would have been remade from scratch again in this migration even if we moved the filesystems with 'zfs send'.)

My Cinnamon desktop customizations (as of 2025)

By: cks
22 April 2025 at 03:15

A long time ago I wrote up some basic customizations of Cinnamon, shortly after I started using Cinnamon (also) on my laptop of the time. Since then, the laptop got replaced with another one and various things changed in both the land of Cinnamon and my customizations (eg, also). Today I feel like writing down a general outline of my current customizations, which fall into a number of areas from the modest but visible to the large but invisible.

The large but invisible category is that just like on my main fvwm-based desktop environment, I use xcape (plus a custom Cinnamon key binding for a weird key combination) to invoke my custom dmenu setup (1, 2) when I tap the CapsLock key. I have dmenu set to come up horizontally on the top of the display, which Cinnamon conveniently leaves alone in the default setup (it has its bar at the bottom). And of course I make CapsLock into an additional Control key when held.

(On the laptop I'm using a very old method of doing this. On more modern Cinnamon setups in virtual machines, I do this with Settings β†’ Keyboard β†’ Layout β†’ Options, and then in the CapsLock section set CapsLock to be an additional Ctrl key.)

To start xcape up and do some other things, like load X resources, I have a personal entry in Settings β†’ Startup Applications that runs a script in my ~/bin/X11. I could probably do this in a more modern way with an assortment of .desktop files in ~/.config/autostart (which is where my 'Startup Applications' setting actually wind up) that run each thing individually or perhaps some systemd user units. But the current approach works and is easy to modify if I want to add or remove things (I can just edit the script).

I have a number of Cinnamon 'applets' installed on my laptop and my other Cinnamon VM setups. The ones I have everywhere are Spices Update and Shutdown Applet, the latter because if I tell the (virtual) machine to log me off, shut down, or restart, I generally don't want to be nagged about it. On my laptop I also have CPU Frequency Applet (set to only display a summary) and CPU Temperature Indicator, for no compelling reason. In all environments I also pin launchers for Firefox and (Gnome) Terminal to the Cinnamon bottom bar, because I start both of them often enough. I position the Shutdown Applet on the left side, next to the launchers, because I think of it as a peculiar 'launcher' instead of an applet (on the right).

(The default Cinnamon keybindings also start a terminal with Ctrl + Alt + T, which you can still find through the same process from several years ago provided that you don't cleverly put something in .local/share/glib-2.0/schemas and then run 'glib-compile-schemas .' in that directory. If I was a smarter bear, I'd understand what I should have done when I was experimenting with something.)

On my virtual machines with Cinnamon, I don't bother with the whole xcape and dmenu framework, but I do set up the applets and the launchers and fix CapsLock.

(This entry was sort of inspired by someone I know who just became a Linux desktop user (after being a long time terminal user).)

Sidebar: My Cinnamon 'window manager' custom keybindings

I have these (on my laptop) and perpetually forget about them, so I'm going to write them down now so perhaps that will change.

move-to-corner-ne=['<Alt><Super>Right']
move-to-corner-nw=['<Alt><Super>Left']
move-to-corner-se=['<Primary><Alt><Super>Right']
move-to-corner-sw=['<Primary><Alt><Super>Left']
move-to-side-e=['<Shift><Alt><Super>Right']
move-to-side-n=['<Shift><Alt><Super>Up']
move-to-side-s=['<Shift><Alt><Super>Down']
move-to-side-w=['<Shift><Alt><Super>Left']

I have some other keybindings on the laptop but they're even less important, especially once I added dmenu.

I feel that DANE is not a good use of DNS

By: cks
21 April 2025 at 03:10

DANE is commonly cited as as "wouldn't it be nice" alternative to the current web TLS ('PKI') system. It's my view that DANE is an example of why global DNS isn't a database and shouldn't be used as one. The usual way to describe DANE is that 'it lets you publish your TLS certificates in DNS'. This is not actually what it does, because DNS does not 'publish' anything in the sense of a database or a global directory. DANE lets some unknown set of entities advertise some unknown set of TLS certificates for your site to an unknown set of people. Or at least you don't know the scope of the entities, the TLS certificates, and the people, apart from you, your TLS certificate, and the people who (maybe) come directly to you without being intercepted.

(This is in a theoretical world where DNSSEC is widely deployed and reaches all the way to programs that are doing DNS resolution. That is not this world, where DNSSEC has failed.)

DNS specifically allows servers (run by people) to make up answers to things they get asked. Obviously this would be bad when the answers are about your TLS certificates, so DANE and other things like it try to paper over the problem by adding a cascading hierarchy of signing. The problem is that this doesn't eliminate the issue, it merely narrows who can insert themselves into the chain of trust, from 'the entire world' to 'anyone already in the DNSSEC path or who can break into it', including the TLD operator for your domain's TLD.

There are a relatively small number of Certificate Authorities in the world and even large ones have had problems, never mind the one that got completely compromised. Our most effective tool against TLS mis-issuance is exactly a replicated, distributed global record of issued certificates. DNS and DANE bypass this, unless you require all DANE-obtained TLS certificates to be in Certificate Transparency logs just like CA-issued TLS certificates (and even then, Certificate Transparency is an after the fact thing; the damage has probably been done once you detect it).

In addition, there's no obvious way to revoke or limit DNSSEC the way there is for a mis-behaving Certificate Authority. If a TLD had its DNSSEC environment completely compromised, does anyone think it would be removed from global DNS, the way DigiNotar was removed from global trust? That's not very likely; the damage would be too severe for most TLDs. One of the reasons that Certificate Authorities can be distrusted is that what they do is limited and you can replace one with another. This isn't true for DNS and TLDs.

DNS is extremely bad fit for a system where you absolutely want everyone to see the same 'answer' and to have high assurance that you know what that answer is (and that you are the only person who can put it there). It's especially bad if you want to globally limit who is trusted and allow that trust to be removed or limited without severe damage. In general, if security would be significantly compromised should people received a different answer than the one you set up, DNS is not what you want to use.

(I know, this is how DNS and email mostly work today, but that is historical evolution and backward compatibility. We would not design email to work like that if we were doing it from scratch today.)

(This entry was sparked by ghop's comment mentioning DANE on my first entry.)

Tailscale's surprising interaction of DNS settings and 'exit nodes'

By: cks
20 April 2025 at 02:40

Tailscale is a well regarded commercial mesh networking system, based on WireGuard, that can be pressed into service as a VPN as well. As part of its general features, it allows you to set up various sorts of DNS settings for your tailnet (your own particular Tailscale mesh network), including both DNS servers for specific (sub)domains (eg an 'internal.example.org') and all DNS as a whole. As part of optionally being VPN-like, Tailscale also lets you set up exit nodes, which let you route all traffic for the Internet out the exit node (if you want to route just some subnets to somewhere, that's a subnet router, a different thing). If you're a normal person, especially if you're a system administrator, you probably have a guess as to how these two features interact. Unfortunately, you may well be wrong.

As of today, if you use a Tailscale exit node, all of your DNS traffic is routed to the exit node regardless of Tailscale DNS settings. This applies to both DNS servers for specific subdomains and to any global DNS servers you've set for your tailnet (due to, for example, 'split horizon' DNS). Currently this is documented only in one little sentence in small type in the "Use Tailscale DNS settings" portion of the client preferences documentation.

In many Tailscale environments, all this does is make your DNS queries take an extra hop (from you to the exit node and then to the configured DNS servers). Your Tailscale exit nodes are part of your tailnet, so in ordinary configurations they will have your Tailscale DNS settings and be able to query your configured DNS servers (and they will probably get the same answers, although this isn't certain). However, if one of your exit nodes isn't set up this way, potential pain and suffering is ahead of you. Your tailnet nodes that are using this exit node will get wildly different DNS answers than you expect, potentially not resolving internal domains and maybe getting different answers than you'd expect (if you have split horizon DNS).

One reason that you might set an exit node machine to not use your Tailscale DNS settings (or subnet routes) is that you're only using it as an exit node, not as a regular participant in your tailnet. Your exit node machine might be placed on a completely different network (and in a completely different trust environment) than the rest of your tailnet, and you might have walled off its (less-trusted) traffic from the rest of your network. If the only thing the machine is supposed to be is an Internet gateway, there's no reason to have it use internal DNS settings, and it might not normally be able to reach your internal DNS servers (or the rest of your internal servers).

In my view, a consequence of this is that it's probably best to have any internal DNS servers directly on your tailnet, with their tailnet IP addresses. This makes them as reachable as possible to your nodes, independent of things like subnet routes.

PS: Routing general DNS queries through a tailnet exit node makes sense in this era of geographical DNS results, where you may get different answers depending on where in the world you are and you'd like these to match up with where your exit node is.

(I'm writing this entry because this issue was quite mysterious to us when we ran into it while testing Tailscale and I couldn't find much about it in online searches.)

The clever tricks of OpenPubkey and OPKSSH

By: cks
19 April 2025 at 02:24

OPKSSH (also) is a clever way of using OpenID Connect (OIDC) to authenticate your OpenSSH sessions (it's not the only way to do this). How it works is sufficiently ingenious and clever that I want to write it up, especially as one underlying part uses a general trick.

OPKSSH itself is built on top of OpenPubkey, which is a trick to associated your keypair with an OIDC token. When you perform OIDC authentication, what you get back (at an abstract level) is a signed set of 'claims' and, crucially, a nonce. The nonce is supplied by the client that initiated the OIDC authentication so that it can know that the ID token it eventually gets back actually comes from this authentication session and wasn't obtained through some other one. The client initiating OIDC authentication doesn't get to ask the OIDC identity provider (OP) to include other fields.

What OpenPubkey does is turn the nonce into a signature for a combination of your public key and a second nonce of its own, by cryptographically hashing these together through a defined process. Because the OIDC IdP is signing a set of claims that include the calculated nonce, it is effectively signing a signature of your public key. If you give people the signed OIDC ID token, your public key, and your second nonce, they can verify this (and you can augment the ID Token you back to get a PK Token that embeds this additional information).

(As I understand it, calculating the OIDC ID Token nonce this way is safe because it still includes a random value (the inner nonce) and due to the cryptographic hashing, the entire calculated nonce is still effectively a non-repeating random value.)

To smuggle this PK Token to the OpenSSH server, OPKSSH embeds it as an additional option field in an OpenSSH certificate (called 'openpubkey-pkt'). The certificate itself is for your generated PK Token private key and is (self) signed with it, but this is all perfectly fine with OpenSSH; SSH clients will send the certificate off to the server as a candidate authentication key and the server will read it in. Normally the server would reject it since it's signed by an unknown SSH certificate authority, but OPKSSH uses a clever trick with OpenSSH's AuthorizedKeysCommand server option to get its hands on the full certificate, which lets it extract the PK Token, verify everything, and tell the SSH server daemon that your public key is the underlying OpenPubkey key (which you have the private key for in your SSH client).

Smuggling information through OpenSSH certificates and then processing them with AuthorizedKeysCommand is a clever trick, but it's specific to OpenSSH. Turning a nonce into a signature is a general trick that was eye-opening to me, especially because you can probably do it repeatedly.

The appeal of serving your web pages with a single process

By: cks
18 April 2025 at 02:58

As I slowly work on updating the software behind this blog to deal with the unfortunate realities of the modern web (also), I've found myself thinking (more than once) how much simpler my life would be if I was serving everything through a single process, instead of my eccentric, more or less stateless CGI-based approach. The simple great thing about doing everything through a single process (with threads, goroutines, or whatever inside it for concurrency) is that you have all the shared state you could ever want, and that shared state makes it so easy to do so many things.

Do you have people hitting one URL too often from a single IP address? That's easy to detect, track, and return HTTP 429 responses for until they cool down. Do you have an IP making too many requests across your entire site? You can track that sort of volume information. There's all sorts of potential bad stuff that it's at least easier to detect when you have easy shared global state. And the other side of this is that it's also relatively easy to add simple brute force caching in a single process with global state.

(Of course you have some practical concerns about memory and CPU usage, depending on how much stuff you're keeping track of and for how long.)

You can do a certain amount of this detection with a separate 'database' process of some sort (or a database file, like sqlite), and there's various specialized software that will let you keep this sort of data in memory (instead of on disk) and interact with it easily. But this is an extra layer or two of overhead over simply updating things in your own process, especially if you have to set up things like a database schema for what you're tracking or caching.

(It's my view that ease of implementation is especially useful when you're not sure what sort of anti-abuse measures are going to be useful. The easier it is to implement something and at least get logs of what and how much it would have done, the more you're going to try and the more likely you are to hit on something that works for you.)

Unfortunately it seems like we're only going to need more of this kind of thing in our immediate future. I don't expect the level of crawling and abuse to go down any time soon; if anything, I expect it to keep going up, especially as more and more websites move behind effective but heavyweight precautions and the crawlers turn more of their attention to the rest of us.

Looking at what NFSv4 clients have locked on a Linux NVS(v4) server

By: cks
17 April 2025 at 03:10

A while ago I wrote an entry about (not) finding which NFSv4 client owns a lock on a Linux NFS(v4) server, where the best I could do was pick awkwardly through the raw NFS v4 client information in /proc/fs/nfsd/clients. Recently I discovered an alternative to doing this by hand, which is the nfsdclnts program, and as a result of digging into it and what I was seeing when I tried it out, I now believe I have a better understanding of the entire situation (which was previously somewhat confusing).

The basic thing that nfsdclnts will do is list 'locks' and some information about them with 'nfsdclnts -t lock', in addition to listing other state information such as 'open', for open files, and 'deleg', for NFS v4 delegations. The information it lists is somewhat limited, for example it will list the inode number but not the filesystem, but on the good side nfsdclnts is a Python program so you can easily modify it to report any extra information that exists in the clients/#/states files. However, this information about locks is not complete, because of how file level locks appear to normally manifest in NFS v4 client state.

(The information in the states files is limited, although it contains somewhat more than nfsdclnts shows.)

Here is how I understand NFS v4 locking and states. To start with, NFS v4 has a feature called delegations where the NFS v4 server can hand a lot of authority over a file to a NFS v4 client. When a NFS v4 client accesses a file, the NFS v4 server likes to give it a delegation if this is possible; it normally will be if no one else has the file open or active. Once a NFS v4 client holds a delegation, it can lock the file without involving the NFS v4 server. At this point, the client's 'states' file will report an opaque 'type: deleg' entry for the file (and this entry may or may not have a filename or instead be what nfsdclnts will report as 'disconnected dentry').

While a NFS v4 client has the file delegated, if any other NFS v4 client does anything with the file, including simply opening it, the NFS v4 server will recall the delegation from the original client. As a result, the original client now has to tell the NFS v4 server that it has the file locked. At this point a 'type: lock' entry for the file appears in the first NFS v4 client's states file. If the first NFS v4 client releases its lock while the second NFS v4 client is trying to acquire it, the second NFS v4 client will not have a delegation for the file, so its lock will show up as an explicit 'type: lock' entry in its states file.

An additional wrinkle, a NFS v4 client holding a delegation doesn't immediately release it once all processes have released their locks, closed the file, and so on. Instead the delegation may linger on for some time. If another NFS v4 client opens the file during this time, the first client will lose the delegation but the second NFS v4 client may not get a delegation from the NFS v4 server, so its lock will be visible as a 'type: lock' states file entry.

A third wrinkle is that multiple clients may hold read-only delegations for a file and have fcntl() read locks on it at once, with each of them having a 'type: deleg, access: r' entry for it in their states files. These will only become visible 'type: lock' states entries if the clients have to release their delegations.

So putting this all together:

  • If there is a 'type: lock' entry for the file in any states file (or it's listed in 'nfsdclnts -t lock'), the file is definitely locked by whoever has that entry.

  • If there are no 'type: deleg' or 'type: lock' entries for the file, it's definitely not locked; you can also see this by whether nfsdclnts lists it as having delegations or locks.

  • If there are 'type: deleg' entries for the file, it may or may not be locked by the NFS v4 client (or clients) with the delegation. If the delegation is an 'access: w' delegation, you can see if someone actually has the file locked by accessing the file on another NFS v4 client, which will force the NFS v4 server to recall the delegation and expose the lock if there is one.

If the delegation is 'access: r' and might have multiple read-only locks, you can't force the NFS v4 server to recall the delegation by merely opening the file read-only (for example with 'cat file' or 'less file'). Instead the server will only recall the delegation if you open the file read-write. A convenient way to do this is probably to use 'flock -x <file> -c /bin/true', although this does require you to have more permissions for the file than simply the ability to read it.

Sidebar: Disabling NFS v4 delegations on the server

Based on trawling various places, I believe this is done by writing a '0' to /proc/sys/fs/leases-enabled (or the equivalent 'fs.leases-enabled' sysctl) and then apparently restarting your NFS v4 server processes. This will disable all user level uses of fcntl()'s F_SETLEASE and F_GETLEASE as an additional effect, and I don't know if this will affect any important programs running on the NFS server itself. Based on a study of the kernel source code, I believe that you don't need to restart your NFS v4 server processes if it's sufficient for the NFS server to stop handing out new delegations but current delegations can stay until they're dropped.

(There have apparently been some NFS v4 server and client issues with delegations, cf, along with other NFS v4 issues. However, I don't know if the cure winds up being worse than the disease here, or if there's another way to deal with these stateid problems.)

The DNS system isn't a database and shouldn't be used as one

By: cks
16 April 2025 at 02:53

Over on the Fediverse, I said something:

Thesis: DNS is not meaningfully a database, because it's explicitly designed and used today so that it gives different answers to different people. Is it implemented with databases? Sure. But treating it as a database is a mistake. It's a query oracle, and as a query oracle it's not trustworthy in the way that you would normally trust a database to be, for example, consistent between different people querying it.

It would be nice if we had a global, distributed, relatively easily queryable, consistent database system. It would make a lot of things pretty nice, especially if we could wrap some cryptography around it to make sure we were getting honest answers. However, the general DNS system is not such a database and can't be used as one, and as a result should not be pressed into service as one in protocols.

DNS is designed from the ground up to lie to you in unpredictable ways, and parts of the DNS system lie to you every day. We call these lies things like 'outdated cached data' or 'geolocation based DNS' (or 'split horizon DNS'), but they're lies, or at least inconsistent alternate versions of some truth. The same fundamental properties that allow these inconsistent alternate versions also allow for more deliberate and specific lies, and they also mean that no one can know with assurance what version of DNS anyone else is seeing.

(People who want to reduce the chance for active lies as much as possible must do a variety of relatively extreme things, like query DNS from multiple vantage points around the Internet and perhaps through multiple third party DNS servers. No, checking DNSSEC isn't enough, even when it's present (also), because that just changes who can be lying to you.)

Anything that uses the global DNS system should be designed to expect outdated, inconsistent, and varying answers to the questions it asks (and sometimes incorrect answers, for various reasons). Sometimes those answers will be lies (including the lie of 'that name doesn't exist'). If your design can't deal with all of this, you shouldn't be using DNS.

ZFS's delayed compression of written data (when compression is enabled)

By: cks
15 April 2025 at 02:23

In a comment on my entry about how Unix files have at least two sizes, Leah Neukirchen said that 'ZFS compresses asynchronously' and noted that this could cause the reported block size of a just-written file to change over time. This way of describing ZFS's behavior made me twitch and it took me a bit of thinking to realize why. What ZFS does is delayed compression (which is asynchronous with your user level write() calls), but not true 'asynchronous compression' that happens later at an unpredictable time.

Like basically all filesystems, ZFS doesn't immediately start writing data to disk when you do a write() system call. Instead it buffers this data in memory for a while and only writes it later. As part of this, ZFS doesn't immediately decide where on disk the data will be written (this is often called 'delayed allocation' and is common in many filesystems) and otherwise prepare it to be written out. As part of this delayed allocation and preparation, ZFS doesn't immediately compress your written data, and as a result ZFS doesn't know how many disk blocks your data will take up. Instead your data is only compressed and has disk blocks allocated for it as part of ZFS's pipeline of actually performing IO, when the data is flushed to disk, and only then is its physical block size known.

However, once written to disk, the data's compression or lack of it is never changed (nor is anything else about it; ZFS never modifies data once it's written). For example, data isn't initially written in uncompressed form and then asynchronously compressed later. Nor is there anything that goes around asynchronously compressing or decompressing data if you turn on or off compression on a ZFS filesystem (or change the compression algorithm). This periodically irks people who wish they could turn compression on on an existing filesystem, or change the compression algorithm, and have this take effect 'in place' to shrink the amount of space the filesystem is using.

Delaying compressing data until you're writing it out is a sensible decision for a variety of reasons. One of them is that ZFS compresses your data in potentially large chunks, and you may not write() all of that chunk at once. If you wrote half a chunk now and then half a chunk later before it got flushed to disk, it would be a waste of effort to compress your half a chunk now and then throw the away the work when you compressed the whole chunk.

(I also suspect that it was simpler to add compression to ZFS as part of its IO pipeline than to do it separately. ZFS already had a multi-stage IO pipeline, so adding compression and decompression as another step was probably relatively straightforward.)

Unix files have (at least) two sizes

By: cks
14 April 2025 at 03:02

I'll start by presenting things in illustrated form:

; ls -l testfile
-rw-r--r-- 1 cks 262144 Apr 13 22:03 testfile
; ls -s testfile
1 testfile
; ls -slh testfile
512 -rw-r--r-- 1 cks 256K Apr 13 22:03 testfile

The two well known sizes that Unix files have are the logical 'size' in bytes and what stat.h describes as "the number of blocks allocated for this object", often converted to some number of bytes (as ls is doing here in the last command). A file's size in bytes is roughly speaking the last file offset that has been written to in the file, and not all of the bytes covered by it may have actually been written; when this is the case, the result is a sparse file. Sparse files are the traditional cause of a mismatch between the byte size and the number of blocks a file uses. However, that is not what is happening here.

This file is on a ZFS filesystem with ZFS's compression turned on, and it was created with 'dd if=/dev/zero of=testfile bs=1k count=256'. In ZFS, zeroes compress extremely well, and so ZFS has written basically no physical data blocks and faithfully reported that (minimal) number in the stat() st_blocks field. However, at the POSIX level we have indeed written data to all 256 KBytes of the file; it's not a sparse file. This is an extreme example of filesystem compression, and there are plenty of lesser ones.

This leaves us with a third size, which is the number of logical blocks for this file. When a filesystem is doing data compression, this number will be different from the number of physical blocks used. As far as I can tell, the POSIX stat.h description doesn't specify which one you have to report for st_blocks. As we can see, ZFS opts to report the physical block size of the file, which is probably the more useful number for the purposes of things like 'du'. However, it does leave us with no way of finding out the logical block size, which we may care about for various reasons (for example, if our backup system can skip unwritten sparse blocks but always writes out uncompressed blocks).

This also implies that a non-sparse file can change its st_blocks number if you move it from one filesystem to another. One filesystem might have compression on and the other one have it off, or they might have different compression algorithms that give different results. In some cases this will cause the file's space usage to expand so that it doesn't actually fit into the new filesystem (or for a tree of files to expand their space usage).

(I don't know if there are any Unix filesystems that report the logical block size in st_blocks and only report the physical block size through a private filesystem API, if they report it at all.)

Mandatory short duration TLS certificates are probably coming soon

By: cks
13 April 2025 at 02:56

The news of the time interval is that the maximum validity period for TLS certificates will be lowered to 47 days by March 2029, unless the CA/Browser Forum changes its mind (or is forced to) before then. The details are discussed in SC-081. In skimming the mailing list thread on the votes, a number of organizations that voted to abstain seem unenthused (and uncertain that it can actually be implemented), so this may not come to pass, especially on the timeline proposed here.

If and when this comes to pass, I feel confident that this will end manual certificate renewals at places that are still doing them. With that, it will effectively end Certificate Authorities that don't have an API that you can automatically get certificates through (not necessarily a free or public API). I'm not sure what it's going to do to the Certificate Authority business models for commercial CAs, but I also don't think the browsers care about that issue and the browsers are driving.

This will certainly cause pain. I know of places around the university that are still manually handling one-year TLS certificates; those places will have to change over the course of a few years. This pain will arrive well before 2029; based on the proposed changes, starting March 15, 2027, the maximum certificate validity period will be 100 days, which is short enough to be decidedly annoying. Even a 250 200 day validity period (starting March 15 2026) will be somewhat painful to do by hand.

I expect one consequence to be that some number of (internal) devices stop having valid TLS certificates, because they can only have certificates loaded into them manually and no one is going to do that every 40-dd or even every 90-odd days. You might manually get and load a valid TLS certificate every year; you certainly won't do it every three months (well, almost no one will).

I hope that this will encourage the creation and growth of more alternatives to Let's Encrypt, even if not all of them are free, since more and more CAs will be pushed to have an API and one obvious API to adopt is ACME.

(I can also imagine ways to charge for an ACME based API, even with standard ACME clients. One obvious way would be to only accept ACME requests for domains that the CA had some sort of site license with. You'd establish the site license through out of band means, not ACME.)

How I install personal versions of programs (on Unix)

By: cks
12 April 2025 at 03:03

These days, Unixes are quite generous in what they make available through their packaging systems, so you can often get everything you want through packages that someone else worries about building, updating, managing, and so on. However, not everything is available that way; sometimes I want something that isn't packaged, and sometimes (especially on 'long term support' distributions) I want something that's more recent that the system provides (for example, Ubuntu 22.04 only has Emacs 27.1). Over time, I've evolved my own approach for managing my personal versions of such things, which is somewhat derived from the traditional approach for multi-architecture Unixes here.

The starting point is that I have a ~/lib/<architecture> directory tree. When I build something personally, I tell it that its install prefix is a per-program directory within this tree, for example, '/u/cks/lib/<arch>/emacs-30.1'. These days I only have one active architecture inside ~/lib, but old habits die hard, and someday we may start using ARM machines or FreeBSD. If I install a new version of the program, it goes in a different (versioned) subdirectory, so I have 'emacs-29.4' and 'emacs-30.1' directory trees.

I also have both a general ~/bin directory, for general scripts and other architecture independent things, and a ~/bin/bin.<arch> subdirectory, for architecture dependent things. When I install a program into ~/lib/<arch>/<whatever> and want to use it, I will make either a symbolic link or a cover script in ~/bin/bin.<arch> for it, such as '~/bin/bin.<arch>/emacs'. This symbolic link or cover script always points to what I want to use as the current version of the program, and I update it when I want to switch.

(If I'm building and installing something from the latest development tree, I'll often call the subdirectory something like 'fvwm3-git' and then rename it to have multiple versions around. This is not as good as real versioned subdirectories, but I tend to do this for things that I won't ever run two versions of at the same time; at most I'll switch back and forth.)

Some things I use, such as pipx, normally install programs (or symbolic links to them) into places like ~/.local/bin or ~/.cargo/bin. Because it's not worth fighting city hall on this one, I pretty much let them do so, but I don't add either directory to my $PATH. If I want to use a specific tool that they install and manage, I put in a symbolic link or a cover script in my ~/bin/bin.<arch>. The one exception to this is Go, where I do have ~/go/bin in my $PATH because I use enough Go based programs that it's the path of least resistance.

This setup isn't perfect, because right now I don't have a good general approach for things that depend on the Ubuntu version (where an Emacs 30.1 built on 22.04 doesn't run on 24.04). If I ran into this a lot I'd probably make an addition ~/bin/bin.<something> directory for the Ubuntu version and then put version specific things there. And in general, Go and Cargo are not ready for my home directory to be shared between different binary architectures. For Go, I would probably wind up setting $GOPATH to something like ~/lib/<arch>/go. Cargo has a similar system for deciding where it puts stuff but I haven't looked into it in detail.

(From a quick skim of 'cargo help install' and my ~/.cargo, I suspect that I'd point $CARGO_INSTALL_ROOT into my ~/lib/<arch> but leave $CARGO_HOME unset, so that various bits of Cargo's own data remain shared between architectures.)

(This elaborates a bit on a Fediverse conversation.)

PS: In theory I have a system for keeping track of the command lines used to build things (also, which I'd forgotten when I wrote the more recent entry on this system). In practice I've fallen out of the habit of using it when I build things for my ~/lib, although I should probably get back into it. For GNU Emacs, I put the ./configure command line into a file in ~/lib/<arch>, since I expected to build enough versions of Emacs over time.

One way to set up local programs in a multi-architecture Unix environment

By: cks
11 April 2025 at 02:29

Back in the old days, it used to be reasonably routine to have 'multi-architecture' Unix environments with shared files (where here architecture was a combination of the process architecture and the Unix variant). The multi-architecture days have faded out, and with them fading, so has information about how people made this work with things like local binaries.

In the modern era of large local disks and build farms, the default approach is probably to simply build complete copies of '/local' for each architecture type and then distribute the result around somehow. In the old days people were a lot more interested in reducing disk space by sharing common elements and then doing things like NFS-mounting your entire '/local', which made life more tricky. There likely were many solutions to this, but the one I learned at the university as a young sprout worked like the following.

The canonical paths everyone used and had in their $PATH were things like /local/bin, /local/lib, /local/man, and /local/share. However, you didn't (NFS) mount /local; instead, you NFS mounted /local/mnt (which was sort of an arbitrary name, as we'll see). In /local/mnt there were 'share' and 'man' directories, and also a per-architecture directory for every architecture you supported, with names like 'solaris-sparc' or 'solaris-x86'. These per-architecture directories contained 'bin', 'lib', 'sbin', and so on subdirectories.

(These directories contained all of the locally installed programs, all jumbled together, which did have certain drawbacks that became more and more apparent as you added more programs.)

Each machine had a /local directory on its root filesystem that contained /local/mnt, symlinks from /local/share and /local/man to 'mnt/share' and 'mnt/man', and then symlinks for the rest of the directories that went to 'mnt/<arch>/bin' (or sbin or lib). Then everyone mounted /local/mnt on, well, /local/mnt. Since /local and its contents were local to the machine, you could have different symlinks on each machine that used the appropriate architecture (and you could even have built them on boot if you really wanted to, although in practice they were created when the machine was installed).

When you built software for this environment, you told it that its prefix was /local, and let it install itself (on a suitable build server) using /local/bin, /local/lib, /local/share and so on as the canonical paths. You had to build (and install) software repeatedly, once for each architecture, and it was on the software (and you) to make sure that /local/share/<whatever> was in fact the same from architecture to architecture. System administrators used to get grumpy when people accidentally put architecture dependent things in their 'share' areas, but generally software was pretty good about this in the days when it mattered.

(In some variants of this scheme, the mount points were a bit different because the shared stuff came from one NFS server and the architecture dependent parts from another, or might even be local if your machine was the only instance of its particular architecture.)

There were much more complicated schemes that various places did (often universities), including ones that put each separate program or software system into its own directory tree and then glued things together in various ways. Interested parties can go through LISA proceedings from the 1980s and early 1990s.

The problem of general OIDC identity provider support in clients

By: cks
10 April 2025 at 02:34

I've written entries criticizing things that support using OIDC (OAuth2) authentication for not supporting it with general OIDC identity providers ('OPs' in OIDC jargon), only with specific (large) ones like Microsoft and Google (and often Github in tech-focused things). For example, there are almost no mail clients that support using your own IdP, and it's much easier to find web-based projects that support the usual few big OIDC providers and not your own OIDC OP. However, at the same time I want to acknowledge the practical problems with supporting arbitrary OIDC OPs in things, especially in things that ordinary people are going to be expected to set up themselves.

The core problem is that there is no way to automatically discover all of the information that you need to know in order to start OIDC authentication. If the person gives you their email address, perhaps you can use WebFinger to discover basic information through OIDC Identity Provider discovery, but that isn't sufficient by itself (and it also requires aligning a number of email addresses). In practice, the OIDC OP will require you to have an 'client identifier' and perhaps a 'client secret', both of which are essentially arbitrary strings. If you're a website, the OIDC standards require your 'redirect URI' to have been pre-registered with it. If you're a client program, hopefully you can supply some sort of 'localhost' redirect URI and have it accepted, but you may need to tell the person setting things up on the OIDC OP side that you need specific strings set.

(The client ID and especially the client secret are not normally supposed to be completely public; there are various issues if you publish them widely and then use them for a bunch of different things, cf.)

If you need specific information, even to know who the authenticated person is, this isn't necessarily straightforward. You may have to ask for exactly the right information, neither too much nor too little, and you can't necessarily assume you know where a user or login name is; you may have to ask the person setting up the custom OIDC IdP where to get this. On the good side, there is at least a specific place for where people's email addresses are (but you can't assume that this is the same as someone's login).

(In OIDC terms, you may need to ask for specific scopes and then use a specific claim to get the user or login name. You can't assume that the always-present 'sub' claim is a login name, although it often is; it can be an opaque identifier that's only meaningful to the identity provider.)

Now imagine that you're the author of a mail client that wants to provide a user friendly experience to people. Today, the best you can do is provide a wall of text fields that people have to enter the right information into, with very little validation possible. If people get things even a little bit wrong, all you and they may see is inscrutable error messages. You're probably going to have to describe what people need to do and the information they need to get in technical OIDC terms that assume people can navigate their specific OIDC IdP (or that someone can navigate this for them). You could create a configuration file format for this where the OIDC IdP operator can write down all of the information, give it to the people using your software, and they import it (much like OpenVPN can provide canned configuration files), but you'll be inventing that format (cue xkcd).

If you have limited time and resources to develop your software and help people using it, it's much simpler to support only a few large, known OIDC identity providers. If things need specific setup on the OIDC IdP side, you can feasibly provide that in your documentation (since there's only a few variations), and you can pre-set everything in your program, complete with knowledge about things like OIDC scopes and claims. It's also going to be fairly easy to test your code and procedures against these identity providers, while if you support custom OIDC IdPs you may need to figure out how to set up one (or several), how to configure it, and so on.

Getting older, now-replaced Fedora package updates

By: cks
9 April 2025 at 02:05

Over the history of a given Fedora version, Fedora will often release multiple updates to the same package (for example, kernels, but there are many others). When it does this, the older package wind up being removed from the updates repository and are no longer readily available through mechanisms like 'dnf list --showduplicates <package>'. For a long time I used dnf's 'local' plugin to maintain a local archive of all packages I'd updated, so I could easily revert, but it turns out that as of Fedora 41's change to dnf5 (dnf version 5), that plugin is not available (presumably it hasn't been ported to dnf5, and may never be). So I decided to look into my other options for retrieving and installing older versions of packages, in case the most recent version has a bug that affects me (which has happened).

Before I take everyone on a long yak-shaving expedition, the simplest and best answer is to install the 'fedora-repos-archive' package, which installs an additional Fedora repository that has those replaced updates. After installing it, I suggest that you edit /etc/yum.repos.d/fedora-updates-archive.repo to disable it by default, which will save you time, bandwidth, and possibly aggravation. Then when you really want to see all possible versions of, say, Rust, you can do:

dnf list --showduplicates --enablerepo=updates-archive rust

You can then use 'dnf downgrade ...' as appropriate.

(Like the other Fedora repositories, updates-archive automatically knows your release version and picks packages from it. I think you can change this a bit with '--releasever=<NN>', but I'm not sure how deep the archive is.)

The other approach is to use Fedora Bodhi (also) and Fedora Koji (also) to fetch the packages for older builds, in much the same way as you can use Bodhi (and Koji) to fetch new builds that aren't in the updates or updates-testing repository yet. To start with, we're going to need to find out what's available. I think this can be done through either Bodhi or Koji, although Koji is presumably more authoritative. Let's do this for Rust in Fedora 41:

bodhi updates query --packages rust --releases f41
koji list-builds --state COMPLETE --no-draft --package rust --pattern '*.fc41'

Note that both of these listings are going to include package versions that were never released as updates for various reasons, and also versions built for the pre-release Fedora 41. Although Koji has a 'f41-updates' tag, I haven't been able to find a way to restrict 'koji list-builds' output to packages with that tag, so we're getting more than we'd like even after we use a pattern to restrict this to just Fedora 41.

(I think you may need to use the source package name, not a binary package one; if so, you can get it with 'rpm -qi rust' or whatever and looking at the 'Source RPM' line and name.)

Once you've found the package version you want, the easiest and fastest way to get it is through the koji command line client, following the directions in Installing Kernel from Koji with appropriate changes:

mkdir /tmp/scr
cd /tmp/scr
koji download-build --arch=x86_64 --arch=noarch rust-1.83.0-1.fc41

This will get you a bunch of RPMs, and then you can do 'dnf downgrade /tmp/scr/*.rpm' to have dnf do the right thing (only downgrading things you actually have installed).

One reason you might want to use Koji is that this gets you a local copy of the old package in case you want to go back and forth between it and the latest version for testing. If you use the dnf updates-archive approach, you'll be re-downloading the old version at every cycle. Of course at that point you can also use Koji to get a local copy of the latest update too, or 'dnf download ...', although Koji has the advantage that it gets all the related packages regardless of their names (so for Rust you get the 'cargo', 'clippy', and 'rustfmt' packages too).

(In theory you can work through the Fedora Bodhi website, but in practice it seems to be extremely overloaded at the moment and very slow. I suspect that the bot scraper plague is one contributing factor.)

PS: If you're using updates-archive and you just want to download the old packages, I think what you want is 'dnf download --enablerepo=updates-archive ...'.

Fedora 41 seems to have dropped an old XFT font 'property'

By: cks
8 April 2025 at 03:09

Today I upgraded my office desktop from Fedora 40 to Fedora 41, and as traditional there was a little issue:

Current status: it has been '0' days since a Fedora upgrade caused X font problems, this time because xft apparently no longer accepts 'encoding=...' as a font specification argument/option.

One of the small issues with XFT fonts is that they don't really have canonical names. As covered in the "Font Name" section of fonts.conf, a given XFT font is a composite of a family, a size, and a number of attributes that may be used to narrow down the selection of the XFT font until there's only one option left (or no option left). One way to write that in textual form is, for example, 'Sans:Condensed Bold:size=13'.

For a long time, one of the 'name=value' properties that XFT font matching accepted was 'encoding=<something>'. For example, you might say 'encoding=iso10646-1' to specify 'Unicode' (and back in the long ago days, this apparently could make a difference for font rendering). Although I can't find 'encoding=' documented in historical fonts.conf stuff, I appear to have used it for more than a decade, dating back to when I first converted my fvwm configuration from XLFD fonts to XFT fonts. It's still accepted today on Fedora 40 (although I suspect it does nothing):

: f40 ; fc-match 'Sans:Condensed Bold:size=13:encoding=iso10646-1'
DejaVuSans.ttf: "DejaVu Sans" "Regular"

However, it's no longer accepted on Fedora 41:

: f41 ; fc-match 'Sans:Condensed Bold:size=13:encoding=iso10646-1'
Unable to parse the pattern

Initially I thought this had to be a change in fontconfig, but that doesn't seem to be the case; both Fedora 40 and Fedora 41 use the same version, '2.15.0', just with different build numbers (partly because of a mass rebuild for Fedora 41). Freetype itself went from version 2.13.2 to 2.13.3, but the release notes don't seem to have anything relevant. So I'm at a loss. At least it was easy to fix once I knew what had happened; I just had to take the ':encoding=iso10646-1' bit out from the places I had it.

(The visual manifestation was that all of my fvwm menus and window title bars switched to a tiny font. For historical reasons all of my XFT font specifications in my fvwm configuration file used 'encoding=...', so in Fedora 41 none of them worked and fvwm reported 'can't load font <whatever>' and fell back to its default of an XLFD font, which was tiny on my HiDPI display.)

PS: I suspect that this change will be coming in other Linux distributions sooner or later. Unsurprisingly, Ubuntu 24.04's fc-match still accepts 'encoding=...'.

PPS: Based on ltrace output, FcNameParse() appears to be what fails on Fedora 41.

Sorting out the ordering of OpenSSH configuration directives

By: cks
7 April 2025 at 02:50

As I discovered recently, OpenSSH makes some unusual choices for the ordering of configuration directives in its configuration files, both sshd_config and ssh_config (and files they include). Today I want to write down what I know about the result (which is partly things I've learned researching this entry).

For sshd_config, the situation is relatively straightforward. There are what we could call 'global options' (things you set normally, outside of 'Match' blocks) and 'matching Match options' (things set in Match blocks that actually matched). Both of them are 'first mention wins', but Match options take priority over global options regardless of where the Match option block is in the (aggregate) configuration file. Sshd makes 'first mention win' work in the presence of including files from /etc/ssh/sshd_config.d/ by doing the inclusion at the start of /etc/ssh/sshd_config.

So here's an example with a Match statement:

PasswordAuthentication no
Match Address 127.0.0.0/8,192.168.0.0/16
  PasswordAuthentication yes

Password authentication is turned off as a global option but then overridden in the address-based Match block to enable it for connections from the local network. If we had a (Unix) group for logins that we wanted to never use passwords even if they were coming from the local network, I believe that we would have to write it like this, which looks somewhat odd:

PasswordAuthentication no
Match Group neverpassword
 PasswordAuthentication no
Match Address 127.0.0.0/8,192.168.0.0/16
  PasswordAuthentication yes

Then a 'neverpassword' person logging in from the local network would match both Match blocks, and the first block (the group block) would have 'PasswordAuthentication no' win over the second block's 'PasswordAuthentication yes'. Equivalently, you could put the global 'PasswordAuthentication no' after both Match blocks, which might be clearer.

The situation with ssh and ssh_config is one that I find more confusing and harder to follow. The ssh_config manual page says:

Unless noted otherwise, for each parameter, the first obtained value will be used.

It's pretty clear how this works for the various sources of configurations; options on the command line take priority over everything else, and ~/.ssh/config options take priority over the global options from /etc/ssh/ssh_config and its included files. But within a file (such as ~/.ssh/config), I get a little confused.

What I believe this means for any specific option that you want to give a default value to for all hosts but then override for specific hosts is that you must put your Host * directive for it at the end of your configuration file, and the more specific Host or Match directives first. I'm not sure how this works for matches like 'Match canonical' or 'Match final' that happen 'late' in the processing of your configuration; the natural reading would be that you have to make sure that nothing earlier conflicts with them. If this is so, a natural use for 'Match final' would then be options that you want to be true defaults that only apply if nothing has overridden them.

Some ssh_config options are special in that you can provide them multiple times and they'll be merged together; one example is IdentityFile. I think this applies even across multiple Host and Match blocks, and also that there's no way to remove an IdentityFile once you've added it (which might be an issue if you have a lot of identity files, because SSH servers only let you offer so many). Some options let you modify the default state to, for example, add a non-default key exchange algorithm; I haven't tested to see if you can do this multiple times in Host blocks or if you can only do it once.

(These days you can make things somewhat simpler with 'Match tagged ...' and 'Tag'; one handy and clear explanation of what you can do with this is OpenSSH Config Tags How To.)

Typically your /etc/ssh/ssh_config has no active options set in it and includes /etc/ssh/ssh_config.d/* at the end. On Debian-derived systems, it does have some options specified (for 'Host *', ie making them defaults), but the inclusion of /etc/ssh/ssh_config.d/* has been moved to the start so you can override them.

My own personal ~/.ssh/config setup starts with a 'Host *' block, but as far as I can tell I don't try to override any of its settings later in more specific Host blocks. I do have a final 'Host *' block with comments about how I want to do some things by default if they haven't been set earlier, along with comments in the file that I was finding all of this confusing. I may at some point try to redo it into a 'Match tagged' / 'Tag' form to see if that makes it clearer.

My pessimism about changes to error handling in Go (but they'll happen)

By: cks
6 April 2025 at 02:50

I've said in the past that Go is not our language, and I still stand by that. At the same time, the Go developers do eventually respond to the clamour from the community, which I maintain that we've seen with both Go's eventual addition of generics and the change to Go modules and Go dependency handling (where Go started with one story until it clearly didn't work and they had to change). This leads me to two related views.

First, I think that changes to Go's error handling are inevitably coming sooner or later. Error handling is something the community keeps being unhappy about (even though some people are fine with the current situation), and we know that some people in the core team have written up ideas (via, also). This issue is on the radar, and because it's such a popular issue, I think that change is inevitable.

At the same time, I'm not optimistic about that change, because I don't think error handling is a solved problem. We have a relatively good understanding of things like generics and dependency management, but we don't have a similar understanding of 'good error handling' that can drive a good new implementation in Go. It's possible that the Go developers will find something great, but I think it's more likely that what we'll get is a change that comes with its own set of drawbacks (although it'll be better overall than the current approach).

Go is slowly but steadily becoming a more and more complicated language, and a new method of error handling will inevitably add to that complexity. Also, as I once wrote about an earlier error handling proposal (and another one), a change in error handling will inevitably change how Go is written. People will be pushed to write code that works well with the new error handling mechanism and some number of people will use it for nominally clever tricks, because that's what happens with any language feature.

All of this leaves me feeling somewhat pessimistic about any error handling changes to Go. The current situation isn't ideal, but at least the language is kept simple. Given that error handling isn't a solved problem, I'm not sure any error handling change will improve things enough to make up for its other effects.

I should learn systemd's features for restricting things

By: cks
5 April 2025 at 03:19

Today, for reasons beyond the scope of this entry, I took something I'd been running by hand from the command line for testing and tried to set it up under systemd. This is normally straightforward, and it should have been extra straightforward because the thing came with a .service file. But that .service file used a lot of systemd's features for restricting what programs can do, and for my sins I'd decided to set up the program with its binary, configuration file, and so on in different places than it expected (and I think without some things it expected, like a supplementary group for permission to read some files). This was, unfortunately, an abject failure, so I wound up yanking all of the restrictions except 'DynamicUser=true'.

I'm confident that with enough time, I can (or could) sort out all of the problems (although I didn't feel like spending that time today). What this experience really points out is that systemd has a lot of options for really restricting what programs you run can do, and I'm not particularly familiar with them. To get the service working with all of its original restrictions, I'd have to read way through things like systemd.exec and understanding what everything the .service file used did. Once I did that, I could have understood what I needed to change to deal with my setup of the program.

(An expert probably could have fixed things in short order.)

That systemd has a lot of potential restrictions it can impose and that those restrictions are complex is not a flaw of systemd (or its fault). We already know that fine grained permissions are hard to set up and manage in any environment, especially if you don't know what you're doing (as I don't with systemd's restrictions). At the same time, fine grained restrictions are quite useful for being able to apply some restrictions to programs not designed for them.

(The simplicity of OpenBSD's 'pledge' system is great, but it needs the program's active cooperation. For better or worse, Linux doesn't have a native, fully supported equivalent; instead we have to build it out of more fine grained, lower level facilities, and that's what systemd exposes.)

Learning how to do use the restrictions is probably worthwhile in general. We run plenty of things through locally written systemd .service units. Some amount of those things are potentially risky (although generally not too risky), and some of them could be more restricted than they are today if we wanted to do the work and knew what we were doing (and knew some of the gotchas involved).

(And sooner or later we're going to run into more things with restrictions already in their .service units, and we're going to want to change some aspects of how they work.)

OIDC/OAuth2 as the current all purpose 'authentication hammer'

By: cks
4 April 2025 at 02:50

Today, for reasons, I found myself reflecting that OIDC/OAuth2 seems to have become today's all purpose authentication method, rather than just being a web authentication and Single Sign On system. Obviously you can authenticate websites with OIDC, as well as anything that you can reasonably implement using a website as part of things, but it goes beyond this. You can use OIDC/OAuth2 tokens to authenticate IMAP, POP3, and authenticated SMTP (although substantial restrictions apply), you can (probably) authenticate yourself to various VPN software through OIDC, there are several ways of doing SSH authentication with OIDC, and there's likely others. OIDC/OAuth2 is a supported SASL mechanism, so protocols with SASL support can in theory use OIDC tokens for authentication (although your backend has to support this, as I suppose do your clients). And in general you can pass OAuth2 tokens around somehow to validate yourself over some bespoke protocol.

On the one hand, this is potentially quite useful if you have an OIDC identity server (an 'OP'), perhaps one with some special custom authentication behavior. Once you have your special server, OIDC is your all purpose tool to get its special behavior supported everywhere (as opposed to having to build and hook up your special needs with bespoke behavior in everything, assuming that's even possible). It does have the little drawback that you wind up with OIDC on the brain and see OIDC as the solution to all of your problems, much like hammers.

(Another use of OIDC is to outsource all of your authentication and perhaps even identity handling to some big third party provider (such as Google, Microsoft/Office365, Github, etc). This saves you from having to run your own authentication and identity servers, manage your own Multi-Factor Authentication handling, and so on.)

On the other hand, the OIDC authentication flow is unapologetically web based, and in practice often needs a browser with JavaScript and cookies (cookies may be required in the protocol, I haven't checked). This means that any regular program that wants to use OIDC to authenticate you to something must either call up your browser somehow and then collect the result or it must embed a browser within itself in a little captive browser interface (where it's probably easier to collect the result). This has a variety of limitations and implications, especially if you want to authenticate yourself through OIDC on a server style machine where you don't even have a web browser you can readily run (or a GUI).

(There are awkward tricks around this, cf, or you can outsource part of the authentication to a trusted website that the server program checks in with.)

OIDC isn't the first or the only web authentication protocol; there's also at least SAML, which I believe predates it. But I don't think SAML caught on outside of (some) web authentication. Perhaps it's the XML, which has had what you could call 'some problems' over the years (also, which sort of discusses how SAML requires specific XML handling guarantees that general XML libraries don't necessarily provide).

The order of files in /etc/ssh/sshd_config.d/ matters (and may surprise you)

By: cks
3 April 2025 at 03:18

Suppose, not entirely hypothetically, that you have an Ubuntu 24.04 server system where you want to disable SSH passwords for the Internet but allow them for your local LAN. This looks straightforward based on sshd_config, given the PasswordAuthentication and Match directives:

PasswordAuthentication no
Match Address 127.0.0.0/8,192.168.0.0/16
  PasswordAuthentication yes

Since I'm an innocent person, I put this in a file in /etc/ssh/sshd_config.d/ with a nice high ordering number, say '60-no-passwords.conf'. Then I restarted the SSH daemon and was rather confused when it didn't work (and I wound up resorting to manipulating AuthenticationMethods, which also works).

The culprit is two things combined together. The first is this sentence at the start of sshd_config:

[...] Unless noted otherwise, for each keyword, the first obtained value will be used. [...]

Some configuration systems are 'first mention wins', but I think it's more common to be either 'last mention wins' or 'if it's mentioned more than once, it's an error'. Certainly I was vaguely expecting sshd_config and the files in sshd_config.d to be 'last mention wins', because that would be the obvious way to let you easily override things specified in sshd_config itself. But OpenSSH doesn't work this way.

(You can still override things in sshd_config, because the global sshd_config includes all of sshd_config.d/* at the start, before it sets anything, rather than at the end, how you often see this.)

The second culprit is that at least in our environment, Ubuntu 24.04 writes out a '50-cloud-init.conf' file that contains one deadly (for this) line:

PasswordAuthentication yes

Since '50-cloud-init.conf' was read by sshd before my '60-no-passwords.conf', it forced password authentication to be on. My new configuration file was more or less silently ignored.

Renaming my configuration file to be '10-no-passwords.conf' fixed my problem and made things work like I expected.

Getting a (vague) understanding of error handling in Rust

By: cks
2 April 2025 at 02:37

When I wrote about how error handling isn't a solved problem, I said some things about Rust's error handling that were flat out wrong, which I had in my mind through superstition. Today is a brief correction on that, since I looked it up.

Rust's usual way of signalling (recoverable) errors is to use the Result type, which is an enum with one option for errors and one option for success (so it is the Go 'result, err := call(...)' pattern where only one of result and err can be valid at once, and you have to check before using either). The verbose way of handling this is to explicitly match and handle each option. However, often you're only going to propagate an error, and Rust has special syntax for that in the '?' operator, which immediately propagates an error return and otherwise continues:

fn read_username_from_file() -> Result<String, io::Error> {
    let mut username_file = File::open("hello.txt")?;
    [...]

Rust's '?' operator can be used in any function with a compatible return type. This includes main() if you declare it appropriately, so you can use error propagation with '?' as your only way of handling errors all through your program if you want. The result will probably be a little bit mysterious since people won't get any specific message on error, just a non-zero exit status, but for quick programs I can see the appeal of doing this all the way up through main().

(This makes the '?' operator a far less verbose equivalent of the common Go idiom of 'r, err := ...; if err != nil {return ..., err}'. The '...' will vary depending on the function's return type.)

The other approach is to panic on error with .unwrap(), if you're okay with a basic panic, or .expect(), if you want to provide some sort of diagnostic message to explain a bit about the problem. Although Rust people will probably twitch at this comparison, it feels to me like using .unwrap() is the equivalent of a Python program that does nothing to catch any exceptions (and so winds up with the default stack backtrace), while .expect() is the equivalent of a Python try/except block that prints some sort of a message before exiting.

Since both of these approaches are using Result, you can combine them in quick utility programs. You can propagate errors upward through most of your code, then .expect() on high level operations in main() or functions directly below it to provide some information if things go wrong.

(As a sysadmin, I'm used to the idea of writing quick and rough programs that are run by hand, used only rarely, and operate in environments where they almost never expect to fail. These programs often can get away with minimal error handling, but if things do go wrong it's handy to have some idea of roughly what and where.)

Obviously, what I'd vaguely remembered was a common usage of .unwrap(). I think the wires got crossed in my mind because more recent Rust code I've seen uses '?' a lot, so I sort of vaguely crossed the two in my mind.

I'm working to switch from wget to curl (due to Fedora)

By: cks
1 April 2025 at 02:51

I've been using wget for a long time now, which means that I've developed a lot of habits, reflexes and even little scripts around it. Then wget2 happened, or more exactly Fedora switched from wget to wget2 (and Ubuntu is probably going to follow along). I'm very much not a fan of wget2 (also); I find it has both worse behavior and worse output than classical wget, in ways that routinely get in my way. Or got in my way before I started retraining myself to use curl instead of wget.

(It's actually possible that Ubuntu won't follow Fedora here. Ubuntu 24.04's 'wget' is classic wget, and Debian unstable currently has the wget package still as classic wget. The wget to wget2 transition involves the kind of changes that I can see Debian developers rejecting, so maybe Debian will keep 'wget' as classic wget. The upstream has a wget 1.25.0 release as recently as November 2024 (cf); on the other hand, the main project page says that 'currently GNU wget2 is being developed', so it certainly sounds like the upstream wants to move.)

One tool for my switch is wcurl (also, via), which is a cover script to provide a wget-like interface to curl. But I don't have wcurl everywhere (it's not packaged in Ubuntu 24.04, although I think it's coming in 26.04), so I've also been working to remember things like curl's -L and -O options (for downloading things, these are basically 'do what I want' options; I almost always want curl to follow HTTP redirects). There's a number of other options I want to remember, so since I've been looking at the curl manual page, here's some notes to myself.

(If I downloaded multiple URLs at once, I'll probably want to use '--remote-name-all' instead of repeating -O a lot. But I'm probably not going to remember that unless I write a script.)

My 'wcat' script is basically 'curl -L -sS <url>' (-s to not show the progress bar, -S to include at least the HTTP payload on an error, -L to follow redirects). My related 'wretr' script, which is intended to show headers too, is 'curl -L -sS -i <url>' (-i includes headers), or 'curl -sS -i <url>' if I want to explicitly see any HTTP redirect rather than automatically follow it.

(What I'd like is an option to show HTTP headers only if there was an HTTP error, but curl is currently all or nothing here.)

Some of the time I'll want to fetch files with the -J option, which is the curl equivalent of wget's --trust-server-names. This is necessary in cases where a project doesn't bother with good URLs for things. Possibly I also want to use '-R' to set the local downloaded file's timestamp based on the server provided timestamp, which is wget's traditional behavior (sometimes it's good, sometimes it's confusing).

PS: I care about wcurl being part of a standard Ubuntu package because then we can install it as part of one of our standard package sets. If it's a personal script, it's not pervasive, although that's still better than nothing.

PPS: I'm not going to blame Fedora for the switch from wget to wget2. Fedora has a consistent policy of marching forward in changes like this to stay in sync with what upstream is developing, even when they cause pain to people using Fedora. That's just what you sign up for when you choose Fedora (or drift into it, in my case; I've been using 'Fedora' since before it was Fedora).

Our simple view of 'identity' for our (Unix) accounts

By: cks
31 March 2025 at 03:17

When I wrote about how it's complicated to count how many professors are in our department, I mentioned that the issues involved would definitely complicate the life of any IAM system that tried to understand all of this, but that we had a much simpler view of things. Today I'm going to explain that, with a little bit on its historical evolution (as I understand it).

All Unix accounts on our have to be 'sponsored' by someone, their 'sponsor'. Roughly speaking, all professors who supervise graduate students in the department and all professors who are in the department are or can be sponsors, and there are some additional special sponsors (for example, technical and administrative staff also have sponsors). Your sponsor has to approve your account request before it can be created, although some of the time the approval is more or less automatic (for example, for incoming graduate students, who are automatically sponsored by their supervisor).

At one level this requires us to track 'who is a professor'. At another level, we outsource this work; when new professors show up, the administrative staff side of the department will ask us to set up an account for them, at which point we know to either enable them as a sponsor or schedule it in the future at their official start date. And ultimately, 'who can sponsor accounts' is a political decision that's made (if necessary) by the department (generally by the Chair). We're never called on to evaluate the 'who is a professor in the department' question ourselves.

I believe that one reason we use this model is that what is today the department's general research side computing environment originated in part from an earlier organization that included only a subset of the professors here, so that not everyone in the department could get a Unix account on 'CSRI' systems. To get a CSRI account, a professor who was explicitly part of CSRI had to say 'yes, I want this person to have an account', sponsoring it. When this older, more restricted environment expanded to become the department's general research side computing environment, carrying over the same core sponsorship model was natural (or so I believe).

(Back in the days there were other research groups around the department, involving other professors, and they generally had similar policies for who could get an account.)

Using SimpleSAMLphp to set up an identity provider with Duo support

By: cks
30 March 2025 at 02:28

My university has standardized on an institutional MFA system that's based on institutional identifiers and Duo (a SaaS company, as is commonly necessary these days to support push MFA). We have our own logins and passwords, but wanted to add full Duo MFA authentication to (as a first step) various of our web applications. We were eventually able to work out how to do this, which I'm going to summarize here because although this is a very specific need, maybe someone else in the world also has it.

The starting point is SimpleSAMLphp, which we already had an instance of that authenticated only with login and password against an existing LDAP server we had. SSP is a SAML IdP, but there's a third party module for OIDC OP support, and we wound up using it to make our new IdP support both SAML and OIDC. For Duo support we found a third party module, but to work with SSP 2.x, you need to use a feature branch. We run the entire collective stack of things under Apache, because we're already familiar with that.

A rough version of the install process is:

  • Set up Apache so it can run PHP and etc etc.

  • Obtain SimpleSAMLphp 2.x from the upstream releases. You almost certainly can't use a version packaged by your Linux distribution, because you need to be able to use the 'composer' PHP package manager to add packages to it.
  • Unpack this release somewhere, conventionally /var/simplesamlphp.
  • Install the 'composer' PHP package manager if it's not already available.

  • Install the third party Duo module from the alternate branch. At the top level of your SimpleSAMLphp install, run:

    composer require 0x0fbc/simplesamlphp-module-duouniversal:dev-feature
    

  • Optionally install the OIDC module:

    composer require simplesamlphp/simplesamlphp-module-oidc
    

Now you can configure SimpleSAMLphp, the Duo module, and the OIDC module following their respective instructions (which are not 'simple' despite the name). If you're using the OIDC module, remember that you'll need to set up the Duo module (and the other things we'll need) in two places, not just one, and you'll almost certainly want to add an Apache alias for '/.well-known/openid-configuration' that redirects it to the actual URL that the OIDC module uses.

At this point we need to deal with the mismatch between our local logins and the institutional identifiers that Duo uses for MFA. There are at least three options to deal with this:

  • Add a LDAP attribute (and schema) that will hold the Duo identifier (let's call this the 'duoid') for everyone. This attribute will (probably) be automatically available as a SAML attribute, making it available to the Duo module.

    (If you're not using LDAP for your SimpleSAMLphp authentication module, the module you're using may have its own way to add extra information.)

  • Embed the duoid into your GECOS field in LDAP and write a SimpleSAMLphp 'authproc' with arbitrary PHP code to extract the GECOS field and materialize it as a SAML attribute. This has the advantage that you can share this GECOS field with the Duo PAM module if you use that.

  • Write a SimpleSAMLphp 'authproc' that uses arbitrary PHP code to look up the duoid for a particular login from some data source, which could be an actual database or simply a flat file that you open and search through. This is what we did, mostly because we had such a file sitting around for other reasons.

(Your new SAML attribute will normally be passed through to SAML SPs (clients) that use you as a SAML IdP, but it won't be passed through to OIDC RPs (also clients) unless you configure a new OIDC claim and scope for it and clients ask for that OIDC scope.)

You'll likely also want to augment the SSP Duo module with some additional logging, so you can tell when Duo MFA authentication is attempted for people and when it succeeds. Since the SSP Duo module is more or less moribund, we probably don't have too much to worry about as far as keeping up with upstream updates goes.

I've looked through the SSP Duo module's code and I'm not too worried about development having stopped some time ago. As far as I can see, the module is directly following Duo's guidance for how to use the current Duo Universal SDK and is basically simple glue code to sit between SimpleSAMLphp's API and the Duo SDK API.

Sidebar: Implications of how the Duo module is implemented

To simplify the technical situation, the MFA challenge created by the SSP Duo module is done as an extra step after SimpleSAMLphp has 'authenticated' your login and password against, say, your LDAP server. SSP as a whole has no idea that a person who's passed LDAP is not yet 'fully logged in', and so it will both log things and behave as if you're fully authenticated even before the Duo challenge succeeds. This is the big reason you need additional logging in the Duo module itself.

As far as I can tell, SimpleSAMLphp will also set its 'you are authenticated' IdP session cookie in your browser immediately after you pass LDAP. Conveniently (and critically), authprocs always run when you revisit SimpleSAMLphp even if you're not challenged for a login and password. This does mean that every time you revisit your IdP (for example because you're visiting another website that's protected by it), you'll be sent for a round trip through Duo's site. Generally this is harmless.

In universities, sometimes simple questions aren't simple

By: cks
29 March 2025 at 02:13

Over on the Fediverse I shared a recent learning experience:

Me, an innocent: "So, how many professors are there in our university department?"
Admin person with a thousand yard stare: "Well, it depends on what you mean by 'professor', 'in', and 'department." <unfolds large and complicated chart>

In many companies and other organizations, the status of people is usually straightforward. In a university, things are quite often not so clear, and in my department all three words in my joke are in fact not a joke (although you could argue that two overlap).

For 'professor', there are a whole collection of potential statuses beyond 'tenured or tenure stream'. Professors may be officially retired but still dropping by to some degree ('emeritus'), appointed only for a limited period (but doing research, not just teaching), hired as sessional instructors for teaching, given a 'status-only' appointment, and other possible situations.

(In my university, there's such a thing as teaching stream faculty, who are entirely distinct from sessional instructors. In other universities, all professors are what we here would call 'research stream' professors and do research work as well as teaching.)

For 'in', even once you have a regular full time tenure stream professor, there's a wide range of possibilities for a professor to be cross appointed (also) between departments (or sometimes 'partially appointed' by two departments). These sort of multi-department appointments are done for many reasons, including to enable a professor in one department to supervise graduate students in another one. How much of the professor's salary each department pays varies, as does where the professor actually does their research and what facilities they use in each department.

(Sometime a multi-department professor will be quite active in both departments because their core research is cross-disciplinary, for example.)

For 'department', this is a local peculiarity in my university. We have three campuses, and professors are normally associated with a specific campus. Depending on how you define 'the department', you might or might not consider Computer Science professors at the satellite campuses to be part of the (main campus) department. Sometimes it depends on what the professors opt to do, for example whether or not they will use our main research computing facilities, or whether they'll be supervising graduate students located at our main campus.

Which answers you want for all of these depends on what you're going to use the resulting number (or numbers) for. There is no singular and correct answer for 'how many professors are there in the department'. The corollary to this is that any time we're asked how many professors are in our department, we have to quiz the people asking about what parts matter to them (or guess, or give complicated and conditional answers, or all of the above).

(Asking 'how many professor FTEs do we have' isn't any better.)

PS: If you think this complicates the life of any computer IAM system that's trying to be a comprehensive source of answers, you would be correct. Locally, my group doesn't even attempt to track these complexities and instead has a much simpler view of things that works well enough for our purposes (mostly managing Unix accounts).

US sanctions and your VPN (and certain big US-based cloud providers)

By: cks
28 March 2025 at 02:43

As you may have heard (also) and to simplify, the US government requires US-based organizations to not 'do business with' certain countries and regions (what this means in practice depends in part which lawyer you ask, or more to the point, that the US-based organization asked). As a Canadian university, we have people from various places around the world, including sanctioned areas, and sometimes they go back home. Also, we have a VPN, and sometimes when people go back home, they use our VPN for various reasons (including that they're continuing to do various academic work while they're back at home). Like many VPNs, ours normally routes all of your traffic out of our VPN public exit IPs (because people want this, for good reasons).

Getting around geographical restrictions by using a VPN is a time honored Internet tradition. As a result of it being a time honored Internet tradition, a certain large cloud provider with a lot of expertise in browsers doesn't just determine what your country is based on your public IP; instead, as far as we can tell, it will try to sniff all sorts of attributes of your browser and your behavior and so on to tell if you're actually located in a sanctioned place despite what your public IP is. If this large cloud provider decides that you (the person operating through the VPN) actually are in a sanctioned region, it then seems to mark your VPN's public exit IP as 'actually this is in a sanctioned area' and apply the result to other people who are also working through the VPN.

(Well, I simplify. In real life the public IP involved may only be one part of a signature that causes the large cloud provider to decide that a particular connection or request is from a sanctioned area.)

Based on what we observed, this large cloud provider appears to deal with connections and HTTP requests from sanctioned regions by refusing to talk to you. Naturally this includes refusing to talk to your VPN's public exit IP when it has decided that your VPN's IP is really in a sanctioned country. When this sequence of events happened to us, this behavior provided us an interesting and exciting opportunity to discover how many companies hosted some part of their (web) infrastructure and assets (static or otherwise) on the large cloud provider, and also how hard to diagnose the resulting failures were. Some pages didn't load at all; some pages loaded only partially, or had stuff that was supposed to work but didn't (because fetching JavaScript had failed); with some places you could load their main landing page (on one website) but then not move to the pages (on another website at a subdomain) that you needed to use to get things done.

The partial good news (for us) was that this large cloud provider would reconsider its view of where your VPN's public exit IP 'was' after a day or two, at which point everything would go back to working for a while. This was also sort of the bad news, because it made figuring out what was going on somewhat more complicated and hit or miss.

If this is relevant to your work and your VPNs, all I can suggest is to get people to use different VPNs with different public exit IPs depending on where the are (or force them to, if you have some mechanism for that).

PS: This can presumably also happen if some of your people are merely traveling to and in the sanctioned region, either for work (including attending academic conferences) or for a vacation (or both).

(This is a sysadmin war story from a couple of years ago, but I have no reason to believe the situation is any different today. We learned some troubleshooting lessons from it.)

Three ways I know of to authenticate SSH connections with OIDC tokens

By: cks
27 March 2025 at 02:56

Suppose, not hypothetically, that you have an MFA equipped OIDC identity provider (an 'OP' in the jargon), and you would like to use it to authenticate SSH connections. Specifically, like with IMAP, you might want to do this through OIDC/OAuth2 tokens that are issued by your OP to client programs, which the client programs can then use to prove your identity to the SSH server(s). One reason you might want to do this is because it's hard to find non-annoying, MFA-enabled ways of authenticating SSH, and your OIDC OP is right there and probably already supports sessions and so on. So far I've found three different projects that will do this directly, each with their own clever approach and various tradeoffs.

(The bad news is that all of them require various amounts of additional software, including on client machines. This leaves SSH apps on phones and tablets somewhat out in the cold.)

The first is ssh-oidc, which is a joint effort of various European academic parties, although I believe it's also used elsewhere (cf). Based on reading the documentation, ssh-oidc works by directly passing the OIDC token to the server, I believe through a SSH 'challenge' as part of challenge/response authentication, and then verifying it on the server through a PAM module and associated tools. This is clever, but I'm not sure if you can continue to do plain password authentication (at least not without PAM tricks to selectively apply their PAM module depending on, eg, the network area the connection is coming from).

Second is Smallstep's DIY Single-Sign-On for SSH (also). This works by setting up a SSH certificate authority and having the CA software issue signed, short-lived SSH client certificates in exchange for OIDC authentication from your OP. With client side software, these client certificates will be automatically set up for use by ssh, and on servers all you need is to trust your SSH CA. I believe you could even set this up for personal use on servers you SSH to, since you set up a personally trusted SSH CA. On the positive side, this requires minimal server changes and no extra server software, and preserves your ability to directly authenticate with passwords (and perhaps some MFA challenge). On the negative side, you now have a SSH CA you have to trust.

(One reason to care about still supporting passwords plus another MFA challenge is that it means that people without the client software can still log in with MFA, although perhaps somewhat painfully.)

The third option, which I've only recently become aware of, is Cloudflare's recently open-sourced 'opkssh' (via, Github). OPKSSH builds on something called OpenPubkey, which uses a clever trick to embed a public key you provide in (signed) OIDC tokens from your OP (for details see here). OPKSSH uses this to put a basically regular SSH public key into such an augmented OIDC token, then smuggles it from the client to the server by embedding the entire token in a SSH (client) certificate; on the server, it uses an AuthorizedKeysCommand to verify the token, extract the public key, and tell the SSH server to use the public key for verification (see How it works for more details). If you want, as far as I can see OPKSSH still supports using regular SSH public keys and also passwords (possibly plus an MFA challenge).

(Right now OPKSSH is not ready for use with third party OIDC OPs. Like so many things it's started out by only supporting the big, established OIDC places.)

It's quite possible that there are other options for direct (ie, non-VPN) OIDC based SSH authentication. If there are, I'd love to hear about them.

(OpenBao may be another 'SSH CA that authenticates you via OIDC' option; see eg Signed SSH certificates and also here and here. In general the OpenBao documentation gives me the feeling that using it merely to bridge between OIDC and SSH servers would be swatting a fly with an awkwardly large hammer.)

How we handle debconf questions during our Ubuntu installs

By: cks
26 March 2025 at 02:37

In a comment on How we automate installing extra packages during Ubuntu installs, David Magda asked how we dealt with the things that need debconf answers. This is a good question and we have two approaches that we use in combination. First, we have a prepared file of debconf selections for each Ubuntu version and we feed this into debconf-set-selections before we start installing packages. However in practice this file doesn't have much in it and we rarely remember to update it (and as a result, a bunch of it is somewhat obsolete). We generally only update this file if we discover debconf selections where the default doesn't work in our environment.

Second, we run apt-get with a bunch of environment variables set to muzzle debconf:

export DEBCONF_TERSE=yes
export DEBCONF_NOWARNINGS=yes
export DEBCONF_ADMIN_EMAIL=<null address>@<our domain>
export DEBIAN_FRONTEND=noninteractive

Traditionally I've considered muzzling debconf this way to be too dangerous to do during package updates or installing packages by hand. However, I consider it not so much safe as safe enough to do this during our standard install process. To put it one way, we're not starting out with a working system and potentially breaking it by letting some new or updated package pick bad defaults. Instead we're starting with a non-working system and hopefully ending up with a working one. If some package picks bad defaults and we wind up with problems, that's not much worse than we started out with and we'll fix it by updating our file of debconf selections and then redoing the install.

Also, in practice all of this gets worked out during our initial test installs of any new Ubuntu version (done on test virtual machines these days). By the time we're ready to start installing real servers with a new Ubuntu version, we've gone through most of the discovery process for debconf questions. Then the only time we're going to have problems during future system installs future is if a package update either changes the default answer for a current question (to a bad one) or adds a new question with a bad default. As far as I can remember, we haven't had either happen.

(Some of our servers need additional packages installed, which we do by hand (as mentioned), and sometimes the packages will insist on stopping to ask us questions or give us warnings. This is annoying, but so far not annoying enough to fix it by augmenting our standard debconf selections to deal with it.)

The pragmatics of doing fsync() after a re-open() of journals and logs

By: cks
25 March 2025 at 02:02

Recently I read Rob Norris' fsync() after open() is an elaborate no-op (via). This is a contrarian reaction to the CouchDB article that prompted my entry Always sync your log or journal files when you open them. At one level I can't disagree with Norris and the article; POSIX is indeed very limited about the guarantees it provides for a successful fsync() in a way that frustrates the 'fsync after open' case.

At another level, I disagree with the article. As Norris notes, there are systems that go beyond the minimum POSIX guarantees, and also the fsync() after open() approach is almost the best you can do and is much faster than your other (portable) option, which is to call sync() (on Linux you could call syncfs() instead). Under POSIX, sync() is allowed to return before the IO is complete, but at least sync() is supposed to definitely trigger flushing any unwritten data to disk, which is more than POSIX fsync() provides you (as Norris notes, POSIX permits fsync() to apply only to data written to that file descriptor, not all unwritten data for the underlying file). As far as fsync() goes, in practice I believe that almost all Unixes and Unix filesystems are going to be more generous than POSIX requires and fsync() all dirty data for a file, not just data written through your file descriptor.

Actually being as restrictive as POSIX allows would likely be a problem for Unix kernels. The kernel wants to index the filesystem cache by inode, including unwritten data. This makes it natural for fsync() to flush all unwritten data associated with the file regardless of who wrote it, because then the kernel needs no extra data to be attached to dirty buffers. If you wanted to be able to flush only dirty data associated with a file object or file descriptor, you'd need to either add metadata associated with dirty buffers or index the filesystem cache differently (which is clearly less natural and probably less efficient).

Adding metadata has an assortment of challenges and overheads. If you add it to dirty buffers themselves, you have to worry about clearing this metadata when a file descriptor is closed or a file object is deallocated (including when the process exits). If you instead attach metadata about dirty buffers to file descriptors or file objects, there's a variety of situations where other IO involving the buffer requires updating your metadata, including the kernel writing out dirty buffers on its own without a fsync() or a sync() and then perhaps deallocating the now clean buffer to free up memory.

Being as restrictive as POSIX allows probably also has low benefits in practice. To be a clear benefit, you would need to have multiple things writing significant amounts of data to the same file and fsync()'ing their data separately; this is when the file descriptor (or file object) specific fsync() saves you a bunch of data write traffic over the 'fsync() the entire file' approach. But as far as I know, this is a pretty unusual IO pattern. Much of the time, the thing fsync()'ing the file is the only writer, either because it's the only thing dealing with the file or because updates to the file are being coordinated through it so that processes don't step over each other.

PS: If you wanted to implement this, the simplest option would be to store the file descriptor and PID (as numbers) as additional metadata with each buffer. When the system fsync()'d a file, it could check the current file descriptor number and PID against the saved ones and only flush buffers where they matched, or where these values had been cleared to signal an uncertain owner. This would flush more than strictly necessary if the file descriptor number (or the process ID) had been reused or buffers had been touched in some way that caused the kernel to clear the metadata, but doing more work than POSIX strictly requires is relatively harmless.

Sidebar: fsync() and mmap() in POSIX

Under a strict reading of the POSIX fsync() specification, it's not entirely clear how you're properly supposed to fsync() data written through mmap() mappings. If 'all data for the open file descriptor' includes pages touched through mmap(), then you have to keep the file descriptor you used for mmap() open, despite POSIX mmap() otherwise implicitly allowing you to close it; my view is that this is at least surprising. If 'all data' only includes data directly written through the file descriptor with system calls, then there's no way to trigger a fsync() for mmap()'d data.

The obviousness of indexing the Unix filesystem buffer cache by inodes

By: cks
24 March 2025 at 02:34

Like most operating systems, Unix has an in-memory cache of filesystem data. Originally this was a fixed size buffer cache that was maintained separately from the memory used by processes, but later it became a unified cache that was used for both memory mappings established through mmap() and regular read() and write() IO (for good reasons). Whenever you have a cache, one of the things you need to decide is how the cache is indexed. The more or less required answer for Unix is that the filesystem cache is indexed by inode (and thus filesystem, as inodes are almost always attached to some filesystem).

Unix has three levels of indirection for straightforward IO. Processes open and deal with file descriptors, which refer to underlying file objects, which in turn refer to an inode. There are various situations, such as calling dup(), where you will wind up with two file descriptors that refer to the same underlying file object. Some state is specific to file descriptors, but other state is held at the level of file objects, and some state has to be held at the inode level, such as the last modification time of the inode. For mmap()'d files, we have a 'virtual memory area', which is a separate level of indirection that is on top of the inode.

The biggest reason to index the filesystem cache by inode instead of file descriptor or file object is coherence. If two processes separately open the same file, getting two separate file objects and two separate file descriptors, and then one process writes to the file while the other reads from it, we want the reading process to see the data that the writing process has written. The only thing the two processes naturally share is the inode of the file, so indexing the filesystem cache by inode is the easiest way to provide coherence. If the kernel indexed by file object or file descriptor, it would have to do extra work to propagate updates through all of the indirection. This includes the 'updates' of reading data off disk; if you index by inode, everyone reading from the file automatically sees fetched data with no extra work.

(Generally we also want this coherence for two processes that both mmap() the file, and for one process that mmap()s the file while another process read()s or write()s to it. Again this is easiest to achieve if everything is indexed by the inode.)

Another reason to index by inode is how easy it is to handle various situations in the filesystem cache when things are closed or removed, especially when the filesystem cache holds writes that are being buffered in memory before being flushed to disk. Processes frequently close file descriptors and drop file objects, including by exiting, but any buffered writes still need to be findable so they can be flushed to disk before, say, the filesystem itself is unmounted. Similarly, if an inode is deleted we don't want to flush its pending buffered writes to disk (and certainly we can't allocate blocks for them, since there's nothing to own those blocks any more), and we want to discard any clean buffers associated with it to free up memory. If you index the cache by inode, all you need is for filesystems to be able to find all their inodes; everything else more or less falls out naturally.

This doesn't absolutely require a Unix to index its filesystem buffer caches by inode. But I think it's clearly easiest to index the filesystem cache by inode, instead of the other available references. The inode is the common point for all IO involving a file (partly because it's what filesystems deal with), which makes it the easiest index; everyone has an inode reference and in a properly implemented Unix, everyone is using the same inode reference.

(In fact all sorts of fun tend to happen in Unixes if they have a filesystem that gives out different in-kernel inodes that all refer to the same on-disk filesystem object. Usually this happens by accident or filesystem bugs.)

How we automate installing extra packages during Ubuntu installs

By: cks
23 March 2025 at 02:52

We have a local system for installing Ubuntu machines, and one of the important things it does is install various additional Ubuntu packages that we want as part of our standard installs. These days we have two sorts of standard installs, a 'base' set of packages that everything gets and a broader set of packages that login servers and compute servers get (to make them more useful and usable by people). Specialized machines need additional packages, and while we can automate installation of those too, they're generally a small enough set of packages that we document them in our install instructions for each machine and install them by hand.

There are probably clever ways to do bulk installs of Ubuntu packages, but if so, we don't use them. Our approach is instead a brute force one. We have files that contain lists of packages, such as a 'base' file, and these files just contain a list of packages with optional comments:

# Partial example of Basic package set
amanda-client
curl
jq
[...]

# decodes kernel MCE/machine check events
rasdaemon

# Be able to build Debian (Ubuntu) packages on anything
build-essential fakeroot dpkg-dev devscripts automake 

(Like all of the rest of our configuration information, these package set files live in our central administrative filesystem. You could distribute them in some other way, for example fetching them with rsync or even HTTP.)

To install these packages, we use grep to extract the actual packages into a big list and feed the big list to apt-get. This is more or less:

pkgs=$(cat $PKGDIR/$s | grep -v '^#' | grep -v '^[ \t]*$')
apt-get -qq -y install $pkgs

(This will abort if any of the packages we list aren't available. We consider this a feature, because it means we have an error in the list of packages.)

A more organized and minimal approach might be to add the '--no-install-recommends' option, but we started without it and we don't particularly want to go back to find which recommended packages we'd have to explicitly add to our package lists.

At least some of the 'base' package installs could be done during the initial system install process from our customized Ubuntu server ISO image, since you can specify additional packages to install. However, doing package installs that way would create a series of issues in practice. We'd probably need to more carefully track which package came from which Ubuntu collection, since only some of them are enabled during the server install process, it would be harder to update the lists, and the tools for handling the whole process would be a lot more limited, as would our ability to troubleshoot any problems.

Doing this additional package install in our 'postinstall' process means that we're doing it in a full Unix environment where we have all of the standard Unix tools, and we can easily look around the system if and when there's a problem. Generally we've found that the more of our installs we can defer to once the system is running normally, the better.

(Also, the less the Ubuntu installer does, the faster it finishes and the sooner we can get back to our desks.)

(This entry was inspired by parts of a blog post I read recently and reflecting about how we've made setting up new versions of machines pretty easy, assuming our core infrastructure is there.)

The mystery (to me) of tiny font sizes in KDE programs I run

By: cks
22 March 2025 at 03:24

Over on the Fediverse I tried a KDE program and ran into a common issue for me:

It has been '0' days since a KDE app started up with too-small fonts on my bespoke fvwm based desktop, and had no text zoom. I guess I will go use a browser, at least I can zoom fonts there.

Maybe I could find a KDE settings thing and maybe find where and why KDE does this (it doesn't happen in GNOME apps), but honestly it's simpler to give up on KDE based programs and find other choices.

(The specific KDE program I was trying to use this time was NeoChat.)

My fvwm based desktop environment has an XSettings daemon running, which I use in part to set up a proper HiDPI environment (also, which doesn't talk about KDE fonts because I never figured that out). I suspect that my HiDPI display is part of why KDE programs often or always seem to pick tiny fonts, but I don't particularly know why. Based on the xsettingsd documentation and the registry, there doesn't seem to be any KDE specific font settings, and I'm setting the Gtk/FontName setting to a font that KDE doesn't seem to be using (which I could only verify once I found a way to see the font I was specifying).

After some searching I found the systemsettings program through the Arch wiki's page on KDE and was able to turn up its font sizes in a way that appears to be durable (ie, it stays after I stop and start systemsettings). However, this hasn't affected the fonts I see in NeoChat when I run it again. There are a bunch of font settings, but maybe NeoChat is using the 'small' font for some reason (apparently which app uses what font setting can be variable).

QT (the underlying GUI toolkit of much or all of KDE) has its own set of environment variables for scaling things on HiDPI displays, and setting $QT_SCALE_FACTOR does size up NeoChat (although apparently bits of Plasma ignore these, although I think I'm unlikely to run into this since I don't want to use KDE's desktop components).

Some KDE applications have their own settings files with their own font sizes; one example I know if is kdiff3. This is quite helpful because if I'm determined enough, I can either adjust the font sizes in the program's settings or at least go edit the configuration file (in this case, .config/kdiff3rc, I think, not .kde/share/config/kdiff3rc). However, not all KDE applications allow you to change font sizes through either their GUI or a settings file, and NeoChat appears to be one of the ones that don't.

In theory now that I've done all of this research I could resize NeoChat and perhaps other KDE applications through $QT_SCALE_FACTOR. In practice I feel I would rather switch to applications that interoperate better with the rest of my environment unless for some reason the KDE application is either my only choice or the significantly superior one (as it has been so far for kdiff3 for my usage).

Go's choice of multiple return values was the simpler option

By: cks
21 March 2025 at 02:56

Yesterday I wrote about Go's use of multiple return values and Go types, in reaction to Mond's Were multiple return values Go's biggest mistake?. One of the things that I forgot to mention in that entry is that I think Go's choice to have multiple values for function returns and a few other things was the simpler and more conservative approach in its overall language design.

In a statically typed language that expects to routinely use multiple return values, as Go was designed to with the 'result, error' pattern, returning multiple values as a typed tuple means that tuple-based types are pervasive. This creates pressures on both the language design and the API of the standard library, especially if you start out (as Go did) being a fairly strongly nominally typed language, where different names for the same concrete type can't be casually interchanged. Or to put it another way, having a frequently used tuple container (meta-)type significantly interacts with and affects the rest of the language.

(For example, if Go had handled multiple values through tuples as explicit typed entities, it might have had to start out with something like type aliases (added only in Go 1.9) and it might have been pushed toward some degree of structural typing, because that probably makes it easier to interact with all of the return value tuples flying around.)

Having multiple values as a special case for function returns, range, and so on doesn't create anywhere near this additional influence and pressure on the rest of the language. There are a whole bunch of questions and issues you don't face because multiple values aren't types and can't be stored or manipulated as single entities. Of course you have to be careful in the language specification and it's not trivial, but it's simpler and more contained than going the tuple type route. I also feel it's the more conservative approach, since it doesn't affect the rest of the language as much as a widely used tuple container type would.

(As Mond criticizes, it does create special cases. But Go is a pragmatic language that's willing to live with special cases.)

Go's multiple return values and (Go) types

By: cks
20 March 2025 at 03:31

Recently I read Were multiple return values Go's biggest mistake? (via), which wishes that Go had full blown tuple types (to put my spin on it). One of the things that struck me about Go's situation when I read the article is exactly the inverse of what the article is complaining about, which is that because Go allows multiple values for function return types (and in a few other places), it doesn't have to have tuple types.

One problem with tuple types in a statically typed language is that they must exist as types, whether declared explicitly or implicitly. In a language like Go, where type definitions create new distinct types even if the structure is the same, it isn't particularly difficult to wind up with an ergonomics problem. Suppose that you want to return a tuple that is a net.Conn and an error, a common pair of return values in the net package today. If that tuple is given a named type, everyone must use that type in various places; merely returning or storing an implicitly declared type that's structurally the same is not acceptable under Go's current type rules. Conversely, if that tuple is not given a type name in the net package, everyone is forced to stick to an anonymous tuple type. In addition, this up front choice is now an API; it's not API compatible to give your previously anonymous tuple type a name or vice versa, even if the types are structurally compatible.

(Since returning something and error is so common an idiom in Go, we're also looking at either a lot of anonymous types or a lot more named types. Consider how many different combinations of multiple return values you find in the net package alone.)

One advantage of multiple return values (and the other forms of tuple assignment, and for range clauses) is that they don't require actual formal types. Functions have a 'result type', which doesn't exist as an actual type, but you also needed to handle the same sort of 'not an actual type' thing for their 'parameter type'. My guess is that this let Go's designers skip a certain amount of complexity in Go's type system, because they didn't have to define an actual tuple (meta-)type or alternately expand how structs worked to cover the tuple usage case,

(Looked at from the right angle, structs are tuples with named fields, although then you get into questions of nested structs act in tuple-like contexts.)

A dynamically typed language like Python doesn't have this problem because there are no explicit types, so there's no need to have different types for different combinations of (return) values. There's simply a general tuple container type that can be any shape you want or need, and can be created and destructured on demand.

(I assume that some statically typed languages have worked out how to handle tuples as a data type within their type system. Rust has tuples, for example; I haven't looked into how they work in Rust's type system, for reasons.)

How ZFS knows and tracks the space usage of datasets

By: cks
19 March 2025 at 02:44

Anyone who's ever had to spend much time with 'zfs list -t all -o space' knows the basics of ZFS space usage accounting, with space used by the datasets, data unique to a particular snapshot (the 'USED' value for a snapshot), data used by snapshots in total, and so on. But today I discovered that I didn't really know how it all worked under the hood, so I went digging in the source code. The answer is that ZFS tracks all of these types of space usage directly as numbers, and updates them as blocks are logically freed.

(Although all of these are accessed from user space as ZFS properties, they're not conventional dataset properties; instead, ZFS materializes the property version any time you ask, from fields in its internal data structures. Some of these fields are different and accessed differently for snapshots and regular datasets, for example what 'zfs list' presents as 'USED'.)

All changes to a ZFS dataset happen in a ZFS transaction (group), which are assigned ever increasing numbers, the 'transaction group number(s)' (txg). This includes allocating blocks, which remember their 'birth txg', and making snapshots, which carry the txg they were made in and necessarily don't contain any blocks that were born after that txg. When ZFS wants to free a block in the live filesystem (either because you deleted the object or because you're writing new data and ZFS is doing its copy on write thing), it looks at the block's birth txg and the txg of the most recent snapshot; if the block is old enough that it has to be in that snapshot, then the block is not actually freed and the space for the block is transferred from 'USED' (by the filesystem) to 'USEDSNAP' (used only in snapshots). ZFS will then further check the block's txg against the txgs of snapshots to see if the block is unique to a particular snapshot, in which case its space will be added to that snapshot's 'USED'.

ZFS goes through a similar process when you delete a snapshot. As it runs around trying to free up the snapshot's space, it may discover that a block it's trying to free is now used only by one other snapshot, based on the relevant txgs. If so, the block's space is added to that snapshot's 'USED'. If the block is freed entirely, ZFS will decrease the 'USEDSNAP' number for the entire dataset. If the block is still used by several snapshots, no usage numbers need to be adjusted.

(Determining if a block is unique in the previous snapshot is fairly easy, since you can look at the birth txgs of the two previous snapshots. Determining if a block is now unique in the next snapshot (or for that matter is still in use in the dataset) is more complex and I don't understand the code involved; presumably it involves somehow looking at what blocks were freed and when. Interested parties can look into the OpenZFS code themselves, where there are some surprises.)

PS: One consequence of this is that there's no way after the fact to find out when space shifted from being used by the filesystem to used by snapshots (for example, when something large gets deleted in the filesystem and is now present only in snapshots). All you can do is capture the various numbers over time and then look at your historical data to see when they changed. The removal of snapshots is captured by ZFS pool history, but as far as I know this doesn't capture how the deletion affected the various space usage numbers.

I don't think error handling is a solved problem in language design

By: cks
18 March 2025 at 02:53

There are certain things about programming language design that are more or less solved problems, where we generally know what the good and bad approaches are. For example, over time we've wound up agreeing on various common control structures like for and while loops, if statements, and multi-option switch/case/etc statements. The syntax may vary (sometimes very much, as for example in Lisp), but the approach is more or less the same because we've come up with good approaches.

I don't believe this is the case with handling errors. One way to see this is to look at the wide variety of approaches and patterns that languages today take to error handling. There is at least 'errors as exceptions' (for example, Python), 'errors as values' (Go and C), and 'errors instead of results and you have to check' combined with 'if errors happen, panic' (both Rust). Even in Rust there are multiple idioms for dealing with errors; some Rust code will explicitly check its Result types, while other Rust code sprinkles '?' around and accepts that if the program sails off the happy path, it simply dies.

Update: I got Rust's error handling wrong, as pointed out in the comments on this entry. What I was thinking of is Rust's .unwrap() and .expect(), not '?'.

If you were creating a new programming language from scratch, there's no clear agreed answer to what error handling approach you should pick, not the way we have more or less agreed on how for, while, and so on should work. You'd be left to evaluate trade offs in language design and language ergonomics and to make (and justify) your choices, and there probably would always be people who think you should have chosen differently. The same is true of changing or evolving existing languages, where there's no generally agreed on 'good error handling' to move toward.

(The obvious corollary of this is that there's no generally agreed on keywords or other syntax for error handling, the way 'for' and 'while' are widely accepted as keywords as well as concepts. The closest we've come is that some forms of error handling have generally accepted keywords, such as try/catch for exception handling.)

I like to think that this will change at some point in the future. Surely there actually is a good pattern for error handling out there and at some point we will find it (if it hasn't already been found) and then converge on it, as we've converged on programming language things before. But I feel it's clear that we're not there yet today.

OIDC claim scopes and their interactions with OIDC token authentication

By: cks
17 March 2025 at 02:31

When I wrote about how SAML and OIDC differed in sharing information, where SAML shares every SAML 'attribute' by default and OIDC has 'scopes' for its 'claims', I said that the SAML approach was probably easier within an organization, where you already have trust in the clients. It turns out that there's an important exception to this I didn't realize at the time, and that's when programs (like mail clients) are using tokens to authenticate to servers (like IMAP servers).

In OIDC/OAuth2 (and probably in SAML as well), programs that obtain tokens can open them up and see all of the information that they contain, either inspecting them directly or using a public OIDC endpoint that allows them to 'introspect' the token for additional information (this is the same endpoint that will be used by your IMAP server or whatever). Unless you enjoy making a bespoke collection of (for example) IMAP clients, the information that programs need to obtain tokens is going to be more or less public within your organization and will probably (or even necessarily) leak outside of it.

(For example, you can readily discover all of the OIDC client IDs used by Thunderbird for the various large providers it supports. There's nothing stopping you from using those client IDs and client secrets yourself, although large providers may require your target to have specifically approved using Thunderbird with your target's accounts.)

This means that anyone who can persuade your people to authenticate through a program's usual flow can probably extract all of the information available in the token. They can do this either on the person's computer (capturing the token locally) or by persuading people that they need to 'authenticate to this service with IMAP OAuth2' or the like and then extracting the information from the token.

In the SAML world, this will by default be all of the information contained in the token. In the OIDC world, you can restrict the information made available through tokens issued through programs by restricting the scopes that you allow programs to ask for (and possibly different scopes for different programs, although this is a bit fragile; attackers may get to choose which program's client ID and so on they use).

(Realizing this is going to change what scopes we allow in our OIDC IdP for program client registrations. So far I had reflexively been giving them access to everything, just like our internal websites; now I think I'm going to narrow it down to almost nothing.)

Sidebar: How your token-consuming server knows what created them

When your server verifies OAuth2/OIDC tokens presented to it, the minimum thing you want to know is that they come from the expected OIDC identity provider, which is normally achieved automatically because you'll ask that OIDC IdP to verify that the token is good. However, you may also want to know that the token was specifically issued for use with your server, or through a program that's expected to be used for your server. The normal way to do this is through the 'aud' OIDC claim, which has at least the client ID (and in theory your OIDC IdP could add additional entries). If your OIDC IdP can issue tokens through multiple identities (perhaps to multiple parties, such as the major IdPs of, for example, Google and Microsoft), you may also want to verify the 'iss' (issuer) field instead or in addition to 'aud'.

Some notes on the OpenID Connect (OIDC) 'redirect uri'

By: cks
16 March 2025 at 02:57

The normal authentication process for OIDC is web-based and involves a series of HTTP redirects, interspersed with web pages that you interact with. Something that wants to authenticate you will redirect you to the OIDC identity server's website, which will ask you for your login and password and maybe MFA authentication, check them, and then HTTP redirect you back to a 'callback' or 'redirect' URL that will transfer a magic code from the OIDC server to the OIDC client (generally as a URL query parameter). All of this happens in your browser, which means that the OIDC client and server don't need to be able to directly talk to each other, allowing you to use an external cloud/SaaS OIDC IdP to authenticate to a high-security internal website that isn't reachable from the outside world and maybe isn't allowed to make random outgoing HTTP connections.

(The magic code transferred in the final HTTP redirect is apparently often not the authentication token itself but instead something the client can use for a short time to obtain the real authentication token. This does require the client to be able to make an outgoing HTTP connection, which is usually okay.)

When the OIDC client initiates the HTTP redirection to the OIDC IdP server, one of the parameters it passes along is the 'redirect uri' it wants the OIDC server to use to pass the magic code back to it. A malicious client (or something that's gotten a client's ID and secret) could do some mischief by manipulating this redirect URL, so the standard specifically requires that OIDC IdP have a list of allowed redirect uris for each registered client. The standard also says that in theory, the client's provided redirect uri and the configured redirect uris are compared as literal string values. So, for example, 'https://example.org/callback' doesn't match 'https://example.org/callback/'.

This is straightforward when it comes to websites as OIDC clients, since they should have well defined callback urls that you can configure directly into your OIDC IdP when you set up each of them. It gets more hairy when what you're dealing with is programs as OIDC clients, where they are (for example) trying to get an OIDC token so they can authenticate to your IMAP server with OAuth2, since these programs don't normally have a website. Historically, there are several approaches that people have taken for programs (or seem to have, based on my reading so far).

Very early on in OAuth2's history, people apparently defined the special redirect uri value 'urn:ietf:wg:oauth:2.0:oob' (which is now hard to find or identify documentation on). An OAuth2 IdP that saw this redirect uri (and maybe had it allowed for the client) was supposed to not redirect you but instead show you a HTML page with the magic OIDC code displayed on it, so you could copy and paste the code into your local program. This value is now obsolete but it may still be accepted by some IdPs (you can find it listed for Google in mutt_oauth2.py, and I spotted an OIDC IdP server that handles it).

Another option is that the IdP can provide an actual website that does the same thing; if you get HTTP redirected to it with a valid code, it will show you the code on a HTML page and you can copy and paste it. Based on mutt_oauth2.py again, it appears that Microsoft may have at one point done this, using https://login.microsoftonline.com/common/oauth2/nativeclient as the page. You can do this too with your own IdP (or your own website in general), although it's not recommended for all sorts of reasons.

The final broad approach is to use 'localhost' as the target host for the redirect. There are several ways to make this work, and one of them runs into complications with the IdP's redirect uri handling.

The obvious general approach is for your program to run a little HTTP server that listens on some port on localhost, and capture the code when the (local) browser gets the HTTP redirect to localhost and visits the server. The problem here is that you can't necessarily listen on port 80, so your redirect uri needs to include the port you're listening (eg 'http://localhost:7000'), and if your OIDC IdP is following the standard it must be configured not just with 'http://localhost' as the allowed redirect uri but the specific port you'll use. Also, because of string matching, if the OIDC IdP lists 'http://localhost:7000', you can't send 'http://localhost:7000/' despite them being the same URL.

(And your program has to use 'localhost', not '127.0.0.1' or the IPv6 loopback address; although the two have the same effect, they're obviously not string-identical.)

Based on experimental evidence from OIDC/OAuth2 client configurations, I strongly suspect that some large IdP providers have non-standard, relaxed handling of 'localhost' redirect uris such that their client configuration lists 'http://localhost' and the IdP will accept some random port glued on in the actual redirect uri (or maybe this behavior has been standardized now). I suspect that the IdPs may also accept the trailing slash case. Honestly, it's hard to see how you get out of this if you want to handle real client programs out in the wild.

(Some OIDC IdP software definitely does the standard compliant string comparison. The one I know of for sure is SimpleSAMLphp's OIDC module. Meanwhile, based on reading the source code, Dex uses a relaxed matching for localhost in its matching function, provided that there are no redirect uris register for the client. Dex also still accepts the urn:ietf:wg:oauth:2.0:oob redirect uri, so I suspect that there are still uses out there in the field.)

If the program has its own embedded web browser that it's in full control of, it can do what Thunderbird appears to do (based on reading its source code). As far as I can tell, Thunderbird doesn't run a local listening server; instead it intercepts the HTTP redirection to 'http://localhost' itself. When the IdP sends the final HTTP redirect to localhost with the code embedded in the URL, Thunderbird effectively just grabs the code from the redirect URL in the HTTP reply and never actually issues a HTTP request to the redirect target.

The final option is to not run a localhost HTTP server and to tell people running your program that when their browser gives them an 'unable to connect' error at the end of the OIDC authentication process, they need to go to the URL bar and copy the 'code' query parameter into the program (or if you're being friendly, let them copy and paste the entire URL and you extract the code parameter). This allows your program to use a fixed redirect uri, including just 'http://localhost', because it doesn't have to be able to listen on it or on any fixed port.

(This is effectively a more secure but less user friendly version of the old 'copy a code that the website displayed' OAuth2 approach, and that approach wasn't all that user friendly to start with.)

PS: An OIDC redirect uri apparently allows things other than http:// and https:// URLs; there is, for example, the 'openid-credential-offer' scheme. I believe that the OIDC IdP doesn't particularly do anything with those redirect uris other than accept them and issue a HTTP redirect to them with the appropriate code attached. It's up to your local program or system to intercept HTTP requests for those schemes and react appropriately, much like Thunderbird does, but perhaps easier because you can probably register the program as handling all 'whatever-special://' URLs so the redirect is automatically handed off to it.

(I suspect that there are more complexities in the whole OIDC and OAuth2 redirect uri area, since I'm new to the whole thing.)

Some notes on configuring Dovecot to authenticate via OIDC/OAuth2

By: cks
15 March 2025 at 03:01

Suppose, not hypothetically, that you have a relatively modern Dovecot server and a shiny new OIDC identity provider server ('OP' in OIDC jargon, 'IdP' in common usage), and you would like to get Dovecot to authenticate people's logins via OIDC. Ignoring certain practical problems, the way this is done is for your mail clients to obtain an OIDC token from your IdP, provide it to Dovecot via SASL OAUTHBEARER, and then for Dovecot to do the critical step of actually validating that token it received is good, still active, and contains all the information you need. Dovecot supports this through OAuth v2.0 authentication as a passdb (password database), but in the usual Dovecot fashion, the documentation on how to configure the parameters for validating tokens with your IdP is a little bit lacking in explanations. So here are some notes.

If you have a modern OIDC IdP, it will support OpenID Connect Discovery, including the provider configuration request on the path /.well-known/openid-configuration. Once you know this, if you're not that familiar with OIDC things you can request this URL from your OIDC IdP, feed the result through 'jq .', and then use it to pick out the specific IdP URLs you want to set up in things like the Dovecot file with all of the OAuth2 settings you need. If you do this, the only URL you want for Dovecot is the userinfo_endpoint URL. You will put this into Dovecot's introspection_url, and you'll leave introspection_mode set to the default of 'auth'.

You don't want to set tokeninfo_url to anything. This setting is (or was) used for validating tokens with OAuth2 servers before the introduction of RFC 7662. Back then, the defacto standard approach was to make a HTTP GET approach to some URL with the token pasted on the end (cf), and it's this URL that is being specified. This approach was replaced with RFC 7662 token introspection, and then replaced again with OpenID Connect UserInfo. If both tokeninfo_url and introspection_url are set, as in Dovecot's example for Google, the former takes priority.

(Since I've just peered deep into the Dovecot source code, it appears that setting 'introspection_mode = post' actually performs an (unauthenticated) token introspection request. The 'get' mode seems to be the same as setting tokeninfo_url. I think that if you set the 'post' mode, you also want to set active_attribute and perhaps active_value, but I don't know what to, because otherwise you aren't necessarily fully validating that the token is still active. Does my head hurt? Yes. The moral here is that you should use an OIDC IdP that supports OpenID Connect UserInfo.)

If your IdP serves different groups and provides different 'issuer' ('iss') values to them, you may want to set the Dovecot 'issuers =' to the specific issuer that applies to you. You'll also want to set 'username_attribute' to whatever OIDC claim is where your IdP puts what you consider the Dovecot username, which might be the email address or something else.

It would be nice if Dovecot could discover all of this for itself when you set openid_configuration_url, but in the current Dovecot, all this does is put that URL in the JSON of the error response that's sent to IMAP clients when they fail OAUTHBEARER authentication. IMAP clients may or may not do anything useful with it.

As far as I can tell from the Dovecot source code, setting 'scope =' primarily requires that the token contains those scopes. I believe that this is almost entirely a guard against the IMAP client requesting a token without OIDC scopes that contain claims you need elsewhere in Dovecot. However, this only verifies OIDC scopes, it doesn't verify the presence of specific OIDC claims.

So what you want to do is check your OIDC IdP's /.well-known/openid-configuration URL to find out its collection of endpoints, then set:

# Modern OIDC IdP/OP settings
introspection_url = <userinfo_endpoint>
username_attribute = <some claim, eg 'email'>

# not sure but seems common in Dovecot configs?
pass_attrs = pass=%{oauth2:access_token}

# optionally:
openid_configuration_url = <stick in the URL>

# you may need:
tls_ca_cert_file = /etc/ssl/certs/ca-certificates.crt

The OIDC scopes that IMAP clients should request when getting tokens should include a scope that gives the username_attribute claim, which is 'email' if the claim is 'email', and also apparently the requested scopes should include the offline_access scope.

If you want a test client to see if you've set up Dovecot correctly, one option is to appropriately modify a contributed Python program for Mutt (also the README), which has the useful property that it has an option to check all of IMAP, POP3, and authenticated SMTP once you've obtained a token. If you're just using it for testing purposes, you can change the 'gpg' stuff to 'cat' to just store the token with no fuss (and no security). Another option, which can be used for real IMAP clients too if you really want to, is an IMAP/etc OAuth2 proxy.

(If you want to use Mutt with OAuth2 with your IMAP server, see this article on it also, also, also. These days I would try quite hard to use age instead of GPG.)

Doing multi-tag matching through URLs on the modern web

By: cks
14 March 2025 at 02:46

So what happened is that Mike Hoye had a question about a perfectly reasonable ideas:

Question: is there wiki software out there that handles tags (date, word) with a reasonably graceful URL approach?

As in, site/wiki/2020/01 would give me all the pages tagged as 2020 and 01, site/wiki/foo/bar would give me a list of articles tagged foo and bar.

I got nerd-sniped by a side question but then, because I'd been nerd-sniped, I started thinking about the whole thing and it got more and more hair-raising as a thing done in practice.

This isn't because the idea of stacking selections like this is bad; 'site/wiki/foo/bar' is a perfectly reasonable and good way to express 'a list of articles tagged foo and bar'. Instead, it's because of how everything on the modern web eventually gets visited combined with how, in the natural state of this feature, 'site/wiki/bar/foo' is just a valid a URL for 'articles tagged both foo and bar'.

The combination, plus the increasing tendency of things on the modern web to rattle every available doorknob just to see what happens, means that even if you don't advertise 'bar/foo', sooner or later things are going to try it. And if you do make the combinations discoverable through HTML links, crawlers will find them very fast. At a minimum this means crawlers will see a lot of essentially duplicated content, and you'll have to go through all of the work to do the searches and generate the page listings and so on.

If I was going to implement something like this, I would define a canonical tag order and then, as early in request processing as possible, generate a HTTP redirect from any non-canonical ordering to the canonical one. I wouldn't bother checking if the tags were existed or anything, just determine that they are tags, put them in canonical order, and if the request order wasn't canonical, redirect. That way at least all of your work (and all of the crawler attention) is directed at one canonical version. Smart crawlers will notice that this is a redirect to something they already have (and hopefully not re-request it), and you can more easily use caching.

(And if search engines still matter, the search engines will see only your canonical version.)

This probably holds just as true for doing this sort of tag search through query parameters on GET queries; if you expose the result in a URL, you want to canonicalize it. However, GET query parameters are probably somewhat safer if you force people to form them manually and don't expose links to them. So far, web crawlers seem less likely to monkey around with query parameters than with URLs, based on my limited experience with the blog.

The commodification of desktop GUI behavior

By: cks
13 March 2025 at 03:08

Over on the Fediverse, I tried out a thesis:

Thesis: most desktop GUIs are not opinionated about how you interact with things, and this is why there are so many GUI toolkits and they make so little difference to programs, and also why the browser is a perfectly good cross-platform GUI (and why cross-platform GUIs in general).

Some GUIs are quite opinionated (eg Plan 9's Acme) but most are basically the same. Which isn't necessarily a bad thing but it creates a sameness.

(Custom GUIs are good for frequent users, bad for occasional ones.)

Desktop GUIs differ in how they look and to some extent in how you do certain things and how you expect 'native' programs to behave; I'm sure the fans of any particular platform can tell you all about little behaviors that they expect from native applications that imported ones lack. But I think we've pretty much converged on a set of fundamental behaviors for how to interact with GUI programs, or at least how to deal with basic ones, so in a lot of cases the question about GUIs is how things look, not how you do things at all.

(Complex programs have for some time been coming up with their own bespoke alternatives to, for example, huge cascades of menus. If these are successful they tend to get more broadly adopted by programs facing the same problems; consider the 'ribbon', which got what could be called a somewhat mixed reaction on its modern introduction.)

On the desktop, changing the GUI toolkit that a program uses (either on the same platform or on a different one) may require changing the structure of your code (in addition to ordinary code changes), but it probably won't change how your program operates. Things will look a bit different, maybe some standard platform features will appear or disappear, but it's not a completely different experience. This often includes moving your application from the desktop into the browser (a popular and useful 'cross-platform' environment in itself).

This is less true on mobile platforms, where my sense is that the two dominant platforms have evolved somewhat different idioms for how you interact with applications. A proper 'native' application behaves differently on the two platforms even if it's using mostly the same code base.

GUIs such as Plan 9's Acme show that this doesn't have to be the case; for that matter, so does GNU Emacs. GNU Emacs has a vague shell of a standard looking GUI but it's a thin layer over a much different and stranger vastness, and I believe that experienced Emacs people do very little interaction with it.

Some views on the common Apache modules for SAML or OIDC authentication

By: cks
12 March 2025 at 03:01

Suppose that you want to restrict access to parts of your Apache based website but you want something more sophisticated and modern than Apache Basic HTTP authentication. The traditional reason for this was to support 'single sign on' across all your (internal) websites; the modern reason is that a central authentication server is the easiest place to add full multi-factor authentication. The two dominant protocols for this are SAML and OIDC. There are commonly available Apache authentication modules for both protocols, in the form of Mellon (also) for SAML and OpenIDC for OIDC.

I've now used or at least tested the Ubuntu 24.04 version of both modules against the same SAML/OIDC identity provider, primarily because when you're setting up a SAML/OIDC IdP you need to be able to test it with something. Both modules work fine, but after my experiences I'm more likely to use OpenIDC than Mellon in most situations.

Mellon has two drawbacks and two potential advantages. The first drawback is that setting up a Mellon client ('SP') is more involved. Most of annoying stuff is automated for you with the mellon_create_metadata script (which you can get from the Mellon repository if it's not in your Mellon package), but you still have to give your IdP your XML blob and get their XML blob. The other drawback is that Mellon isn't integrated into the Apache 'Require' framework for authorization decisions; instead you have to make do with Mellon-specific directives.

The first potential advantage is that Mellon has a straightforward story for protecting two different areas of your website with two different IdPs, if you need to do that for some reason; you can just configure them in separate <Location> or <Directory> blocks and everything works out. If anything, it's a bit non-obvious how to protect various disconnected bits of your URL space with the same IdP without having to configure multiple SPs, one for each protected section of URL space. The second potential advantage is that in general SAML has an easier story for your IdP giving you random information, and Mellon will happily export every SAML attribute it gets into the environment your CGI or web application gets.

The first advantage of OpenIDC is that it's straightforward to configure when you have a single IdP, with no XML and generally low complexity. It's also straightforward to protect multiple disconnected URL areas with the same IdP but possibly different access restrictions. A third advantage is that OpenIDC is integrated into Apache's 'Require' system, although you have to use OpenIDC specific syntax like 'Require claim groups:agroup' (see the OpenIDC wiki on authorization).

In exchange for this, it seems to be quite involved to use OpenIDC if you need to use multiple OIDC identity providers to protect different bits of your website. It's apparently possible to do this in the same virtual host but it seems quite complex and requires a lot of parts, so if I was confronted with this problem I would try very hard to confine each web thing that needed a different IdP into a different virtual host. And OpenIDC has the general OIDC problem that it's harder to expose random information.

(All of the important OpenIDC Apache directives about picking an IdP can't be put in <Location> or <Directory> blocks, only in a virtual host as a whole. If you care about this, see the wiki on Multiple Providers and also access to different URL paths on a per-provider basis.)

We're very likely to only ever be working with a single IdP, so for us OpenIDC is likely to be easier, although not hugely so.

Sidebar: The easy approach for group based access control with either

Both Mellon and OpenIDC work fine together with the traditional Apache AuthGroupFile directive, provided (of course) that you have or build an Apache format group file using what you've told Mellon or OpenIDC to use as the 'user' for Apache authentication. If your IdP is using the same user (and group) information as your regular system is, then you may well already have this information around.

(This is especially likely if you're migrating from Apache Basic HTTP authentication, where you already needed to build this sort of stuff.)

Building your own Apache group file has the additional benefit that you can augment and manipulate group information in ways that might not fit well into your IdP. Your IdP has the drawback that it has to be general; your generated Apache group file can be narrowly specific for the needs of a particular web area.

The web browser as an enabler of minority platforms

By: cks
11 March 2025 at 03:35

Recently, I got involved in a discussion on the Fediverse over what I will simplify to the desirability (or lack of it) of cross platform toolkits, including the browser, and how they erase platform personality and opinions. This caused me to have a realization about what web browser based applications are doing for me, which is that being browser based is what lets me use them at all.

My environment is pretty far from being a significant platform; I think Unix desktop share is in the low single percent under the best of circumstances. If people had to develop platform specific versions of things like Grafana (which is a great application), they'd probably exist for Windows, maybe macOS, and at the outside, tablets (some applications would definitely exist on phones, but Grafana is a bit of a stretch). They probably wouldn't exist on Linux, especially not for free.

That the web browser is a cross platform environment means that I get these applications (including the Fediverse itself) essentially 'for free' (which is to say, it's because of the efforts of web browsers to support my platform and then give me their work for free). Developers of web applications don't have to do anything to make them work for me, not even so far as making it possible to build their software on Linux; it just happens for them without them even having to think about it.

Although I don't work in the browser as much as some people do, looking back the existence of implicitly cross platform web applications has been a reasonably important thing in letting me stick with Linux.

This applies to any minority platform, not just Linux. All you need is a sufficiently capable browser and you have access to a huge range of (web) applications.

(Getting that sufficiently capable browser can be a challenge on a sufficiently minority platform, especially if you're not on a major architecture. I'm lucky in that x86 Linux is a majority minority platform; people on FreeBSD or people on architectures other than x86 and 64-bit ARM may be less happy with the situation.)

PS: I don't know if what we have used the web for really counts as 'applications', since they're mostly HTML form based things once you peel a few covers off. But if they do count, the web has been critical in letting us provide them to people. We definitely couldn't have built local application versions of them for all of the platforms that people here use.

(I'm sure this isn't a novel thought, but the realization struck (or re-struck) me recently so I'm writing it down.)

How I got my nose rubbed in my screens having 'bad' areas for me

By: cks
10 March 2025 at 02:50

I wrote a while back about how my desktop screens now had areas that were 'good' and 'bad' for me, and mentioned that I had recently noticed this, calling it a story for another time. That time is now. What made me really notice this issue with my screens and where I had put some things on them was our central mail server (temporarily) stopping handling email because its load was absurdly high.

In theory I should have noticed this issue before a co-worker rebooted the mail server, because for a long time I've had an xload window from the mail server (among other machines, I have four xloads). Partly I did this so I could keep an eye on these machines and partly it's to help keep alive the shared SSH connection I also use for keeping an xrun on the mail server.

(In the past I had problems with my xrun SSH connections seeming to spontaneously close if they just sat there idle because, for example, my screen was locked. Keeping an xload running seemed to work around that; I assumed it was because xload keeps updating things even with the screen locked and so forced a certain amount of X-level traffic over the shared SSH connection.)

When the mail server's load went through the roof, I should have noticed that the xload for it had turned solid green (which is how xload looks under high load). However, I had placed the mail server's xload way off on the right side of my office dual screens, which put it outside my normal field of attention. As a result, I never noticed the solid green xload that would have warned me of the problem.

(This isn't where the xload was back on my 2011 era desktop, but at some point since then I moved it and some other xloads over to the right.)

In the aftermath of the incident, I relocated all of those xloads to a more central location, and also made my new Prometheus alert status monitor appear more or less centrally, where I'll definitely notice it.

(Some day I may do a major rethink about my entire screen layout, but most of the time that feels like yak shaving that I'd rather not touch until I have to, for example because I've been forced to switch to Wayland and an entirely different window manager.)

Sidebar: Why xload turns green under high load

Xload draws a horizontal tick line for every integer load average it needs to display the maximum load that fits in its moving histogram. If the highest load average is 1.5, there will be one tick; if the highest load average is 10.2, there will be ten. Ticks are normally drawn in green. This means that as the load average climbs, xload draws more and more ticks, and after a certain point the entire xload display is just solid green from all of the tick lines.

This has the drawback that you don't know the shape of the load average (all you know is that at some point it got quite high), but the advantage that it's quite visually distinctive and you know you have a problem.

How SAML and OIDC differ in sharing information, and perhaps why

By: cks
9 March 2025 at 04:39

In practice, SAML and OIDC are two ways of doing third party web-based authentication (and thus a Single Sign On (SSO)) system; the web site you want to use sends you off to a SAML or OIDC server to authenticate, and then the server sends authentication information back to the 'client' web site. Both protocols send additional information about you along with the bare fact of an authentication, but they differ in how they do this.

In SAML, the SAML server sends a collection of 'attributes' back to the SAML client. There are some standard SAML attributes that client websites will expect, but the server is free to throw in any other attributes it feels like, and I believe that servers do things like turn every LDAP attribute they get from a LDAP user lookup into a SAML attribute (certainly SimpleSAMLphp does this). As far as I know, any filtering of what SAML attributes are provided by the server to any particular client is a server side feature, and SAML clients don't necessarily have any way of telling the SAML server what attributes they want or don't want.

In OIDC, the equivalent way of returning information is 'claims', which are grouped into 'scopes', along with basic claims that you get without asking for a scope. The expectation in OIDC is that clients that want more than the basic claims will request specific scopes and then get back (only) the claims for those scopes. There are standard scopes with standard claims (not all of which are necessarily returned by any given OIDC server). If you want to add additional information in the form of more claims, I believe that it's generally expected that you'll create one or more custom scopes for those claims and then have your OIDC clients request them (although not all OIDC clients are willing and able to handle custom scopes).

(I think in theory an OIDC server may be free to shove whatever claims it wants to into information for clients regardless of what scopes the client requested, but an OIDC client may ignore any information it didn't request and doesn't understand rather than pass it through to other software.)

The SAML approach is more convenient for server and client administrators who are working within the same organization. The server administrator can add whatever information to SAML responses that's useful and convenient, and SAML clients will generally automatically pick it up and often make it available to other software. The OIDC approach is less convenient, since you need to create one or more additional scopes on the server and define what claims go in them, and then get your OIDC clients to request the new scopes; if an OIDC client doesn't update, it doesn't get the new information. However, the OIDC approach makes it easier for both clients and servers to be more selective and thus potentially for people to control how much information they give to who. An OIDC client can ask for only minimal information by only asking for a basic scope (such as 'email') and then the OIDC server can tell the person exactly what information they're approving being passed to the client, without the OIDC server administrators having to get involved to add client-specific attribute filtering.

(In practice, OIDC probably also encourages giving less information to even trusted clients in general since you have to go through these extra steps, so you're less likely to do things like expose all LDAP information as OIDC claims in some new 'our-ldap' scope or the like.)

My guess is that OIDC was deliberately designed this way partly in order to make it better for use with third party clients. Within an organization, SAML's broad sharing of information may make sense, but it makes much less sense in a cross-organization context, where you may be using OIDC-based 'sign in with <large provider>' on some unrelated website. In that sort of case, you certainly don't want that website to get every scrap of information that the large provider has on you, but instead only ask for (and get) what it needs, and for it to not get much by default.

The OpenID Connect (OIDC) 'sub' claim is surprisingly load-bearing

By: cks
8 March 2025 at 04:24

OIDC (OpenID Connect) is today's better or best regarded standard for (web-based) authentication. When a website (or something) authenticates you through an OpenID (identity) Provider (OP), one of the things it gets back is a bunch of 'claims', which is to say information about the authenticated person. One of the core claims is 'sub', which is vaguely described as a string that is 'subject - identifier for the end-user at the issuer'. As I discovered today, this claim is what I could call 'load bearing' in a surprising way or two.

In theory, 'sub' has no meaning beyond identifying the user in some opaque way. The first way it's load bearing is that some OIDC client software (a 'Relying Party (RP)') will assume that the 'sub' claim has a human useful meaning. For example, the Apache OpenIDC module defaults to putting the 'sub' claim into Apache's REMOTE_USER environment variable. This is fine if your OIDC IdP software puts, say, a login name into it; it is less fine if your OIDC IdP software wants to create 'sub' claims that look like 'YXVzZXIxMi5zb21laWRw'. These claims mean something to your server software but not necessarily to you and the software you want to use on (or behind) OIDC RPs.

The second and more surprising way that the 'sub' claim is load bearing involves how external consumers of your OIDC IdP keep track of your people. In common situations your people will be identified and authorized by their email address (using some additional protocols), which they enter into the outside OIDC RP that's authenticating against your OIDC IdP, and this looks like the identifier that RP uses to keep track of them. However, at least one such OIDC RP assumes that the 'sub' claim for a given email address will never change, and I suspect that there are more people who either quietly use the 'sub' claim as the master key for accounts or who require 'sub' and the email address to be locked together this way.

This second issue makes the details of how your OIDC IdP software generates its 'sub' claim values quite important. You want it to be able to generate those 'sub' values in a clear and documented way that other OIDC IdP software can readily duplicate to create the same 'sub' values, and that won't change if you change some aspect of the OIDC IdP configuration for your current software. Otherwise you're at least stuck with your current OIDC IdP software, and perhaps with its exact current configuration (for authentication sources, internal names of things, and so on).

(If you have to change 'sub' values, for example because you have to migrate to different OIDC IdP software, this could go as far as the outside OIDC RP basically deleting all of their local account data for your people and requiring all of it to be entered back from scratch. But hopefully those outside parties have a better procedure than this.)

The problem facing MFA-enabled IMAP at the moment (in early 2025)

By: cks
7 March 2025 at 04:32

Suppose that you have an IMAP server and you would like to add MFA (Multi-Factor Authentication) protection to it. I believe that in theory the IMAP protocol supports multi-step 'challenge and response' style authentication, so again in theory you could implement MFA this way, but in practice this is unworkable because people would be constantly facing challenges. Modern IMAP clients (and servers) expect to be able to open and close connections more or less on demand, rather than opening one connection, holding it open, and doing everything over it. To make IMAP MFA practical, you need to do it with some kind of 'Single Sign On' (SSO) system. The current approach for this uses an OIDC identity provider for the SSO part and SASL OAUTHBEARER authentication between the IMAP client and the IMAP server, using information from the OIDC IdP.

So in theory, your IMAP client talks to your OIDC IdP to get a magic bearer token, provides this token to the IMAP server, the IMAP server verifies that it comes from a configured and trusted IdP, and everything is good. You only have to go through authenticating to your OIDC IdP SSO system every so often (based on whatever timeout it's configured with); the rest of the time the aggregate system does any necessary token refreshes behind the scenes. And because OIDC has a discovery process that can more or less start from your email address (as I found out), it looks like IMAP clients like Thunderbird could let you more or less automatically use any OIDC IdP if people had set up the right web server information.

If you actually try this right now, you'll find that Thunderbird, apparently along with basically all significant IMAP client programs, will only let you use a few large identity providers; here is Thunderbird's list (via). If you read through that Thunderbird source file, you'll find one reason for this limitation, which is that each provider has one or two magic values (the 'client ID' and usually the 'client secret', which is obviously not so secret here), in addition to URLs that Thunderbird could theoretically autodiscover if everyone supported the current OIDC autodiscovery protocols (my understanding is that not everyone does). In most current OIDC identity provider software, these magic values are either given to the IdP software or generated by it when you set up a given OIDC client program (a 'Relying Party (RP)' in the OIDC jargon).

This means that in order for Thunderbird (or any other IMAP client) to work with your own local OIDC IdP, there would have to be some process where people could load this information into Thunderbird. Alternately, Thunderbird could publish default values for these and anyone who wanted their OIDC IdP to work with Thunderbird would have to add these values to it. To date, creators of IMAP client software have mostly not supported either option and instead hard code a list of big providers who they've arranged more or less explicit OIDC support with.

(Honestly it's not hard to see why IMAP client authors have chosen this approach. Unless you're targeting a very technically inclined audience, walking people through the process of either setting this up in the IMAP client or verifying if a given OIDC IdP supports the client is daunting. I believe some IMAP clients can be configured for OIDC IdPs through 'enterprise policy' systems, but there the people provisioning the policies are supposed to be fairly technical.)

PS: Potential additional references on this mess include David North's article and this FOSDEM 2024 presentation (which I haven't yet watched, I only just stumbled into this mess).

A Prometheus gotcha with alerts based on counting things

By: cks
6 March 2025 at 04:39

Suppose, not entirely hypothetically, that you have some backup servers that use swappable HDDs as their backup media and expose that 'media' as mounted filesystems. Because you keep swapping media around, you don't automatically mount these filesystems and when you do manually try to mount them, it's possible to have some missing (if, for example, a HDD didn't get fully inserted and engaged with the hot-swap bay). To deal with this, you'd like to write a Prometheus alert for 'not all of our backup disks are mounted'. At first this looks simple:

count(
  node_filesystem_size_bytes{
         host = "backupserv",
         mountpoint =~ "/dumps/tapes/slot.*" }
) != <some number>

This will work fine most of the time and then one day it will fail to alert you to the fact that none of the expected filesystems are mounted. The problem is the usual one of PromQL's core nature as a set-based query language (we've seen this before). As long as there's at least one HDD 'tape' filesystem mounted, you can count them, but once there are none, the result of counting them is not 0 but nothing. As a result this alert rule won't produce any results when there are no 'tape' filesystems on your backup server.

Unfortunately there's no particularly good fix, especially if you have multiple identical backup servers and so the real version uses 'host =~ "bserv1|bserv2|..."'. In the single-host case, you can use either absent() or vector() to provide a default value. There's no good solution in the multi-host case, because there's no version of vector() that lets you set labels. If there was, you could at least write:

count( ... ) by (host)
  or vector(0, "host", "bserv1")
  or vector(0, "host", "bserv2")
  ....

(Technically you can set labels via label_replace(). Let's not go there; it's a giant pain for simply adding labels, especially if you want to add more than one.)

In my particular case, our backup servers always have some additional filesystems (like their root filesystem), so I can write a different version of the count() based alert rule:

count(
  node_filesystem_size_bytes{
         host =~ "bserv1|bserv2|...",
         fstype =~ "ext.*' }
) by (host) != <other number>

In theory this is less elegant because I'm not counting exactly what I care about (the number of 'tape' filesystems that are mounted) but instead something more general and potentially more variable (the number of extN filesystems that are mounted) that contains various assumptions about the systems. In practice the number is just as fixed as the number of 'taoe' filesystems, and the broader set of labels will always match something, producing a count of at least one for each host.

(This would change if the standard root filesystem type changed in a future version of Ubuntu, but if that happened, we'd notice.)

PS: This might sound all theoretical and not something a reasonably experienced Prometheus person would actually do. But I'm writing this entry partly because I almost wrote a version of my first example as our alert rule, until I realized what would happen when there were no 'tape' filesystems mounted at all, which is something that happens from time to time for reasons outside the scope of this entry.

What SimpleSAMLphp's core:AttributeAlter does with creating new attributes

By: cks
5 March 2025 at 03:41

SimpleSAMLphp is a SAML identity provider (and other stuff). It's of deep interest to us because it's about the only SAML or OIDC IdP I can find that will authenticate users and passwords against LDAP and has a plugin that will do additional full MFA authentication against the university's chosen MFA provider (although you need to use a feature branch). In the process of doing this MFA authentication, we need to extract the university identifier to use for MFA authentication from our local LDAP data. Conveniently, SimpleSAMLphp has a module called core:AttributeAlter (a part of authentication processing filters) that is intended to do this sort of thing. You can give it a source, a pattern, a replacement that includes regular expression group matches, and a target attribute. In the syntax of its examples, this looks like the following:

 // the 65 is where this is ordered
 65 => [
    'class' => 'core:AttributeAlter',
    'subject' => 'gecos',
    'pattern' => '/^[^,]*,[^,]*,[^,]*,[^,]*,([^,]+)(?:,.*)?$/',
    'target' => 'mfaid',
    'replacement' => '\\1',
 ],

If you're an innocent person, you expect that your new 'mfaid' attribute will be undefined (or untouched) if the pattern does not match because the required GECOS field isn't set. This is not in fact what happens, and interested parties can follow along the rest of this in the source.

(All of this is as of SimpleSAMLphp version 2.3.6, the current release as I write this.)

The short version of what happens is that when the target is a different attribute and the pattern doesn't match, the target will wind up set but empty. Any previous value is lost. How this happens (and what happens) starts with that 'attributes' here are actually arrays of values under the covers (this is '$attributes'). When core:AttributeAlter has a different target attribute than the source attribute, it takes all of the source attribute's values, passes each of them through a regular expression search and replace (using your replacement), and then gathers up anything that changed and sets the target attribute to this gathered collection. If the pattern doesn't match any values of the attribute (in the normal case, a single value), the array of changed things is empty and your target attribute is set to an empty PHP array.

(This is implemented with an array_diff() between the results of preg_replace() and the original attribute value array.)

My personal view is that this is somewhere around a bug; if the pattern doesn't match, I expect nothing to happen. However, the existing documentation is ambiguous (and incomplete, as the use of capture groups isn't particularly documented), so it might not be considered a bug by SimpleSAMLphp. Even if it is considered a bug I suspect it's not going to be particularly urgent to fix, since this particular case is unusual (or people would have found it already).

For my situation, perhaps what I want to do is to write some PHP code to do this extraction operation by hand, through core:PHP. It would be straightforward to extract the necessary GECOS field (or otherwise obtain the ID we need) in PHP, without fooling around with weird pattern matching and module behavior.

(Since I just looked it up, I believe that in the PHP code that core:PHP runs for you, you can use a PHP 'return' to stop without errors but without changing anything. This is relevant in my case since not all GECOS entries have the necessary information.)

If you get the chance, always run more extra network fiber cabling

By: cks
4 March 2025 at 04:22

Some day, you may be in an organization that's about to add some more fiber cabling between two rooms in the same building, or maybe two close by buildings, and someone may ask you for your opinion about many fiber pairs should be run. My personal advice is simple: run more fiber than you think you need, ideally a bunch more (this generalizes to network cabling in general, but copper cabling is a lot more bulky and so harder to run (much) more of). There is an unreasonable amount of fiber to run, but mostly it comes up when you'd have to put in giant fiber patch panels.

The obvious reason to run more fiber is that you may well expand your need for fiber in the future. Someone will want to run a dedicated, private network connection between two locations; someone will want to trunk things to get more bandwidth; someone will want to run a weird protocol that requires its own network segment (did you know you can run HDMI over Ethernet?); and so on. It's relatively inexpensive to add some more fiber pairs when you're already running fiber but much more expensive to have to run additional fiber later, so you might as well give yourself room for growth.

The less obvious reason to run extra fiber is that every so often fiber pairs stop working, just like network cables go bad, and when this happens you'll need to replace them with spare fiber pairs, which means you need those spare fiber pairs. Some of the time this fiber failure is (probably) because a raccoon got into your machine room, but some of the time it just happens for reasons that no one is likely to ever explain to you. And when this happens, you don't necessarily lose only a single pair. Today, for example, we lost three fiber pairs that ran between two adjacent buildings and evidence suggests that other people at the university lost at least one more pair.

(There are a variety of possible causes for sudden loss of multiple pairs, probably all running through a common path, which I will leave to your imagination. These fiber runs are probably not important enough to cause anyone to do a detailed investigation of where the fault is and what happened.)

Fiber comes in two varieties, single mode and multi-mode. I don't know enough to know if you should make a point of running both (over distances where either can be used) as part of the whole 'run more fiber' thing. Locally we have both SM and MM fiber and have switched back and forth between them at times (and may have to do so as a result of the current failures).

PS: Possibly you work in an organization where broken inside-building fiber runs are regularly fixed or replaced. That is not our local experience; someone has to pay for fixing or replacing, and when you have spare fiber pairs left it's easier to switch over to them rather than try to come up with the money and so on.

(Repairing or replacing broken fiber pairs will reduce your long term need for additional fiber, but obviously not the short term need. If you lose N pairs of fiber, you need N spare pairs to get back into operation.)

Updating local commits with more changes in Git (the harder way)

By: cks
3 March 2025 at 03:34

One of the things I do with Git is maintain personal changes locally on top of the upstream version, with my changes updated via rebasing every time I pull upstream to update it. In the simple case, I have only a single local change and commit, but in more complex cases I split my changes into multiple local commits; my local version of Firefox currently carries 12 separate personal commits. Every so often, upstream changes something that causes one of those personal changes to need an update, without actually breaking the rebase of that change. When this happens I need to update my local commit with more changes, and often it's not the 'top' local commit (which can be updated simply).

In theory, the third party tool git-absorb should be ideal for this, and I believe I've used it successfully for this purpose in the past. In my most recent instance, though, git-absorb frustratingly refused to do anything in a situation where it felt it should work fine. I had an additional change to a file that was changed in exactly one of my local commits, which feels like an easy case.

(Reading the git-absorb readme carefully suggests that I may be running into a situation where my new change doesn't clash with any existing change. This makes git-absorb more limited than I'd like, but so it goes.)

In Git, what I want is called a 'fixup commit', and how to use it is covered in this Stackoverflow answer. The sequence of commands is basically:

# modify some/file with new changes, then
git add some/file

# Use this to find your existing commit ID
git log some/file

# with the existing commid ID
git commit --fixup=<commit ID>
git rebase --interactive --autosquash <commit ID>^

This will open an editor buffer with what 'git rebase' is about to do, which I can immediately exit out of because the defaults are exactly what I want (assuming I don't want to shuffle around the order of my local commits, which I probably don't, especially as part of a fixup).

I can probably also use 'origin/main' instead of '<commit ID>^', but that will rebase more things than is strictly necessary. And I need the commit ID for the 'git commit --fixup' invocation anyway.

(Sufficiently experienced Git people can probably put together a script that would do this automatically. It would get all of the files staged in the index, find the most recent commit that modified each of them, abort if they're not all the same commit, make a fixup commit to that most recent commit, and then potentially run the 'git rebase' for you.)

Using PyPy (or thinking about it) exposed a bug in closing files

By: cks
2 March 2025 at 03:20

Over on the Fediverse, I said:

A fun Python error some code can make and not notice until you run it under PyPy is a function that has 'f.close' at the end instead of 'f.close()' where f is an open()'d file.

(Normal CPython will immediately close the file when the function returns due to refcounted GC. PyPy uses non-refcounted GC so the file remains open until GC happens, and so you can get too many files open at once. Not explicitly closing files is a classic PyPy-only Python bug.)

When a Python file object is garbage collected, Python arranges to close the underlying C level file descriptor if you didn't already call .close(). In CPython, garbage collection is deterministic and generally prompt; for example, when a function returns, all of its otherwise unreferenced local variables will be garbage collected as their reference counts drop to zero. However, PyPy doesn't use reference counting for its garbage collection; instead, like Go, it only collects garbage periodically, and so will only close files as a side effect some time later. This can make it easy to build up a lot of open files that aren't doing anything, and possibly run your program out of available file descriptors, something I've run into in the past.

I recently wanted to run a hacked up version of a NFS monitoring program written in Python under PyPy instead of CPython, so it would run faster and use less CPU on the systems I was interested in. Since I remembered this PyPy issue, I found myself wondering if it properly handled closing the file(s) it had to open, or if it left it to CPython garbage collection. When I looked at the code, what I found can be summarized as 'yes and no':

def parse_stats_file(filename):
  [...]
  f = open(filename)
  [...]
  f.close

  return ms_dict

Because I was specifically looking for uses of .close(), the lack of the '()' immediately jumped out at me (and got fixed in my hacked version).

It's easy to see how this typo could linger undetected in CPython. The line 'f.close' itself does nothing but isn't an error, and then 'f' is implicitly closed in the next line, as part of the 'return', so even if you looking at this program's file descriptor usage while it's running you won't see any leaks.

(I'm not entirely a fan of nondeterministic garbage collection, at least in the context of Python, where deterministic GC was a long standing feature of the language in practice.)

Always sync your log or journal files when you open them

By: cks
1 March 2025 at 03:10

Today I learned of a new way to accidentally lose data 'written' to disk, courtesy of this Fediverse post summarizing a longer article about CouchDB and this issue. Because this is so nifty and startling when I encountered it, yet so simple, I'm going to re-explain the issue in my own words and explain how it leads to the title of this entry.

Suppose that you have a program that makes data it writes to disk durable through some form of journal, write ahead log (WAL), or the like. As we all know, data that you simply write() to the operating system isn't yet on disk; the operating system is likely buffering the data in memory before writing it out at the OS's own convenience. To make the data durable, you must explicitly flush it to disk (well, ask the OS to), for example with fsync(). Your program is a good program, so of course it does this; when it updates the WAL, it write()s then fsync()s.

Now suppose that your program is terminated after the write but before the fsync. At this point you have a theoretically incomplete and improperly written journal or WAL, since it hasn't been fsync'd. However, when your program restarts and goes through its crash recovery process, it has no way to discover this. Since the data was written (into the OS's disk cache), the OS will happily give the data back to you even though it's not yet on disk. Now assume that your program takes further actions (such as updating its main files) based on the belief that the WAL is fully intact, and then the system crashes, losing that buffered and not yet written WAL data. Oops. You (potentially) have a problem.

(These days, programs can get terminated for all sorts of reasons other than a program bug that causes a crash. If you're operating in a modern containerized environment, your management system can decide that your program or its entire container ought to shut down abruptly right now. Or something else might have run the entire system out of memory and now some OOM handler is killing your program.)

To avoid the possibility of this problem, you need to always force a disk flush when you open your journal, WAL, or whatever; on Unix, you'd immediately fsync() it. If there's no unwritten data, this will generally be more or less instant. If there is unwritten data because you're restarting after the program was terminated by surprise, this might take a bit of time but insures that the on-disk state matches the state that you're about to observe through the OS.

(CouchDB's article points to another article, Justin Jaffray’s NULL BITMAP Builds a Database #2: Enter the Memtable, which has a somewhat different way for this failure to bite you. I'm not going to try to summarize it here but you might find the article interesting reading.)

Using Netplan to set up WireGuard on Ubuntu 22.04 works, but has warts

By: cks
28 February 2025 at 04:07

For reasons outside the scope of this entry, I recently needed to set up WireGuard on an Ubuntu 22.04 machine. When I did this before for an IPv6 gateway, I used systemd-networkd directly. This time around I wasn't going to set up a single peer and stop; I expected to iterate and add peers several times, which made netplan's ability to update and re-do your network configuration look attractive. Also, our machines are already using Netplan for their basic network configuration, so this would spare my co-workers from having to learn about systemd-networkd.

Conveniently, Netplan supports multiple configuration files so you can put your WireGuard configuration into a new .yaml file in your /etc/netplan. The basic version of a WireGuard endpoint with purely internal WireGuard IPs is straightforward:

network:
  version: 2
  tunnels:
    our-wg0:
      mode: wireguard
      addresses: [ 192.168.X.1/24 ]
      port: 51820
      key:
        private: '....'
      peers:
        - keys:
            public: '....'
          allowed-ips: [ 192.168.X.10/32 ]
          keepalive: 90
          endpoint: A.B.C.D:51820

(You may want something larger than a /24 depending on how many other machines you think you'll be talking to. Also, this configuration doesn't enable IP forwarding, which is a feature in our particular situation.)

If you're using netplan's systemd-networkd backend, which you probably are on an Ubuntu server, you can apparently put your keys into files instead of needing to carefully guard the permissions of your WireGuard /etc/netplan file (which normally has your private key in it).

If you write this out and run 'netplan try' or 'netplan apply', it will duly apply all of the configuration and bring your 'our-wg0' WireGuard configuration up as you expect. The problems emerge when you change this configuration, perhaps to add another peer, and then re-do your 'netplan try', because when you look you'll find that your new peer hasn't been added. This is a sign of a general issue; as far as I can tell, netplan (at least in Ubuntu 22.04) can set up WireGuard devices from scratch but it can't update anything about their WireGuard configuration once they're created. This is probably be a limitation in the Ubuntu 22.04 version of systemd-networkd that's only changed in the very latest systemd versions. In order to make WireGuard level changes, you need to remove the device, for example with 'ip link del dev our-wg0' and then re-run 'netplan try' (or 'netplan apply') to re-create the WireGuard device from scratch; the recreated version will include all of your changes.

(The latest online systemd.netdev manual page says that systemd-networkd will try to update netdev configurations if they change, and .netdev files are where WireGuard settings go. The best information I can find is that this change appeared in systemd v257, although the Fedora 41 systemd.netdev manual page has this same wording and it has systemd '256.11'. Maybe there was a backport into Fedora.)

In our specific situation, deleting and recreating the WireGuard device is harmless and we're not going to be doing it very often anyway. In other configurations things may not be so straightforward and so you may need to resort to other means to apply updates to your WireGuard configuration (including working directly through the 'wg' tool).

I'm not impressed by the state of NFS v4 in the Linux kernel

By: cks
27 February 2025 at 04:15

Although NFS v4 is (in theory) the latest great thing in NFS protocol versions, for a long time we only used NFS v3 for our fileservers and our Ubuntu NFS clients. A few years ago we switched to NFS v4 due to running into a series of problems our people were experiencing with NFS (v3) locks (cf); since NFS v4 locks are integrated into the protocol and NFS v4 is the 'modern' NFS version that's probably receiving more attention than anything to do with NFS v3.

(NFS v4 locks are handled relatively differently than NFS v3 locks.)

Moving to NFS v4 did fix our NFS lock issues in that stuck NFS locks went away, when before they'd been a regular issue on our IMAP server. However, all has not turned out to be roses, and the result has left me not really impressed with the state of NFS v4 in the Linux kernel. In Ubuntu 22.04's 5.15.x server kernel, we've now run into scalability issues in both the NFS server (which is what sparked our interest in how many NFS server threads to run and what NFS server threads do in the kernel), and now in the NFS v4 client (where I have notes that let me point to a specific commit with the fix).

(The NFS v4 server issue we encountered may be the one fixed by this commit.)

What our two issues have in common is that both are things that you only find under decent or even significant load. That these issues both seem to have still been present as late as kernels 6.1 (server) and 6.6 (client) suggests that neither the Linux NFS v4 server nor the Linux NFS v4 client had been put under serious load until then, or at least not by people who could diagnose their problems precisely enough to identify the problem and get kernel fixes made. While both issues are probably fixed now, their past presence leaves me wondering what other scalability issues are lurking in the kernel's NFS v4 support, partly because people have mostly been using NFS v3 until recently (like us).

We're not going to go back to NFS v3 in general (partly because of the clear improvement in locking), and the server problem we know about has been wiped away because we're moving our NFS fileservers to Ubuntu 24.04 (and some day the NFS clients will move as well). But I'm braced for further problems, including ones in 24.04 that we may be stuck with for a while.

PS: I suspect that part of the issues may come about because the Linux NFS v4 client and the Linux NFS v4 server don't add NFS v4 operations at the same time. As I found out, the server supports more operations than the client uses but the client's use is of whatever is convenient and useful for it, not necessarily by NFS v4 revision. If the major use of Linux NFS v4 servers is with v4 clients, this could leave the server implementation of operations under-used until the client starts using them (and people upgrade clients to kernel versions with that support).

MFA's "push notification" authentication method can be easier to integrate

By: cks
26 February 2025 at 03:59

For reasons outside the scope of this entry, I'm looking for an OIDC or SAML identity provider that supports primary user and password authentication against our own data and then MFA authentication through the university's SaaS vendor. As you'd expect, the university's MFA SaaS vendor supports all of the common MFA approaches today, covering push notifications through phones, one time codes from hardware tokens, and some other stuff. However, pretty much all of the MFA integrations I've been able to find only support MFA push notifications (eg, also). When I thought about it, this made a lot of sense, because it's often going to be much easier to add push notification MFA than any other form of it.

A while back I wrote about exploiting password fields for multi-factor authentication, where various bits of software hijacked password fields to let people enter things like MFA one time codes into systems (like OpenVPN) that were never set up for MFA in the first place. With most provider APIs, authentication through push notification can usually be inserted in a similar way, because from the perspective of the overall system it can be a synchronous operation. The overall system calls a 'check' function of some sort, the check function calls out the the provider's API and then possibly polls for a result for a while, and then it returns a success or a failure. There's no need to change the user interface of authentication or add additional high level steps.

(The exception is if the MFA provider's push authentication API only returns results to you by making a HTTP query to you. But I think that this would be a relatively weird API; a synchronous reply or at least a polled endpoint is generally much easier to deal with and is more or less required to integrate push authentication with non-web applications.)

By contrast, if you need to get a one time code from the person, you have to do things at a higher level and it may not fit well in the overall system's design (or at least the easily exposed points for plugins and similar things). Instead of immediately returning a successful or failed authentication, you now need to display an additional prompt (in many cases, a HTML page), collect the data, and only then can you say yes or no. In a web context (such as a SAML or OIDC IdP), the provider may want you to redirect the user to their website and then somehow call you back with a reply, which you'll have to re-associate with context and validate. All of this assumes that you can even interpose an additional prompt and reply, which isn't the case in some contexts unless you do extreme things.

(Sadly this means that if you have a system that only supports MFA push authentication and you need to also accept codes and so on, you may be in for some work with your chainsaw.)

Go's behavior for zero value channels and maps is partly a choice

By: cks
25 February 2025 at 04:30

How Go behaves if you have a zero value channel or map (a 'nil' channel or map) is somewhat confusing (cf, via). When we talk about it, it's worth remembering that this behavior is a somewhat arbitrary choice on Go's part, not a fundamental set of requirements that stems from, for example, other language semantics. Go has reasons to have channels and maps behave as they do, but some those reasons have to do with how channel and map values are implemented and some are about what's convenient for programming.

As hinted at by how their zero value is called a 'nil' value, channel and map values are both implemented as pointers to runtime data structures. A nil channel or map has no such runtime data structure allocated for it (and the pointer value is nil); these structures are allocated by make(). However, this doesn't entirely allow us to predict what happens when you use nil values of either type. It's not unreasonable for an attempt to assign an element to a nil map to panic, since the nil map has no runtime data structure allocated to hold anything we try to put in it. But you don't have to say that a nil map is empty and looking up elements in it gives you a zero value; I think you could have this panic instead, just as assigning an element does. However, this would probably result in less safe code that paniced more (and probably had more checks for nil maps, too).

Then there's nil channels, which don't behave like nil maps. It would make sense for receiving from a nil channel to yield the zero value, much like looking up an element in a nil map, and for sending to a nil channel to panic, again like assigning to an element in a nil map (although in the channel case it would be because there's no runtime data structure where your goroutine could metaphorically hang its hat waiting for a receiver). Instead Go chooses to make both operations (permanently) block your goroutine, with panicing on send reserved for sending to a non-nil but closed channel.

The current semantics of sending on a closed channel combined with select statements (and to a lesser extent receiving from a closed channel) means that Go needs a channel zero value that is never ready to send or receive. However, I believe that Go could readily make actual sends or receives on nil channels panic without any language problems. As a practical matter, sending or receiving on a nil channel is a bug that will leak your goroutine even if your program doesn't deadlock.

Similarly, Go could choose to allocate an empty map runtime data structure for zero value maps, and then let you assign to elements in the resulting map rather than panicing. If desired, I think you could preserve a distinction between empty maps and nil maps. There would be some drawbacks to this that cut against Go's general philosophy of being relatively explicit about (heap) allocations and you'd want a clever compiler that didn't bother creating those zero value runtime map data structures when they'd just be overwritten by 'make()' or a return value from a function call or the like.

(I can certainly imagine a quite Go like language where maps don't have to be explicitly set up any more than slices do, although you might still use 'make()' if you wanted to provide size hints to the runtime.)

Sidebar: why you need something like nil channels

We all know that sometimes you want to stop sending or receiving on a channel in a select statement. On first impression it looks like closing a channel (instead of setting the channel to nil) could be made to work for this (it doesn't currently). The problem is that closing a channel is a global thing, while you may only want a local effect; you want to remove the channel from your select, but not close down other uses of it by other goroutines.

This need for a local effect pretty much requires a special, distinct channel value that is never ready for sending or receiving, so you can overwrite the old channel value with this special value, which we might as well call a 'nil channel'. Without a channel value that serves this purpose you'd have to complicate select statements with some other way to disable specific channels.

(I had to work this out in my head as part of writing this entry so I might as well write it down for my future self.)

JSON has become today's machine-readable output format (on Unix)

By: cks
24 February 2025 at 04:26

Recently, I needed to delete about 1,200 email messages to a particular destination from the mail queue on one of our systems. This turned out to be trivial, because this system was using Postfix and modern versions of Postfix can output mail queue status information in JSON format. So I could dump the mail queue status, select the relevant messages and print the queue IDs with jq, and feed this to Postfix to delete the messages. This experience has left me with the definite view that everything should have the option to output JSON for 'machine-readable' output, rather than some bespoke format. For new programs, I think that you should only bother producing JSON as your machine readable output format.

(If you strongly object to JSON, sure, create another machine readable output format too. But if you don't care one way or another, outputting only JSON is probably the easiest approach for programs that don't already have such a format of their own.)

This isn't because JSON is the world's best format (JSON is at best the least bad format). Instead it's because JSON has a bunch of pragmatic virtues on a modern Unix system. In general, JSON provides a clear and basically unambiguous way to represent text data and much numeric data, even if it has relatively strange characters in it (ie, JSON has escaping rules that everyone knows and all tools can deal with); it's also generally extensible to add additional data without causing heartburn in tools that are dealing with older versions of a program's output. And on Unix there's an increasingly rich collection of tools to deal with and process JSON, starting with jq itself (and hopefully soon GNU Awk in common configurations). Plus, JSON can generally be transformed to various other formats if you need them.

(JSON can also be presented and consumed in either multi-line or single line formats. Multi-line output is often much more awkward to process in other possible formats.)

There's nothing unique about JSON in all of this; it could have been any other format with similar virtues where everything lined up this way for the format. It just happens to be JSON at the moment (and probably well into the future), instead of (say) XML. For individual programs there are simpler 'machine readable' output formats, but they either have restrictions on what data they can represent (for example, no spaces or tabs in text), or require custom processing that goes well beyond basic grep and awk and other widely available Unix tools, or both. But JSON has become a "narrow waist" for Unix programs talking to each other, a common coordination point that means people don't have to invent another format.

(JSON is also partially self-documenting; you can probably look at a program's JSON output and figure out what various parts of it mean and how it's structured.)

PS: Using JSON also means that people writing programs don't have to design their own machine-readable output format. Designing a machine readable output format is somewhat more complicated than it looks, so I feel that the less of it people need to do, the better.

(I say this as a system administrator who's had to deal with a certain amount of output formats that have warts that make them unnecessarily hard to deal with.)

Institutions care about their security threats, not your security threats

By: cks
23 February 2025 at 03:45

Recently I was part of a conversation on the Fediverse that sparked an obvious in retrospect realization about computer security and how we look at and talk about security measures. To put it succinctly, your institution cares about threats to it, not about threats to you. It cares about threats to you only so far as they're threats to it through you. Some of the security threats and sensible responses to them overlap between you and your institution, but some of them don't.

One of the areas where I think this especially shows up is in issues around MFA (Multi-Factor Authentication). For example, it's a not infrequently observed thing that if all of your factors live on a single device, such as your phone, then you actually have single factor authentication (this can happen with many of the different ways to do MFA). But for many organizations, this is relatively fine (for them). Their largest risk is that Internet attackers are constantly trying to (remotely) phish their people, often in moderately sophisticated ways that involve some prior research (which is worth it for the attackers because they can target many people with the same research). Ignoring MFA alert fatigue for a moment, even a single factor physical device will cut of all of this, because Internet attackers don't have people's smartphones.

For individual people, of course, this is potentially a problem. If someone can gain access to your phone, they get everything, and probably across all of the online services you use. If you care about security as an individual person, you want attackers to need more than one thing to get all of your accounts. Conversely, for organizations, compromising all of their systems at once is sort of a given, because that's what it means to have a Single Sign On system and global authentication. Only a few organizational systems will be separated from the general SSO (and organizations have to hope that their people cooperate by using different access passwords).

Organizations also have obvious solutions to things like MFA account recovery. They can establish and confirm the identities of people associated with them, and a process to establish MFA in the first place, so if you lose whatever lets you do MFA (perhaps your work phone's battery has gotten spicy), they can just run you through the enrollment process again. Maybe there will be a delay, but if so, the organization has broadly decided to tolerate it.

(And I just recently wrote about the difference between 'internal' accounts and 'external' accounts, where people generally know who is in an organization and so has an account, so allowing this information to leak in your authentication isn't usually a serious problem.)

Another area where I think this difference in the view of threats is in the tradeoffs involved in disk encryption on laptops and desktops used by people. For an organization, choosing non-disclosure over availability on employee devices makes a lot of sense. The biggest threat as the organization sees it isn't data loss on a laptop or desktop (especially if they write policies about backups and where data is supposed to be stored), it's an attacker making off with one and having the data disclosed, which is at least bad publicity and makes the executives unhappy. You may feel differently about your own data, depending on how your backups are.

HTTP connections are part of the web's long tail

By: cks
22 February 2025 at 03:32

I recently read an article that, among other things, apparently seriously urging browser vendors to deprecate and disable plain text HTTP connections by the end of October of this year (via, and I'm deliberately not linking directly to the article). While I am a strong fan of HTTPS in general, I have some feelings about a rapid deprecation of HTTP. One of my views is that plain text HTTP is part of the web's long tail.

As I'm using the term here, the web's long tail (also is the huge mass of less popular things that are individually less frequently visited but which in aggregate amount to a substantial part of the web. The web's popular, busy sites are frequently updated and can handle transitions without problems. They can readily switch to using modern HTML, modern CSS, modern JavaScript, and so on (although they don't necessarily do so), and along with that update all of their content to HTTPS. In fact they mostly or entirely have done so over the last ten to fifteen years. The web's long tail doesn't work like that. Parts of it use old JavaScript, old CSS, old HTML, and these days, plain HTTP (in addition to the people who have objections to HTTPS and deliberately stick to HTTP).

The aggregate size and value of the long tail is part of why browsers have maintained painstaking compatibility back to old HTML so far, including things like HTML Image Maps. There's plenty of parts of the long tail that will never be updated to have HTTPS or work properly with it. For browsers to discard HTTP anyway would be to discard that part of the long tail, which would be a striking break with browser tradition. I don't think this is very likely and I certainly hope that it never comes to pass, because that long tail is part of what gives the web its value.

(It would be an especially striking break since a visible percentage of page loads still happen with HTTP instead of HTTPS. For example, Google's stats say that globally 5% of Windows Chrome page loads apparently still use HTTP. That's roughly one in twenty page loads, and the absolute number is going to be very large given how many page loads happen with Chrome on Windows. This large number is one reason I don't think this is at all a serious proposal; as usual with this sort of thing, it ignores that social problems are the ones that matter.)

PS: Of course, not all of the HTTP connections are part of the web's long tail as such. Some of them are to, for example, manage local devices via little built in web servers that simply don't have HTTPS. The people with these devices aren't in any rush to replace them just because some people don't like HTTP, and the vendors who made them aren't going to update their software to support (modern) HTTPS even for the devices which support firmware updates and where the vendor is still in business.

(You can view them as part of the long tail of 'the web' as a broad idea and interface, even though they're not exposed to the world the way that the (public) web is.)

It's good to have offline contact information for your upstream networking

By: cks
21 February 2025 at 03:42

So I said something on the Fediverse:

Current status: it's all fun and games until the building's backbone router disappears.

A modest suggestion: obtain problem reporting/emergency contact numbers for your upstream in advance and post them on the wall somewhere. But you're on your own if you use VOIP desk phones.

(It's back now or I wouldn't be posting this, I'm in the office today. But it was an exciting 20 minutes.)

(I was somewhat modeling the modest suggestion after nuintari's Fediverse series of "rules of networking", eg, also.)

The disappearance of the building's backbone router took out all local networking in the particular building that this happened in (which is the building with our machine room), including the university wireless in the building. THe disappearance of the wireless was especially surprising, because the wireless SSID disappeared entirely.

(My assumption is that the university's enterprise wireless access points stopped advertising the SSID when they lost some sort of management connection to their control plane.)

In a lot of organizations you might have been able to relatively easily find the necessary information even with this happening. For example, people might have smartphones with data plans and laptops that they could tether to the smartphones, and then use this to get access to things like the university directory, the university's problem reporting system, and so on. For various reasons, we didn't really have any of this available, which left us somewhat at a loss when the external networking evaporated. Ironically we'd just managed to finally find some phone numbers and get in touch with people when things came back.

(One bit of good news is that our large scale alert system worked great to avoid flooding us with internal alert emails. My personal alert monitoring (also) did get rather noisy, but that also let me see right away how bad it was.)

Of course there's always things you could do to prepare, much like there are often too many obvious problems to keep track of them all. But in the spirit of not stubbing our toes on the same problem a second time, I suspect we'll do something to keep some problem reporting and contact numbers around and available.

Shared (Unix) hosting and the problem of managing resource limits

By: cks
20 February 2025 at 03:14

Yesterday I wrote about how one problem with shared Unix hosting was the lack of good support for resource limits in the Unixes of the time. But even once you have decent resource limits, you still have an interlinked set of what we could call 'business' problems. These are the twin problems of what resource limits you set on people and how you sell different levels of these resources limits to your customers.

(You may have the first problem even for purely internal resource allocation on shared hosts within your organization, and it's never a purely technical decision.)

The first problem is whether you overcommit what you sell and in general how you decide on the resource limits. Back in the big days of the shared hosting business, I believe that overcommitting was extremely common; servers were expensive and most people didn't use much resources on average. If you didn't overcommit your servers, you had to charge more and most people weren't interested in paying that. Some resources, such as CPU time, are 'flow' resources that can be rebalanced on the fly, restricting everyone to a fair share when the system is busy (even if that share is below what they're nominally entitled to), but it's quite difficult to take memory back (or disk space). If you overcommit memory, your systems might blow up under enough load. If you don't overcommit memory, either everyone has to pay more or everyone gets unpopularly low limits.

(You can also do fancy accounting for 'flow' resources, such as allowing bursts of high CPU but not sustained high CPU. This is harder to do gracefully for things like memory, although you can always do it ungracefully by terminating things.)

The other problem entwined with setting resource limits is how (and if) you sell different levels of resource limits to your customers. A single resource limit is simple but probably not what all of your customers want; some will want more and some will only need less. But if you sell different limits, you have to tell customers what they're getting, let them assess their needs (which isn't always clear in a shared hosting situation), deal with them being potentially unhappy if they think they're not getting what they paid for, and so on. Shared hosting is always likely to have complicated resource limits, which raises the complexity of selling them (and of understanding them, for the customers who have to pick one to buy).

Viewed from the right angle, virtual private servers (VPSes) are a great abstraction to sell different sets of resource limits to people in a way that's straightforward for them to understand (and which at least somewhat hides whether or not you're overcommitting resources). You get 'a computer' with these characteristics, and most of the time it's straightforward to figure out whether things fit (the usual exception is IO rates). So are more abstracted, 'cloud-y' ways of selling computation, database access, and so on (at least in areas where you can quantify what you're doing into some useful unit of work, like 'simultaneous HTTP requests').

It's my personal suspicion that even if the resource limitation problems had been fully solved much earlier, shared hosting would have still fallen out of fashion in favour of simpler to understand VPS-like solutions, where what you were getting and what you were using (and probably what you needed) were a lot clearer.

❌
❌