Using typing in Python leads to different sorts of code
So what happened is that I converted a big pile of (highly untyped) Python 2 to Python 3 recently, and then I wanted to experiment with typing-heavy Python LSP servers in GNU Emacs, so I decided to try them out by experimentally adding some type annotations to DWiki, the aforementioned pile of untyped Python (and the code powering Wandering Thoughts). The experience was educational and taught me some new things about type annotations, but it also firmed up my view that typed Python code is different than untyped Python code (although not quite to the extent that they create a different language, as I sort of felt before). There are idioms that are perfectly natural in untyped Python that are pretty annoying to deal with in typed Python.
One of these idioms is dictionaries with multiple types of values. For instance, DWiki has a dictionary that is basically 'a collection of information about the HTTP request'. The authentic type of the values in this dictionary is "str | bool | SimpleCookie | dict[str, str]", which is to say that values can be any of a string, a boolean, a HTTP Cookie, or a dictionary of string key/value pairs. Of course, individual keys in the dictionary have a fixed type for their value; for example, the key 'request-fullpath' only ever has a string value, so in untyped Python code it's natural to write something like:
if reqdata['request-fullpath'] and \
reqdata['request-fullpath'][-1] != '/':
[...]
If you do this in typed Python, your type checker will almost certainly complain that this indexing isn't valid for booleans and HTTP Cookies. You need to either check or type-assert that the value is a string.
In untyped Python, this is a perfectly decent data structure (although it might not be good style). In typed Python, this is a bad data structure that will cause you pain. There are ways around the pain that preserve the underlying dictionary, but they exist almost entirely to pacify the type checker. A proper data structure in typed Python is not multi-typed like this, or at least it's not multi-typed with a lot of keys.
(One way is to use typing.TypedDict, but if you have a lot of keys it gets painful).
There's a good reason for this insistence in typed Python, because right now there's nothing preventing me from putting in the wrong type of value for a particular key in this dictionary. I could slip up and set some key that's supposed to have a string value to a boolean, or a key that's supposed to have a dictionary to a plain string. Typing can't detect those errors because any of those are valid for the dictionary in general, just not for that particular key. A proper data structure in typed Python is one where the type checker itself can check your invariants, so string values are separated from boolean values and so on. This would probably also be clearer code.
This is a general issue for any sort of variable-typed container object, return values, or the like. I saw a similar thing when typing my program that uses the email packages; the email packages have old-school polymorphic API return values that typing is not fond of and that required type checks or casts. This is relatively valid on the part of programs determining typing (they're unlikely to ever do full flow control analysis to determine actual types), and is clearly part of the style of typed Python.
(Another case of this in DWiki is that I have a general caching layer that uses pickle to store and retrieve arbitrary objects. The callers know what they're storing and retrieving under a particular key, but this isn't visible in any types I could assign.)
As far as I can see, typing also changes how you want to structure multi-file code with classes and other data structures. In untyped Python such as DWiki, it's natural to have one source file declare a data structure, create an instance of it, and pass it as an argument to a function (or a class) from another file that the first file imports. In typed Python, this doesn't work so well. Because everything that either takes data structures as arguments or returns them wants to name the data structure in type hints, you need the classes for those data structures to be eventually be accessible in everything that touches them, which means a tangle of circular imports.
(This is different from forward references in that the code that accepts instances of these data structures will normally never import the code that defines them, cf.)
Circular imports work, technically (as I've sort of written about before), but they make me unhappy. I lack enough experience with typed Python to know the correct approach, but it certainly feels like one should define as many data structures as possible in low level files that are relatively standalone so they can be imported into everything without circular imports. I'm not sure how this works once you want to put methods on your classes that take other classes as arguments and so on.
(Mypy has some suggestions but its answers don't make me feel happy.)
Another practical issue I ran into was that DWiki has a stack of middleware functions to fiddle with HTTP requests. All of the middleware functions take a standard set of four arguments, each with a specific type, and I have enough of theses functions that going through and adding the appropriate type annotation to each argument for each function (and the return value) was clearly a pain (in my experiment I only did this for a few). I found myself really wishing for a way to say that the function as a whole had a particular type shape, which would automatically infer the argument and return types. I think the proper way to do this is to pass each function fewer arguments (ideally one), but I'm not sure I like it (and the four arguments aren't tightly coupled to each other).
(I also wound up feeling that I should create a 'types.py' file that had all of the basic type definitions that didn't depend on classes and so on. This would be things like the shape of callable functions, that 'data about the HTTP request' dictionary, and so on. Many of these are used in multiple files in DWiki and this avoids various sorts of annoyances. I don't know if such a 'types.py' file is considered a code smell.)
I don't regret my scratch experiments with adding some types to DWiki (partly because I learned more useful things about Python typing), but it's clear that doing it properly is somewhere between infeasible and impossible (and Python typing acknowledges that this can be the case). A reasonable typed version of DWiki would be structured significantly differently, and getting from the current code to any new type-friendly structure would be a significant rewrite (which would fix some old mess but likely introduce new mess).
(The semi-typed results of my experimentation are messy enough that I'm to discard that copy of the source code.)
(I said something about type hints on the Fediverse and some interesting things came up in the replies, eg.)
