A surprise with rspamd's spam scoring and a workaround
Over on the Fediverse, I shared a discovery:
This is my face when rspamd will apparently pattern-match a mention of 'test@test' in the body of an email, extract 'test', try that against the multi.surbl.org DNS blocklist (which includes it), and decide that incoming email is spam as a result.
Although I didn't mention it in the post, I assume that rspamd's goal is to extract the domain from email addresses and see if the domain is 'bad'. This handles a not uncommon pattern of spammer behavior where they send email from a throwaway setup but direct your further email to their long term address. One sees similar things with URLs, and I believe that rspamd will extract domains from URLs in messages as well.
(Rspamd is what we currently use for scoring email for spam, for various reasons beyond the scope of this entry.)
The sign of this problem happening was message summary lines in the rspamd log that included annotations like (with a line split and spacing for clarity):
[...] MW_SURBL_MULTI(7.50){test:email;}, PH_SURBL_MULTI(5.00){test:email;} [...]
As I understand it, the 'test:email' bit means that the thing being looked up in multi.surbl.org was 'test' and it came from the email message (I don't know if it's specifically the body of the email message or this could also have been in the headers). The SURBL reasonably lists 'test' for, presumably, testing purposes, much like many IP based DNSBLs list various 127.0.0.* IPs. Extracting a dot-less 'domain' from a plain text email message is a bit aggressive, but we get the rspamd that we get.
(You might wonder where 'test@test' comes from; the answer is that in Toronto it's a special DSL realm that's potentially useful for troubleshooting your DSL (also).)
Fortunately rspamd allows exceptions. If your rspamd configuration
directory is /etc/rspamd
as normal, you can put a 'map' file of
SURBL exceptions at /etc/rspamd/local.d/map.d/surbl-whitelist.inc.local.
You can discover this location by reading modules.d/rbl.conf, which
you can find by grep'ing the entire /etc/rspamd tree for 'surbl'
(yes, sometimes I use brute force). The best documentation on what
you put into maps that I could find is "Maps content" in the
multimap module documentation; the
simple version is that you appear to put one domain per line and
comment lines are allowed, starting with '#'.
(As far as I could tell from our experience, rspamd noticed the existence of our new surbl-whitelist.inc.local file all on its own, with no restart or reload necessary.)