❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

A surprise with rspamd's spam scoring and a workaround

By: cks
12 February 2025 at 03:41

Over on the Fediverse, I shared a discovery:

This is my face when rspamd will apparently pattern-match a mention of 'test@test' in the body of an email, extract 'test', try that against the multi.surbl.org DNS blocklist (which includes it), and decide that incoming email is spam as a result.

Although I didn't mention it in the post, I assume that rspamd's goal is to extract the domain from email addresses and see if the domain is 'bad'. This handles a not uncommon pattern of spammer behavior where they send email from a throwaway setup but direct your further email to their long term address. One sees similar things with URLs, and I believe that rspamd will extract domains from URLs in messages as well.

(Rspamd is what we currently use for scoring email for spam, for various reasons beyond the scope of this entry.)

The sign of this problem happening was message summary lines in the rspamd log that included annotations like (with a line split and spacing for clarity):

[...] MW_SURBL_MULTI(7.50){test:email;},
PH_SURBL_MULTI(5.00){test:email;} [...]

As I understand it, the 'test:email' bit means that the thing being looked up in multi.surbl.org was 'test' and it came from the email message (I don't know if it's specifically the body of the email message or this could also have been in the headers). The SURBL reasonably lists 'test' for, presumably, testing purposes, much like many IP based DNSBLs list various 127.0.0.* IPs. Extracting a dot-less 'domain' from a plain text email message is a bit aggressive, but we get the rspamd that we get.

(You might wonder where 'test@test' comes from; the answer is that in Toronto it's a special DSL realm that's potentially useful for troubleshooting your DSL (also).)

Fortunately rspamd allows exceptions. If your rspamd configuration directory is /etc/rspamd as normal, you can put a 'map' file of SURBL exceptions at /etc/rspamd/local.d/map.d/surbl-whitelist.inc.local. You can discover this location by reading modules.d/rbl.conf, which you can find by grep'ing the entire /etc/rspamd tree for 'surbl' (yes, sometimes I use brute force). The best documentation on what you put into maps that I could find is "Maps content" in the multimap module documentation; the simple version is that you appear to put one domain per line and comment lines are allowed, starting with '#'.

(As far as I could tell from our experience, rspamd noticed the existence of our new surbl-whitelist.inc.local file all on its own, with no restart or reload necessary.)

Our well-prepared phish spammer may have been chasing lucrative prey

By: cks
30 January 2025 at 03:19

Yesterday I wrote about how we got hit by an alarmingly well-prepared phish spammer. This spammer sent a moderate amount of spam through us, in two batches; most of it was immediately delivered or bounced (and was effectively lost), but we managed to capture one message due to delivery problems. We can't be definite from a single captured spam message (and our logs suggesting that the other messages were similar to it), but it's at least suggestive.

The single captured email message has two PDFs and a text portion; as far as I can tell the PDFs are harmless (apart from their text contents), with no links or other embedded things. The text portion claims to be a series of (top replying) email messages about the nominal sender of the message getting an invoice paid, and the PDFs are an invoice for vague professional services for $49,700 (US dollars, implicitly), with a bank's name, a bank routing number and an account number, and a US IRS W-9 form for the person supposedly asking for their invoice to be paid, complete with an address and a US Social Security number. The PDF requests that you 'send a copy of the remittance to <email address>', where the domain has no website and its mail is hosted by Google. Based on some Internet searches, the PDF's bank routing number is correct for the bank, although of course who knows who the account number goes to.

The very obvious thing to say is that if even a single recipient out of the just under three hundred this spam was sent to follows the directions and sends an invoice payment, this will have been a decently lucrative phish spam (assuming that all of the spam messages were pushing the same scam, and the spammer can extract the money). If several of them did, this could be extremely lucrative, more than lucrative enough to justify dozens or hundreds of hours of research on both the ultimate targets (to determine who at various domains to send email to, what names of bosses to put in the email, and so on) and access methods (ie, how to use our VPNs).

Further, it seems possible that the person whose name was on the invoice, the email, and the W-9 is real and had their identity stolen, complete with their current address and US social security number. If this is the case, the person may receive an unpleasant surprise the next time they have to interact with the US IRS, since the IRS may well have data from companies claiming that this person was paid income that, well, they weren't. I can imagine a more advanced version of the scam where the spammer actually opened an account in this person's name at the bank in the invoice, and is now routing their fraudulently obtained invoice payments through it.

(There are likely all sorts of other possibilities for how the spammer might be extracting invoice payment money, and all of this assumes that the PDFs themselves don't contain undetected malware that is simply inactive in my Linux command line based PDF viewing environment.)

We got hit by an alarmingly well-prepared phish spammer

By: cks
29 January 2025 at 04:24

Yesterday evening, we were hit by a run of phish spam that I would call 'vaguely customized' for us, for example the display name in the From: header was "U of T | CS Dept" (but then the actual email address was that of the compromised account elsewhere that was used to send the phish spam). The destination addresses here weren't particularly well chosen, and some of them didn't even exist. So far, so normal. One person here fell for the phish spam that evening but realized it almost immediately and promptly changed their password. Today that person got in touch with us because they'd started receiving email bounces for (spam) email that they hadn't sent. Investigation showed that the messages were being sent through us, but in an alarmingly clever way.

We have a local VPN service for people, and this VPN service requires a different password from your regular (Unix and IMAP and etc) password. People connecting through our VPN have access to an internal-only SMTP gateway machine that doesn't require SMTP authentication. As far as we can tell, in the quite short interval between when the person fell for the phish and then changed their password, the phish spam attacker used the main password they'd just stolen to register the person for our VPN and obtain a VPN password (which we don't reset on Unix password changes). They then connected to the VPN using their stolen credentials and used the VPN to send spam email through our internal-only SMTP gateway (initially last evening and then again today, at which point they were detected).

Based on some log evidence, I think that the phish spammer first tried to use authenticated SMTP but failed due to the password change, then fell back on the VPN access. Even if VPN access hadn't been their primary plan, they worked very fast to secure themselves an additional access method. It seems extremely likely that the attacker had already researched our mail and VPN environment before they sent their initial phish spam, since they knew exactly where to go and what to do.

If phish spammers are increasingly going to be this well prepared and clever, we're going to have to be prepared for that on our side. Until now, we hadn't really thought about the possibility of phish spammers gaining VPN access; previous phish spammers have exploited some combination of webmail and authenticated SMTP.

(We're also going to need to be more concerned about other methods of obtaining persistent account access, such as adding new SSH authorized keys to the Unix login. This attacker didn't attempt any sort of SSH access.)

Rejecting email at SMTP time based on the From: header address

By: cks
3 January 2025 at 04:14

Once upon a time (a long time ago), filtering and rejecting email based on the SMTP envelope sender (the SMTP MAIL FROM) was a generally sufficient mechanism to deal with many repeat spam sources. It didn't deal with all of them but many used their own domain in the envelope sender, even if they send from a variety of different IP addresses. Unfortunately, the rise of (certain) mail service providers has increasingly limited the usefulness of envelope sender address filtering, because an increasing number of the big providers use their own domains for the envelope sender addresses of all outgoing email. Unless you feel like blocking the provider entirely (often this isn't feasible, even on an individual basis), rejecting based on the envelope sender doesn't do you any good here.

This has made it increasingly useful to be able to do SMTP time rejection (and general filtering) based on the 'From:' header address. Many mail sending services will put the real spam source's email address in the From: and at least the top level domain of this will be consistent for a particular source, which means that you can use it to reject some of their customers but accept others. These days, MTAs (mail transfer agents) generally give you an opportunity to reject messages at the SMTP DATA phase, after you've received the headers and message body, so you can use this to check the From: header address.

(If you're applying per-destination filtering, you have the SMTP DATA error problem and may only be able to do this filtering if the incoming email has only a single recipient. Conveniently, the mail service providers that commonly obfuscate the envelope sender address usually send messages with only a single recipient for various reasons, including VERP or at least something that looks like it.)

I feel that From: address filtering works best on pseudo-legitimate sources of repeat spam, such as companies that are sending you marketing email without consent. These are the senders that are least likely to vary their top level domain, because they have a business and want to look legitimate, be found at a consistent address, and build up reputation. These are also the sources of unwanted email that are the least likely to be dropped as customers by mail service providers (for a collection of likely reasons that are beyond the scope of this entry).

There are plenty of potential limitations on From: header address filtering. Bad actors can put various sorts of badly formed garbage in the From:, you definitely have to parse it (ideally your MTA will provide this as a built-in), and I believe that it still technically might have multiple addresses. But as a heuristic for rejecting unwanted mail, all of this is not a serious problem. Most From: addresses are well formed and good, especially now that DMARC and DKIM are increasingly required if you want the large providers to accept your email.

(DKIM signing in 'alignment' with the From: header is increasingly mandatory in practice, which requires that the From: header has to be well formed. I don't know how Google and company react to badly formed or peculiar From: headers, but I doubt it helps your email appear in people's inboxes.)

PS: While you can filter or discard email based on the From: header in a variety of places, I like rejecting at SMTP time and it's possible that SMTP rejections at DATA time will trigger anti-spam precautions in the mail service providers (it's a possible signal of badness in the message).

(Some) spammers will keep trying old, no longer in DNS IPv6 addresses

By: cks
18 November 2024 at 03:57

As I mentioned the other day, in late September my home ISP changed my IPv6 allocation from a /64 to a different /56, but kept the old /64 still routing to me. I promptly changed all DNS entries that referred to the old IPv6 address to the new IPv6 address. One of the things that my home machine runs is my 'sinkhole' SMTP server, which has a DNS MX entry pointing to it. This server tracks which local IP address was connected to, and it does periodically receive spam and see probes.

Since this server was most recently restarted on November 10th, it's seen about the same volume of connections to each IPv6 address, the old one (which hasn't been present in DNS for more than a month) and the new one (present in DNS). Some of this activity appears to be from Internet scanning efforts, which I will charitably assume are intending to do good and which have arguable reasons to keep scanning any IPv6 address that they've seen respond. Other connections seem less likely to be innocent.

I'm pretty certain I've seen this behavior for IPv4 addresses long ago (I might even have written it up here, although I can't find an entry right now), so in a sense it doesn't surprise me. Some spammers and other systems apparently do DNS lookups only infrequently and save the IP addresses (both IPv4 and apparently IPv6) that they see, then use them for a long time. Still, it's a more modern world, so I'd sort of hoped that any spammer with software that could deal with IPv6 would handle DNS lookups better.

On the one hand, it's not like holding on to the IP addresses of old mail servers is likely to do spammers much good. If the IP address of a mail server changes, it's very likely that the old IP address will stop working before too long. On the other hand, presumably this mostly doesn't hurt because most mail servers don't change IP addresses very often. Usually the IP address you looked up two months ago (or more) is still good.

DKIM signatures from mailing list providers don't mean too much

By: cks
7 October 2024 at 02:43

Suppose, hypothetically, that you're a clever email spammer and you'd like to increase the legitimacy of your (spam) email by giving it a good DKIM signature, such as a DKIM signature from a reasonably reputable provider of mailing list services. The straightforward way to do this is to sign up to the provider, upload your spam list, and send your email to it; the provider will DKIM sign your message on the way through. However, if you do this you'll generally get your service cancelled and have to go through a bunch of hassles to get your next signup set up. Unfortunately for everyone else, it's possible for spammers to do better.

The spammer starts by signing up to the provider and setting up a mailing list. However, they don't upload a bunch of addresses to it. Instead, they set the list to be as firmly anti-spam, 'confirmed opt in through the provider' as the provider supports. Then they use a bunch of email addresses under their own control to sign up to the mailing list, opt in to everything, and so on. They may then even spend a bit of time sending marketing emails to their captive mailing list of their own addresses, which will of course not complain in the least.

Then the spammer sends their real spam mailing to the mailing list, goes to one of their captive addresses, copies the entire raw message, headers and all, and strips out the 'Received:' headers that come from it leaving the mailing list provider. Then they go to their (rented) spam sending infrastructure and queue up a bunch of spam sending of this message to the real targets, setting it to have a '<>' null SMTP MAIL FROM. This message has a valid DKIM signature put on by the mailing list provider and its SMTP envelope sender is not (quite) in conflict with it. The only thing that will give the game away is inspecting the Received: headers, which will say it came from some random IP with no listed headers that say how it got from the mailing list provider to the random IP.

The spammer set up their mailing list to be so strictly anti-spam in order to deflect complaints submitted to the mailing list provider, especially more or less automatic ones created by people clicking on 'report this as spam' in their mail environment (which will often use headers put in by the mailing list provider). The mailing list provider will get the complaint(s) and hopefully not do much to the spammer overall, because all of the list members have fully confirmed subscriptions, a history of successful deliveries of past messages that look much like the latest one, and so on.

I don't know if any spammers are actively doing this, but I have recently seen at least one spammer that's doing something like it. Our mail system has logged a number of incoming (spam) messages with a null SMTP envelope sender that come from random IPs but that have valid DKIM signatures for various places. In some cases we have captured headers that suggest a pattern like this.

(You can also play this trick with major providers of free mail services; sign up, send email from them to some dummy mail address, and take advantage of the DKIM signature that they'll put on their outgoing messages. The abuse handling groups at those places will most likely take a look at the 'full' message headers and say 'it's obviously not from us', and they may not have the tools to even try to verify the DKIM signature to see that actually, it is from them.)

The speed of updates for signatures of bad things matters (a lot)

By: cks
5 August 2024 at 03:35

These days (and for a long time), most spam, phish, malware, and so on (in email and other things) is recognized not through general rules, patterns, and processes (eg), but by seeing if the content matches any known signatures. Sometimes this is literally matching cryptographic hashes, but more often there's some sort of signature matching engine involved with various matching operators, conditions for combining them, and so on. ClamAV is one example that's mostly a matching engine, which means that in practice you need a collection of signatures to make it useful. Since signatures aren't general things, they have to be created by someone and then you have to get that newly created (or perhaps updated) signature.

What people have collectively found is that in practice, the speed of updating signatures matters, often a lot; in fact it matters enough that people are willing to pay for faster updates to collections of signatures. Why it matters is pretty straightforward; you're in a race against attackers. Attackers are perfectly well aware that the effectiveness of what they're doing goes down fast once signatures are available for it (or in general once people have had time to recognize what's going on, get their web landing page killed off, or whatever), so they generally try to get things done as fast as possible.

(I'm sure there are some slow-moving spam, phish, and malware campaigns that keep on going and going, but I don't think they're very common.)

However, attackers have their own speed limits; they can only send so much so fast, to you and to everyone else. Against many attackers, this gives you the chance to cut off at least some of their activities if 'you' can react fast enough, which broadly means if you can get signature updates fast enough. In more sophisticated environments, fast signature updates may also give you the chance to re-scan people's recently received email messages before people open them (or when they open them).

(Similar things apply to scanning files or recognizing signs of active malware, especially since these may already be delayed from the initial attack depending on how the attacker got to people. If you're getting people to download malware from a web page by sending them a bait message, you have to wait for people to read their email.)

So in general, the faster you get signature updates, the less you'll be exposed to (and for a shorter amount of time). The slower the updates, the more you're exposed to and the longer you're exposed. In the extreme case, sufficiently delayed updates are mostly useless, because the attacker campaign they're reacting to is over by the time you get the updates active.

(Of course you can try to delay receiving things (and thus checking them), but this tends to be unpopular with people. Like it or not, modern email is expected to get through rapidly and as a result is used for time sensitive things.)

We've seen this ourselves when we changed from a commercial anti-spam system for our email to one mostly based on free software and free signature data sources for the anti-malware, anti-virus (and anti-phish) part. Even with paying for some signature sources, the free system clearly was less effective at matching and blocking new malware, and we're fairly certain that part of this was that the commercial system's signatures updated quite frequently (and the company involved had a bunch of people working on keeping them up to date).

(I think this is something that's well known to people in the communities that use signatures, like anti-spam and (anti-)malware, but is perhaps not so obvious to people outside those communities.)

Some (big) mail senders do use TLS SNI for SMTP even without DANE

By: cks
10 July 2024 at 02:31

TLS SNI (Server Name Indication) is a modern TLS feature where clients that are establishing a TLS session with a server tell it what name they are connecting to, so the server can give them the right TLS server certificate. TLS SNI is essential for the modern web's widespread HTTPS hosting, and so every usable HTTPS-capable web client uses SNI. However, other protocols also use TLS, and whether or not the software involved uses SNI is much more variable.

DANE is a way to bind TLS certificates to domain names through DNS and DNSSEC. In particular it can be used to authenticate the SMTP connections used to deliver email (RFC 7672). When you use DANE with TLS over SMTP, using SNI is required and is also straightforward, because DNSSEC and DANE have told you (the software trying to deliver email over SMTP) what server name to use.

Recently, SNI came up on the Exim mailing list, where I learned that when it's sending email, Exim doesn't normally use SNI when establishing TLS over SMTP (unless it's using DANE). According to Exim developers on the mailing list, the reasons for this include not being sure of what TLS SNI name to use and uncertainties over whether SMTP servers would malfunction if given SNI information. This caused me to go look at our (Exim) logs for our incoming mail gateway, where I noticed that although we don't use DANE and don't have DNSSEC, a number of organizations sending email to us were using SNI when they established their TLS sessions (helpfully, Exim logs this information). In fact, the SNI information logged is more interesting than I expected.

We have a straightforward inbound mail situation; our domains have a single DNS MX record to a specific host name that has a direct DNS A record (IP address). Despite that, a small number of senders supplied wild SNI names of 'dummy' (which look like mostly spammers), a RFC 1918 IP address (a sendnode.com host), and the IP address of the inbound mail gateway (from barracuda.com). However, most sending mailers that used SNI at all provided our inbound mail gateway's host name as the SNI name.

Using yesterday's logs because it's easy, roughly 40% of the accepted messages were sent using SNI; a better number is that about 46% of the messages that used TLS at all were using SNI (roughly 84% of the accepted incoming messages used TLS). One reason the percentage of SNI is so high is that a lot of the SNI sources are large, well known organizations (often ones with a lot invested in email), including amazonses.com, outlook.com, google.com, quora.com, uber.com, mimecast.com, statuspage.io, sendgrid.net, and mailgun.net.

Given this list of organizations that are willing to use SNI when talking to what is effectively a random server on the Internet with nothing particularly special about its DNS setup, my assumption is that today, sending SNI when you set up TLS over SMTP doesn't hurt delivery very much. At the same time, that some people's software send bogus values suggests that fumbling the SNI name doesn't do too much harm, which is often unlike the situation with HTTPS.

PS: I suspect that the software setting 'dummy' as the SNI name isn't actually mail software, but is instead some dedicated spam sending software that's using a TLS library that has a default SNI name set and is (of course) not overriding the name, much as some web spider software doesn't specifically set the HTTP User-Agent and so inherits whatever vague User-Agent their HTTP library defaults to.

Spammers do forge various noreply@<you> sender addresses

By: cks
1 June 2024 at 02:47

It is probably not news to anyone reading this that some of the time, spammers sending you email will forge the email as being from various addresses at your domain, for either or both of the SMTP 'MAIL FROM' envelope sender address and the From: header address. Spammers have been doing this to us for years. What I hadn't realized until now, when I looked at the actual addresses being forged, was that spammers were forging various variations on 'noreply@<us>', in various variations of words and cases. Over the past ten days we've seen all of 'noreply@', 'Noreply@', 'Nonreply@', 'no_reply@', 'NOREPLY@', 'no-reply@', and 'NO-REPLY@'.

Of course, spammers also forge various plausible administrative addresses as well, such as 'Administrator@', 'Admin@', 'cpanel@', 'support@' (and 'Support@'), and one case of 'hr@', as well as the expected 'postmaster@'. These are almost all addresses that don't exist here and never have, so I'm pretty confident that spammers are just making them up instead of drawing them from a list of (past) legitimate email addresses of people here. I suspect that some or perhaps many of these forged addresses are being used on phish spams, and this is probably the case for the various 'noreply@' addresses.

(Spammers clearly use old email address lists to generate their envelope sender addresses, because we reject a lot lot of SMTP 'MAIL FROM' addresses that used to be real email addresses here but which have since been removed (we do eventually close some accounts). Interestingly, there is also a relatively frequently forged sender address that is a single-letter typo for a real person's email address.)

One of the lessons I draw from this little exercise in curiosity is that if we've created administrative-like email addresses in our system simply to reserve them, and we aren't using them, we should actively block their use as external sender addresses. If we want to create a dummy 'cpanel@' address, for example, we should definitely make it so that it's not accepted as a SMTP envelope sender.

(Because of some features of our mail environment, people here can created valid email addresses without our involvement (this has various entirely legitimate uses, including expendable personal email addresses). Historically this has meant that we grabbed a number of addresses simply as precautions to reserve them, without ever intending them to be 'legitimate'.)

PS: We do have a local noreply-like address, for internal use. However, spammers don't seem to forge it on their messages, perhaps because it basically never appears on email we send to actual people and thus has never made it onto various spammer lists of email addresses here.

(All of the email that we send to people has real sender and reply addresses that are read by us, even if the mail is sent by automated systems.)

❌
❌