Feedback: working state, mobile rollovers, and IP filtering
I get questions from people who read my posts and sometimes I answer them in a post. This is one of those times.
...
Someone read my "war room" post and picked up on the part where I spent several weeks trying to get to the bottom of what turned out to be "kill -9 -1" nuking the world on a bunch of FB machines. They asked how I keep track of things during that time, what my working memory is, and is it paper, text files, IRC messages, or just remembering things.
The answer is: back when I used to do that kind of thing, I found it very useful to have a "MMDD" (hey, I'm in the US, so just pretend it's the back half of an 8601 number... you'll see why...) directory, and inside of it, I'd have some short names for whatever I was dealing with that day.
That means today would be 0629, and then something inside of it might be "fbar" or "rsw" or "webi" or something. It was just something to keep all of the crap together and yet away from other things I was doing that day, while also distinguishing it from other times I might've done a "fbar" or "webi" or whatever project, if that makes sense.
These would just live off my home directory, and yes, it would fill up with crap, but I'd batch them up when a year ended. I'd take 01xx through 12xx and move them into "2013" when 2014 began, for instance. So, as you can see, they *are* ISO-8601 dates, sorta, but it's ~/YYYY/MMDD/foobar once it's old, and it's merely ~/MMDD/foobar until then. This is a balancing act between speed and having my homedir fill up with tons of ancient crap.
Any time I needed scratch space for output from things, that's what I used. That might be the output from a bunch of sweeps over the fleet to look for anomalies. Let's say I ran a command that sshed into a few hundred thousand hosts to look for common items in the logs. The stdout/stderr from that would probably be there so I could hit it up multiple times without asking the job runner system for another copy. It's a lot faster that way.
But in terms of troubleshooting things and dealing with permutations, it's hard to beat paper, and I usually end up with some kind of "lab notebook" at any job I work. I can think of examples of those going back over 20 years. The only problem is that the stuff in them really properly belongs to the company so I don't tend to have them after the fact. A great many pages of context have gone in the shredders over time, and not because I didn't ask. I've asked if anyone wanted them, and the answer has always been no.
There were also the internal posts about things, and then running commentary on IRC channels, but those tend to be useful after the fact, or for getting help from other people. My own state-keeping tends to stay on something close at hand and (usually) physically tangible.
As for those <date>/<term> directories, some of them turned out to be rather handy after the fact. A lot of times, something would happen, and then time would pass, and it would break *again* in exactly the same way, and I could think "didn't we deal with these people already?", dig around, find something from six months earlier, and go "AHA!". Now armed with the date, I could pull up the right posts, IRC logs, group messages, graphs, or whatever else.
Considering how much random crap I dealt with in any given day during my time "in the barrel" (kids, ask your parents), that was the only way to make any sense of it later. Without stuff like that, by the end of the week, I'd have no idea what I had been doing on Monday or Tuesday. That's how bad the load got at points.
...
Someone else wrote in and asked if I could do something to improve the overflow calculator thing I hacked up last week. It wasn't particularly usable on mobile, and that's definitely true. I wrote it on a laptop and paid exactly zero attention to what it would look like on a weirdly small screen that's usually taller than it is wide.
It was a fair point, so I took a whack at making it suck slightly less. It's probably still terrible, but in the specific case of me holding my phone upright, it looks halfway usable now. You're no longer forced into "flyspeck-3" mode with the fonts, at least.
CSS is such a mess.
...
I occasionally hear that the site is unreachable from one spot or another. This almost certainly comes down to IP filtering on my part. Perhaps you've read about the influx of web weasels who are scraping the living shit out of everything remotely URL-shaped they can get their hands on. My stuff is certainly in that space, and they show up here regularly.
Besides that, there are also a number of networks which send nothing but straight-up abusive traffic. This is where you look at the logs and see stuff like them using every single IP address in a (v4) /24 to scan for random webshit vulnerabilities. It's like, nice job, fucko, but I don't run PHP here. Anyway, that's pretty solid evidence that a given network is not worth hearing from ever again, and so into the filter it goes.
A whole lot of this happens automatically just based on whatever traffic is sent this way first. Send bad traffic and meet the bit bucket. I don't even find about most of it since it's constant and utterly uninteresting.
Then there are the people who run feed readers which don't play nicely. As previously described, they will get a handful of chances to slow down with 429s, and then if the web server feels like those aren't having an effect, it'll just ignore the traffic for however long it feels like. Again, I'm also not in the loop on this sort of thing. It's all automatic.
Finally, there is a new rub. I taught myself how to ingest BGP data, and so can now trivially go from an IP address to the autonomous system number of whoever's advertising it, including overlapping advertisements (like a /24 out of a bigger /20). Then I wrote something that will dump out an entire AS, and it's not hard to imagine what I do with that.
Enough bad behavior from a host -> filter the host.
Enough bad hosts in a netblock -> filter the netblock.
Enough bad netblocks in an AS -> filter the AS. Think of it as an "AS death penalty", if you like.
As long as clueless network operators continue to let abusive customers bounce around thousands of dynamic IP addresses with not so much as a hint of SWIP data, those entire net blocks will find themselves unable to get to large swaths of the net. That's just how it is, and it's nothing new. Send crap, meet /dev/null.
Dealing with what the web has become is exhausting.