[Home]ToothyWikiInternals/ToothycatDowntime

ec2-18-213-110-162.compute-1.amazonaws.com | ToothyWiki | ToothyWikiInternals | RecentChanges | Login | Webcomic

toothycat.net downtime/outage record

20 Apr 2018, 07.20-08.20
web server down as Bytemark migrated it to a Meltdown/Spectre?-patched host.

28 Jun 2017, 15.00-18.30
mailserver storage device failure. Mail server was reimaged on temorary (slow, small) media, will get reimaged again onto a faster device lunchtime 30th.

9 June 2015, 08.50-16.50
our DNS provider experienced a [DDOS attack], so DNS resolution is intermittent. (fwiw at the time of writing, toothycat.net is at 89.16.173.239)

18 June 2014, 15.00-15.40
Bytemark rebooted the devices hosting www.toothycat.net.

12 May 2014, 11.00-14.00
Bytemark had to reboot and fix the host www.toothycat.net uses following a kernel panic, resulting in ~3 hours of downtime

29 October 2013
Bytemark will be migrating the webserver to new hardware; this will result in ~45 minutes of downtime.

28 May 2013, 08.00-18.00
the mailserver stopped responding; cause still undetermined, suspecting out of memory.

14 April 2013, 19
50-20.30: the storage medium in the replacement mail server did not prove sufficiently reliable and had to be replaced. There will be a further 20-30 minutes of downtime on Tuesday night when we replace the storage again - the spare device in use right now is quite slow.

11 April 2013
we are having electricity supply issues. Mail service may be intermittent this afternoon as the electricians investigate.
You mean you don't have everything on UPS? Shocking! --Admiral
UPS doesn't help if the power's off for several hours while the engineer tests the wiring / replaces the consumer unit... ^^; --MoonShadow
You mean you don't have everything on vastly over-sized UPS? Shocking! (My UPS doesn't seem to be U any more. It is also stretching the definition of PS and making ominous humming sounds.) --Admiral
We don't have a petrol generator out back, either ;) We do have solar PV cells, but they're the wrong side of the consumer unit to help here ^^; --MoonShadow

9 April 2013, 06.?? - 10 April 2013, 02.00
the mail server was destroyed by water damage. A replacement was brought online. No data was lost.

18 February 2013, 14.45 - 16.45
there will be a power outage at the house, hence no mail server during this time

30 April 2012, 08.33 - 1 May 2012, 19.30
ADSL connection (and hence lucien.toothycat.net) was down. Ruled out everything our side of the socket (replaced router, microfilters and all cabling); TalkTalk? were unable to determine the cause. Came back of its own accord.

18 October 2011, 21.25 - 19 October 2011, 19.00
outage at Pipex (hence no mail). Sorted itself out some time after the fault was repaired...

02 October 2011, 09.20 - 11.30
FreeParking?, our DNS provider, was down.

12 September 2011 - 14 September 2011
our telephone line (and hence lucien.toothycat.net) was down, BT took three days to send an engineer out to reattach a wire

02 February 2011, 09.20 - 10.20
ADSL connectivity (and hence lucien.toothycat.net) was down, no idea why yet

10 November 2010, 02.25 - 18
30pm: BT wholesale network outages, hence no lucien.toothycat.net.

18th October 2010, 17.50 - 19.00
ADSL connectivity (and hence lucien.toothycat.net) was down due to network problems at ISP (actually, connectivity existed, but at ~200bps (sic) so no realistic way to use any services due to timeouts). To save me finding it over and over again, the Opal support number is 0800 298 2981. This seems to be quite frequent since Opal bought Pipex; I may have to start looking for another ISP. -- MoonShadow

26th September 2010, 18:00 - 27th September early AM
ADSL connectivity (and hence lucien.toothycat.net) down due to massive outages at ISP. Resolved by ISP.

9 August 2010
Bytemark will be taking the webserver down for 20-30 minutes at some point during the day.

28 June 2010, 14.03 - 18.10
Connection to lucien/kerberos/joggler was dropped. Resolved by power cycling the ADSL modem. --MoonShadow

1 June 2010, 14.40 - 15.20
electricity supply was disconnected, and the mailserver therefore unavailable. --MoonShadow

19 May 2010
lucien appears to be down. Any news? --Admiral
Uh, it's up from this end. We did have a brief internet outage during the night, though. Let me know if it's still down for you now. --MoonShadow
A while back, I couldn't ping or ssh. Now I just can't ping. --Admiral

18 March 2010, 08.30 - 14.30
- our electricity supply will be disconnected during this time so EDF can carry out maintenance; the mailserver will therefore be unavailable.

30 January, 2010 - 31 January, 2010
- upgrade from Debian Etch to Lenny. Current status: running Lenny, with Apache 1.3x. Some perl modules may be missing at this time. libphp5/apache1.3 is not in Lenny. A decision will be taken shortly whether to carry on with attempt to migrate to Apache 2.x at this time or to compile a 1.3x version of libphp5.

23 December, 2009 - 24 December, 2009
our DNS provider, in their infinite wisdom, decided to reset one of our MX records, causing incoming mail to be spooled to our secondary MX. This should now be fixed, and the backlog should start being sent to us as the DNS changes propagate - hopefully this will happen before Black Cat start bouncing the queued mail :(

18 December, 2009
we're having brownouts and power blips at the rate of about one an hour or so. Our router is not currently attached to a UPS, so the mailserver connection will drop briefly whenever one happens.

10 August, 2009, 09.30 - 10.25
freeparking.co.uk and its DNS servers were down, rendering toothycat.net inaccessible to anyone that did not have the DNS record cached.

06 June, 2009, 04.30 - 12.10
ADSL router dropped connection; powercycled. Wondering about getting a timer plug for it, to powercycle it automatically say once a week at 4am or something; or maybe one that does it based on pings.

27 May, 2009 16.00 - 29 May, 2009 02.30
Web server outage following [data loss] at Bytemark and subsequent restoration from a backup taken mid-March.

1 January, 2009 4.00 - 16:00
Connection lost, Pipex outage.

7 November, 2008 17.00 - 9 Nov., 18.10
ADSL router dropped connection, powercycled on our return from Prague. Investigating cause. Unfortunately, it was down for over 48 hours, so Mythic Beasts are likely to have started bouncing mail :(

1 September, 2008 14.00 - 17.30
ADSL router dropped connection, powercycled at 17:30 by Wikivic.

23 April, 2008 11.30 - 13.00
Cleaner turned off power to ADSL router, cutting mailserver connectivity.

29 March, 2008
Mail server has now been moved to being hosted from a PC in our house. The change should be seamless. Please let us know if you encounter problems.

28 March, 2008
Web server has now been moved to [Bytemark]. The transition should be seamless; please let me know if you spot that something has stopped working. People whose domains we provide webspace for should now change all mentions of 80.77.247.11 to 89.16.173.239 in their DNS records; please let me know when you have done this, or if you would like some help; I have also emailed all of you. [DSL Warehouse]'s website lied about the ADSL modem I ordered being in stock; I have cancelled the order and ordered from elsewhere, but it is not clear whether it will arrive tomorrow or Monday, so mail is still hosted on Rob's machine for now.
Pipex refuse to let us change the reverse DNS records for our static IP block; apparently, "this is not a service we provide". I will see if I can configure the mail server to send outbound mail through their smarthost. Failing that, I shall have to look for alternative solutions. No-one else provides ADSL at our exchange :(

25 March, 2008, 11.20-29 March 2008
Intermittent outages due to CPU fan failure and consequent overheating. The fan has still not been replaced; the account with the colo will be terminated in a few weeks due to their poor service. Due to the CPU overheating toothycat.net will continue to glitch intermittently until we migrate. MoonShadow will be migrating toothycat.net away from RobHu's servers over the next few days; the current plan is to use a [Bytemark] VM for the website and serve mail from a PC in our house over ADSL. We'll let people who have their own domains pointing at us know when to switch DNS records.

6am December 30, 2007 - 2.30pm
IDE controller failure in machine running toothycat.net VMs appears to have caused said machine to lock up.

4pm Saturday July 7, 2007 - 5pm Saturday July 7, 2007
There will be a brief outage of the web and email servers as we migrate to RobHu's host.

5.30 June 6, 2007 - 15.50 June 7, 2007
Fault at BT telephone exchange leading to loss of telephone and ADSL service at TheFlat, and consequent outage. Note that BT take a day and a half just to get around to sending an engineer.

10.45 May 31, 2007 - 15.20 May 31, 2007
[BT network fault] leading to DSL outage.


4.00am May 20, 2007 - 5.30am May 21, 2007
Easynet DSL outage.

Mar 8, 2007
10.58 - 12.36pm: Webserver became unusable after several hundred simultaneous instances of the wiki script were started, filling memory and swap and using up all available CPU time. Took 45 minutes for the reboot command to complete. Since Apache was not able to log the requests, MoonShadow has not yet been able to ban the bot involved, so problem may recur. Looking into ways of introducing request rate limiting.
Just limit the number of simultaneous apache processes/threads (depending on your version). The OS will queue a certain number of incoming connections so there shouldn't be much of a problem setting this limit quite low. It's only really useful setting it high if you will have clients with a much slower download rate than your total upload rate. I think the setting is called "MaxClients?". --Admiral
This does not appear to be sufficient: the number of instances of wiki.pl I observed running was several times the current MaxClients? setting. - MoonShadow
That's odd. AIUI that shouldn't happen, unless those instances were un-collected zombies or something. Out of interest, what is it set to? I would have thought something like five would be sufficient, although that does open opportunities for people to block your web site by just opening five idle connections. I suppose another option would be for the wiki.pl script to block on a mutex before beginning the RAM-intensive business so only one does RAM-intensive stuff at a time. --Admiral
It's 75 right now. I observed 300+ instances of the process. --MoonShadow
I suppose this could happen if the bot disconnects before receiving the whole page. Which would happen if the server got slow because it was dealing with too many simultaneous requests. But then I would have expected to have seen something in the log. --Admiral
Apache doesn't write the log entry until it's finished dealing with the request, one way or another. So as soon as the server starts thrashing, very little gets logged. I'll take MaxClients? down a bit and see if that has any visible effect... - MoonShadow
One trick I read about ages ago was to use NetFilter? to log every 10000th packet (possibly SYN-packet), which gives you an idea where DoS attacks may be coming from. --B, in delayed-reaction mode

Jan 23, 2007
13.50 - 17.05pm: Nationwide Easynet [DSL outage].

Oct 16, 2006
19.28 - 19.58pm: Power cut. Mail service was restored within a few minutes, web took about 20 minutes to bring back up.

Mar 15, 2006
Freeparking.co.uk appear to have reset our zone file to default values. toothycat.net operation will be intermittent for about six hours while we wait for stuff to revert.
It's great to see tc.net back. Is the ToothyChat still down, though? I get 404s from IIS (!) trying to access https://www.toothycat.net/~sham/chat.html ... --AC
Works fine here. If you can get to any toothycat.net URL (which you can, since you posted!), you should be able to see all of them; otherwise, it's not related to the Freeparking/DNS problem. - MoonShadow
I can get to the site fine but I can't connect to mail through POP3.  Is this connected or do I have some sort of problem on my end? --K
Can you see https://lucien.toothycat.net/? If you can, the problem is at your end. - MoonShadow
I can get to that (and used it to check my mail, thanks) but if I try to go to mail.toothycat.net, through browser or e-mail client, it returns the message "Could not connect to server mail.toothycat.net; the connection was refused."  I'm rather puzzled, since I'm still receiving mail from a different mail server and two separate machines at this end give the same message :/ --K
Where did you get 'mail.toothycat.net' from? The only address I have ever listed for the mailserver is lucien.toothycat.net. mail.toothycat.net currently happens to be pointing at our secondary MX, but I make no guarantees about that entry pointing at anything in particular except possibly some machine somewhere that's capable of forwarding SMTP to us.. - MoonShadow
Sorry, I probably guessed that value when setting up my mail at some point.  It worked fine up until a few days ago when something must have changed (I think on your end, but I could be wrong).  Anyway, now working, thanks. --K
Ah, sorry - that actually *was* my fault, by the looks of it; I was setting mail and mail-2 to point at our and Black Cat's servers respectively before freeparking trashed our DNS records, and it looks like I reentered them the other way round. I hadn't realised anyone else was relying on them because I'd never mentioned them, and it doesn't matter for the purposes of SMTP so long as the priority in the MX record is correct. I probably oughta switch them, though, because "mail" is a reasonable guess and it behaves in a surprising fashion right now which is bad. - MoonShadow
It's probably a stupid default in some silly piece of software. -- Senji
Thanks Senji... --K

Mar 11, 2006
2.30am - 12.15pm: BT cluster servers down, affecting all Easynet UK DSL customers.

Dec 30, 2005
BT's servers in London were operating intermittently for much of the day, affecting DSL connectivity including toothycat.net - MoonShadow

Dec 13, 2005, around 8am
the DNS servers were down for a few minutes; toothycat.net was still accessible, but only to people who know the IP address ;) - MoonShadow

Dec 13, 2005
14:15pm - 15:15pm: toothycat.net was inaccessible for exactly an hour, then came up again on its own - implying a problem with DSL connectivity. - MoonShadow

Nov 15, 2005
10.40pm - 11.15pm: Wiki briefly became uneditable due to MoonShadow's muppetry (recent changes log was not writeable by HTTP server's user, attempts to save pages left stale locks). Fixed - thanks to Kazuhiko for sending a timely email, it would have stayed like that overnight otherwise.

Oct 03, 2005
Not downtime as such, but I am currently testing a [virtual host] configuration. Please let me know if anything is broken on the main site. - MoonShadow

Aug 09, 2005 - 10.35am-1pmish
No access to ToothyCat domain.  --Vitenka
Wow, someone got there before me! Traceroute showed a loop inside Easynet during the time, implying the connection to the ADSL modem was dropped. Being at work, there was little else I could do. I wasn't expecting it to come back on its own actually - last time it did this, I needed to reset the modem. - MoonShadow

Jun. 12, 2005 - 15.16pm
Loss of RecentChanges log due to race condition between maintenance scripts.
I've seen the term RaceCondition? a couple of times, but I'm not sure I've ever known what it is. Care to enlighten me? --CH
In general, it means that two processes aqre running at the same time, and odd things happen depending upon which finish first.  Generally this happens because the two (or more) things check something common between them and then change it at some point later.  Things would work ok if the two processes take place after one another, but things go wrong because they've ended up being interleaved and weren't designed for it.  Noddy example - two things both doing 'x=x+1'.  First one reads in x (4, say) - second one reads in x (4 again) first one adds 1 to it (5) writes it back.  Second one adds 1 to its local copy (5) and writes it back - giving the absurd result that adding 1 to 4 twice gives 5.  --Vitenka  (Usually a race condition trashes a file completely or crashes.)

Jun. 5, 2005 - 21.30-22.00
Webserver decided to reboot itself a little after 21.30. No idea why yet. It didn't come back up because there was no keyboard plugged in. Condition discovered at 22.00, appropriate BIOS option duly clobbered; should come back unattended if this happens again. Don't even have to clear any locks this time since they live in /tmp nowadays which is mounted on a ramdrive.. - MoonShadow
There was a power cut a little further North of you around 8:30 for about half an hour, and a power blip around 9:30 that rebooted my computer. This may be the cause. You may not have had the power cut, but you might have had the blip. --Admiral

February 14, 2005 - 1pm - 7pm
ADSL router dead. Phoned Easynet, who said BT would replace it; spent two hours on hold to BT. They can't replace the router until Thursday morning. Went to PC World and bought a router to last until then, thus demonstrating the sheer strength of my addiction to the net. It can be a hot backup for when BT-managed one next fails, since BT appear to generally suck at response times. Colocated machines will be down until Thursday since new router does not have a built-in hub, unless their owners want them badly enough to donate a hub to the cause. There will be a brief outage Thursday morning when the BT engineer turns up. - MoonShadow, who is currently trying to decide whether to leave the new modem attached on Thursday morning so we have more uptime, or to remove it as a potential source of confusion.
I'd remove it.  The last thing you want to do is confuse them and leave the door open for more delays... --K
BT engineer duly arrived, replaced router, complimented the snake and left. - MoonShadow

February 9, 2005 - 23
50pm-23:55pm: Migration of webserver to new hardware, reindexing and reattaching server to DMZ is now complete. - MoonShadow

February 9, 2005 - 6.30am-9am
'nother HD error. After two years and one week of service, webserver retired in favour of backup machine (the proper one I had time to put together, not the little laptop described below; although that is still available if there turns out to be a problem with the new hardware). There should be little visible change, except for performance (it's only got a third of the RAM the old server had (though I intend to get some more, I probably won't have time until the weekend), *and* it's rebuilding the index in the background). Since I don't back up the full text index, indexed search, although usable, will return only partial results until the index has been fully rebuilt (1-2 days). There will be a brief outage tonight when I move the new server to a more suitable location so I can switch it over to the DMZ (currently it's positioned precariously in our front room and serving from the internal LAN); and another when I do get the RAM so I can install it. - MoonShadow

January 30, 2005 - 7am-11pm
More webserver HD errors. I have a standby machine with a mirror of the site ready to take over at very short notice; however, it is not a high enough spec to provide full text indexing (in particular, I need around 2Gb more HD space for the index, and it is a laptop so not easily upgradeable), and I am not certain how well it will stand up to the load of running the wiki (only 64Mb of RAM - that's basically gonna keel over dead at lunchtimes), so I will not swap it in until I have to. With any luck I will be able to get a proper replacement box put together before it comes to that (waiting for time and cash, mainly time). - MoonShadow
I told you to go to bed! Bad MoonShadow! - SunKitten

January 24, 2005
3am - 8am; mailserver and webserver not attached to network because of dead DMZ router PSU. PSU replaced with spare pulled from a LAN router; H Gee do replacements. Don't buy cheap Buffalo routers, people, the PSUs are cr*p. I've had two die on me in the last month, both bought around the same time. - MoonShadow

October 9, 2004
Latest on trying to build a replacement webserver on the cheap: following two days of trawling the web, newsgroups and trying everything I could find / think of, the BIOS on the P5A-B motherboard got hosed by a duff update and the machine became unbootable. I can't bring it up using the boot block, and I can't tell if that's because the boot block is screwed or something else is wrong. Time for plan B: I have another candidate motherboard (AB-BX6) and processor, but need to acquire a proper ATX case for it since none of the cases lying around the house have an opening in the right place to let me plug a keyboard in. Actually, I've just had an idea - I do have a case with mountings in the right places; all it's lacking is a 10mm hole in the rear for the keyboard - which I won't actually need while everything's up and running; I can set the whole thing up with the rear plate removed, being careful not to dislodge the network and video cards (normally attached to the rear plate) while the power is on, then unplug the keyboard, ensure it boots without, reattach the rear plate and administrate it remotely from that point; which will at least work until the next power cut happens and I need console access to talk it through fsck. Hm. I think I'll get onto that tomorrow evening since the computer shops will be shut anyway.. - MoonShadow

October 7, 2004
still waiting for replacement webserver machine.
Replacement machine constructed; has FreeDOS? installed, which works fine; if MoonShadow tries to boot any of his half-dozen Linux CDs, the bootloader is fine, the kenrnel and initrd appear to decompress, but the machine spontaneously reboots or shuts down just after the kernel banners appear and the processor is (correctly) detected. Will investigate tonight. - MoonShadow
I had this happenning to me.  I think it is to do with ACPI defaults.  I believe that a kernel can be built with 'safe' power management, and that this works better.  Personally, I just switched distro.  --Vitenka
Could try adding acpi=off to the kernel command line to disable it. --Bobacus
acpi=off, acpi=force (as some newsgroups suggest for the asus p5a) and apm=off have no effect. All the Linux CDs I've tried are based on 2.4.x kernels; considered trying digging around for a 2.2.x-based Debian Woody installer but frankly not convinced I want hardware dodgy enough it has trouble with recent kernels as the new webserver. Going to try swapping RAM chips; failing that, going to assume the motherboard or CPU or both are too flaky to bother with and switch to the other spare motherboard, which will involve procuring an ATX case from somewhere. - MoonShadow

October 6, 2004
webserver is up for now but MoonShadow cannot guarantee reliability. MoonShadow will be shopping for replacement hardware and will hopefully get a replacement server set up and seamlessly switched in during the evening.

October 5, 2004
webserver has disk errors. Due to murphy's law, of course, I find out about this at the *start* of the working day when I know I'll not be able to do anything about it for another eight hours. Have remounted it readwrite, but we're sitting on a timebomb until I can e2fsck it because the filesystem has a couple of duff entries; and it looks like it was a hardware error that triggered it so it's presumably gonna fail all the way soon. I have dwindling cash reserves and no car (SunKitten is using it tonight) so don't know how much I'm gonna be able to do. Bleh. *cries* - MoonShadow
Planned downtime: the webserver will be taken down around 6.30 tonight, probably for around an hour, so MoonShadow can e2fsck the root partition and (sob) run a bad sector scan.''
Is this a tiny hard-drive, or can I have whatever you think can do a full sector scan in under an hour?  --Vitenka
It's only 6Gb; surely it can't take that long? ^^; *naive* - MoonShadow
No idea how it scales, but 100gig took me about ten hours, so you might get away with it.  --Vitenka  (Drat.  I'm still on the lookout for sdomething faster.)
In the end, it was down from 7.30 to 10pm; I did three full scans in that time to make sure the errors were caused by specific sectors rather than the hardware in general. - MoonShadow
I would replace it as soon as possible - it is very unlikely that the problem will not spread. Generally, if the computer ever actually sees a bad sector, that means that the state of the disc surface is so bad that the (quite substantial) error correction/avoidance/rearranging system in the hard drive cannot cope. Be afraid. Be very afraid. --Admiral
Oh, I intend to. Well, I intend to replace the entire machine, since the current HD is a 2.5" one and I can't really get a replacement one of those very cheaply or easily. But I can't do that tonight since the shops are shut ^^; - MoonShadow
I don't know quite how dwindled your cash reserves are, but hard drives have dropped in price significantly over the last year or two. £30 for 40Gb or £35 for 80Gb. http://www.pcindex.co.uk/ --Admiral
The webserver is my old laptop, unfortunately - SunKitten
Yah. I'm currently looking to build a spare machine that can take over when something next fails, the way I had a standby before which is now the mailserver. I am missing an ATX PSU and a hard drive from the equation; I will probably buy those new and in the case of the PSU reasonably decent, since those are the bits that have failed most often over toothycat.net's three-year experience; I will probably buy them from a physical shop like PC World (who, if delivery is taken into account, reasonably approximate online prices for small IDE hard drives anyway) since it makes returning them easier when they fail. I believe I have everything else. Since SunKitten hasn't been paid this month yet due to administrative mix-up I was hoping to wait a payday or two before shelling out the 80+ quid total, but with things as they are I might not. I'm grumbling, basically. - MoonShadow
Couldn't get to PC World before it shut. Server HD has four bad sectors; always the same four each time, so it's not the controller, memory etc. and it's not spreading, so hopefully it'll be stable for a while. Worryingly, e2fsck -c barfs on all of them but only marks the two that had data in them as bad - which presumably implies the server will blow up again when it fills up as far as the next one. Hopefully I'll have it replaced by then. - MoonShadow
I have a spare ATX PSU you can have if you want. --qqzm
That'd be really quite useful at this point, actually. Don't suppose I could nab it tonight? - MoonShadow
Certainly. I'll be in from 1730 onwards. Or you could ask Alex nicely to pick it up on his way to WednesdayAnime :) --qqzm

October 4, 2004
toothycat.net was down from ~11am until 9.15 pm following a power cut. The mailserver did not come back up after the powercut, neither did the ADSL modem. Both are now functional once more. ADSL modem would not power on after repeated attempts; have just tried again after having left it turned off for a few hours, and it just worked. *shrug* will have to phone EasyDSL? to cancel the engineer callout. Don't suppose anyone has any UPSes surplus to requirements that they'd be willing to donate to the cause? ;) - MoonShadow
umm....let me get back to you...macloud

August 5, 2004
toothycat.net will be down sometime between 8am and noon (Electricity meter being changed by Siemens, apparently it's at the end of its life; they are unable to narrow the time down any further than that).
Now done. We very nearly forgot - I was on my way out... thankfully everything booted up happily; I was a little worried ^^;; - SunKitten (unwilling PFY-for-the-day :)
Sorry - I totally forgot :( ;_; Well done ^^ *hug* - MoonShadow
We got doubly lucky. I couldn't find the spare key, but the padlock wasn't actually locked... - SunKitten

July 22nd 2004
The [ticker] appears to be unable to fetch the RSS feed. Any idea why? --AlexChurchill
MSIE's XML parser didn't like the character references in 'tenka's edit comment. Am now stripping character references. - MoonShadow
Or, in other words, my attack worked?  :)  --Vitenka
Well, if you intended a DOS attack.. ;) - MoonShadow
Well, I intended a 'see what happens' - There is a rootable attack available, by the way.  Unpatched IE can be exploited by a sufficiently evil jpeg, and you've got the image uploader facility and Image: tags.  --Vitenka
''D'you have a reference? Last I heard, there were a number of attacks involving things *pretending* to be images and relying on the fact that MSIE ignores the mime type supplied by the server and does its own thing based on the first few dozen bytes of the file - the /ImageServer actually checks the header of whatever you upload for this reason. Is it one of those (in which case that won't work), or is it something actually sufficiently b0rked in the JPEG? decoder that a real malformed JPEG? triggers it (in which case I don't see how I can defend against it in general without banning JPEG? images)? - MoonShadow
I thought it was the jpeg claiming to be a certain size and then actually turning out to be vastly larger (thus overrunning the buffer and doing its own thing)  Then again, if you're checking headers I guess you've already defended against that.  There was a similar png attack, but you don't allow png uploads last I looked, so that's ok.  --Vitenka
Hm. I'm not actually checking the headers for sanity, just decoding enough to make sure MSIE will treat them as images rather than HTML, executable code or anything else. And I do allow PNGs. Will go and Google to see what's involved.. - MoonShadow
Can't find any JPEG ones. Don't think I can defend against the [PNG] one in any fashion other than banning PNGs. :/ - MoonShadow

Jun. 2, ~2pm onwards
mailserver taken down after filesystem became readonly; hasn't come back up.
7pm: have taken snapshots of data, placed server back on the net so people can check mail received up to the snapshot, have not restarted SMTP yet (since any mail I receive now won't be in the snapshot; our secondary MX will spool incoming mail). Will shortly go buy new hard disk, put it in and copy the partition, then boot and assuming everything works restart SMTP.
11pm: trying to copy the partition failed. Old hard drive is officially dead as a doornail - the BIOS refuses to recognise it. Am about to install debian/stable on new hard drive, then recover from snapshots. We'll have basic functionality back but I might leave the webmail and IRC for tomorrow night.
12am: woody netinst is crunching away. Sometimes I really wish we could afford a 2 megabit connection. I think I'm gonna start scouring ucam.giveaway for parts again - I miss having a machine with debian preinstalled on standby.. ^^;
1.15am: spool and home restored from snapshots; no-one's lost any data unless you drafted/sent mail after they were taken 7pm (in which case you'll have lost the local copies - sorry..). Configuration restored (it's about time I updated that SSL certificate..) Spam's coming in, so I shall assume mail works :) If anything's still broken, please say.
Yeay, thank you! --M-A

Apr. 28, 15.30pm onwards
webserver operation is intermittent. OSGirls is currently the top Google result for a search of the same name, and we are experiencing the SlashDot? effect. Webserver was down briefly after WednesdayAnime and now has rather more RAM in. - MoonShadow

Apr. 14 2004, 15.50 - 19.00
toothycat.net was down, reasons unknown; came back online when ADSL modem was power cycled.
I retract any suggestion that Toothycat close down in protest against SoftwarePatents! I can't take the withdrawal symptoms any more!!! - CorkScrew

Feb. 20 2004
Mailserver will be going down for a few minutes around 9pm for a kernel upgrade.
9:06pm: mailserver successfully survived upgrade and reboot.

Jan. 7 2004
Mailserver will be going down for a few minutes around 10am for a kernel upgrade ([mremap vulnerability]). The webserver is not vulnerable.
10:09am: mailserver successfully survived upgrade and reboot :) should be back up.

Dec. 1
Mailserver and webserver will be going down for a few minutes between 11pm and midnight due to a kernel upgrade (user->root escalation vulnerability patch).

Nov. 27
Firewall PSU died due to dead CPU fan. Cannibalized lots of existing machines. Wasted two hours due to total stupidity. Long live the new firewall..
I applaud your dedication!  Now go get some sleep!  --K

Nov. 20
more intermittent downtime problems; about once an hour, connection drops for 2-3 minutes. About to email easynet tech support.
This has been happening more than just today...or at least, the Cambridge webcache has told me it can't find toothycat over many short periods of time the last three days. --SF
I've had this with my SSH to the /MailServer also dropping frequently the past couple of days.  --AC
(Nov. 21) Easynet have escalated the problem to BT, who are currently testing the line. We have discovered that the router has trouble responding to HTTP requests for its admin page at times when the connection drops, and Easynet now suspect a problem with the router.

Sep. 29, 11.43 - 13.42
toothycat.net was unavailable, traceroute showed two of Easynet's gateways busily routing to each other during this time.

Sep. 16
(not actual downtime, but something to bear in mind..) [Verisign] have caused all DNS queries for unassigned .com and .net domains to resolve to [sitefinder.verisign.com]. This presumably means that NXDOMAIN will *never* be returned for DNS lookups to .com and .net domains, and could break scripts looking to, say, ping stuff to determine whether it's up or not. toothycat.net is affected - diagnosing DNS problems with our domain will suddenly become a little harder. To repeat, DNS error messages will no longer be displayed - if you are ever redirected to [this] page when you try to access toothycat.net, and you did not misspell the URL, this means our DNS is down and we want to know - thanks.. ;)Then again, we've only had one day of downtime due to DNS issues in all the time that toothycat.net has been around, so *shrug*

Sep. 11
toothycat.net will be down this evening between 9pm and 10pm for a firewall kernel upgrade.

Sep. 8
[demon.net have recently started blacklisting any IP address which is the source of a malformed SMTP HELO]. A sizable minority of small ISPs (toothycat.net, until last night - natch..) have their MTAs misconfigured in a way that will cause Demon to blacklist them and reject legitimate mail. AFAICT, although entries do eventually time out, there is no quicker way of getting oneself off the blacklist once one is on it. In practice, this means mail sent from toothycat.net to addresses hosted by demon.net will bounce probably all day today.
Update: as of 9pm, demon.net are no longer blocking mail from us to their servers. - MoonShadow

Sep. 2
The /MailServer will be down for about half an hour between 6pm and 7pm tonight (MoonShadow will plug in more RAM and run a few tests to make sure the RAM chips aren't dodgy)

Jul 4, 5.30pm - Jul 5, 3.10am
Power fluctuations at Easynet core site caused widespread network problems, according to their tech support.

Jun 23, ~4.30-5pm
Intermittent problems. toothycat.net seems to become inaccessible for a few minutes at a time; a traceroute during these times gives something like the following:
  1  finch-core-1.router.jellybaby.net (194.159.247.129)  1.075 ms  0.942 ms  0.900 ms
  2  finch-service-1-13.router.demon.net (194.159.253.247)  2.192 ms  1.748 ms  2.018 ms
  3  finch-backbone-11-162.router.demon.net (194.159.36.65)  26.403 ms  3.689 ms  3.708 ms
  4  anchor-backbone-11.router.demon.net (194.159.7.5)  6.510 ms  5.515 ms  8.113 ms
  5  anchor-border-1-1-0-2-551.router.demon.net (194.159.36.226)  5.036 ms  4.881 ms  4.480 ms
  6  ge0-1-0-0.er0.thlon.uk.easynet.net (195.66.224.43)  3.279 ms  2.770 ms  4.055 ms
  7  ge0-1-0-0.br1.thlon.uk.easynet.net (195.40.0.134)  3.505 ms  8.644 ms  15.111 ms
  8  so0-0-0-0.br1.bllon.uk.easynet.net (212.135.16.6)  14.967 ms  3.843 ms  4.501 ms
  9  ge0-3-0-0.br0.bllon.uk.easynet.net (212.135.5.129)  3.606 ms  3.388 ms  6.205 ms
10  so0-2-0-0.br0.wslon.uk.easynet.net (212.135.16.93)  11.381 ms  26.235 ms  3.447 ms
11  bt-dsl1.core.44whit.router.easynet.net (212.134.10.3)  4.857 ms  3.066 ms  4.817 ms
12  ge0-3-0-0.br1.wslon.uk.easynet.net (212.134.10.14)  9.232 ms  12.688 ms  5.771 ms
13  bt-dsl1.core.44whit.router.easynet.net (212.134.10.3)  6.419 ms  28.444 ms  4.967 ms
14  ge0-3-0-0.br1.wslon.uk.easynet.net (212.134.10.14)  5.223 ms  3.579 ms  3.452 ms
15  bt-dsl1.core.44whit.router.easynet.net (212.134.10.3)  5.707 ms  4.790 ms  5.035 ms
.
.
(loops forever)
Previously, this kind of trace has been diagnosed by support staff as a timed-out DSL connection, requiring MoonShadow to power cycle the DSL modem; however, this time it is intermittent - implying, perhaps, that there is some sort of intermittent fault between the DSL modem and the exchange. Anyone have any useful info on toothycat.net visibility? Kazuhiko - this was probably the cause of the spurious alerts you received earlier. MoonShadow will be contacting tech support tonight.

Erg, not necessarily.  Our firewall is apparently playing up at the moment (maybe heat, maybe something else) which means nothing gets in, nothing gets out and Kazu gets stressed...  Hope your connection problems are going better... --Kazuhiko

Apr. 1, 20.15 - Apr. 2 09.05
BT trunk problems. EasyNet (and BT, for that matter) were swamped with calls (MoonShadow tried 3-4 times before he could get through for a time estimate; no, he's got no idea why they didn't put a recorded message on like they did when something similar happened last year). Anyhow, overnight no-one could see toothycat.net, and we couldn't see the outside world. It was very lonely.
We lost our connectivity for a few hours (between about 8 and 10), on Zen ADSL. Most annoying -- Emperor

Feb. 11, 8am - 11.20am
DNS issues.
It's spring; the leaves are budding, the sky is - um, grey, and computers are crashing. Not just ours, it seems. Our [DNS provider] had "issues" earlier, with the result that queries for toothycat.net were returning NXDOMAIN. This means no website unless you know the IP address (217.204.172.82, FWIW) and no mail. Mailservers trying to send us mail during that time would have got temporary errors, which should mean they spool the mail (about wot MoonShadow said earlier about forwarding addresses: don't do that, it doesn't work :))and try again in a couple of hours. Hey, anyone know why when you change your DNS entries it takes 48 hours for the new info to propagate, but when the DNS server dies everything goes boom straight away?

This from freeparking, 10am:
 Subject: Complete - nslookup returns NXDOMAIN

 The status of support ticket 2026 has been updated

 Updated By: Support Staff
Update Comment: Hi

 The problem we were having has now been resolved.  It will take about an
hour for your website to be up and running again.
We are very sorry for the inconvenience caused.

 Freeparking

So there you have it, folks. toothycat.net will be back around 11am.

Hm. It's about time we did our own DNS anyway. Anyone want to trade secondary DNS service? We'll do yours if you do ours :) Mail spooling would be useful too.

Feb. 09, 15.24 - 23.44
Dead mailserver.
AlexChurchill: As at midnight Sunday 9th/Monday? 10th, the mailserver seems to be down.  Is there a problem with it, or just some random disconnectivity?
Its power supply was dead. It's been resurrected, but MoonShadow is taking the chance to insert the extra RAM - SunKitten, 00:05
Why does everything have to fail all at once? Got back from friend's birthday party to discover lucien.toothycat.net was not responding to connections. Opened server cupboard, connected up the monitor and keyboard - no signal to the monitor, no response to CTRL+ALT+DEL. Oh, well. Thumb the power switch. Everything dies and refuses to wake up again. Swap PSU for a new one, take the time to insert another 16Mb of RAM while I'm at it. Everything boots and looks happy. But for how long? It seems toothycat.net is overdue for an overhaul.. While I was at it, I discovered the fan on our firewall is dead. I don't know how long it's been like that. The firewall's happy - it's a 486, with the load rarely rising above 10%, so it copes with just the heatsink. But.
  - A somewhat depressed MoonShadow

Jan. 27 - Jan. 29
The webserver's dead; long live the webserver!
Hardware failure. toothycat.net is currently being hosted from a laptop - we've lost about half a day's worth of wiki, at least until I have time to do forensics on the old server; gomen, minna!
The laptop will be replaced with something more permanent early next week, once I have time to examine the corpse and work out exactly what died.
Best guess so far is I didn't change the fan in time and the processor was damaged by the heat. Don't suppose anyone's about to chuck out any AMD K6-2's? Or any other socket 7 stuff, for that matter - we're not picky..
I have a K6-2 133PR here (that's, what, 80MHz real?) unsure whether it is reliable, it's not been used in a long time.  Want it?
I take it back - I got confused.  It's a cyrix pr200+, which makes it 133 real.  Still available though :)  {I guess I already  put the k6 into something)
Ah. I was about to say "yes please" there, but I guess we /are/ slightly picky after all - a Cyrix chip would cause heat problems. (We have a pr300+ lying around if /you/'re interested in a spare - the only way I could make it work at all reliably was to heavily underclock it..) Thanks for the offer, though - it's appreciated!
Ah, I thought I was the only one who underclocked.  It used to run fine with only a heatsink and 486 fan.
I may have a K6-2 400 going in the next few weeks, depending on quickly my parents upgrade their computer and I take their Duron 700 off them... M-A
We would most certainly appreciate that, if you do ^-^ - MoonShadow
Hmm, there is some risk of a K6 300 becoming surplus to requirements, or even a K6-2 350 if I can get hold of a better Socket 7 processor -- Bobacus

Dec. 13 3.30pm to 6.40pm
Web server stopped accepting POST requests around 3:30. Got back from work, discovered lots of open-waiting connections, rebooted webserver, now attempting to determine whether that was a cause or effect of the downtime.

Nov. 21 6.30am to 11am
Woke up to discover external network dead. Checked over firewall, then powercycled ADSL modem. ADSL modem didn't come back on. Checked kettle cord and socket, then telephoned EasyDSL?. BT engineer was on site at 10.30, replaced modem with a new one. Apparently, the on/off switches have a very short lifespan ^^;  Everything back up now. No idea what caused original loss of connectivity. I'm off to work. - MoonShadow



See also [Hijinks Ensue]
.

ec2-18-213-110-162.compute-1.amazonaws.com | ToothyWiki | ToothyWikiInternals | RecentChanges | Login | Webcomic
Edit this page | View other revisions | Recently used referrers
Last edited April 20, 2018 8:22 am (viewing revision 291, which is the newest) (diff)
Search: