PDA

View Full Version : Server was down for several hours -- failures on multiple fronts



Tned
10-15-2010, 09:35 AM
I was working late last night and slept in. When I work up, I found that the server was down. I'm still not sure why the server hung, but I know we had failures on multiple fronts in terms of why it was down for so long.


I pay a support company quite a bit of money to immediately (within 10 minutes) login and deal with any problems that cropped up. Within a few minutes of the server being down, they couldn't connect to the server, so attempted to reboot the server through the datacenter's control panel/software.

About three weeks ago, the datacenter where the server is located moved to new support software. I had sent the new login details to the support company, and had them try logging in and creating a support ticket with the datacenter, and that all worked. However, someone at the support company 'forgot' to overwrite the old login details with these new ones, and this morning the tech working the case didn't know about the new login info. So, they couldn't login to reboot the server, or put a ticket in with the hardware/Datacenter people. They also didn't call me to alert me of the problem.

I have another company monitor the server, and if it's down for more than about 5 (to eliminate false alarms from temp network outages, router reboots, etc.) it sends me a text message. For some reason, my phone never received the "down" text message. This has never happened before.

So, while the server crashed and apparently needed a reboot (not sure exactly what happened yet), the only reason it was down for as long as it was is because there were a series of major failures on the support/alerting side.

Sorry for the problems, hopefully this is an isolated event.

Shazam!
10-15-2010, 09:40 AM
Thanks Tned.

MileHighCrew
10-15-2010, 09:49 AM
I was scared they blocked the site at work

Tned
10-15-2010, 09:52 AM
In addition to working with the support company to find out why they failed to update the "right" place with the new datacenter login info, I have also changed the monitoring alert system so that if the site remains down, it sends another text message out every 15 minutes so that if somehow the first text/page fails to go through, it will keep sending them until the site is back up.

Jagsbch
10-15-2010, 10:14 AM
I thought I was banned again.:D:D

Denver Native (Carol)
10-15-2010, 10:49 AM
I thought I was banned again.:D:D

AGAIN ???? Did I miss something? :lol:

BroncoJoe
10-15-2010, 11:34 AM
Get a discount from your support center.

rcsodak
10-15-2010, 11:45 AM
I was working late last night and slept in. When I work up, I found that the server was down. I'm still not sure why the server hung, but I know we had failures on multiple fronts in terms of why it was down for so long.


I pay a support company quite a bit of money to immediately (within 10 minutes) login and deal with any problems that cropped up. Within a few minutes of the server being down, they couldn't connect to the server, so attempted to reboot the server through the datacenter's control panel/software.

About three weeks ago, the datacenter where the server is located moved to new support software. I had sent the new login details to the support company, and had them try logging in and creating a support ticket with the datacenter, and that all worked. However, someone at the support company 'forgot' to overwrite the old login details with these new ones, and this morning the tech working the case didn't know about the new login info. So, they couldn't login to reboot the server, or put a ticket in with the hardware/Datacenter people. They also didn't call me to alert me of the problem.

I have another company monitor the server, and if it's down for more than about 5 (to eliminate false alarms from temp network outages, router reboots, etc.) it sends me a text message. For some reason, my phone never received the "down" text message. This has never happened before.

So, while the server crashed and apparently needed a reboot (not sure exactly what happened yet), the only reason it was down for as long as it was is because there were a series of major failures on the support/alerting side.

Sorry for the problems, hopefully this is an isolated event.

YOU'RE FIRED!

I want my money back! :tsk:

elsid13
10-15-2010, 03:13 PM
Maybe you shouldn't let Clay host the site for you.

Lonestar
10-15-2010, 04:08 PM
Was going to email you but did not have it on my BB.

Initially thought it was my connection after rebooting my BB and trying another site knew it was not me.
Mobile Post via Mobile.BroncosForums.com/forums

GEM
10-15-2010, 04:14 PM
It happened because T changed his avy. It sent shockwaves to the server that it just couldn't handle. :lol:

honz
10-15-2010, 04:53 PM
I am considering leaving this board due to these unacceptable technical difficulties.

camdisco24
10-15-2010, 05:44 PM
All I know is, if the Broncos lose this week, I blame you Tned. ;)

gnomeflinger
10-15-2010, 05:47 PM
If you call and complain, maybe they'll give you $5 off the next month's payment, and send you a $10 gift card to Wal Mart.

FanInAZ
10-15-2010, 06:03 PM
Well it is all Tned's & KCL's fault. If one of those two would just win the Powerball, then Tned would be able to hire a company that can provide them with the same level of service as the Pentagon.

Tned
10-15-2010, 07:32 PM
It happened because T changed his avy. It sent shockwaves to the server that it just couldn't handle. :lol:

My pup has a purple cast now, I was thinking of changing my avvy again, but based on this mornings problems, maybe I better stay hands off....

gnomeflinger
10-15-2010, 08:37 PM
Now that I've had time to ponder the reason for the outage, I have come to the conclusion that it's probably because you're paying money to a sever when you should be paying a server. :D

BCJ
10-16-2010, 01:23 AM
Sounds like Government at work with excuses too! Tned, we pay a lot of money for dues here. Some of us have that VIP package that includes behind the scenes access like posters comments and PMs. Keep this up and you will find one less subscriber to the elite group. I can find Bronco Warrior, my hero, to come back and raise havoc for this place.