Jump to content
c45y

[AC] Server hardware replacement plan

Recommended Posts

Starting to outline a quick process for this disk replacement, feel free to edit this post to include details or format nicely ( Make sure to cross out items if you complete work )

  1. Create new BungeeCord instance on creeper, pointing to the existing servers on newnerd
  2. Change DNS entry for c & p to point to creeper - leave s and nerd.nu webserver on newnerd
  3. Wait for DNS TTL to expire for entries so all of our players will be operating on the new bungee
  4. Schedule downtime for PvE ( as it is most affected currently ) and rsync the server files across, export all pve database tables and import on creeper mysql instance
  5. change bungeecord entry for PvE to point to local server
  6. Start PvE with whitelist enabled and verify that permissions, LWC, logblock are all functioning correctly - disable whitelist after testing
  7. Repeat steps 4-6 for Creative
  8. Turn survival OFF, it is due to be shutdown anyway from memory
  9. Ensure any remaining server directories on /ssd/ are no longer required, archive any maps that have not been
  10. Kill webserver
  11. Export forum/website database tables from mysql and backup on the SATA drive - possibly copy to creeper
  12. tarball and copy the website directory(s) to the SATA drive - possibly copy to creeper
  13. Power off server, allow on-site technicians to replace drive ( might need to mkfs once the box comes back up )
  14. Repeat process 2-7, swapping target servers from creeper to newnerd
  15. Untar server files to the web directory and import mysql dump of forums, starting webserver

I'll clean the formatting and start work on it in a few days once my index finger is usable again

  • Upvote 3

Share this post


Link to post
Share on other sites

For clarification, if it's not obvious, newnerd is the primary server—the naming being a relic from when we migrated servers ~3 years ago—and creeper is our secondary box.

Share this post


Link to post
Share on other sites

I think it was mentioned in the meeting that E might be taken down during this change over, but it's not mentioned here that I can see? Is there any confirmation of that (just for planning reasons)?

Share this post


Link to post
Share on other sites

Question for the techadmins, would you guys be ok with me making the post below to update everyone on what's going on?

 

Hello everyone,

 

We currently believe one of the main sources of the lag affecting the servers is a result of a failing SSD on our main box. The techadmins are in the process of moving the necessary data from the failing SSD to our secondary box before contacting the host to replace the SSD. We don't know how long this will take since it depends on the availability of our techadmins and our host.  This may result in some downtime for some of our serveries including pve, creative, CS:GO, website, etc., so  make sure to keep an eye on the subreddit and forums for any further announcements. 

 

Other things are also being done and looked into and we will make further announcements once there is something to update you guys on.

Share this post


Link to post
Share on other sites

As this seems to be top-priority above everything else right now due to how badly it is effecting the new PvE Rev, is there a aimed timelime to have this all completed by? Obviously the sooner the better, but is there any ETA or goal for it?

Share this post


Link to post
Share on other sites

Well, Dumbo figured out that there's a 4 hour hardware service window, so it seems like it would be wasteful to set everything up on the other server and deal with updating DNS, only for a four hour downtime.

 

A better strategy is:

  1. Put the word out that we'll be having a scheduled downtime on a certain day
  2. Back up the SSD
  3. Power down for service
  4. Wait until it's done
  5. Copy everything from the backup to the new SSD, make sure everything's working, and unwhitelist

We'd lose the website during the four hour window, but reddit, Mumble (which is hosted elsewhere) and IRC would all be available still.

Share this post


Link to post
Share on other sites

Not sure if this is possible but it's an idea I figured I'd pass by you guys. Since you guys are not going to be moving any servers over and just backing it up there won't be a need to use our second box. E is currently hosted on the second box, perhaps we could host a small little event like a snowball fight on E for the 4 hours or so the servers are down? That way we can keep people entertained while waiting.

Share this post


Link to post
Share on other sites

A ticket has been opened, and the provider agrees that the drive needs to be replaced. Details haven't been hashed out yet, but the process has started.


 


We've got a timetable we're aiming for ourselves, but we don't want to announce it yet since we have no idea if the provider will be okay with it or not. Basically, 10PM Saturday my time (so early morning Eastern) we want to run down our preparations and have the server powered down for service by the following hour. Then they have a four-hour window where they'll get to it and perform the repairs. But we don't want to announce anything until we have confirmation that it's acceptable with the provider.


 


So, don't go around telling people this date, because if things fall through and we have to change it, the unnecessary fallout will be annoying. (People already are irritable enough about the lag.)


 


Once we've got the schedule set in stone, we'll ensure there's an announcement on the subreddit and forums, tweet it, and change the server MOTDs. Hopefully there will be a few days of lead-up time, but it's not the end of the world if notice is a little short, since the plan is for a reasonably off-peak time.


  • Upvote 2

Share this post


Link to post
Share on other sites

Not sure if this is possible but it's an idea I figured I'd pass by you guys. Since you guys are not going to be moving any servers over and just backing it up there won't be a need to use our second box. E is currently hosted on the second box, perhaps we could host a small little event like a snowball fight on E for the 4 hours or so the servers are down? That way we can keep people entertained while waiting.

The bungee instance that allows people to connect to the private IP of creeper is hosed on the main box, so E will appear down, even if the server isn't actually off

We could open up some firewall ports to allow direct connections to E if all the techs think that is a good idea, mcbouncer would need to be set up in that case as Bungee wouldn't be handling our bans

Share this post


Link to post
Share on other sites

The bungee instance that allows people to connect to the private IP of creeper is hosed on the main box, so E will appear down, even if the server isn't actually off

We could open up some firewall ports to allow direct connections to E if all the techs think that is a good idea, mcbouncer would need to be set up in that case as Bungee wouldn't be handling our bans

 

Honestly sounds like too much work for you guys just for 4 hours

  • Upvote 1

Share this post


Link to post
Share on other sites

Honestly sounds like too much work for you guys just for 4 hours

 

The bungee instance that allows people to connect to the private IP of creeper is hosed on the main box, so E will appear down, even if the server isn't actually off

We could open up some firewall ports to allow direct connections to E if all the techs think that is a good idea, mcbouncer would need to be set up in that case as Bungee wouldn't be handling our bans

 

I don't want to put any large strain on the techs. It was just merely an idea if it was a easy thing to do.

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...