Tech Admin blog

Talking Tech #1 - August 2019

First Post

Deaygo has been working on the new nerd.nu web site, with some help from other techs, and we now have the technology to give you regular blog updates on the state of the servers. Thanks, Deaygo!


New Hardware

If you've been playing regularly in the last couple of months, it will not have escaped your attention that the new server hardware is much snappier than the old box. However, we'll make a note of it here for the sake of posterity. PvE restarts are now so quick that there is only just enough time to get from the lobby spawn to the PvE lobby portal before the server is open again.

The new server specs are documented on the Server page of the nerd.nu wiki. The main highlights are double the amount of RAM, a faster CPU with 2 more physical cores and an SSD that has doubled in size.


Creative Rev 34 and Minecraft 1.13.2

Creative got stuck on Minecraft 1.12.2 for some time while we waited for FastAsyncWorldEdit to update. But by mid June, a development server was running on 1.13.2 with the current Rev 34 worlds and updated plugins, courtesy of totemo with some help from Challenger2. The final step was for the CAdmins to do some testing. By the time they were satisfied with the state of it, at the end of July, Dumbo52 had rejoined the tech team, and his first major act was to copy the dev server configuration over to the main host. He's currently doing minor tweaks on the configuration to fix issues that slipped through the net during the testing phase.


PvE Rev 24 and Minecraft 1.14.4

Chunk I/O and Lag

Minecraft 1.14 was a very buggy update with serious performance problems. We run PaperSpigot, (a.k.a. Paper) which is an optimised server derived from Spigot, which is the continuation of the Bukkit project. The PaperSpigot devs describe 1.14 thus:

MC 1.14.x was a rushed release from Mojang and has significant performance issues in how it handles chunk I/O. It is not recommended for large servers. Extensive backups and patience are recommended. [Source]

The PaperSpigot project has an official statement on 1.14 performance on their web site here.

That's certainly been our experience on PvE so far as we approach the end of the second week of Rev 24. We seem to be able to run at 20 TPS (Ticks Per Second) up to about 50 players, but beyond that we fall off some kind of performance cliff. Chunks taking a long time to load and long (7-8 seconds) saves saves are the most obvious symptoms of our current performance woes. Chunk I/O is also apparently blocking the main server thread at times, causing low TPS.

The problem is certainly not due to lack of RAM. We have allocated perhaps a little too much RAM to the server just to rule that out. It is possible that the problem is due to some programming error in Mojang's chunk I/O code which has the effect of serialising multithreaded I/O - effectively making it single-threaded. Or perhaps we have hit the limit of what the CPU can do in each I/O thread. We don't have much visibility into the internal state of the server in these situations, other than that timings report (image above). The situation requires more research and better tools.

It's possible that the chunk I/O issues may be exacerbated by Garbage Collector (GC) churn, whereby many short-term object allocations are overwhelming the GC's ability to free up memory in a timely manner, resulting in a costly full collection pause. So we're keen to try out a Java Virtual Machine (JVM) with the Shenandoah GC algorithm soon.

However, we're skeptical that Shenandoah will make a major difference to performance. We're mostly reliant on the PaperSpigot developers to improve chunk I/O. Mojang have stated that there will be no further 1.14 releases - 1.14.4 is the last - but have vowed to deliver performance improvements in 1.15. Unfortunately, that is a long way down the track.


Major Bugs

WorldEdit has been failing on startup, taking down WorldGuard, LWC and various other plugins with it and necessitating world restoration from the most recent backup. This has happened about 3 times now. It doesn't make any sense. We've held off on reporting it until now in order to try the latest WorldEdit snapshot. We'll be reporting it to the WorldEdit project ASAP and in the meantime will add a small plugin to detect the failure and automatically stop the server before any damage can be done.

Horses, donkeys and apparently pigs were not dropping their inventories on death. That problem was fixed in Spigot build 2437 and made it's way onto PvE in PaperSpigot build 158.

Beds coming back from the end are unreliable. That turns out to be a bug in NerdSpawn. It's randomly selecting either your bed or world spawn as the spawn location. It's a dirty little plugin with a lot of undocumented, legacy code that's no longer needed and complicates maintenance. We'll fix that soon.

Portals coming back from the nether to the overworld don't quite get there, with a message about the player hitting the world border. This issue is reported in WorldBorder issue #131 and at least 3 duplicate reports so far. But it's not exactly a WorldBorder bug. The Spigot server is raising nonsensical teleport events when the player goes through a portal:

onPlayerPortal(): NETHER_PORTAL from (world_nether,296,68,321) to (world,2370,68,2568)
onPlayerTeleport(): NETHER_PORTAL from (world_nether,296,68,321) to (world,2370,68,2568)
onPlayerTeleport(): UNKNOWN from (world_nether,2370,68,2568) to (world_nether,2332,65,2466)
onPlayerTeleport(): UNKNOWN from (world,2332,65,2466) to (world,2332,65,2466)

The two PlayerTeleportEvents with the reason UNKNOWN are the main problem. The first of these is completely nonsensical. It uses the overworld X, Y and Z coordinates but says the teleport happens in the nether. That's a problem for us because our nether world border is at +/-1500. Clearly there is a some kind of a bug in Spigot. We're currently liaising with those developers to get that fixed and we'll also talk to the WorldBorder plugin developer about whether WorldBorder can implement a work-around.


Annoyances

PaperSpigot build 160 fixes MC-156852, "phantom" blocks that the client thinks are deleted but the server does not. So that's good news for people mining ice, sandstone, netherrack and when using haste beacons.


The Case of the Missing Drops

There's a rumour going around in PvE chat that player death drops are despawning too fast - in under 5 minutes. It's the same rumour that was going around last rev, and probably the rev before that. It will probably be going around next rev.

I've personally looked at four such claims in recent memory. In the first case, on PvE Rev 23, the chunks were proven to be loaded by virtue of a player nearby. In the three cases that I have looked at in Rev 24 so far, every time, much more than 5 minutes had elapsed. The minimum time period between death and a modreq was 8 minutes. However, all three players stated that less than 5 minutes had elapsed, and in some cases that the chunks were definitely not loaded because nobody appeared on the live map.

I'm beginning to think we have an X File on our hands. Time dilation or "missing time" is a well known feature of alien abduction. Perhaps it is a feature of this alien planet we have landed on in Rev 24.

If you look on the live map and don't see anyone in the area, that does not mean that nobody is there. There could be invisible staff members or players who have hidden themselves from the live map with /dynmap hide. It's very unlikely that drops are despawning too fast. There is nothing in the server configuration that removes drops. Whenever we test it, the drops stay on the ground for exactly 5 minutes (subject to tick rate scaling). Every time items do disappear, there is an alternate explanation that constitutes normal gameplay.