ActuallyKabouters Rust Server News


2024-07-24: On performance woes, mod setting changes and future plans

Hi all!
To answer some questions, but also to document this for others, we've started a blog to share our experiences and discuss ongoing challenges and changes.

TL;DR:

Hello my name is Nox 👋 and I type a lot. Here is a brief recap of what's below if you don't enjoy walls of text:

  • Server grew a lot unexpectedly
  • We're adding extra RAM tonight to (hopefully) fix performance issues
  • We removed some mods to reduce server load
  • We're thinking about introducing VIP if you want to support the server
  • We might remove vending machine instant-restock next wipe because it's too abusive
  • We might limit wind turbines to 4 starting next wipe (but sell Test Generators in Outpost)
  • Many graphs with fancy colors

Unexpected growth

It is probably not big news that our cozy server recently had a big influx of new players. This happened for a number of reasons.

To get started, back in March, Ryan (Facepunch Discord moderator, runs various servers and supports a lot of others) and me (NoxiousPluK) asked the developers if the PVE-tag could be restored for servers that don't run vanilla PVE mode.
Vanilla PVE is a little known feature of the Rust server. Rust officially does have build-in PVE, but it is very limited in functionality and was broken for many years. It also doesn't get tested during updates; so in reality almost noone uses it.

What most PVE server use is TruePVE, or selfmade mods specific to their flavor of gameplay.
However since a few years, Facepunch enforced server owners to use the build-in PVE-mode in order to be able to use the tag.
After a lenghty discussion on allowing the tag for TruePVE-modded servers, this was granted to us at the end of April.

During May directly after wipe, there was the Rust Kingdoms Twitch event (with skin drops), combined with the Steam Open World Survival Crafting Fest sales, with Rust being 50% off.
The Rust developers forgot to mention the PVE-tag change in the change notes, leaving us with some insider information by accident. As a direct result a part of the influx of new players who looked for PVE-servers only found a handful of options, including ours.

Statistics provided by BattleMetrics.com

From that moment on when searching for PVE-servers that are still fairly vanilla (aimed at learning the game); our server shows up fairly high - which has a compounding effect over the next few wipes and events as seen in the chart above.
We're quite happy with this since our goal always has been to provide a home to those learning the game. That is how the server originally started too, but for ourselves. So it has been really nice seeing the community grow in an unexpected way, with both people that enjoy the server enough to call it their home, and others who just come by to try out an idea or learn the basics.

Performance impact

Of course all this came with some growing pains, so at the end of May we decided to run the server on dedicated hardware.

We've always hosted the server 'at home', since it was at the start mostly for ourselves, but also because we have the experience, a (usually) reliable connection (1/1Gbit XGS-PON). For this we used a virtual machine on our 'home server'; a machine running TrueNAS Scale with the following relevant specifications:

Component Model
OS TrueNAS Scale
Motherboard ASRock B450 Pro4
CPU AMD Ryzen 7 5800X
RAM Corsair Vengeance LPX CMK64GX4M2E3200C16 64GB
Storage ZFS RAID-Z1 over:
3x Seagate Exos 7E10 10TB
1x Crucial MX500 1TB (for cache)

The Rust server ran on this as a Windows Server 2019 virtual machine, with 4 CPU-cores and 16GB RAM. This ran fine for years with server population peaking at 20 players or so, but since the influx of new players this became a problem.
Luckily after a series of PC upgrades we ended up with a 'leftover' system with the following specifications:

Component Model
OS Windows 10 Enterprise LTSC 👀
Motherboard Gigabyte B450M DS3H
CPU AMD Ryzen 5 3600
RAM Corsair Vengeance LPX CMK16GX4M2B3200C16 16GB
Storage Lexar NM620 256GB

For those wondering why we host on Windows and not on Linux: The Rust server has (from own experience) better stability on Windows, and on top of that offers extra features in the Windows version, like an interactive console - which makes maintenance and admin duty a lot easier.

This seemed like a nice solution to move the server to; so that is what we ended up doing.
This ran great at the start (up until the July wipe). The reduced overhead from running as a virtual machine made a noticable difference in response times and the server needed less maintenance. However this changed quickly once we got a really busy wipe with way more activity.
For the first time ever our server consistently starts hitting over 50 players online at peak hours, and the number of entities (which had never gone over ~150k) suddenly reached 300k+ (it is sitting at 313571 as I write this).

This means that the servers memory usage was becoming a new issue, and memory usage is expected to be the current issue.
Basically computer memory operates in tiers. The first tiers are the L1, L2 and (optionally) L3 cache which are small amounts of incredibly fast memory that live inside the CPU itself. After that comes the system memory (RAM), and then the swap/page-file which is stored on disk.

Since disks are (by far, even in the days of SSD storage) the slowest option you want to avoid that, and having enough memory is essential for large processes (like a game server).
Especially Rust is infamous for its memory usage; partially due to the huge amound of objects in the game, combined with the big maps.
Looking at the Rust server process and its memory usage, it is currently using roughly 14GB RAM, but only 9GB is stored in actual memory. This is a problem, because it has to store and retrieve information from disk (swap/page-file), heavily impacting performance.

Because of this we've ordered an extra 16GB that I plan to put in the server late tonight during its daily restart, to hopefully reduce the impact.

As a long-term solution I want to look into getting a processor with 3D V-Cache since they are absolutely superior for such memory-intensive processes (since they have a huge amound of L3-cache), like the AMD Ryzen 7 5800X3D. But that is a pricy upgrade (currently ~€330 / $360).

Other ways we are addressing the performance impact is looking at more improved monitoring of the server and its plugins.

For this we first started using the Performance Monitor-plugin, which told us that some plugins that we used for administration had a way bigger impact than expected. An example of this is TCMap, which allowed us to see all placed TCs on the map (and easily count them); but it had by far the biggest performance impact of any other plugin. We stopped using it and I've since had someone look at the code (thanks Duck!) who noticed that the plugin is written extremely poorly optimized.

The impact of the TCMap plugin compared to other plugins. Generated with Performance Monitor.

Another big contender is the Info Panel; these are the extra information panels you see ingame showing the current time, airdrop, heli, cargo, etc-status, scrap balance, and so on.
For this I have decided that I want to rewrite it to my own more lightweight version. We don't need all the modularity and functionality that the current plugin provides, and this adds a lot of overhead that we can do without.

Yearning for yet more detail we decided to start using Rust Server Metrics. This is a way more advanced statistics gathering platform used by many big Rust servers, and required setting up some infrastructure before we could use it.
Yesterday evening we've set up an InfluxDB server to store the data, and installed a Grafana dashboard to view the generated information.
This does result in some pretty graphs, of which I can share some!

First of all, let's start with the bad news: the current numbers are not pretty.
I hope to solve a sizable part of this with the memory upgrade that I'm planning for tonight.
Check out the following graphs; I will explain their importance below:

The 'Server FPS'. Measured using Rust Server Metrics.
The 'Server frame times'. Measured using Rust Server Metrics.

These are two important metrics for measuring server performance.

The Rust server once running is basically in a loop. Every cycle it handles the actions of any connected player it receives, checks them for validity and handles them. Then also processes all AI tasks (scientists walking around, animals hunting, the effect of wind and its changes on wind turbines, sun and time of day on solar panels, rain on water catchers, loot respawn timers, etc..), and then reports back to all players.

The speed of this cycle is called the 'server FPS' or frame rate, and the length of each cycle is called the 'frame time'. Facepunch itself strives for ~30 fps on its servers, and from personal experience I'd say that servers running under 60 fps feel sluggish; so we're striving for that.

The frame time makes noticable lagspikes visible. Every time a frame takes longer than average, this becomes a noticable spike on the client side; resulting in 'rubber banding' or 'lag'.

However as you can see we're dipping quite a bit below that currently. This started happening since the (suspected) holiday influx of active players, and the megabases that have been build.

I hope to restore that with tonights memory upgrade; but we've also been diving in these and other stats to get rid of some functionality.
As a result of measuring and seeing impact, we removed the folowing plugins (at least for now):
Box Looters - this plugin allowed us to see who looted boxes and when, and was thus mostly used for admin duty. Since we don't care about stealing (but are sometimes curious); this isn't an essential plugin, and it had a sizable impact on server load. We've removed it for now and will create an alternative solution to satisfy our curiousity another day.
Player Challenges - this plugin counts many ingame events and keeps scoreboards of various players. The performance impact was surprisingly high so we decided to remove it for now, and again look at our own solution eventually. Not many players looked at these stats and they never worked fully properly.

Earlier we already removed a self-made plugin that we used to log players researching (to get an idea how far they were in the game progress-wise). However the impact of this was too big on our monitoring (Up until then, a Telegram bot reporting to us in a chat), often causing it to fail.

Entity changes per second; the biggest peaks are things like industrial systems moving items around and raidable bases (de-)spawning.
Measured using Rust Server Metrics.

Future plans

So some of the server plugins have a sizable impact on the entire server performance. Even after removing a few heavy ones, we still have some others to deal with. Here's an example of the impact of some of the plugins:

Long-term culmination of hook time per plugin. Measured using Rust Server Metrics.

Here we can see that the three biggest contenders are Raidable Bases, TruePVE and Info Panel.
At least two of those (Raidable Bases and TruePVE) are so essential to the server that we can't and don't want to do without them. We can look into optimizing performance in their settings, but this will affect gameplay in some way and might not be possible or desirable.
As mentioned before, Info Panel is already on my list to scrub and replace with our own self-made solution. I hope to get to that eventually.

But some of the others might also need some attention.
For example the recently (by popular request) added Vehicle Deployed Locks seems to also have quite some impact; as for every frame or 'loop' it has to check if an unauthorized player is trying to access a vehicle, and other checks like trying to place a lock. But we also consider it essential now.
The Loot Defender plugin is a similar case. This locks crates and heli/bradley loot to the player that took it, but the impact is quite sizable while being essential.
Similar to Night Lantern which really adds to the atmosphere on the server.
But, Prop Control perhaps has to go. This may or may not be your neighborhood admin occasionally becoming a shark 🦈 - perhaps soon a thing of the past.

There's quite a bunch more plugins like this, and we're going to individually check them, see if we can improve their configuration, judge if we still need/want them or find alternatives, where a self-made rewrite from scratch is also an option.

Other future plans are hardware upgrades, like the already mentioned upgrade to 32GB RAM later tonight or potential processor upgrade.
To cover the cost of this a little bit (since this is entirely a hobby project for us), we're thinking of the (also requested multiple times) option of having a VIP-perk on the server, or allowing donations.
So far this server has run without any external funding, and it's a big ask if we want to add additional perks but also have to manage those.

For the VIP-perk we're thinking about:

  • €5 a month pricepoint
  • Queueskip (being able to join the server if it's full)
  • Access to /sil (Sign Artist), to paste images from URLs on signs
  • Colored name ingame
  • Colored name in the Discord + access to a channel where we can dicuss map/plugin requests

But we also have to be careful to not give VIPs preference treatment, and make it clear that all the rules still exist for them (and that they can still be banned).
On top of that we're thinking about a 'trusted player' role for long-term players that engage with the community and show good will, and giving them access to Sign Artist as well. This because not everyone is able to spend the extra money on things like this, and we don't think exclusive perks should be locked behind a paywall; except for showing gratitude for supporting us.

We're also planning some gameplay changes (some we've accidentally already experimented with).

One of the issues is the unlimited restock on NPC vending machines.
Players make a horse farm and gather 20k scrap on the first day of wipe, which is an unintentional side-effect of making the vending machines restock instantly. An extra side-effect is that no more horses spawn on the map since they're all in a few peoples poop farms.

We on accident already disabled this earlier today; but re-enabled it now.
However we'll try to disable it at the start of next wipe and judge the responses.

We're also planning to limit wind turbines per base to 4 for next wipe, since they have a huge impact on client performance.
This is not something that is noticable server-side, but the Rust client gets quite laggy when there are a lot of wind turbines in render distance.
To offset this for those that really want a lot of power, we started selling Test Generators in Outpost - but for a high price for a somewhat hard to get item (Tech Trash) so they won't be fully unlimited.
And we see the effects of that in the metrics that clients send to the server. On high-end systems this makes FPS dip by maybe 50 or so, but we also have many players on less furtunate computer setups, and some of those dip under 20 fps with heavy rubberbanding when getting near these areas.

A player with a high-end system getting closer to a wind turbine-heavy area. Measured using Rust Server Metrics.

We're also writing new server management tooling from scratch for ourselves; because none of the existing solutions were sufficient, affordable or offered the flexibility we wanted. This will be a lengthy process but will allow us to provide some fun unique features in the (hopefully) near future.

I think this is all for now, but let me know if there are any questions and if you have any feedback, please share it with us in the Discord!

Thank you all for sharing your time with us, your patience when dealing with issues and playing on our server; we hope it is an enjoyable experience!

Greetings,

Nox (NoxiousPluK) & appy (Apaltra)