Blueprint Sandboxes

2 years ago

Hiya, my wife and I have been running into an issue recently on our SE server. I was very happy to install this mod as it gives me a space to work on my city block designs but I've had some problems. Whenever I make large changes like placing and definitely when deleting my city blocks (100x100 with one production line and some train stains) it drops UPS to 0 and sometimes kicks us from the server. It doesn't usually crash the server but logging in takes us forever every time.

We checked the f5 screen last night while trying some blueprint creations and deletions and the numbers for this mod jumped up a crazy amount when we could see them. Max ms per tick jumped up the 2000 or more at times and average was above 20.

We tried making the creation and deletions asynchronous but that just seemed to make the server super slow for a longer time. We've decided for now that 20 seconds of unplayability is better than a minute or more of it when things are asynchronous.

Any ideas what could be going on here and how we can work on big blueprints without kicking ourselves from the server? Stamping down and deleting blueprints outside of the sandbox can slow things down a little but nowhere near the same scale.

somethingtohide

2 years ago

You've already found the right settings to tweak, and may I ask what sort of values you have tried for the situation? This is an unfortunate side-effect of not being able to alter or influence the actual game code (which would be significantly faster) and instead having to replicate the desired behavior in Lua. In any case, the game will place ghosts or upgrade/downgrade/delete requests, and this mod must handle each of them individually. Every entity will either immediately be handled or can be put in a queue for handling later, over time.

The direct approach is by-far the fastest and easiest, but it will require immediate and unbounded resources. If the queue approach still isn't working reasonably enough for you, then we can only try optimizing it. Pushing more of the logic into the game code, like with bulk-methods or asking it to instant-build, is the best situation, but that requires actions from Wube. While I haven't asked, I'm not sure how receptive they'd be for it. I could still see some optimizations in Lua, however. Currently, the queue is not maintained in Lua - instead I ask the game to find the relevant Entities - and I can imagine that not being the most performant for processor time. With that in mind, I can attempt to refactor that into keeping track of the Entities as they come in, at the cost of memory instead.

somethingtohide

2 years ago

Well, I did some science, and the results are actually surprising. I refactored into a queue framework where Entities are stored in a temporary list until handled, and tested this against the old approach, and it comes out slower. This is rather direct: exactly what comes in is what is handled later. Compared to the older approach, where for-each Surface that I know about, if it has had any alterations, then multiple Surface scans (limited with a quantity and for relevant Entities) are requested and the results handled. This leads me to believe that the game's scanning code is extremely efficient.

However, since this approach is considerably less complex, and it helps me solve a separate problem I'm working on, I will probably swap to this refactor anyways. The results aren't significantly worse, and that also points to how choosing the setting values is the most important factor.

With a new game, I make a blueprint involving 207k Concrete, 27k Steel Chests, 27k Fast Inserters, 27k Speed 1 Modules, and 13k Assembler 3s. When I paste this directly into the Sandbox, it takes 17s in the current version and 20s in the refactor; the added time there is probably from adding the Entity to the list (which didn't happen before). This is huge but close the times that you mentioned which is why I was testing it.

With the default of every-15-ticks and then changing the async limits to 1k, I checked two values. The current update time in milliseconds (visibly checking min/max) and the max of the last average value (which is basically 0, then increases to a huge number, then goes back to 0 next). In the current version, it bounces between 14ms and 17ms for current updates, and hit a max of 3,100ms. The refactor is similar enough at 15ms and 17ms, with a max of 3,400ms.

I increased the limit to 5k and tried again. It diverges a little more here, with the max being almost equal, and the average updates being 60ms/70ms on the current version, compared to 75ms/85ms on the refactor.

leximicham

2 years ago

Wow, what an excellently thorough response. You've actually tested with larger samples and numbers than we have. I was doing some testing with a compact 8 lane intersection with like 2k rail and 450 rail signals with 15, 30, and 60 tick delay and 0, 30, 60, and 100 entities per tick and in every case we saw a huge spike at the point of creation/deletion and then a major slowdown until the queue was finished.

I'm wondering if the fact that we have been working with rails has anything to do with the issue. I don't see train path update time going up but in non-sandbox surfaces we've got probably about a thousand trains running and making changes to rails always causes slowness.

somethingtohide

2 years ago

I've released 1.10 which includes the previously discussed changes, though it does bring the possibility of being slower in some regards. I am curious to know if it affects your situation, so let me know how that goes. I want to find a more scientific way to test this, since right now it seems like the majority of the spike comes from the initial placement tick and not the updates in future ticks; if so, we definitely expect a static amount of overhead and I can aim to reduce it but it's a game of tradeoffs in any case.

Xemrox

2 years ago

Heya. Gonna chime in on this topic. Me and a few friends are on a long term run with sandbox, Factorissimo (notnotmelon) and SE. From the start on we were experiencing "some" performance spikes related to blueprints. Now with bigger factories available its starting to really bother us as blueprints tend to kick us off the server. E.g. pasting a blueprint (on nauvis!) in the size of a tier 3 factorissimo building freezes the game for aprox. 20 sec and nearly drops us from the server. Within the sandbox the issue seems to be less obvious, though I significantly reduced the update rate of the mod from the start on. Going to deliver the exact settings later. Btw server itself is hosted on a dedicated machine and we are running on decent hardware, client and server sided. Gonna investigate this further. Maybe next week I can have a look at the source code too.

somethingtohide

2 years ago

It is unfortunate, but I'm really not sure of a way around the situation outside of nuking it (e.g. turning off automatic building and requiring you or bots to place everything) or getting special integrations within Factorissimo that would allow this to perform better.

This mod must listen to and handle every entity creation event, although it can limit those events to basically ghosts-only (and not much else, but potentially a force filter depending on the complexity and capability). In other words, there will be some amount of overhead on every created entity.

In most cases, it's a pretty small overhead because it's just checking strings (what surface are you on?). For Factorissimo's case, it's quite a bit worse. Factorissimo basically shards each factory and stores them within surfaces and designated areas within those. This requires much more advanced determination logic: is it a known factory surface, and what is the position of the entity, is that position within any areas owned by a factory that was placed inside of a Sandbox, and apply this recursively because factories can be nested. You can see how normally we have nearly constant-time checks, but for Factorissimo we have something that scales very poorly in multiple dimensions.

b [WIP] Server Slowdown with Large Blueprints

New response