Well, I did some science, and the results are actually surprising. I refactored into a queue framework where Entities are stored in a temporary list until handled, and tested this against the old approach, and it comes out slower. This is rather direct: exactly what comes in is what is handled later. Compared to the older approach, where for-each Surface that I know about, if it has had any alterations, then multiple Surface scans (limited with a quantity and for relevant Entities) are requested and the results handled. This leads me to believe that the game's scanning code is extremely efficient.
However, since this approach is considerably less complex, and it helps me solve a separate problem I'm working on, I will probably swap to this refactor anyways. The results aren't significantly worse, and that also points to how choosing the setting values is the most important factor.
With a new game, I make a blueprint involving 207k Concrete, 27k Steel Chests, 27k Fast Inserters, 27k Speed 1 Modules, and 13k Assembler 3s. When I paste this directly into the Sandbox, it takes 17s in the current version and 20s in the refactor; the added time there is probably from adding the Entity to the list (which didn't happen before). This is huge but close the times that you mentioned which is why I was testing it.
With the default of every-15-ticks and then changing the async limits to 1k, I checked two values. The current update time in milliseconds (visibly checking min/max) and the max of the last average value (which is basically 0, then increases to a huge number, then goes back to 0 next). In the current version, it bounces between 14ms and 17ms for current updates, and hit a max of 3,100ms. The refactor is similar enough at 15ms and 17ms, with a max of 3,400ms.
I increased the limit to 5k and tried again. It diverges a little more here, with the max being almost equal, and the average updates being 60ms/70ms on the current version, compared to 75ms/85ms on the refactor.