Page MenuHomeSchine - Phabricator

Starmade | Editing blueprint catalog causes whole-server lag
Confirmed task for development, HighPublic

Description

5-10 seconds of apparent blocked-IO whenever an entry is added/removed from blueprint catalog (our current catalog contains 1063 items).

Nothing alarming in logs:

[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1357 objects took 102
[2016-05-04 01:47:23] [SERVER] SERIALIZED PROXIMITY (2, 2, 2)
[2016-05-04 01:47:23] [UNIVERSE] WARNING: Sector UPDATE took 33; sectors updated: 324
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: UNIVERSE update took 33
[2016-05-04 01:47:23] [SERVER][FloatingRock] Empty rock: deleting!!!!!!!!!!!!!!!!!!! Asteroid(73515)sec[73509](!)
[2016-05-04 01:47:23] [SEGMENTCONTROLLER] ENTITY Asteroid(73515)sec[73509](!) HAS BEEN DESTROYED...
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER] JOINING ON SERVER: hasPasswd: false; Check true; pw provided:
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1357 objects took 82
[2016-05-04 01:47:23] [ENTITIES] removed object from loaded state Asteroid(73515)sec[73509](!); 73515
[2016-05-04 01:47:23] [DELETE][Server(0)] Sendable 73515(Asteroid(73515)sec[73509](!)) Physically DELETING DONE and Notified!
[2016-05-04 01:47:23] [SERVER] WRITING ENTITY TAG: Asteroid(73515)sec[73509](!) to ./server-database/world0/ENTITY_FLOATINGROCK_1462323653801_67_8_107_5.ent
[2016-05-04 01:47:23] [SERVER] Database: Entry Asteroid(73515)sec[73509](!) skipped SQL update since it hasn't changed
[2016-05-04 01:47:23] [SERVER][SEGMENTCONTROLLER] PERMANENTLY DELETING ENTITY: ENTITY_FLOATINGROCK_1462323653801_67_8_107_5.ent
[2016-05-04 01:47:23] [SERVER][SEGMENTCONTROLLER] PERMANENTLY DELETING ENTITY DATA: ENTITY_FLOATINGROCK_1462323653801_67_8_107_5.0.0.0.smd2 (exists: true)
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1356 objects took 62
[2016-05-04 01:47:23] [SERVER] received catalog delete request: CatalogPermission [catUID=2 barrel cannonXbeam, ownerUID=Darwynn, description=no descrption given, price=116124, permission=0]
[2016-05-04 01:47:23] [SERVER][CATALOG] handling delete request: CatalogPermission [catUID=2 barrel cannonXbeam, ownerUID=Darwynn, description=no descrption given, price=116124, permission=0]
[2016-05-04 01:47:23] [CATALOG] CHANGING REQUEST by Darwynn: CatalogPermission [catUID=2 barrel cannonXbeam, ownerUID=Darwynn, description=no descrption given, price=116124, permission=0]; ADMIN: true
[2016-05-04 01:47:23] [SERVER][CATLOG] permission granted to delete: CatalogPermission [catUID=2 barrel cannonXbeam, ownerUID=Darwynn, description=no descrption given, price=116124, permission=0]
[2016-05-04 01:47:23] [ZIP] Zipping folder: ./blueprints//2 barrel cannonXbeam/ to ./blueprints//exported/2 barrel cannonXbeam.sment (Filter: null)
[2016-05-04 01:47:23] [SERVER] PHYSICALLY DELETING BLUEPRINT ENTRY (backup export): 2 barrel cannonXbeam
[2016-05-04 01:47:23] [AI] Not Roaming while in conversation
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1356 objects took 108
[2016-05-04 01:47:23] [ENTITIES] removed object from loaded state obfuscated.SF@7acd7ab1; 73515
[2016-05-04 01:47:23] [DELETE][Server(0)] Sendable 73515(obfuscated.SF@7acd7ab1) Physically DELETING DONE and Notified!
[2016-05-04 01:47:23] [UNIVERSE] WARNING: Sector UPDATE took 31; sectors updated: 324
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: UNIVERSE update took 31
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1356 objects took 72
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1356 objects took 67
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]
[2016-05-04 01:47:23] [SERVER][UPDATE] WARNING: object local update of 1356 objects took 70
[2016-05-04 01:47:23] NOT FACING:::: (-23, 40, 54)[Rail Basic]o[[WARNING] UNKNOWN SIDE 14][active][100hp][jX]

Details

Task Type
Bug
Testing Results
Affected Gamemode(s)
Multiplayer
Reproducible
uncertain
Last tested (version)
0.197.53
Category
none/unspecified
Hardware/Software/System
OS-Specific
No
Hardware-Specific
No
Video Card Vendor
uncertain
Steps to reproduce
  1. Open catalog
  2. Delete or upload blueprint
  3. Watch pilots complain of lag in #general, watch lag spike across all players via [TAB], try to respond in general... ~5-10 seconds have passed and your message finally gets out. Sadly sigh when A Few Good Pilots ragequit for the night.

Tester Results;
Delete blueprint, after hitting OK hold that and see Ping spike over 4000.

Serverconfig (server.cfg)
<replace this line with the file content>
Clientconfig (settings.cfg)
<replace this line with the file content>

Event Timeline

Restricted Application added a project: Issue Navigation. · View Herald TranscriptMay 4 2016, 1:49 AM
erthparadine updated the task description. (Show Details)May 4 2016, 1:53 AM
SmilingDemon shifted this object from the S1 Public space to the S3 Starmade space.May 6 2016, 12:32 PM
SmilingDemon changed the visibility from "Custom Policy" to "Public (No Login Required)".
SmilingDemon changed the edit policy from "Task Author" to "Starmade (Project)".
SmilingDemon changed Affected Gamemode(s) from none/unspecified to Multiplayer.
SmilingDemon triaged this task as High priority.
Restricted Application added a subscriber: AndyP. · View Herald TranscriptMay 6 2016, 12:32 PM
SmilingDemon added a subscriber: SmilingDemon.
SmilingDemon removed SmilingDemon as the assignee of this task.
Chandler claimed this task.

What Server are you playing on?

erthparadine added a comment.EditedMay 6 2016, 1:39 PM

I'm the owner of stars.bitminery.com:4242 We can also replicate this in our staging environment with a vanilla build (+ a copy of production's blueprint DB).

So far I am unable to replicate. I Will continue to try. How many players are on at the time of spike?

Ithirahad added a subscriber: Ithirahad.EditedMay 9 2016, 5:24 PM

NFD Build Server gets this issue periodically. We have to wipe the blueprint database, or the server lags and can even become unstable when people are doing stuff with blueprints.

It was independent of the amount of players; I recall the issue showing up with 3-5 players on.

Player counts has not affected lag @Chandler.

I can confirm @Ithirahad comment, we wiped our production server's blueprint catalog on May 7th, and lag related to blueprint catalog changes is now gone.

At the time of our catalog's reset, there were 1126 entries.

-Confirmed-
Recording of the issue From Player server.
Still unable to replicate on Dev.

erthparadine added a comment.EditedMay 10 2016, 3:51 AM

Logs from a vanilla build, while working with @Chandler. Only changes were copying our before-deletion CATALOG.cat into world0/ and the complete respective blueprints/ structure as well.

Observations during the test session: the larger the blueprint, the more pronounced the lag. It's almost as-if the server is blocking-I/O while zipping the blueprint folder. During one test, there was a ~5 second lag between clicking "OK" for a core name, and the core actually spawning. Core spawn attempts were done immediately following a blueprint deletion request (as quickly as I could exit the blueprint menu, hit the x key, and then enter to accept the default name).

Server uses SSDs, no other disk-I/O intensive tasks were running during the test.

We had this problem on my server as well, StarSide. They even deleted all the blueprints and the problem is still happening.

Another server I tried, StarBits, was having this same issue, and they also deleted all their blueprints.

Point being, this does not seem to be a server-specific issue, but a broader one and it's leading the server admins to think there is a problem with their hard drives or the game reading all their blueprints, so they are deleting their blueprints.. which brings about it's own issues with everyone scrambling to try and save all their blueprints before the wipes.

...
Another server I tried, StarBits, was having this same issue, and they also deleted all their blueprints.

stars.bitminery.com = StarBits

To reiterate: upon a reset of our blueprint database, the blueprint related lag went away.

Benevolent27 added a comment.EditedMay 13 2016, 7:52 PM

...
Another server I tried, StarBits, was having this same issue, and they also deleted all their blueprints.

stars.bitminery.com = StarBits
To reiterate: upon a reset of our blueprint database, the blueprint related lag went away.

I forgot to mention that. I tested the blueprints on StarBits and there was no lag. But on Shattered Skies the lag remains after the reset.

Maybe @Raiben could throw some info here then.

As for StarBits, after the reset we started fresh with just the Vanilla Isanth series of blueprints. We're already back up to 225 blueprints, but no noticeable lag yet.

Random thought: I wonder if the breaking point is 1024 blueprints?

Maybe @Raiben could throw some info here then.
As for StarBits, after the reset we started fresh with just the Vanilla Isanth series of blueprints. We're already back up to 225 blueprints, but no noticeable lag yet.
Random thought: I wonder if the breaking point is 1024 blueprints?

I was thinking the same. Other servers limit players to 0ne BP at a time.

Chandler set Last tested (version) to 0.197.53.
Chandler changed the task status from Open to In Queue (Game).EditedMay 14 2016, 12:50 AM

Confirmed on Dev Server. Although there is not a red error, just the quick spike. I am of the opinion that this has to do with the size of the blueprint catalog. The greater the size, the greater the lag...

Restricted Application edited projects, added Starmade, Game Development; removed Issue Navigation. · View Herald TranscriptMay 14 2016, 12:50 AM

I was thinking the same. Other servers limit players to 0ne BP at a time.

Could you provide a few example servers; I'm interested in seeing the type of playerbase these one-blueprint servers are attracting, server longevity, overall popularity/activity, etc. Limiting to just one blueprint seems quite extreme, especially if players would want to save a few in-progress designs, without a hassle of frequent uploads/downloads.

An offhand check of popular/active servers shows GenX@15, StarBits@20, Eyvalon@20, StarSide@Unlimited

Assuming it's related to blueprint count, should the whole engine really be blocking-I/O because the blueprint index grows larger than 1K?

The server is now off line, But at their hay day they reminded me a lot like yours, 10-25 players at least. Anything that cause lag was prohibited. That was the excuse for having only one BP per player. The server was mike lands btw.

AndyP added a comment.May 19 2016, 6:59 PM

T1401#55082

Well one BP is very extreme, the default is at 25 or so. And the restrictive ones I played on are at 10 or so.
(Or disable blueprinting in total.)

Even at a 1-blueprint limit, a very large server will run into this problem.

lancake raised the priority of this task from High to Release Blocker.
AndyP lowered the priority of this task from Release Blocker to High.Aug 22 2016, 7:04 AM
AndyP changed the task status from In Queue (Game) to In Queue.Mar 10 2017, 5:05 PM