So here’s the deal with kbin: kbin uses of symphony messenger processes, which are roughly equivalent to sidekiq in mastodon.

After I moved fedia from the docker hosted environment to a bare metal instance, I had all manner of database issues - the dump and reload didn’t work well, creating many duplicate records. That caused the messenger services to die and the queue of activitypub records to process grew huge. Restarting the messenger service worked, however it would never finish, so I increased the number of messenger workers to 16. That kept the queue nice and clean.

HOWEVER, it appears that running multiple messenger processes creates race conditions where things like images ids are created and assigned to different entity records (like posts) but there is no actual image record created, so when kbin goes to draw a page, it runs a complex query to pull magazine info, post info, comments info, user info and all of their respective images. Those records LOOK like they have an image, but there is no actual image, and so kbin says 💩​ I ain’t working and gives the wonderful 500 error.

Setting the messenger services back to 1 seems to be at least not be making the problem worse, but now I have to go find all the broken database record linkages.

  • photography@fedia.io
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    What a mess, thanks Jerry for digging into this.

    The magazine I created (photography) seems to be suffering from many 500 errors but makes sense that it had several image posts.
    When it works, the images appear to be there but also got more 500’s than Indianapolis in May.

    Though at the moment seems to be stable.