Fedia.io had a few issues over the past 24 hours - sometimes working find till you click on certain posts, which result in an error 500, and other times just getting an error 500 no matter what.
The first issue I found is that amqproxy, which helps to reduce the load on the server between the queue runners that process incoming and outgoing posts and rabbitmq. I found this morning that amqproxy was consistently failing, despite there being no apparent problem. I bypassed amqproxy, since the server can handle the load fine without amqproxy. That seemed to work and things returned to normal. A few hours later, the site started responding with error 500 to nearly all requests. This happened because the database server ran our of connections. The 300 it was set to should have been plenty, but clearly it was not. I’ve set that to 3000 and so far, so good.
My apologies for the instability. I continue to learn the nuances here and will keep making the service more reliable as I go.
Just wanted to mention that some of the issues may likely have been due to issues with Mbin rather than on fedia’s side, and we put together an emergency hotfix last night / this morning. The issues had taken out another Mbin instance, so hopefully now that this instance has grabbed those fixes, there will be slightly less queue/db issues (I’d like to say all solved forever but I’ve learned my lessons).