• dreadedsemi@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    I remember reading comments on how the site still fine after firing so many people. “What do they do”.

    • Luca@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      People fail to understand that large projects have inertia. He could have shuttered all twitter offices, fired all employees, and only paid the server bills, and the website would probably continue to function just fine for a few months.

      But as a devops/SRE, this whole saga has been awesome to watch

      • zalack@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        1 year ago

        And often the tipping point is invisible. Some small routine or service degrades, but outwardly everything still works fine… there is just more strain on the services and clients that use that service, causing them to slowly degrade over the next few hours, days, or weeks, which in turn puts more strain on the services that call those services… etc etc.

        Until one day the system is so degraded major things start breaking. It seems like it came out of nowhere, but the initial failure happened weeks ago and has been cascading since then.

        Once a system hits that point it’s often not enough to just fix the initial problem because so much of the ecosystem around it has been thrown out of whack.

          • zalack@kbin.social
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            The Expanse has a whole b-plot about an artificial ecosystem going through cascade failure in one of its arcs.

      • lowdownfool@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        As a way-too seasoned web developer who appreciates working alongside great SREs, this has been pretty interesting. I’m honestly surprised more hasn’t gone wrong but maybe that’s yet to come. Since they are (I imagine) losing users instead of growing it might actually avoid running into future scaling issues that were looming.