I don’t know a lot about lemmy.world, but it seems to be running on “a server”. The person that wrote this may have used it as a simpler way to mean “the overall infrastructure that runs lemmy”.
However, if it really is “a server”, there will eventually be a breaking point where continuing to scale gets a lot harder, more complex, and more expensive. A lot of people don’t really understand that a site like Reddit has a massive infrastructure as its foundation. That’s how it can handle millions of connections, billions of comments, and stay - more or less - available.
It’s expensive to run.
Lemmy can’t ever hope to replace Reddit without some kind of significant investment in infrastructure and possibly development. If the code isn’t written to support scaling out (as opposed to scaling up and just throwing more RAM, CPU, and storage at a single system), it can’t replace Reddit.
That’s not to say that I’m not loving Lemmy. I do. I have barely opened Reddit since Friday after apollo died. At some point, though, money will become a factor here as well.
This doesn’t fully address your concern, but I thought I might be able to illuminate. Lemmy is not running on a server per se. Every instance is a server that communicates with other servers and this network taken together is called Lemmy. This means that as communities grow the load spreads out across these machines laterally as long as people don’t bunch up too heavily on a popular instance, like Lemmy.world. I made the secondary account on a smaller instance and my performance improved drastically. If you examine my username you’ll see I’m coming from a different instance than you are.
It’s good for the network and for you to register an account outside of the big instances. The concern that isn’t as clear to me right now is what happens if the major communities are all actually hosted on one huge instance. Does that instance just have to bear the load? I do not know exactly how spread out the network ends up being due to people wanting to participate in the biggest communities.
I’m not sure anybody knows right now because it depends on how humans behave.
EDIT: I see that the account you’re posting from is from the largest instance in Lemmy. You can help spread out the load if you choose to use an account on a different instance. This won’t compromise your ability to see the same content. It should also improve your experience with using it because your smaller instance is usually less overwhelmed and can give you more reliable responses.
Understood on scaling instances, but even those instances will eventually outstrip a single server’s capabilities. In considering the question as to whether or not Lemmy can replace Reddit, the question really comes down to how well and seamlessly Lemmy can scale.
In a lot of applications, scaling beyond a server introduces what can be some pretty gnarly complexity around things like database consistency between nodes, and things like that. I’m sure Lemmy can handle that to a point by spawning new instances, but, right now, they depend in users having sufficient awareness to even know how to do that whereas Reddit’s sign up experience is pretty streamlined.
All of this is solvable, but at some point, someone will ask the question, “who is going to pay for this capacity?” and we’ll be back in a place where we have to either decide whether o pay a monthly fee, support ads, or see or data sold. Infrastructure and people to support it are expensive.
There will also ultimately be legal compliance needs (GDPR, CCPA, etc), tax compliance as a monetization model - even if it’s just to cover expenses- is established.
I do want to see Lemmy succeed, but there will be a lot of reality to consider eventually.
Does it really spread out the load? Every instance will want to connect to the larger instances and replicate their data, or the larger instances could hug the smaller ones to death if a popular post gets noticed. I am spitballing, but from what I understand more instances means inducing larger and larger loads on everyone.
Yes, but there are shortcomings. It spreads out the load in the sense that for every user served by a small instance, that is bandwidth that a big server didn’t need to spend for those additional users. It only needs to pass a data item to the small instance once instead of to each user as the clients interact with it, saving it resources compared to having to serve each user directly. I believe only actually subscribed communities are fully replicated, but I’m reaching my limits of understanding.
As for the hug of death of small instances that just posted something popular in their own local community, you got me there. Hope that server can handle the load. So far it hasn’t been a problem AFAIK but it could become one.
More instances may also introduce more database consistency issues. It all depends on how the application scales. It it’s designed to scale out, might be fine.
Someone correct me if I’m wrong, but didn’t lemmy.world just migrate from a single server to a cluster of some kind? I thought I remembered seeing that in one of their troubleshooting posts.
Don’t get me wrong at all. Your point stands, and I think it’s a concern for the platform overall. But specifically for lemmy.world I thought it was no longer just “a server”.
I suspect/hope most of reddits infra costs mostly come from massive processes they run to consume and correlate user data into sellable data, or the massive moderator tools using full-text search they probably use to hunt down undesirables.
I feel like just serving up text based information shouldn’t be that intensive if done right. But I definitely don’t have the experience to say so for a program handling millions of requests.
Even “just text” as a sufficient scale introduces significant technical challenges. I’m sure some of Reddit’s resources go to deal with ads and some scraping of user data, but even just the basic user experience at the scale of Reddit takes thousands of servers… and that was back in 2018 when Reddit’s infrastructure team did an AMA. I’m sure it’s grown substantially since then.
Back then, on average, Reddit was sending out 32 gigabytes per second to support all of the users connecting. That text, at Reddit scale, becomes incredibly substantial.
And as you grow beyond single server capability, you get into clusters, load balancing, availability, consistency, and all kinds of other things that pop up to make a single application like Reddit operate at the scale it does.
I don’t know a lot about lemmy.world, but it seems to be running on “a server”. The person that wrote this may have used it as a simpler way to mean “the overall infrastructure that runs lemmy”.
However, if it really is “a server”, there will eventually be a breaking point where continuing to scale gets a lot harder, more complex, and more expensive. A lot of people don’t really understand that a site like Reddit has a massive infrastructure as its foundation. That’s how it can handle millions of connections, billions of comments, and stay - more or less - available.
It’s expensive to run.
Lemmy can’t ever hope to replace Reddit without some kind of significant investment in infrastructure and possibly development. If the code isn’t written to support scaling out (as opposed to scaling up and just throwing more RAM, CPU, and storage at a single system), it can’t replace Reddit.
That’s not to say that I’m not loving Lemmy. I do. I have barely opened Reddit since Friday after apollo died. At some point, though, money will become a factor here as well.
This doesn’t fully address your concern, but I thought I might be able to illuminate. Lemmy is not running on a server per se. Every instance is a server that communicates with other servers and this network taken together is called Lemmy. This means that as communities grow the load spreads out across these machines laterally as long as people don’t bunch up too heavily on a popular instance, like Lemmy.world. I made the secondary account on a smaller instance and my performance improved drastically. If you examine my username you’ll see I’m coming from a different instance than you are.
It’s good for the network and for you to register an account outside of the big instances. The concern that isn’t as clear to me right now is what happens if the major communities are all actually hosted on one huge instance. Does that instance just have to bear the load? I do not know exactly how spread out the network ends up being due to people wanting to participate in the biggest communities.
I’m not sure anybody knows right now because it depends on how humans behave.
EDIT: I see that the account you’re posting from is from the largest instance in Lemmy. You can help spread out the load if you choose to use an account on a different instance. This won’t compromise your ability to see the same content. It should also improve your experience with using it because your smaller instance is usually less overwhelmed and can give you more reliable responses.
Understood on scaling instances, but even those instances will eventually outstrip a single server’s capabilities. In considering the question as to whether or not Lemmy can replace Reddit, the question really comes down to how well and seamlessly Lemmy can scale.
In a lot of applications, scaling beyond a server introduces what can be some pretty gnarly complexity around things like database consistency between nodes, and things like that. I’m sure Lemmy can handle that to a point by spawning new instances, but, right now, they depend in users having sufficient awareness to even know how to do that whereas Reddit’s sign up experience is pretty streamlined.
All of this is solvable, but at some point, someone will ask the question, “who is going to pay for this capacity?” and we’ll be back in a place where we have to either decide whether o pay a monthly fee, support ads, or see or data sold. Infrastructure and people to support it are expensive.
There will also ultimately be legal compliance needs (GDPR, CCPA, etc), tax compliance as a monetization model - even if it’s just to cover expenses- is established.
I do want to see Lemmy succeed, but there will be a lot of reality to consider eventually.
Does it really spread out the load? Every instance will want to connect to the larger instances and replicate their data, or the larger instances could hug the smaller ones to death if a popular post gets noticed. I am spitballing, but from what I understand more instances means inducing larger and larger loads on everyone.
Yes, but there are shortcomings. It spreads out the load in the sense that for every user served by a small instance, that is bandwidth that a big server didn’t need to spend for those additional users. It only needs to pass a data item to the small instance once instead of to each user as the clients interact with it, saving it resources compared to having to serve each user directly. I believe only actually subscribed communities are fully replicated, but I’m reaching my limits of understanding.
As for the hug of death of small instances that just posted something popular in their own local community, you got me there. Hope that server can handle the load. So far it hasn’t been a problem AFAIK but it could become one.
More instances may also introduce more database consistency issues. It all depends on how the application scales. It it’s designed to scale out, might be fine.
Someone correct me if I’m wrong, but didn’t lemmy.world just migrate from a single server to a cluster of some kind? I thought I remembered seeing that in one of their troubleshooting posts.
Don’t get me wrong at all. Your point stands, and I think it’s a concern for the platform overall. But specifically for lemmy.world I thought it was no longer just “a server”.
I suspect/hope most of reddits infra costs mostly come from massive processes they run to consume and correlate user data into sellable data, or the massive moderator tools using full-text search they probably use to hunt down undesirables.
I feel like just serving up text based information shouldn’t be that intensive if done right. But I definitely don’t have the experience to say so for a program handling millions of requests.
Even “just text” as a sufficient scale introduces significant technical challenges. I’m sure some of Reddit’s resources go to deal with ads and some scraping of user data, but even just the basic user experience at the scale of Reddit takes thousands of servers… and that was back in 2018 when Reddit’s infrastructure team did an AMA. I’m sure it’s grown substantially since then.
Back then, on average, Reddit was sending out 32 gigabytes per second to support all of the users connecting. That text, at Reddit scale, becomes incredibly substantial.
And as you grow beyond single server capability, you get into clusters, load balancing, availability, consistency, and all kinds of other things that pop up to make a single application like Reddit operate at the scale it does.
Very insightful, thanks