Is SFU a right architecture to handle about 500 at a time users?

I am trying to incorporate something like club house/twitter space/Reddit talk into my app. However, I don’t know if I should just stick with the mesh network or use SFU. I understand MCU is a bit overkill, since we are not streaming video, and don’t need any processing on the server’s end. I was wondering if anyone from here has implemented such features, and would be willing to guide. Thank you :slight_smile:

We’re doing exactly this use case using rn-webrtc and mediasoup as an SFU. It works pretty well.

Awesome! Thanks for validation. I was wondering how many users are your able to handle, say in case you have just one worker?

The general rule of thumb is about 500 consumers per cpu core. Note that I said, “consumers”, not “users”.

A consumer is one user consuming a media stream of another “producer”. So, a room with 500 active users is going to be 500x499 consumers. This would require over 500 CPU cores.

However, it’s kind of abnormal to have a room with 500 people all talking. In reality, most of the producers will be muted, which means the consumers won’t cost any CPU cycles.

Mediasoup is not a plug and play solution. You’ll need to write your own signaling layer, and if you want to scale to multiple CPUs, you’ll need to write the logic to pipe the streams between workers.

Mediasoup has a very active community on discourse and is very well documented. I’d recommend checking it out.

Mediasoup is written in C++, but you control it through a nodejs module. This was a big selling point for me, as I can keep my entire stack in Javascript.