Data streaming and distrobution use case guide#2979
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
74a56b6 to
6881afb
Compare
| * **Producer simplicity:** Publishers send complete state; Ably handles the optimization | ||
| * **Lossless updates:** Subscribers still receive every update, just in a more efficient format | ||
|
|
||
| [//]: # (Think it could be good to have some kind of visiual showing full message vs delta message size comparison so its not just walls of text) |
There was a problem hiding this comment.
There was a problem hiding this comment.
Not yet, I have a bunch of visuals that I'm going to request so I'll make sure they all get added. Some may be after merging though.
| * **Cost efficiency:** Fewer outbound messages reduce billable message counts | ||
| * **Granular control:** Publish rates can differ across conflation groups, but still be conflated on the same channel. | ||
|
|
||
| [//]: # (Could be good here to have another visual showing messages published at mixed rates with differing conflation keys being conflated into a single batch message) |
There was a problem hiding this comment.
193888b to
cf51f30
Compare
AndyTWF
left a comment
There was a problem hiding this comment.
This is a good start - I think we can do a bit more to call out Ably's USPs, what makes us distinct in the way we handle these tough problems (e.g. computing deltas server-side)?
| * **Producer simplicity:** Publishers send complete state; Ably handles the optimization | ||
| * **Lossless updates:** Subscribers still receive every update, just in a more efficient format | ||
|
|
||
| [//]: # (Think it could be good to have some kind of visiual showing full message vs delta message size comparison so its not just walls of text) |
| How you structure channels significantly impacts performance, costs, and scalability: | ||
|
|
||
| **Single channel with many subscribers:** | ||
| This is the most common pattern for data streaming, and is recommended in most use-cases. Ably uses [consistent hashing](/docs/platform/architecture/platform-scalability) to distribute channel load across instances, enabling you to fan out to millions of subscribers on a single channel: |
There was a problem hiding this comment.
Consistent hashing is more about balancing inbound message load, rather than the fanout.
There was a problem hiding this comment.
Clarified this, and mentioned we have a separate connection layer that can scale independently from the cores. 5f18fcd
|
|
||
| Once conflation is configured as a channel rule, no consumer code changes are needed. Subscribers receive conflated updates transparently: | ||
|
|
||
| <Code> |
There was a problem hiding this comment.
Ditto re more client-side languages
m-hulbert
left a comment
There was a problem hiding this comment.
This is really great! I've mostly left minor comments throughout and then 2 asks on formatting:
- There's some odd linting putting a line break after every sentence or piece of punctuation.
- There's a lot of bold in use - especially in the lists. Can we pare this back a bit to only highlight critical or really important info please.
| - **IoT sensor networks:** Aggregating and distributing data from thousands of sensors across industrial facilities, smart cities, or environmental monitoring systems. | ||
| - **Live event platforms:** Managing reactions, chat, and activity feeds during concerts, conferences, or sporting events with thousands of simultaneous participants. | ||
| - **Fleet and asset tracking:** Real-time position and status updates for vehicles, equipment, or goods in logistics and supply chain applications. | ||
| - **Collaborative applications:** Synchronizing state and updates across users in shared documents, design tools, or gaming environments. |
There was a problem hiding this comment.
I actually disagree with the previous suggestion of mixing a Spaces use case into a Pub/Sub guide.
There was a problem hiding this comment.
It does feel like there is some crossover here, state sync for things like match state is very much a pubsub use-case, but I'll have a think and might just remove it.
Edit: I removed it for now.
|
|
||
| ## How do I reduce bandwidth and latency when data changes frequently but incrementally? | ||
|
|
||
| **The challenge:** A sports games platform streams live match state to spectators and players as matches progress. The match state naturally grows throughout the game - starting with basic match info, then accumulating player statistics, scores, game events, and other statistics. By mid-game, the full state can be many kilobytes. Publishing the entire state with each update means transmitting increasingly large payloads repeatedly to every spectator, even though only small portions change between updates (a player's ranking shifts, a score increment or some game metric updates). This wastes massive bandwidth on redundant information, especially as the match state grows larger. |
There was a problem hiding this comment.
I'd suggest removing the bold emphasis on the 'problem' and 'what's needed' and just introducing the separate paragraps using words, e.g.
"An example of frequent, incremental state change is found in sports games platforms..."
| **Realtime vs REST for publishing:** | ||
| Choose the appropriate client type based on your publisher characteristics: | ||
| - **Use Realtime SDK** when: | ||
| - Publishing messages at a very high volume. | ||
| - Need the lowest possible latency. | ||
| - Need bidirectional communication (publish and subscribe). | ||
| - Ordering of published messages is critical. | ||
|
|
||
| - **Use REST API** when: | ||
| - You have stateless publishers (e.g., serverless functions). | ||
| - Publishing from environments where maintaining persistent connections is impractical. | ||
| - Publishing from some authoritative backend server, or on behalf of multiple users. | ||
| - Batch publishing many messages to different channels in a single API call. |
There was a problem hiding this comment.
This is quite generic advice on REST vs realtime - is there anything we should recommend specifially for these use cases here?
There was a problem hiding this comment.
I could just remove this I suppose? The use-cases are more focused on the stream/outbound clients and I've not really discussed publisher characteristics (which could probably be a guide in and of itself), what do you think?
|
|
||
| ### Platform capabilities | ||
|
|
||
| Ably's architecture provides built-in guarantees: |
There was a problem hiding this comment.
Is this any different or rather more info in respect to the 4 pillars section?
There was a problem hiding this comment.
There wasn't, so I've removed it :)
| * Choose appropriate optimization strategy for each channel or namespace. | ||
| * Monitor statistics to validate configuration choices, start conservatively. | ||
| * Test client applications to ensure they handle any added latency or batching behavior. | ||
| * Review [platform limits](/docs/platform/pricing/limits) for your account tier. |
There was a problem hiding this comment.
I wonder if we should exclude this as theoretically this should be hard to hit for the majority of customers.
682ec4c to
151751f
Compare
I've corrected the linting, seems my local was misconfigured. I've cut down the number of lists/use of bold - if it's still too much, I can do another round of culling |
7eed2db to
a8e78f1
Compare
a8e78f1 to
f7089a1
Compare
f7089a1 to
b044538
Compare
| - The optimization works transparently without code changes for producers or consumers | ||
| - Creates predictable billing that scales linearly with user count rather than message volume | ||
|
|
||
| [//]: # (Again, might be good to get a diagram or somethign in here so it feels less like a wall of text..) |
There was a problem hiding this comment.
Design have a diagram for SSB I'm pretty sure
AndyTWF
left a comment
There was a problem hiding this comment.
Happy with the technical solutions proposed here - will leave it to the folks in DevEd to cover the typography
- Use sports example instead of racing telemetry - Mention deltas are more useful for larger payloads
…ntion independent scaling layer for fanout.
cc54037 to
28ce9a7
Compare
Description
Added a pubsub guide that focus on data streaming and distrobution, with a focus on conflation, deltas and server-side batching.
FF-164
TODO: There are a few diagrams/images that might be useful, I will ask design once we are happy with the content.
Checklist