Algorithms#

We can class the RTC Algorithms into 3 main categories:

  1. CRDT category doesn’t need a central server and is used by Riak, TomTom GPS, Teletype for Atom…

  2. OT category needs a central server and is used by Google Docs, Office365…

  3. Diffs category used by Cocalc

Some implementations have a Mixed approach combining techniques of the above categories.

We also have an Others section for algorithm that don’t fit into those categories.

The following brings more perspective on how those categories compare.

CRDT#

About CRDT#

CRTD is an ancronym for Conflict-free Replicated Data Type. CRDT doesn’t need a central server and is used by Riak, TomTom GPS, Teletype for Atom. CRDT is about Shared Data Structures.

The following videos are useful to discover CRTD in relationship with OT.

CRDT is strongly eventual consistent:

CRDT can use Version Vectors to assign versions that indicate how many changes have been made by a user.

You can also jump into the following links to get more details.

CRDT has some edge cases: The Hard Parts and its references (slides - 1h10m video - hacker news discussion).

RGA Split is a High Responsiveness for Group Editing CRDTs.

Some post-mortem stories can be instructive: xi-editor link 1 and xi-editor link 2.

Other information:

CRDT Libraries#

JupyterLab RTC integration is relying on the CRDT Implementation provided by Lumino Datastore. The Lumino Datastore library is also used in e.g. the interactive dashboard editor for JupyterLab.

Y.js is a CRDT implementation. Checkout out How Yjs works from the inside out video and the online demos:

Read more on the Y.js discussion site and the Y.js blog (backed by tag1consulting) and lib0 encoding (lib0 repo).

Automerge is a JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.

Applications examples using Automerge are available: pushpin (live), pixel pusher ([DEPRECATED] trelli).

Teletype CRDT is string-wise sequence CRDT powering peer-to-peer collaborative editing in Teletype for Atom.

Local First (by jaredforsyth) is nested object CRDT, a hybrid logical clock with a rich-text CRDT.

CRDT-sequence (by @phedkvist), with a server.

Logux has features inspired by CRDT to resolve edit conflicts between users. Real-time updates to prevent conflicts. Time travel to keep actions order the same on every client. A distributed timer to detect the latest changes. It provides a Server and a client.

Rust CRDT is a family of CRDT’s supporting both State and Op based replication..

OT#

About OT#

OT is an acronym for Operational Transformation. OT needs a central server and is used by Google Docs, Office365..

Transformation Property TP2 Case.

Read more on CKEditor lessons learned and other Libraries for OT.

OT Libraries#

ShareDB is a realtime database backend based on Operational Transformation (OT) of JSON documents. It is the realtime backend for the DerbyJS web application framework.

OT.js is not maintained anymore.

Diffs#

About Diffs#

Diffs is more a family of protocols that rely on exchange and merge of diffs. It is used by e.g. Cocalc. Read more on this in the following.

Diffs Libraries#

Google Diff-Match-Patch offers robust algorithms to perform the operations required for synchronizing plain text. See also JackuB Diff-Match-Patch.

Diff-Sync library. See the diffsync-todos example deployed on heroku.

Mixed#

Microsoft Fluid#

Microsoft Fluid has been announced in May 2020 (see also this doc). It is now released.

The offered functionalities and how it can be used outside of Microsoft Office 365 are investigated in jupyterlab/rtc#80.

From the Fluid FAQ:

  • Does Fluid use operational transforms? Fluid does not use Operational Transforms (OT), but we learned a tremendous amount from the literature on OT. While OT uses operations that can be applied out of order by transforming operations to account for recent changes, Fluid relies on a Total Order Broadcast to guarantee that all operations are applied in a specific order.

  • Does Fluid use CRDT? Fluid does not use Conflict-Free Replicated Data Types (CRDTs), but our model is more similar to CRDT than OT. The Fluid Framework relies on update-based operations that are ordered using our Total Order Broadcast to prevent conflicts. This allows us to have non-commutative operations because there is an explicit ordering.

On Twitter.

Other information.

Others#

We review here other algorithms and techniques in the broader Distributed Computing domain that could be useful for RTC.

Paxos#

Vector Clocks#