The Stream
Operations
CheckTail
Determine the tail of the stream, i.e. the next sequence number that will be assigned to a record.
Read
and ReadSession
Read records from the stream, starting at any sequence number that has not been trimmed.
Optionally, limit by count or total bytes.
With a session, you are able to read in a streaming fashion. If a limit is not specified and the end of the stream is reached, the session goes into real-time tailing mode and will return records as they are appended to the stream.
Append
and AppendSession
Append a batch of records to a stream. Appends execute atomically — either all the records in a batch will become durable, or none. S2 returns the range of sequence numbers assigned.
With a session, you can pipeline batches with an ordering guarantee, and receive acknowledgements back in a corresponding order. If any batch fails, subsequent batches will not become durable.
Concurrency control
Appends support two mechanisms for concurrency control.
Match sequence
Specifying an expected current state is optimistic concurrency control. The Append
and AppendSession
APIs allow you to assert on the sequence number that you expect S2 to assign to the first record in a batch. If it does not match, the gRPC API will return an ABORTED
status code, signalling that there has been a concurrent write.
Fencing token
Fencing is a form of pessimistic concurrency control. It is a cooperative mechanism, so an append that does not specify a fencing token will still be allowed. When an append does include a fencing token, S2 will enforce that it matches the current token set for the stream, with strong consistency. Setting the current fencing token requires appending a command record.
A command record takes up a sequence number on the stream so we can ensure the correct semantics, and it will also be returned in reads. S2 SDKs make it easy to create supported command records.
You can combine fencing tokens with a leader lease to build robust distributed data systems, leaning on S2 for replicated durability with the exclusive writer semantics of a local write-ahead log.
Marc Brooker has an excellent blog post discussing this technique in the context of AWS MemoryDB.
Retention
Age-based retention can be configured on a stream, and S2 will trim records automatically.
Explicit trimming is currently supported using a command record. A more direct API which does not consume a sequence number will be added soon.
Key-based compaction inspired by Kafka’s semantics is on our roadmap. Instead of configuring age threshold, you will be able to specify the name of a header whose value represents the record key for compaction.