Operations

The gRPC API and SDKs offer both unary operations as well as streaming sessions. The upcoming REST API will support most operations aside from streaming appends.

The first usage of an authentication token on a connection may experience a latency overhead for verification, typically tens of milliseconds. Subsequent requests will not incur such an overhead.

Latency guidance noted below assumes a client in the same cloud region as the basin. Currently, S2 defaults basins to aws:us-east-1 — see endpoints.

CheckTail

Determine the tail of the stream, i.e. the next sequence number that will be assigned to a record.

This is a fast operation, expected to complete in single-digit milliseconds.

Read and ReadSession

Read records from the stream, starting at any sequence number that has not been trimmed.

Optionally, limit by count or total bytes.

With a session, you are able to read in a streaming fashion. If a limit is not specified and the end of the stream is reached, the session goes into real-time tailing mode and will return records as they are appended to the stream.

Reading records written within the last 20 seconds is expected to take single-digit milliseconds. Otherwise, the time-to-first-byte can take up to 200 milliseconds.

Append and AppendSession

Append a batch of records to a stream. Appends execute atomically — either all the records in a batch will become durable, or none. S2 returns the range of sequence numbers assigned.

With a session, you can pipeline batches with an ordering guarantee, and receive acknowledgements back in a corresponding order. If any batch fails, subsequent batches will not become durable.

Acknowledgment latency depends on the storage class for the stream. A Standard stream will acknowledge writes within 500 milliseconds, and an Express stream within 50 milliseconds.

Concurrency control

Appends support two mechanisms for concurrency control.

Match sequence

Specifying expected current state allows optimistic concurrency control.

You can provide the sequence number that you expect S2 to assign to the first record in a batch as the match_seq_num.

If it does not match, the gRPC API will return an ABORTED status code, signalling that there has been a concurrent write.

Fencing token

Fencing is a form of pessimistic concurrency control.

It is a cooperative mechanism, so an append that does not specify a fencing token will still be allowed.

When an append does include a fencing_token, the gRPC API will return an ABORTED status code if it does not match the current token set for the stream.

Setting the current fencing token itself requires appending to the stream, with the fence command.

Retention

Age-based retention can be configured on a stream, and S2 will automatically delete records that are older than the configured threshold.

Explicit trimming is supported with the trim command.

Key-based compaction inspired by Kafka’s semantics is on our roadmap. Instead of configuring an age threshold, you will be able to specify the name of a header whose value represents the key for compaction.

Command records

Command records are an advanced feature to append certain operations interpreted by the service. S2 SDKs make it easy to create supported command records.

Concretely –– a command record is a record with a sole header that has an empty name, an operation encoded in this header value, and a payload for the command in the body of the record. Empty header names are not allowed in any other context.

Command records take up a sequence number on the stream, and will be returned to reads. It is easy to test and filter out commands if needed, with the logic len(headers) == 1 && headers[0].name == b"".

Operations that are currently supported:

  • fence with an upto 16 byte payload to set a fencing token for the stream. An empty payload clears the token. Fencing is strongly consistent, and subsequent appends that specify a fencing token will be rejected if it does not match.

  • trim with exactly 8 big-endian bytes as payload representing the desired earliest sequence number for the stream –– we will call it the trim point. The effective trim point from the command is going to be max(existing_trim_point, min(provided_trim_point, my_seq_num + 1)). Trimming is eventually consistent, and trimmed records may be visible for a brief period.

The S2 CLI also supports fence and trim commands.