Skip to main content

Operations

There are 3 core data operations on a stream:

Retention

See the Retention + Trimming guide for details on age-based retention and explicit trimming.

Data in JSON

The following pieces of record data are stored as bytes:
  • Header name
  • Header value
  • Body
A custom header S2-Format is used to indicate the desired encoding of these bytes when records are represented in JSON.

raw

S2-Format: raw or omit header.Use when your record data is valid Unicode. Zero overhead, human-readable. Cannot handle binary data safely.

base64

S2-Format: base64Use when you are working with arbitrary bytes. Always safe. 33% overhead over the wire.
You can write using one format and read with another. When reading raw, S2 is interpreting the stored bytes as UTF-8. This will be a potentially-lossy conversion if it was not also written as raw, or as base64-encoded valid UTF-8.

Protobuf messages

Data plane endpoints to append and read records also support protobuf bodies. This helps avoid the base64 encoding tax compared to binary data in JSON messages. To send and receive protobuf:
  • Set the Content-Type header to application/protobuf and send a protobuf-encoded payload.
  • Set the Accept header to application/protobuf to receive a protobuf-encoded response. The response will include the Content-Type: application/protobuf header if the server returns a protobuf.
Type definitions are available in git and Buf.
Sending Accept: application/protobuf request header only guarantees a protobuf response in case of a success (HTTP 200). Other status codes are always accompanied by JSON bodies.

Sessions

S2S (S2-Session) is a minimal binary protocol to encapsulate streaming append and read session semantics over , at the same endpoints as “unary” calls. It is meant for S2 SDKs, but documented for the adventurous.

Setup

  • Content-Type: s2s/proto signals that a session is being requested.
  • Accept-Encoding signals which compression algorithms are supported (service supports zstd and gzip). Content-Encoding is not sent as message-level compression is used.
  • 200 OK response establishes a session.

Message framing

All integers use big-endian byte order. Messages smaller than 1KiB should not be compressed.Length prefix (3 bytes): Total message length (flag + body)Flag byte (1 byte): [T][CC][RRRRR]
  • T (bit 7): Terminal flag (1 = stream ends after this message)
  • CC (bits 6-5): Compression (00=none, 01=zstd, 10=gzip)
  • RRRRR (bits 4-0): Reserved
Body (variable):
  • Regular message is a Protobuf
  • Terminal message contains a 2-byte status code, followed by JSON error information (corresponding to unary response behavior)

Data flow

Append sessions are a bi-directional stream of AppendInput messages from Client → Server, and AppendAck messages from Server → Client.Read sessions are a uni-directional stream of ReadBatch messages from Server → Client. When waiting for new records, an empty batch is sent as a heartbeat at least every 15 seconds.
Read sessions are also supported over SSE which has the benefit of being usable in the browser context.

Command records

Command records are an advanced feature to signal certain operations interpreted by the service. S2 SDKs make it easy to create supported command records. Concretely, a command record is a record with:
  1. sole header that has an empty name — empty header names are not allowed in any other context
  2. operation encoded in this header value
  3. payload for the command in the body of the record.
Command records take up a sequence number on the stream, and will be returned to reads. It is easy to test and filter out commands if needed, with the logic len(headers) == 1 && headers[0].name == b"".
Operations that are currently supported:
  • fence — set a fencing token for cooperative write exclusion. Payload is up to 36 UTF-8 bytes; an empty payload clears the token.
  • trim — set a trim point to remove older records. Payload is exactly 8 big-endian bytes representing the desired earliest sequence number.
curl -X POST "https://{basin}.b.aws.s2.dev/v1/streams/my-stream/records" \
    -H "Authorization: Bearer <token>" \
    -H "s2-format: raw" \
    -H "Content-Type: application/json" \
    -d '{
      "records": [
        {
          "headers": [
            ["", "fence"]
          ],
          "body": "producer-123"
        }
      ]
    }'