Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.
Mercurius Window System (MWS)
draft-ross-mercurius-02

Versions:
This document is an Internet-Draft (I-D). Anyone may submit an I-D to the IETF. This I-D is not endorsed by the IETF and has no formal standing in the IETF standards process.
Document	Type	Active Internet-Draft (individual)
	Author	Christopher Ross
	Last updated	2026-05-07
	RFC stream	(None)
	Intended RFC status	(None)
	Formats	txt htmlized bibtex bibxml
Stream	Stream state	(No stream defined)
	Consensus boilerplate	Unknown
	RFC Editor Note	(None)
IESG	IESG state	I-D Exists
	Telechat date	(None)
	Responsible AD	(None)
	Send notices to	(None)
Email authors IPR References Referenced by Nits Search email archive
draft-ross-mercurius-02
NETWORK WORKING GROUP
Network Working Group                                      C.P. Ross
Internet-Draft                                           Independent
Intended status: Experimental                            07 May 2026
Expires: 07 November 2026

                 Mercurius Window System (MWS)
                 draft-ross-mercurius-02

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at
   any time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This document is an individual submission to the IETF.
   Distribution of this document is unlimited.

   The latest version of this draft can be found at:
   https://mercurius.tebibyte.org/draft-ross-mercurius-02.txt

Author’s Address

   Christopher Ross
   Independent
   Email: chris@tebibyte.org
   Project Website: https://mercurius.tebibyte.org

Abstract

   The Mercurius Window System (MWS) is a network‑native, server‑side
   rendering system that enables graphical sessions to be accessed
   remotely with explicit semantics for windows, input, audio, and
   session state. MWS allows a user to interact with a workstation
   from untrusted or resource‑constrained client devices without
   exposing application data, GPU workloads, or compositor state to
   those devices. The protocol defines a zero‑trust client model, a
   structured session and window architecture, and distinct planes for
   rendering, video, audio, and input over an SCTP multi‑stream
   transport profile. This document specifies the MWS architecture,
   message formats, transport requirements, and security model.

Executive Summary (Non‑Normative)

   The Mercurius Window System (MWS) is a secure,
   high‑performance window system for environments where
   applications run on one machine but users work from another.
   Whether the host is a personal workstation, a shared server, a
   large deployment, or a private cloud, MWS provides responsive
   access from any location without assuming that “local” networks
   or client devices are trustworthy.

   MWS is designed for modern mobility patterns: consultants,
   remote workers, and digital nomads who move between client sites,
   hotels, airports, and home offices.  These users often rely on
   lightweight laptops or tablets that may be lost, stolen, or
   compromised.  MWS ensures that possession of a client device is
   never sufficient to access the workstation.  Authentication
   requires explicit user presence, and no sensitive data,
   credentials, or GPU workloads reside on the client.

   MWS preserves the network transparency that made X11 valuable,
   whilst replacing its implicit trust and CPU‑bound rendering with
   a modern, zero‑trust, GPU‑accelerated architecture.  The result
   is a window system that feels local even when the GPU is in
   another room, another building, or another country, supporting
   workflows ranging from office productivity to high‑refresh‑rate
   interactive applications on modern networks.

Table of Contents

      1. Introduction
         1.1. Scope and Applicability
         1.2. Design Rationale
         1.3. Cloud and Distributed Computing Context
         1.4. High‑Performance Rendering and Gaming
         1.5. Terminal Requirements and Wireless Considerations
         1.6. Session Mobility and Detachable Operation

      2. Conventions Used in This Document

      3. System Architecture
         3.1. Architectural Principles
         3.2. Major Components
         3.3. Workstation‑Centric Model
         3.4. Rendering and Surface Model
         3.5. Session Model
         3.6. Zero‑Trust Client Model
         3.7. Network Considerations
         3.8. Transport Requirements

      4. Detailed Architecture
         4.1. Sessions
         4.2. Seats
         4.3. Windows
         4.4. Compositor Model
         4.5. Rendering Model
         4.6. Audio Model
         4.7. Stream Allocation
            4.7.1. Session Identity and Message Routing
         4.8. Session Lifecycle
         4.9. Session and Seat Model
         4.10. Local Transport Profile (Non‑Normative)
         4.11. Security Model

      5. Protocol Specification
         5.1. Message Framing
         5.2. Control Messages (Stream 0)
            5.2.1. Initial Handshake (001–099)
               5.2.1.1. Session Identifier Semantics
            5.2.2. Session Management (100–199)
               5.2.2.1. Resume Semantics
            5.2.3. Window Lifecycle (200–299)
               5.2.3.1. Window Identifier Scope
         5.3. Rendering Commands (300–399) — Stream 1
         5.4. Input Events (400–499) — Stream 2
            5.4.1. Pointer Motion Events
            5.4.2 Input Scoping and Session Isolation
         5.5. Video Fallback (500–599) — Stream 3
         5.6 Audio Plane (600–699) — Stream 4
         5.7. Protocol State Machine
            5.7.1. Initial Connection
            5.7.2. Session Resume
         5.8. WSI Extension (Surface Creation)
         5.8.1. Surface Binding and Session Validation
         5.8.2. Required Enums
         5.9. Error Handling (700–799)
            5.9.1. Session and Resource Validation Errors

      6. Reference Implementation
         6.1. Server Components
         6.2. Client Components
         6.3. Demonstration Clients
            6.3.1. Vulkan Demonstration Clients
            6.3.2. Simple Game Demonstration
            6.3.3. Desktop Environment Support
         6.4. Dependencies
         6.5. Bootstrap Example

      7. Implementation Requirements and Validation
         7.1. Test Matrix
            7.1.1. Core Validation Tests
            7.1.2. Reference Implementation Commands (Non‑Normative)
         7.2. GPU Isolation Requirements
         7.3. Bandwidth and Transport Isolation Requirements

      8. Performance Considerations

      9. Security Considerations
         9.1. DANE Deployment (Non‑Normative)

      10. IANA Considerations

      11. Acknowledgements

      12. References
         12.1. Normative References
         12.2. Informative References

      13. Copyright

      Appendix A. MWS Opcode Registry
         A.1. Handshake and Authentication (000–099)
         A.2. Session Management (100–199)
         A.3. Window Lifecycle (200–299)
         A.4. Rendering Commands (300–399)
         A.5. Input Events (400–499)
         A.6. Video Plane (500–599)
         A.7. Audio Plane (600–699)
         A.8. Error Reporting (700–799)
         A.9. Reserved for Future Extensions (800–899)
         A.10. Experimental and Vendor‑Specific (900–999)

      Appendix B. Authentication Mechanism Registry
         B.1. Standard Mechanisms
         B.2. Extensible Mechanisms
         B.3. Private and Experimental Mechanisms
         B.4. Registration Policy

      Appendix C. SCTP Stream Usage Summary
         C.1. Stream 0 — Control Plane
         C.2. Stream 1 — Rendering Commands
         C.3. Stream 2 — Input Events
         C.4. Stream 3 — Video Plane
         C.5. Stream 4 — Audio Plane
         C.6. Additional Streams

      Appendix D. Protocol State Machine Diagrams
         D.1. Initial Connection State Machine
         D.2. Session Resume State Machine
         D.3. Error Handling State Machine
         D.4. Stream Interaction Summary

      13. Copyright

1. Introduction

   The Mercurius Window System (MWS), named for Mercurius, the Roman
   messenger god of swift communication, is a secure window system
   for both local and remote use. A user may work directly at the
   console of a workstation as on a conventional Unix‑like desktop,
   with full access to its GPU, input devices, audio devices, and
   local display. The same session may also be accessed from
   lightweight, mobile, or untrusted client devices elsewhere,
   without replicating the workstation’s software environment or
   exposing its data or GPU resources. Compute, storage, rendering,
   and audio processing remain on the workstation; clients act solely
   as authenticated display and input/audio endpoints.

   MWS is intended to let a workstation remain itself while being
   reached from elsewhere. The workstation is treated as a long‑lived
   environment that accumulates tools, history, and identity; remote
   devices are simply places from which the user inhabits that
   environment. A user may begin work at a powerful machine in the
   office and later continue the same session from a laptop, thin
   client, or secondary desktop in another location, without
   maintaining multiple environments or synchronising state. Remote
   access is an extension of the local workstation rather than a
   separate mode of operation.

   MWS is not a local display protocol, nor a pixel‑streaming
   system. It defines a structured, message‑oriented protocol for
   presence, input, audio, and rendering state on a workstation.
   Rendering is server‑resident and GPU‑accelerated; audio capture
   and playback are explicitly negotiated streams; and the protocol
   transmits structured commands and media units rather than raw
   framebuffers. The transport is agnostic but optimised for SCTP’s
   multi‑stream, message‑oriented semantics [RFC9260], with separate
   streams for control, rendering commands, input, video, and audio.

   This architecture continues the lineage of early Unix window
   systems such as X11, which supported network‑transparent
   interaction with applications running on central servers, while
   applying modern zero‑trust security [NIST800‑207], authenticated
   multi‑stream transport [RFC9260][RFC4895], and GPU isolation.
   Earlier systems such as NeWS explored server‑side rendering but
   lacked the transport and security mechanisms required for
   contemporary workloads. Wayland, by contrast, is intentionally
   scoped to trusted local compositing and does not address remote
   GPUs, untrusted clients, or relocatable sessions. MWS occupies a
   distinct design space: secure, zero‑trust remote presence for
   server‑resident graphical environments, without compromising
   first‑class local console use.

1.1. Scope and Applicability

   This document specifies the Mercurius Window System (MWS)
   protocol, the transport‑level protocol used by MWS to establish,
   authenticate, and maintain a user’s graphical presence on a
   workstation. The MWS protocol defines message framing, capability
   negotiation, session attachment, and input/output semantics,
   including structured handling of video and multi‑channel audio,
   over a secure transport profile based on TLS 1.3 [RFC8446] and
   SCTP [RFC9260].

   MWS is intended for environments where:

   • applications execute on a central workstation or server
   • users may work locally at the console or remotely from
     other devices
   • clients may be untrusted, mobile, or ephemeral
   • users may relocate sessions across devices
   • GPU‑accelerated workloads must remain server‑resident
   • audio capture and playback must remain server‑resident or
     explicitly brokered
   • network transparency is a first‑class requirement
   • loss or theft of a device must not compromise workstation
     security.

   MWS does not replace local display protocols such as Wayland,
   nor does it extend them. It provides a complementary mechanism
   for secure local and remote presence in multi‑user and
   distributed environments where local display protocols do not
   apply.

1.2. Design Rationale

   Early Unix window systems, including X11, were explicitly
   designed for network transparency: applications executed on
   powerful central servers while users interacted from remote
   terminals. This model proved valuable in multi‑user and
   distributed environments, but X11’s permissive trust model,
   unrestricted client capabilities, and CPU‑bound rendering are
   incompatible with modern zero‑trust requirements and
   GPU‑accelerated workloads [NIST800‑207].

   Wayland addresses these issues by assuming a single trusted
   local compositor, a local GPU, and a single‑user environment.
   This model provides excellent performance on personal
   workstations but does not support remote GPUs, relocatable
   sessions, multi‑user deployments, or untrusted clients. These
   use cases lie outside Wayland’s design goals.

   MWS intentionally revives and modernises the network‑
   transparent workstation model. It retains the architectural
   advantages of centralised execution and remote interaction while
   adopting a zero‑trust security model based on mutual TLS
   [RFC8446], authenticated SCTP [RFC9260][RFC4895] streams, and
   per‑client GPU isolation. Rendering is server‑resident and
   GPU‑accelerated; clients transmit structured rendering commands
   rather than framebuffers, and audio is carried as explicit
   timestamped streams rather than device‑local side effects. All
   compositor policy, input routing, and window management occur on
   the server, ensuring multi‑user correctness and preventing
   privilege escalation.

   Crucially, possession of a client device is never sufficient
   to access the workstation. Client certificates must be protected
   by the platform, session resume requires fresh authentication,
   and no sensitive data or credentials are stored on the client.
   This ensures that a lost or stolen laptop cannot be used to
   compromise the workstation.

   The result is a window system that provides deterministic
   semantics, strong isolation, and relocatable sessions, enabling
   users to inhabit remote workstations with the performance and
   responsiveness of a local environment, whilst preserving
   first‑class local console operation.

1.3. Cloud and Distributed Computing Context

   Many organisations operate private cloud or workstation‑cluster
   environments where users access centralised compute and GPU
   resources from thin clients or mobile devices. Public cloud
   deployments exhibit similar characteristics: applications execute
   on remote servers while clients roam across untrusted networks.

   MWS aligns with this model by centralising execution and
   distributing only the user interface. This avoids the
   inefficiencies of distributed compute systems whilst preserving
   the benefits of remote access, session mobility, and strong
   isolation between users. Because clients are untrusted, MWS
   ensures that compromise or loss of a client device does not grant
   access to the workstation.

1.4. High‑Performance Rendering and Gaming

   MWS is primarily intended for workstation and private‑cloud
   deployments in which terminals connect over well‑provisioned LANs
   and VPNs, typically with DANE [RFC6698][RFC7671] securing mutually
   authenticated transport over TLS 1.3 [RFC8446] carried over SCTP
   [RFC3436]. In these environments, modern GPUs provide
   hardware‑accelerated AV1 encoding with extremely low latency,
   enabling high‑resolution and high‑refresh‑rate streaming for
   everyday workstation workloads. All workstation rendering in MWS
   is performed using the Vulkan [VK14] API.

   Nevertheless, the same architecture could accommodate high‑
   performance remote rendering workloads, including interactive 3D
   applications and games. Support for such workloads is a stretch
   goal rather than a primary target, but these use cases inform the
   design of the transport, security, and rendering model to ensure
   that MWS remains viable for demanding graphical applications.

1.5. Terminal Requirements and Wireless Considerations

   MWS clients are treated as untrusted endpoints. Practical
   deployments assume a minimum level of capability. A typical
   terminal is expected to provide a modern CPU, a hardware‑
   accelerated GPU capable of AV1 decoding, and at least gigabit‑
   class network connectivity. Higher resolutions or refresh rates
   benefit from greater bandwidth, but MWS remains usable at reduced
   quality on lower‑capacity links.

   Emerging wireless standards such as Wi‑Fi 8 (IEEE 802.11bn) are
   expected to deliver multi‑gigabit throughput and sub‑millisecond
   air‑interface latency, enabling MWS terminals to operate over
   wireless links without compromising interactive performance.

1.6. Session Mobility and Detachable Operation

   Because sessions in MWS are server‑resident and independent of
   client connections, the system naturally supports detachable
   operation. A user may disconnect from one terminal and later
   resume the same session from another device, with all windows,
   GPU state, and compositor context preserved.

   This model is conceptually similar to terminal multiplexers such
   as screen or tmux, but applied to a full GPU‑accelerated
   graphical environment. Session mobility is a core design goal of
   MWS and informs its authentication, transport, and rendering
   architecture.

   Deployments are expected to configure a reconnection grace period
   so that brief network outages or short unscheduled breaks such as
   to move or charge a client device do not cause the user’s session
   to be lost.

2. Conventions Used in This Document

   The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
   SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in
   this document are to be interpreted as described in RFC 2119 and
   RFC 8174 when, and only when, they appear in all capitals.

   Terminology relating to Vulkan follows the definitions and naming
   conventions of the Vulkan 1.4 specification [VK14].

   Terminology relating to SCTP follows the Stream Control Transmission
   Protocol specification [RFC9260]. Terminology relating to TLS 1.3
   follows the Transport Layer Security specification [RFC8446].

   Unless otherwise stated:

   • “workstation” refers to the system on which applications execute,
     and where all rendering, compositing, and session management occur.

   • “server” refers to the MWS server daemon (mwsd) running on the
     workstation. In this document, “workstation” and “server” refer
     to the same system at different levels of abstraction.

   • “client device” refers to the physical device used by a user to
     access the workstation. Examples include laptops, tablets, phones,
     thin clients, and embedded devices.

   • “client” refers to the MWS client component (mwsc) running on the
     client device. The client is the protocol endpoint that communicates
     with the server.

   • “terminal” refers to the logical role a client assumes once
     connected: a seat‑providing endpoint that delivers input and
     receives rendered output.

   • “session” refers to a persistent graphical environment maintained
     on the workstation independently of client connections.

   • “seat” refers to a set of input devices and output mappings
     associated with a session and bound to a terminal.

   • “surface” refers to a drawable region managed by the compositor
     and rendered by the workstation’s GPU.

   • “command stream” refers to the structured, Mercurius protocol
     messages exchanged on SCTP stream 0 (control stream).

   • “video surface” refers to a high‑motion region encoded using a
     hardware‑accelerated codec such as AV1.

   • The names Alice, Bob, Eve, and Mallory are used in their standard
     roles from security literature. Alice and Bob denote honest users,
     Eve denotes a passive eavesdropper, and Mallory denotes an active
     attacker. These names are used solely for threat‑model examples and
     do not correspond to real users or implementation artefacts.

   All multi‑byte integers are transmitted in network byte order
   unless explicitly specified otherwise.

3. System Architecture

   This section provides a high‑level overview of the Mercurius
   Window System (MWS). It describes the conceptual model, major
   components, and architectural principles that inform the detailed
   design in Section 4 and the protocol specification in Section 5.

   MWS is designed around a workstation‑centric model in which all
   rendering, compositing, audio processing, session management, and
   window‑management policy reside on a central server. Client devices
   act solely as authenticated display, audio, and input endpoints.
   This model preserves the semantics of a local workstation while
   enabling secure remote presence across modern networks.

   MWS assumes client devices with at least gigabit‑class
   connectivity, including modern Wi‑Fi networks that routinely
   exceed 1 Gb/s. The protocol is optimised for 10 GbE LANs, where
   uncompressed or lightly compressed surfaces, high‑motion content,
   and low‑latency audio can be delivered with minimal delay. Devices
   with substantially lower bandwidth may operate at reduced quality
   but are not a primary design target.

   MWS also provides a real‑time audio transport suitable for
   workstation‑class media workloads. The protocol supports
   full‑duplex, timestamped PCM audio with deterministic latency,
   enabling remote use of digital audio workstations, conferencing
   applications, and high‑fidelity monitoring. Audio transport is
   integrated into the same multi‑stream SCTP association as
   rendering, input, and video, but operates on dedicated streams to
   ensure isolation from congestion and head‑of‑line blocking.

3.1. Architectural Principles

   The design of MWS is guided by the following principles:

      • Applications execute on a central workstation or server.
      • Local and remote interaction share identical session semantics.
      • Client devices may be untrusted, mobile, or ephemeral.
      • Users may relocate sessions across devices without restarting
        applications.
      • GPU‑accelerated workloads remain server‑resident.
      • Real‑time audio is a first‑class subsystem with strict latency
        and ordering requirements.
      • Network transparency is a first‑class requirement.
      • Loss or theft of a client device must not compromise
        workstation security.

   These principles reflect the goal of treating the workstation as a
   long‑lived environment with continuity of storage, configuration,
   and identity.

   Alternative transports such as QUIC were considered. However,
   SCTP’s native multi‑streaming, message‑oriented delivery, and
   support for partial reliability align directly with the
   requirements of MWS. QUIC’s multiplexed byte‑stream model,
   together with the absence of partially reliable streams, would
   require additional framing and scheduling logic to emulate SCTP
   semantics. For these reasons, SCTP is the primary transport for
   MWS.

   The protocol’s guarantees depend on transport properties that
   SCTP provides natively, including independent ordered streams,
   preservation of message boundaries, optional partial reliability,
   avoidance of cross‑stream head‑of‑line blocking, and stable
   associations. These properties are required to ensure
   deterministic compositor behaviour, responsive input under load,
   support for high‑motion video surfaces, low‑latency audio
   transport, relocatable sessions, and multi‑seat concurrency.

   TCP does not provide these properties without substantial
   additional protocol machinery. A TCP‑based transport would
   therefore be unable to meet the latency, isolation, and
   concurrency requirements of MWS as defined in this document, and
   is out of scope for this specification.

3.2. Major Components

   MWS consists of the following major components, corresponding to
   the object model defined in Appendix A:

      • The compositor, which manages windows, surfaces, focus,
        input routing, and presentation.

      • The renderer, which executes GPU‑accelerated drawing and
        performs surface composition on the workstation.

      • The audio subsystem, which manages audio device and audio
        stream objects, and provides full‑duplex, timestamped PCM
        transport for playback and capture.

      • The session manager, which maintains Session objects
        independently of client connections and supports detachable
        operation.

      • The transport layer, which provides a secure, multi‑stream,
        message‑oriented channel between workstation and client.

      • The input subsystem, which manages Seat and InputDevice
        objects and forwards input events to the compositor.

      • The client device, which decodes surfaces, presents them to
        the user, plays audio, and forwards input events to the
        workstation.

   These components interact to provide a deterministic, structured,
   zero‑trust window system suitable for both local console use and
   remote presence.

3.3. Workstation‑Centric Model

   The workstation is the authoritative environment. It owns all GPU
   resources, audio devices, compositor state, and input‑routing
   policy. Applications run exclusively on the workstation, and all
   rendering and audio processing are performed on workstation‑resident
   hardware.

   Client devices own their physical input hardware. Input events are
   generated on the client and forwarded to the workstation, which
   applies focus, routing, and seat semantics. The workstation never
   interacts with the client’s physical devices directly; it operates
   only on the logical input events they produce.

   GPU resources are exposed to MWS exclusively through the Vulkan
   API. The workstation enumerates all available GPUs using
   vkEnumeratePhysicalDevices() and creates one compositor instance
   per physical device. All rendering, composition, and presentation
   operations are defined in terms of Vulkan objects and capabilities.

   Audio resources are exposed to MWS through audio devices, each of
   which represents a physical or virtual playback or capture endpoint.
   Audio streams are created dynamically to transport PCM audio between
   the workstation and client devices.

   A user may interact with the workstation in two ways:

      • Local console mode, using the workstation’s own keyboard,
        pointer, display, and audio hardware.

      • Remote presence mode, using an authenticated client device
        elsewhere on the network.

   Local and remote interaction share the same compositor, window
   tree, audio devices, and session state. Remote presence is an
   extension of the local workstation, not a separate mode of
   operation. Sessions persist across transient disconnections, but
   long‑term persistence requires explicit detachment.

3.4. Rendering and Surface Model

   MWS supports two classes of graphical output:

      • structured rendering commands, representing deterministic
        drawing operations for low‑motion or vector‑oriented surfaces

      • video surfaces, representing high‑motion content encoded
        using hardware‑accelerated codecs such as AV1

   The compositor selects the appropriate representation based on
   surface characteristics and available bandwidth. This allows MWS
   to operate efficiently on both 1 GbE and 10 GbE networks, as well
   as on modern Wi‑Fi links.

   Surface objects represent drawable regions within the compositor.
   Window objects reference one or more surfaces, and the compositor
   determines whether a surface is transmitted as structured commands
   or as a video stream. The renderer produces Vulkan command streams
   for structured surfaces, while video surfaces are encoded using
   hardware acceleration when available.

3.5. Session Model

   Sessions are server‑resident and persist independently of client
   connections, but long‑term persistence requires explicit
   detachment. MWS tolerates transient network interruptions; if a
   client reconnects within the configured grace period, the session
   continues without interruption. If a client disappears without
   detaching, the session is preserved only for the duration of this
   grace period. Once the period expires, the session is closed, and
   applications terminate in the same manner as a workstation
   session without an active seat. Because all application, window,
   and session state resides on the workstation, failure, loss, or
   destruction of a client device does not by itself risk loss of
   in‑progress work; the session remains intact on the server and
   may be resumed from another device, subject only to the
   reconnection grace period.

   A user may explicitly detach a session to preserve it beyond the
   reconnection grace period and may later resume it from any
   authorised device. A single transport association may provide
   multiple seats for a session when permitted by policy. The
   precise rules governing how many transport associations may
   attach to a session, and under what conditions, are defined in
   Section 4.7.

   Sessions contain seats, each of which aggregates input devices,
   audio streams, and presentation state for a particular user
   interaction context. Multiple seats may be active concurrently,
   enabling multi‑user or multi‑terminal operation.

   This model enables mobility across devices while preserving the
   semantics of a traditional workstation and avoiding long‑lived
   orphaned sessions.

3.6. Zero‑Trust Client Model

   All client devices are treated as untrusted endpoints, even on
   local LANs. Trust is established exclusively through
   cryptographic identity and explicit authorisation rather than
   network location. A client device is not an identity and is not
   authorised to access a session by virtue of its presence on the
   network; only the user is authorised.

   Historically, X11 enabled a powerful and flexible model in which
   users could inhabit remote workstations as naturally as local
   ones. Even on dedicated X terminals, users authenticated as
   themselves and could not access another user’s data. The flaw was
   not the login model but the assumption that any client on the
   network was inherently trustworthy. Wayland addressed this by
   eliminating remote clients entirely. Mercurius instead removes
   the assumption of trust: client devices may be anywhere, but
   trust is derived solely from what the user can cryptographically
   prove.

   Device identity is established during the TLS 1.3 handshake
   carried over SCTP. Deployments SHOULD use DNS‑Based Authentication
   of Named Entities (DANE) [RFC6698][RFC7671] to bind the server’s
   certificate to DNSSEC‑protected TLSA records, allowing clients to
   verify that they are communicating with the correct workstation
   without relying on public certificate authorities or assumptions
   about local network topology. Typically only the server requires
   a TLSA record; client devices may obtain certificates through
   local provisioning mechanisms. DANE prevents an attacker on the
   same network from impersonating a Mercurius server, without
   depending on public CA infrastructure.

   User authentication is performed at the application layer
   using the mechanism‑agnostic model defined in Section 5.2.1.
   The server advertises supported mechanisms (for example, “PAM”,
   “FIDO2”), and the client selects one. This allows deployments
   to integrate password‑based, hardware‑token, federated, or
   certificate‑based user authentication without modifying the
   protocol.

   A client device is assumed to be mobile and at risk of loss or
   theft. To limit the impact of device compromise, a client stores
   no confidential data, long‑term session state, or reusable
   credentials. Because all application and session state resides
   exclusively on the workstation, loss or compromise of a client
   device does not expose user data or permit modification of
   workstation‑resident files. Authentication requires both
   possession of the device’s private key (validated via mutual TLS
   and, when deployed, DANE) and successful user authentication via
   one of the advertised mechanisms. Mallory stealing Bob’s laptop
   gives him no more access to the workstation than if he had
   purchased a brand‑new laptop; the device alone is insufficient to
   access or resume a session.

   In contrast to traditional window systems such as X11, client
   devices in MWS are not part of the trusted computing base; they
   are merely display, audio, and input conduits for a
   workstation‑resident session.

3.7. Network Considerations

   MWS is designed to operate over untrusted IP networks, including
   public networks and variable‑quality wireless links. The protocol
   does not assume that terminals are located on the same LAN as the
   workstation, nor that any network segment provides meaningful
   security. A terminal on a local LAN and a terminal on a remote
   network are treated identically by the workstation.

   MWS is designed for modern networks:

      • 10 GbE provides optimal performance and headroom for multiple
        high‑resolution seats on a single workstation.

      • Wi‑Fi 6/6E/7 provides multi‑gigabit throughput with variable
        jitter and is fully supported for single‑seat clients.

      • 1 GbE provides a usable baseline for typical desktop workloads
        and a small number of seats on a workstation.

      • Sub‑gigabit links are outside the primary design envelope and
        are not expected to provide an acceptable experience for
        high‑resolution, high‑refresh workloads or low‑latency audio.

   The transport layer adapts to available bandwidth through dynamic
   surface encoding, selective use of video surfaces, adaptive
   refresh rates, and prioritised input, audio, and control streams.

   The detailed architecture is specified in Section 4, and the wire
   protocol is defined in Section 5.

3.8. Transport Requirements

   MWS requires a transport that provides structured,
   message‑oriented delivery with support for multiple independently
   ordered channels. The transport MUST preserve message boundaries,
   MUST support concurrent streams with independent ordering, and
   SHOULD provide mechanisms for partial reliability to avoid
   retransmission of stale high‑volume data such as video surfaces
   and real-time audio frames.

   The transport MUST avoid cross‑stream head‑of‑line blocking.
   Input events, control messages, rendering commands, audio streams,
   and video surfaces are logically independent flows, and the
   correctness of compositor behaviour depends on their timely and
   ordered delivery within their respective channels. A transport
   that enforces global ordering across all data would introduce
   latency coupling between these flows and would not meet the
   responsiveness requirements of MWS.

   The transport MUST support stable associations that survive
   transient network changes, including client mobility across
   networks. Session attachment, reconnection semantics, and
   multi‑seat operation rely on the ability to maintain a consistent
   transport‑level association identity.

   SCTP satisfies these requirements through its native
   multi‑streaming model, message‑oriented delivery, optional
   partial reliability, and support for multi‑homing. These
   properties align directly with the architectural principles
   defined in Section 3.1 and are required for deterministic
   compositor behaviour, responsive input under load, support for
   high‑motion video surfaces, low‑latency audio transport,
   relocatable sessions, and multi‑seat concurrency.

   TCP does not provide these properties without substantial
   additional protocol machinery. TCP offers only a single in‑order
   byte stream, lacks message boundaries, enforces global
   head‑of‑line blocking, and provides no support for partial
   reliability or multi‑streaming. A TCP‑based transport would
   therefore be unable to meet the latency, isolation, and
   concurrency requirements of MWS as defined in this document, and
   is out of scope for this specification.

4. Detailed Architecture

   The Mercurius Window System (MWS) is structured around a central
   server (mwsd) that owns all GPU resources, audio devices, input
   routing, and compositor state, and a set of untrusted client
   devices that connect over a secure, message‑oriented transport.
   Once connected, a client device acts as a terminal providing a
   seat. This section describes the architectural model of sessions,
   seats, windows, rendering, and compositor behaviour. The wire
   protocol and message formats are defined in Section 5.

4.1. Sessions

   A session represents the complete graphical environment
   associated with a single authenticated user, including windows,
   workspaces, GPU resources, audio devices, and compositor state.
   Sessions are server‑resident and MAY persist independently of
   client connections when explicitly detached.

   User identity is established during the secure transport
   handshake. The authenticated identity (for example, the client
   certificate subject or a federated identity token) is mapped to a
   local user account via the system’s authentication framework
   (such as PAM). All windows, seats, and compositor state created
   over that association belong to the resulting session.

4.2. Seats

   A seat represents a set of input devices, audio streams, and an
   output binding for a session. A session MAY have multiple seats
   simultaneously. Each seat corresponds to a particular terminal,
   whether that terminal is the local console or a remote client
   device acting in the terminal role.

   Input events are tagged with a seat_id, and the compositor routes
   them according to seat‑specific focus and pointer state. Output
   mappings (for example, which windows appear on which displays)
   and audio routing may also be seat‑specific.

4.3. Windows

   Windows are server‑managed objects representing top‑level
   application surfaces. Each window belongs to exactly one session
   and is associated with one or more rendering surfaces (structured
   swapchains or video surfaces) depending on compositor policy.

   Window identifiers are scoped to a session. A terminal MUST NOT
   reference or interact with windows belonging to any other
   session. The server MUST enforce this isolation and MUST reject
   or ignore any protocol message that attempts to target a window
   outside the authenticated session.

4.4. Compositor Model

   The compositor maintains the global window tree, stacking order,
   focus, workspaces, and output mappings for each session. It is
   responsible for:

      • applying window‑management policy
      • routing input events based on seat and focus
      • managing swapchains and presentation timing
      • selecting between structured rendering and video fallback
      • revoking or reconfiguring windows according to policy

   The compositor SHOULD expose a user‑visible mechanism to
   forcibly terminate an unresponsive window. This mechanism is
   implementation defined (for example, a “kill window” gesture
   similar to Ctrl‑Alt‑Esc in KDE).

   The compositor MAY revoke swapchains, reconfigure windows, or
   migrate them between outputs according to local policy, resource
   constraints, or security requirements. When a swapchain is
   revoked, the server notifies the terminal and MAY substitute a
   placeholder or video surface.

   Each compositor instance is bound to a single GPU or output
   pipeline. It owns the device‑level rendering resources for that
   GPU (device context, queues, swapchains, and associated GPU
   buffers), and no other component may submit rendering work
   directly to that device.

4.5. Rendering Model

   Rendering in MWS is server‑side. Applications submit rendering
   commands to the server, which validates and executes them on the
   GPU. Client devices do not access GPU resources directly.

   The compositor selects the appropriate representation for each
   surface:

      • structured rendering for low‑motion or interactive content
      • video surfaces for high‑motion or bandwidth‑sensitive
        content

   Because rendering is server‑resident, a stalled or misbehaving
   terminal cannot block the compositor. The server MAY revoke a
   window’s rendering resources, substitute a placeholder surface,
   or terminate the client if rendering deadlines are repeatedly
   missed.

   The MWS specification assumes a modern explicit GPU API for
   rendering and composition (for example, Vulkan [VK14]) and
   requires that all rendering and presentation operations be
   performed through the compositor’s device‑level abstraction.
   Client devices are not required to implement any graphics API.

4.6. Audio Model

   The audio subsystem manages audio devices and audio streams and
   provides full‑duplex, timestamped PCM transport between the
   workstation and client devices. Audio is treated as a first‑class
   subsystem with strict latency and ordering requirements; audio
   traffic is logically independent of rendering and control traffic
   but shares the same secure, multi‑stream transport.

   An audio device represents a physical or virtual playback or
   capture endpoint on the workstation, such as speakers, headphones,
   microphones, instrument inputs, multichannel mixers, loopback
   devices, and virtual sinks. Devices are enumerated and managed on
   the server; client devices do not own or configure audio hardware
   directly.

   Logical audio streams are created dynamically to carry PCM samples
   between the workstation and the client. Each audio stream is
   bound to a specific audio device and seat, and is direction‑
   specific (playback or capture). Streams are timestamped at the
   server, and the client maintains playout buffers that honour these
   timestamps while minimising latency and jitter.

   Playback streams carry audio from applications on the workstation
   to the client device for presentation. Capture streams carry audio
   from client‑attached input devices to the workstation, where it
   is injected into the appropriate session and applications
   according to policy. The server MAY apply policy to limit or
   redirect capture streams (for example, to prevent inadvertent
   capture in shared environments, or to restrict which multichannel
   devices are exposed to a given session).

   In the base profile, audio messages use a dedicated SCTP stream
   (stream 4), separate from control, rendering, input, and the
   Video Plane, to avoid head‑of‑line blocking and to ensure
   predictable latency. Implementations MAY allocate additional SCTP
   streams for audio as an optimisation, but MUST NOT multiplex audio
   opcodes with control, input, or video opcodes on the same SCTP
   stream. The wire‑level definition of the Audio Plane, including
   opcodes and negotiation of playback and capture streams, is
   specified in Section 5.6.

4.7. Stream Allocation

   MWS uses SCTP multi‑streaming to separate control traffic from
   rendering traffic and to prevent head‑of‑line blocking between
   independent windows. Stream allocation is defined as follows:

   • Stream 0 is reserved for control messages and MUST NOT carry
     rendering data.

   • Each window is associated with exactly one rendering stream.
     All structured rendering commands and video‑surface updates
     for that window are sent on its assigned stream.

   • Multi‑monitor configurations do not affect stream allocation.
     A window that spans multiple outputs continues to use a single
     rendering stream.

   • The compositor MAY allocate additional streams for specialised
     rendering contexts (for example, off‑screen surfaces or
     auxiliary swapchains), but these MUST be explicitly negotiated
     during window creation and are scoped to the window that
     requested them.

   • Streams are not reused across windows unless the compositor has
     explicitly revoked the prior window and returned the stream to
     the allocator.

   This model ensures that rendering for one window cannot block
   or delay rendering for another, while avoiding unnecessary
   proliferation of streams in multi‑monitor environments.

4.7.1. Session Identity and Message Routing

   Each SCTP association corresponds to exactly one client session.
   The server MUST treat the SCTP association identifier (assoc_id)
   as the authoritative session identity. No client‑supplied field
   may select, reference, or influence the session to which a
   message is delivered.

   A session is created only after successful completion of the
   handshake defined in Section 5. Until the handshake completes,
   the server MUST ignore all messages received on streams other
   than 0.

   For any message received on a non‑zero stream, the server MUST:

   • identify the session associated with the SCTP association

   • verify that the session is active

   • dispatch the message to the subsystem corresponding to the
     stream

   • reject or ignore the message if it is malformed or references
     resources outside the session

   Messages referencing windows, seats, or other resources not owned
   by the session MUST be rejected with MWS_ERROR(type=700,
   fatal=0).

   Messages referencing a session other than the one implied by the
   SCTP association MUST be ignored.

   If a message is received for an association that has no active
   handshake or no active session, or whose session has been closed,
   the server MUST silently discard the message.

4.8. Session Lifecycle

   A session is a long‑lived server‑side construct that persists
   independently of any particular network connection. A session
   becomes ACTIVE when a client completes the handshake defined in
   Section 5 and remains ACTIVE until it is explicitly terminated
   or reclaimed by policy.

   A new client device connection MUST create a new session in the
   ACTIVE state unless the user explicitly requests to resume an
   existing session. The server MUST NOT automatically reattach a
   client device to a prior session solely on the basis of matching
   user identity.

   A client MAY explicitly detach from an ACTIVE session. Detach
   transitions the session from ACTIVE to DETACHED. In the DETACHED
   state, the session continues to run without any attached
   transport association; its windows, compositor state, audio
   streams, and GPU resources remain server‑resident. DETACHED
   sessions MAY be resumed by any authenticated client device
   belonging to the same user, subject to server policy.

   Loss of the SCTP association (for example, network failure,
   timeout, client crash) while a session is ACTIVE does not
   immediately terminate the session. Instead, the server MUST
   transition the session to a GRACE state and start a reconnection
   grace timer. In the GRACE state, the session remains active but
   has no attached client. If a client reconnects and successfully
   resumes the session before the grace timer expires, the server
   MUST transition the session back to ACTIVE and the session
   continues without loss of state.

   If the reconnection grace period expires without a successful
   resume, the server MUST treat the session as ABANDONED unless
   the user has explicitly detached it. ABANDONED sessions MUST be
   terminated and all associated resources reclaimed. Implementations
   MUST provide a configurable reconnection grace interval and SHOULD
   allow values sufficient to tolerate brief network outages on
   typical Wi‑Fi and WAN links. Servers SHOULD return a specific
   error status when a resume request targets an expired session.

   Long‑term persistence is an explicit, opt‑in behaviour: a session
   continues to exist beyond the reconnection grace period only if
   the user has explicitly detached it or otherwise marked it for
   later resumption. Implementations MUST NOT retain ABANDONED or
   stale sessions indefinitely. The server SHOULD reclaim resources
   associated with inactive sessions according to local policy
   (for example, idle timeout, logout event, or administrative
   limits).

   Only one transport association (client instance) MUST be
   attached to a given session at a time in the base protocol. That
   association MAY provide one or more seats for the session,
   subject to server policy. An implementation MAY provide a
   mechanism that allows additional associations to attach to the
   same session (for example, for technical support), but such
   behaviour is outside the scope of this specification and MUST NOT
   alter the semantics defined for the single‑association model
   above.

4.9. Session and Seat Model

   Sessions MAY persist independently of client connections. When a
   client device disconnects, the associated session and its windows
   MAY remain active in either the GRACE or DETACHED state. The
   compositor MAY blank or lock the session’s outputs according to
   local policy while no seat is attached.

   When a user resumes a session (from GRACE or DETACHED), the
   server:

      1. Binds a new seat_id to the resumed session.
      2. Sends the current window list, geometry, and focus state.
      3. Associates the new seat’s outputs and audio routing with
         the session.

   MWS supports both independent sessions and multi‑seat attachment
   within a single session. A user may maintain multiple concurrent
   sessions (for example, one on the local console and another
   accessed remotely), or may attach multiple seats to the same
   session via a single transport association, subject to the
   single‑association rule in Section 4.7.

   MWS also supports explicit session detachment. A user may detach
   a running session, leaving its windows, compositor state, audio
   streams, and GPU resources active on the server without any
   attached seats. The user may then initiate a new session on the
   same client device (for example, to perform unrelated work) and
   later resume the detached session exactly where it was left. This
   behaviour is directly analogous to detaching and reattaching a
   GNU Screen or tmux session, but applied to a full graphical
   desktop environment spanning one or more seats.

4.10. Local Transport Profile (Non‑Normative)

   Although MWS treats all client devices as untrusted endpoints and
   applies the same protocol semantics regardless of network
   location, implementations MAY apply transport‑layer optimisations
   when the client device and server reside on the same physical
   host. These optimisations MUST NOT alter protocol semantics,
   message ordering, authentication requirements, or session
   isolation, and MUST remain transparent to the terminal.

   Permitted implementation‑level optimisations include:

      • loopback‑specific SCTP acceleration
      • reduced cryptographic overhead
      • shared‑memory fast paths
      • GPU‑direct resource sharing where supported

   These optimisations MUST NOT:

      • grant additional privileges to local client devices
      • bypass certificate validation or user authentication
      • modify the behaviour of control, input, audio, or rendering
        streams
      • introduce protocol features unavailable to remote client
        devices

   MWS remains a transport‑agnostic, network‑transparent window
   system. Local optimisations exist solely to ensure that client
   devices running on the same host as the server achieve
   performance comparable to traditional local‑only systems without
   compromising the zero‑trust security model.

4.11. Security Model

   All client devices are treated as untrusted. Trust is established
   exclusively through cryptographic identity and explicit
   authorisation rather than network location. The server enforces
   strict isolation between users, sessions, seats, and windows. In
   particular:

      • each session is bound to a single SCTP association, and the
        association identifier serves as the authoritative session
        identity

      • window identifiers are scoped to a session and cannot be
        referenced by other sessions

      • input events are scoped to a seat and session, and cannot
        target windows outside that session

      • clients cannot observe, enumerate, or reference resources
        belonging to other sessions

      • all client‑originated messages are validated before being
        processed

   Transport security is provided by mutually authenticated TLS 1.3
   carried over SCTP. Deployments SHOULD use DNS‑Based Authentication
   of Named Entities (DANE) to bind the server’s certificate to
   DNSSEC‑protected TLSA records, allowing clients to verify that they
   are communicating with the correct workstation even in the presence
   of compromised or mis‑issued CA certificates. Device authentication
   alone does not grant access to a user session.

   MWS does not mandate a specific encryption mechanism beyond
   requiring confidentiality, integrity, and mutual authentication of
   endpoints. Deployments SHOULD use a transport that provides forward
   secrecy, such as TLS 1.3, WireGuard, or IPsec with PFS‑enabled
   cipher suites. The choice of transport‑layer security does not
   affect protocol semantics, and MWS remains agnostic to whether
   encryption is provided directly by TLS or by an external secure
   tunnel.

   User authentication is performed at the application layer using
   the mechanism‑agnostic model defined in Section 5.2.1. The server
   advertises supported mechanisms (for example, “PAM” and “FIDO2”),
   and the client selects one. This separation of device and user
   identity ensures that compromise of a device does not grant access
   to a user’s session without the corresponding user credential.

   The server validates all client‑originated messages, including
   window creation, input events, and rendering commands. A client
   may not reference windows, sessions, or resources outside its
   authenticated session. Attempts to do so are rejected with
   MWS_ERROR(type=700, fatal=0). Malformed or semantically invalid
   messages are ignored, and the session continues unless the error
   is marked fatal.

   The client is not part of the trusted computing base. It stores
   no confidential data, long‑term session state, or reusable
   credentials. If a client disconnects unexpectedly, the session
   persists only for the duration of the reconnection grace period
   unless the user has explicitly detached. After this period, the
   session is closed and applications terminate.

   Loss of the transport association for any reason (network
   failure, timeout, endpoint crash) is treated as a fatal transport
   error. The session MAY persist according to the rules in Section
   4.7 and MAY be resumed from another client device subject to
   policy.

5. Protocol Specification

5.1. Message Framing

   All MWS messages consist of a fixed-size header followed by an
   optional payload. Messages are carried within a single SCTP user
   message and MUST NOT be fragmented across multiple SCTP user
   messages.

   The header format is:

      struct MwsHeader {
         uint32_t magic;      // MWS_MAGIC_VALUE
         uint16_t type;       // MWS_* opcode
         uint16_t reserved;   // MUST be zero
         uint32_t length;     // payload length in bytes
      };

   The payload immediately follows the header. Implementations MUST
   validate the magic value, type, and length before processing the
   payload. Messages with invalid headers MUST be rejected with
   MWS_ERROR_PROTOCOL (type=701).

   All multi-byte integer fields in MWS messages are encoded in
   network byte order (big-endian). Implementations MUST convert
   values to and from host byte order when constructing or parsing
   messages. Structures shown in this document illustrate field
   layout only and do not imply host endianness.

   Unless otherwise specified, text fields in MWS messages are
   encoded as length‑prefixed UTF‑8 strings (“Pascal strings”).
   A length‑prefixed string consists of an unsigned length field
   (for example, uint8 or uint16, encoded in network byte order)
   followed immediately by exactly that many bytes of UTF‑8 text.
   No NUL terminator is transmitted on the wire; the length is
   authoritative, and implementations MUST NOT assume any
   terminating byte beyond the declared length.

5.2. Control Messages (Stream 0)

   Control messages manage authentication, session establishment,
   window lifecycle, and compositor state. All control messages MUST
   be sent on SCTP stream 0.

   The control channel is strictly ordered and defines the protocol
   state machine for session creation, resumption, and teardown.
   Rendering, input, audio, and video streams operate independently
   and are not blocked by control-plane latency.

5.2.1. Initial Handshake (001–099)

   The initial handshake establishes protocol version, user identity,
   and session parameters. Mutually authenticated TLS 1.3 [RFC8446]
   over SCTP [RFC3436], optionally validated using DANE (Section 9.1),
   authenticates the client device at the transport layer. The
   application-layer handshake authenticates the user and establishes
   a session.

   User authentication is mechanism-agnostic. The server advertises
   one or more supported authentication mechanisms, and the client
   selects one. This allows deployments to integrate PAM, WebAuthn,
   FIDO2, Kerberos, OAuth2, or future mechanisms without modifying
   the protocol.

   The handshake proceeds as follows on SCTP stream 0:

      1.  MWS_QUERY (type=001) — Client → Server
          Initiates protocol negotiation and requests session
          parameters.

      2.  MWS_AUTH_CHALLENGE (type=002) — Server → Client
          Advertises the available authentication mechanisms. The
          payload contains a list of mechanism identifiers.

          Payload format:

             uint8_t mechanism_count;

             repeated mechanism_count times:
                 uint8_t name_len;
                 char    name[name_len];

          Mechanism names are UTF‑8 strings and are not NUL‑terminated.
          name_len specifies the length in bytes and MUST be greater
          than zero. mechanism_count MAY be zero, in which case the
          client MUST abort the handshake.

      3.  MWS_AUTH_RESPONSE (type=003) — Client → Server
          Selects an authentication mechanism and provides
          mechanism-specific credentials.

          Payload format:

             uint8_t  mech_name_len;
             char     mechanism[mech_name_len];
             uint16_t credential_len;
             uint8_t  credential[credential_len];

          mechanism MUST exactly match one of the names advertised in
          MWS_AUTH_CHALLENGE. If the mechanism is unknown or the
          payload length is inconsistent, the server MUST respond with
          MWS_ERROR_PROTOCOL (type=701, fatal=1).

      4.  MWS_SURFACE_CAPS (type=004) — Server → Client
          Returns surface and WSI capability information, including
          supported Vulkan [VK14] extensions, swapchain formats,
          presentation modes, and structured-surface features. The
          payload format is defined in Section 5.3.

      5.  MWS_AUDIO_CAPS (type=005) — Server → Client
          Returns audio capability information for playback and
          capture. The payload describes the audio formats supported
          by the workstation, including:

             • supported sample rates (e.g., 44100, 48000, 96000)
             • supported sample formats (e.g., S16, S24, F32)
             • supported channel counts (e.g., mono, stereo)
             • whether audio capture is available
             • minimum and maximum buffer sizes or latency classes

          The payload format is extensible. Unknown fields MUST be
          ignored by the client. A server without audio devices MUST
          send MWS_AUDIO_CAPS with an empty capability set.

          The client MUST select a configuration compatible with the
          advertised capabilities before initiating audio playback or
          capture. Audio stream parameters MUST be re‑established
          during session resume (Section 5.7.2).

      6.  MWS_SESSION_INFO (type=006) — Server → Client
          Returns session parameters, including:

             • session identifier
             • initial compositor state
             • seat and input configuration
             • resume token (if applicable)

   After successful completion of this sequence, the client is fully
   authenticated and may create windows or resume an existing session
   on the appropriate rendering, input, audio, and video streams.

5.2.1.1. Session Identifier Semantics

   The session identifier returned in MWS_SESSION_INFO is assigned
   solely by the server. Clients MUST treat this value as opaque and
   MUST NOT attempt to select, predict, or construct session
   identifiers. All client-originated messages that include a
   session_id field are advisory; the server MUST validate the
   session_id against the session associated with the SCTP
   association on which the message was received.

   A client MUST NOT assume that a session identifier remains valid
   across reconnects unless the server has explicitly offered the
   session for resumption. Session identifiers from terminated or
   reclaimed sessions MUST NOT be reused by the server.

   Resume tokens included in MWS_SESSION_INFO are hints that allow
   clients to correlate local state with resumable sessions. Resume
   tokens are not client-authoritative and MUST be validated by the
   server during MWS_SESSION_RESUME_REQUEST processing.

5.2.2. Session Management (100–199)

   Session resume allows a client to reattach to an existing session
   previously detached by the user or preserved during the
   reconnection grace period.

      MWS_SESSION_RESUME_OFFER    (type=100) — Server → Client
         Indicates that a resumable session exists for the
         authenticated user.

      MWS_SESSION_RESUME_REQUEST  (type=101) — Client → Server
         Requests resumption of the indicated session.

      MWS_SESSION_RESUME_COMPLETE (type=102) — Server → Client
         Confirms that the session has been resumed and provides
         updated compositor state.

   Server-side application launch allows a client to request that the
   compositor start a program under the authenticated session.

      MWS_EXEC_REQUEST            (type=110) — Client → Server
         Requests execution of a command under the current session.
         The payload carries a UTF‑8 command line and OPTIONAL
         execution parameters.

      MWS_EXEC_RESULT             (type=111) — Server → Client
         Reports the outcome of an EXEC_REQUEST. The payload carries
         a status code indicating success or failure and MAY include
         an exit status, process identifier, or diagnostic message.

5.2.2.1. Resume Semantics

   A session becomes resumable when the user has explicitly detached
   it or when the SCTP association has been lost and the session has
   entered the reconnection grace period defined in Section 4.7. The
   server MUST NOT offer resumption for sessions that have been
   terminated or reclaimed by policy.

   After successful user authentication, the server MUST send
   MWS_SESSION_RESUME_OFFER (type=100) if and only if one or more
   resumable sessions exist for the authenticated user. The offer
   includes a list of resumable session identifiers and MAY include
   metadata such as creation time, last activity time, or a summary
   of compositor state. If no resumable sessions exist, the server
   MUST NOT send a resume offer.

   To resume a session, the client sends MWS_SESSION_RESUME_REQUEST
   (type=101) specifying the session identifier. The server MUST
   validate that the requested session:

      • belongs to the authenticated user
      • is currently resumable
      • is not attached to another client device, unless local policy
        permits forced detachment

   If validation succeeds, the server MUST attach the client to the
   session, cancel any active reconnection grace timer, and send
   MWS_SESSION_RESUME_COMPLETE (type=102) containing the updated
   compositor state.

   MWS_SESSION_RESUME_COMPLETE MUST include a complete reconstruction
   of session state sufficient for the client to synchronise its
   local representation, including the window list, geometry,
   stacking order, focus state, and seat/output mappings.

   If validation fails, the server MUST reject the request with
   MWS_ERROR_SESSION (type=702, fatal=0) and MUST NOT reveal the
   existence or attributes of sessions belonging to other users.

   Resume tokens provided in MWS_SESSION_INFO are advisory hints that
   allow clients to identify resumable sessions across reconnects.
   Resume tokens are not client-authoritative; the server MUST
   validate all resume requests against its internal session table.

   Only one client device MAY be attached to a session at a time. If
   a second device attempts to resume an active session, the server
   MUST either reject the request or forcibly detach the existing
   client, according to local policy.

5.2.3. Window Lifecycle (200–299)

   Window creation, destruction, mapping, and configuration are
   managed through the following messages:

      MWS_CREATE_WINDOW      (type=200) — Client → Server
         Requests creation of a new top‑level window. The window
         is created within the session associated with the SCTP
         association on which the request was received.

      MWS_WINDOW_CREATED     (type=201) — Server → Client
         Confirms window creation and returns window_id and initial
         geometry.

      MWS_DESTROY_WINDOW     (type=202) — Client → Server
         Requests destruction of a window owned by the session.

      MWS_WINDOW_DESTROYED   (type=203) — Server → Client
         Confirms destruction of a window.

      MWS_MAP_WINDOW         (type=204) — Client → Server
         Requests that a window become visible.

      MWS_UNMAP_WINDOW       (type=205) — Client → Server
         Requests that a window become hidden.

      MWS_CONFIGURE_WINDOW   (type=206) — Server → Client
         Notifies the client of geometry or state changes.

      MWS_FOCUS_WINDOW       (type=207) — Server → Client
         Notifies the client that a window has gained or lost focus.

      MWS_SWAPCHAIN_REVOKED  (type=208) — Server → Client
         Indicates that a window’s swapchain has been revoked due
         to policy, timeout, or resource constraints.

5.2.3.1. Window Identifier Scope

   Window identifiers are scoped to the session that created them. A
   client MUST NOT reference, manipulate, or query windows belonging
   to any other session. The server MUST validate that all
   window-related messages refer to windows owned by the session
   associated with the SCTP association on which the message was
   received.

   If a client attempts to reference a window outside its session,
   the server MUST reject the message with MWS_ERROR_SESSION
   (type=702, fatal=0). The server MUST NOT reveal the existence,
   geometry, focus state, or any other attributes of windows
   belonging to other sessions.

   Window identifiers are never reused across sessions, and the
   server MUST ensure that identifiers from one session cannot
   collide with or be interpreted as identifiers from another.
   Implementations MAY use per-session identifier namespaces,
   randomised identifiers, or any other mechanism that guarantees
   isolation.

   These rules ensure that windows are private to the session that
   owns them and that clients cannot observe or interfere with the
   graphical state of other users.

5.3. Rendering Commands (300–399) — Stream 1

   Rendering commands are delivered on SCTP Stream 1. Commands are
   validated and executed by the server.

      MWS_VK_SUBMIT   (type=300) — Client → Server
         Submits a VkCommandBuffer for execution.

      MWS_VK_SYNC     (type=301) — Client → Server
         Requests synchronisation of GPU state.

      MWS_VK_DESTROY  (type=302) — Client → Server
         Requests destruction of Vulkan resources associated with a
         window or pipeline.

5.4. Input Events (400–499) — Stream 2

   Input events are delivered on SCTP Stream 2. All input events MUST be
   tagged with a seat_id and window_id.

      MWS_INPUT_EVENT  (type=400) — Client → Server
         Delivers an XI2‑compatible input event.

      MWS_INPUT_ACK    (type=401) — Server → Client
         Acknowledges receipt of an input event.

5.4.1. Pointer Motion Events

   Pointer motion is reported using both absolute and relative
   coordinates. All motion events, including mouse_move and
   mouse_drag, MUST include the following fields:

      • x, y: absolute pointer position in surface‑local pixel units
              (uint32)

      • delta_x, delta_y: relative motion since the previous pointer
              event, in signed pixel units (int16)

   Terminals MUST send both absolute and relative values. The
   compositor uses absolute coordinates for hit‑testing and focus
   routing, and MAY use relative deltas for high‑precision motion,
   gesture recognition, or sub‑pixel accumulation. A delta of zero
   indicates no relative motion.

   mouse_move
      Generated when the pointer moves with no buttons held.

   mouse_drag
      Generated when the pointer moves while one or more buttons are
      held. Encoded identically to mouse_move, with the addition of a
      bitmask of currently‑held buttons.

5.4.2 Input Scoping and Session Isolation

   Input events are scoped to the session and seat from which they
   originate. A client MUST NOT send input events targeting windows
   belonging to any other session. The server MUST validate that the
   seat_id and window_id in every MWS_INPUT_EVENT refer to resources
   owned by the session associated with the SCTP association on
   which the message was received.

   If a client attempts to deliver input to a window outside its
   session, the server MUST reject the message with
   MWS_ERROR_SESSION (type=702, fatal=0). The server MUST NOT reveal
   the existence, geometry, focus state, or any other attributes of
   windows belonging to other sessions.

   Each session owns exactly one logical seat unless additional
   seats have been explicitly negotiated. seat_id values are
   therefore scoped to the session and MUST NOT collide with or
   reference seats belonging to other sessions.

   These rules ensure that input events cannot be redirected,
   spoofed, or injected across session boundaries, and that clients
   cannot observe or influence the input state of other users.

5.5. Video Fallback (500–599) — Stream 3

   Video fallback is used when server‑side rendering produces a
   pixel stream rather than a Vulkan command stream. Stream 3 uses
   PR‑SCTP to allow frame drops under congestion.

      MWS_AV1_FRAME          (type=500) — Server → Client
         Delivers an AV1‑encoded video frame.

      MWS_PLACEHOLDER_FRAME  (type=501) — Server → Client
         Delivers a placeholder frame when rendering is unavailable.

5.6 Audio Plane (600–699) — Stream 4

   The Audio Plane provides full-duplex, timestamped PCM audio
   transport between client and server. Unlike video, audio is
   latency-critical and MUST be delivered on a dedicated SCTP stream
   (Stream 4) to avoid interference from rendering, input, or video
   traffic. Audio streams are independent: each has its own
   parameters, timebase, and flow-control state.

   Two classes of audio streams exist:

      • Playback streams (server → client):
        Audio the client is expected to play.

      • Capture streams (client → server):
        Audio the server is expected to record or process.

   Each logical audio stream is identified by a 32-bit stream_id
   assigned by the endpoint that initiates that stream. A session
   MAY contain zero or more playback streams and zero or more
   capture streams. Streams are negotiated explicitly and MUST NOT
   begin transmitting PCM data until accepted by the peer.

   Audio messages MUST NOT be sent on Stream 0 or on any stream
   reserved for control, rendering commands, input events, or the
   Video Plane. For a given audio stream_id, all audio data for that
   stream SHOULD be carried on Stream 4 and MUST be processed in
   order. Implementations MAY allocate additional SCTP streams for
   audio as an optimisation (for example, one SCTP stream per
   logical audio stream) provided that audio opcodes (600–699) are
   not multiplexed with control, input, or video opcodes on the same
   SCTP stream.

   The following opcodes are defined for playback streams
   (server → client, 600–619):

      MWS_AUDIO_PLAYBACK_OPEN   (type=600)
         Server → Client. Open a playback stream.
         Payload: MwsAudioOpenPlayback.
         Advertises sample rate, channel layout, format, and a
         server-assigned stream identifier. The client replies with
         MWS_AUDIO_PLAYBACK_ACCEPT or MWS_AUDIO_PLAYBACK_REJECT.

      MWS_AUDIO_PLAYBACK_ACCEPT (type=601)
         Client → Server. Accept a playback stream.
         Payload: MwsAudioPlaybackAccept.
         Confirms that the client will play audio for the given
         stream_id. MAY include local constraints (for example,
         latency hints or volume policy).

      MWS_AUDIO_PLAYBACK_REJECT (type=602)
         Client → Server. Reject a playback stream.
         Payload: MwsAudioPlaybackReject.
         Indicates that the client cannot or will not play this
         stream. The server MUST NOT send MWS_AUDIO_PLAYBACK_DATA for
         a rejected stream_id.

      MWS_AUDIO_PLAYBACK_DATA   (type=603)
         Server → Client. PCM audio frames for playback.
         Payload: MwsAudioPlaybackData followed by PCM frames.
         Frames are tagged with a timestamp in the stream’s timebase.
         For a given stream_id, data MUST be delivered in order.

      MWS_AUDIO_PLAYBACK_CLOSE  (type=604)
         Server → Client. Close a playback stream.
         Payload: MwsAudioPlaybackClose.
         Indicates that no further audio will be sent on this
         playback stream. The client MAY reclaim associated
         resources.

   The following opcodes are defined for capture streams
   (client → server, 620–639):

      MWS_AUDIO_CAPTURE_OPEN    (type=620)
         Client → Server. Open a capture stream.
         Payload: MwsAudioOpenCapture.
         Requests capture with a given sample rate, channel layout,
         and format. The server responds with
         MWS_AUDIO_CAPTURE_ACCEPT or MWS_AUDIO_CAPTURE_REJECT.

      MWS_AUDIO_CAPTURE_ACCEPT  (type=621)
         Server → Client. Accept a capture stream.
         Payload: MwsAudioCaptureAccept.
         Confirms that the server will accept audio for the given
         stream_id and MAY adjust parameters.

      MWS_AUDIO_CAPTURE_REJECT  (type=622)
         Server → Client. Reject a capture stream.
         Payload: MwsAudioCaptureReject.
         Indicates that the server cannot accept this stream. The
         client MUST NOT send MWS_AUDIO_CAPTURE_DATA for a rejected
         stream_id.

      MWS_AUDIO_CAPTURE_DATA    (type=623)
         Client → Server. PCM audio frames for capture.
         Payload: MwsAudioCaptureData followed by PCM frames.
         Frames are tagged with a timestamp in the stream’s timebase.

      MWS_AUDIO_CAPTURE_CLOSE   (type=624)
         Client → Server. Close a capture stream.
         Payload: MwsAudioCaptureClose.
         Indicates that no further audio will be sent on this capture
         stream. The server MAY reclaim associated resources.

   Audio sample formats are identified using conventional shorthand
   widely used in digital audio APIs:

      • S16 — signed 16‑bit linear PCM
      • S24 — signed 24‑bit linear PCM (packed or padded)
      • F32 — 32‑bit IEEE 754 floating‑point PCM

   These identifiers are unambiguous and correspond to the formats
   commonly supported by ALSA, PulseAudio, PipeWire, CoreAudio,
   WASAPI, JACK, and other audio subsystems. Implementations that do
   not support a given format MUST omit it from MWS_AUDIO_CAPS.

5.7. Protocol State Machine

5.7.1. Initial Connection

      client                  server                   streams
      ======                  ======                   =======
      (TLS/SCTP handshake)                             [TLS]
      MWS_QUERY -------------------------------------> Stream 0
                              MWS_AUTH_CHALLENGE <-----
      MWS_AUTH_RESPONSE -----------------------------> Stream 0
                              MWS_SURFACE_CAPS <-----
                              MWS_AUDIO_CAPS   <-----
                              MWS_SESSION_INFO <-----
      MWS_CREATE_WINDOW -----------------------------> Stream 0
                              MWS_WINDOW_CREATED <-----
      MWS_MAP_WINDOW --------------------------------> Stream 0
      MWS_VK_SUBMIT ---------------------------------> Stream 1
                              MWS_AV1_FRAME <--------  Stream 3
      MWS_INPUT_EVENT -------------------------------> Stream 2
                              MWS_INPUT_ACK <--------
                              [optional MWS_CONFIGURE_WINDOW]
                              [optional MWS_FOCUS_WINDOW]
                              [optional MWS_SWAPCHAIN_REVOKED]
                              [optional MWS_WINDOW_DESTROYED]

5.7.2. Session Resume

      client                        server
      ======                        ======
      (TLS/SCTP handshake)
      MWS_QUERY ------------------->
                <------------------ MWS_SESSION_RESUME_OFFER
      MWS_SESSION_RESUME_REQUEST -->
                <------------------ MWS_SESSION_RESUME_COMPLETE
                [audio stream parameters MUST be re‑established]

5.8. WSI Extension (Surface Creation)

   MWS defines a Vulkan WSI extension for creating surfaces
   associated with MWS windows. All WSI requests are validated
   by the server.

      typedef struct VkMWSSurfaceCreateInfoMWS {
         VkStructureType  sType;
         const void*      pNext;
         VkDevice         device;
         uint32_t         session_id;
         uint32_t         sctp_stream_id;
         uint32_t         window_id;
         VkExtent2D       initial_extent;
      } VkMWSSurfaceCreateInfoMWS;

      VkResult mwsCreateMWSSurfaceMWS(
         VkInstance                          instance,
         const VkMWSSurfaceCreateInfoMWS*    pCreateInfo,
         const VkAllocationCallbacks*        pAllocator,
         VkSurfaceKHR*                       pSurface
      );

   Surface creation proceeds as follows:

      1. The client calls vkCreateInstance(), receiving MWS_SURFACE_CAPS.
      2. The client calls mwsCreateMWSSurfaceMWS() with session and
         window parameters.
      3. The server validates the request and creates a VkSurfaceKHR
         bound to the specified window_id.
      4. The client calls vkCreateSwapchainKHR() on the returned surface.

5.8.1. Surface Binding and Session Validation

   The fields session_id, sctp_stream_id, and window_id in
   VkMWSSurfaceCreateInfoMWS are not client-authoritative. They are
   treated as requests that the server MUST validate against the
   session associated with the SCTP association on which the WSI
   request was received.

   The server MUST enforce the following rules:

      • session_id MUST match the authenticated session associated
        with the SCTP association. A client MUST NOT request creation
        of a surface for any other session.

      • window_id MUST refer to a window owned by the same session.
        The server MUST reject any request that attempts to bind a
        surface to a window belonging to another session.

      • sctp_stream_id MUST match the rendering stream allocated to
        the specified window. The server MUST reject requests that
        specify an incorrect or unauthorised stream.

   If any of these validations fail, the server MUST reject the
   request with MWS_ERROR_SESSION (type=702, fatal=0). The server MUST NOT
   reveal the existence, geometry, or state of windows belonging to
   other sessions.

   These rules ensure that surface creation cannot be used to infer
   or access the graphical resources of other users, and that Vulkan
   surfaces remain correctly bound to the window and rendering
   stream allocated by the compositor.

5.8.2. Required Enums

   #define VK_STRUCTURE_TYPE_MWS_SURFACE_CREATE_INFO_MWS 1000053000

5.9. Error Handling (700–799)

   Errors are reported using the MWS_ERROR message. Errors are
   asynchronous and MAY be sent by either endpoint on any stream.
   Control-plane errors SHOULD be sent on Stream 0. Errors relating
   to rendering, input, or video fallback MAY be sent on the stream
   on which the offending message was received.

   Receipt of an error does not terminate the SCTP association
   unless the error is marked fatal.

      MWS_ERROR (type=700)
         Reports a protocol, semantic, transport, policy, or
         resource-related error.

         Fields:
            uint32_t error_code;
            uint32_t offending_type;   // MWS_* opcode that caused error
            uint32_t window_id;        // 0 if not applicable
            uint32_t fatal;            // 0 = recoverable, 1 = fatal
            char     description[];    // UTF-8 diagnostic string

   Error classes:

      • Protocol errors:
           – unknown or unsupported opcode
           – malformed header or payload
           – invalid magic value
           – message sent on the wrong SCTP stream
           – invalid length field
           – framing violations

         Protocol errors SHOULD use MWS_ERROR_PROTOCOL (type=701) as
         error_code. Protocol errors are fatal unless explicitly stated
         otherwise or the endpoint can reliably discard the offending
         message without desynchronising protocol state.

      • Semantic errors:
           – referencing a window outside the authenticated session
           – referencing a destroyed or revoked resource
           – message not valid in the current protocol state
           – invalid seat_id or session_id

         Semantic errors SHOULD use MWS_ERROR_SESSION (type=702) as
         error_code. Semantic errors are recoverable unless otherwise
         stated.

      • Policy errors:
           – compositor policy violation
           – swapchain revoked due to timeout or resource pressure
           – access control or authorisation failure

         Policy errors SHOULD use MWS_ERROR_POLICY (type=704) as
         error_code. Policy errors are recoverable unless the compositor
         explicitly marks them fatal.

      • Resource errors:
           – server unable to allocate memory for a new handshake,
             session, or window
           – exhaustion of file descriptors or other kernel resources
           – failure to create required GPU, Vulkan, or kernel objects
             due to resource limits

         Resource errors SHOULD use MWS_ERROR_RESOURCE (type=705) as
         error_code. Resource errors are recoverable unless the endpoint
         explicitly marks them fatal. Transient overload conditions
         (for example, “server too busy to handle your request right
         now”) SHOULD be reported as MWS_ERROR_RESOURCE with fatal=0 so
         that the client can retry at a later time.

      • Timeout errors:
           – expected reply not received within implementation-defined
             limits
           – client or server unresponsive

         Timeout errors MAY be mapped to MWS_ERROR_SESSION (type=702) or
         MWS_ERROR_TRANSPORT (type=703) depending on implementation
         policy.

      • Transport errors:
           – SCTP association loss
           – excessive retransmissions
           – PR-SCTP frame discard (non-fatal)

         Transport errors SHOULD use MWS_ERROR_TRANSPORT (type=703) as
         error_code. Association loss is always fatal.

      • Endpoint failures:
           – server crash or restart
           – client crash or termination

         Endpoint failures are fatal conditions and do not generate
         MWS_ERROR messages.

   Recoverable errors (fatal=0) indicate that the offending message
   has been ignored and the session MAY continue. Fatal errors
   (fatal=1) indicate that the SCTP association MUST be closed
   immediately after transmitting the error, unless the error
   prevents the message from being parsed.

   Loss of the SCTP association for any reason (network failure,
   timeout, endpoint crash) is treated as a fatal error. The session
   MAY persist according to the rules in Section 4.7.

   All error_code and window_id fields follow the network-byte-order
   rules defined in Section 5.1.

5.9.1. Session and Resource Validation Errors

   The server MUST validate that all window_id, seat_id, and
   session_id fields in client-originated messages refer to
   resources owned by the authenticated session associated with the
   SCTP association on which the message was received.

   The following conditions constitute session and resource
   validation errors (a subclass of semantic errors) and MUST be
   reported using MWS_ERROR_SESSION (type=702, fatal=0):

      • referencing a window belonging to another session
      • referencing a seat belonging to another session
      • attempting to bind a Vulkan surface to a window outside
        the authenticated session
      • specifying an sctp_stream_id that does not match the
        rendering stream allocated to the window
      • attempting to resume or manipulate a session not associated
        with the current SCTP association
      • providing a session_id that does not match the authenticated
        session

   When reporting such errors, the server MUST NOT reveal the
   existence, geometry, focus state, or any other attributes of
   resources belonging to other sessions. The window_id field in the
   error message MUST be set to zero if revealing the true
   identifier would disclose cross-session state.

   These rules ensure that clients cannot infer the presence of
   other users, windows, or seats, and that all resource identifiers
   remain strictly scoped to the authenticated session.

6. Reference Implementation

   The Mercurius reference implementation provides a complete,
   interoperable implementation of the protocol, compositor, transport
   stack, and session model described in this document. It is designed
   to be small, comprehensible, and faithful to the specification while
   remaining capable of running real applications, including full
   desktop environments.

   The implementation targets contemporary UNIX‑like systems, with
   Debian and FreeBSD as the primary development platforms. All
   components rely only on portable interfaces available across BSD
   and POSIX systems. The code builds and runs on Linux, but Linux
   is not assumed or required.

   Mercurius follows the same mental model as SSH. A user on a remote
   device may run:

      flash$ ssh chris@xavier uptime
      flash$ mwsc chris@xavier mlogo

   In both cases the user authenticates to a remote machine and
   executes a program there. SSH provides a remote shell; Mercurius
   provides a remote graphical session. The application runs on the
   workstation, while its windows appear on the remote terminal.

   When no application is specified:

      flash$ mwsc xavier

   the terminal connects to the workstation and presents mwsdm, the
   Mercurius Display Manager. This graphical login portal provides
   user authentication, session selection, and resume‑token handling.
   It serves the same role that a shell does in SSH: a default
   environment entered when no specific command is requested.

   Mercurius Portals are a modern reboot of the XTerminal concept for
   the 21st century. Terminals are stateless devices that provide
   display and input while all application execution occurs on the
   workstation. Unlike historical XTerminals, Mercurius Portals operate
   over secure TLS/SCTP transport, support GPU‑accelerated rendering,
   enforce strong authentication (including DANE), and provide session
   detach/reattach semantics.

   The reference implementation may be distributed using illustrative
   package groupings such as:

      mercurius-server   — server daemon, compositor, display manager
      mercurius-client   — terminal client and libraries
      mercurius-utils    — diagnostic and development tools
      mercurius-apps     — demonstration applications

   These names are examples only and are not tied to any specific
   operating system or packaging format.

6.1. Server Components

   The server components are installed on the workstation that owns GPU
   resources, compositor state, and user sessions.

   mwsd
      The primary MWS server daemon. Implements SCTP transport with
      TLS 1.3 mutual authentication, optional DANE validation of
      certificates, the full Mercurius handshake, session and seat
      management, compositor policy, window management, and Vulkan‑
      based rendering. mwsd is the authoritative source of truth for
      all session, seat, and window state.

   mwsdm
      The Mercurius Display Manager. Presented automatically when a
      terminal connects without specifying an application. Provides
      graphical login, user authentication, session selection, and
      resume‑token handling. mwsdm is the entry point for Portal‑style
      deployments.

   mlogo
      A minimal test client that runs on the server and connects to
      the local mwsd instance. It validates transport establishment,
      handshake correctness, session creation, window creation and
      mapping, and basic rendering. It displays a static Mercurius
      logo in a compositor‑managed movable, resizable window.

6.2. Client Components

   The client components are installed on laptops, thin clients, and
   embedded devices that attach to a workstation over the network.

   mwsc
      The primary terminal client. Implements SCTP/TLS transport, the
      full handshake, certificate and DANE validation, window
      management, input routing, audio playback, AV1 decode for
      Stream 3 fallback, and Vulkan loader integration where available.
      Audio is received over the Audio Plane using
      MWS_AUDIO_PLAYBACK_DATA and rendered locally with low latency.
      mwsc is invoked similarly to SSH:

         flash$ mwsc chris@xavier mlogo
         flash$ mwsc xavier

   libmws.a
      A static client library providing SCTP/TLS bindings, DANE‑
      validated mutual authentication, audio playback support, AV1
      decode (via dav1d), protocol message definitions, and Vulkan
      loader integration. Intended for test clients, demos, and early
      adopters.

   mws_protocol.h
      A public protocol header defining all opcodes, message
      structures, and constants corresponding to the formats described
      in Section 5. It is kept in sync with the protocol registry
      (Appendix A) and is intended to remain stable across minor
      revisions.

6.3. Demonstration Clients

   The reference implementation includes several demonstration clients
   that exercise progressively more complex behaviours. These clients
   are not part of the core protocol but are essential for validating
   compositor behaviour, swapchain management, input latency, audio
   transport, and long‑running rendering workloads.

6.3.1. Vulkan Demonstration Clients

   A set of small Vulkan‑based demos validate continuous animation,
   swapchain reuse, and timing stability. Examples include:

      • a rotating textured cube;
      • a particle‑system demo;
      • a multi‑quad compositor simulation.

   These demos validate steady‑state frame pacing, correct damage
   tracking, fallback video behaviour on Stream 3, and long‑running
   rendering without leaks or drift.

6.3.2. Simple Game Demonstration

   Breakout, a simple Vulkan-based game validates keyboard and
   mouse input, low‑latency feedback loops, window focus changes,
   and real‑world interactive rendering workloads. The game also
   exercises the Audio Plane: sound effects are delivered via
   MWS_AUDIO_PLAYBACK_DATA, demonstrating synchronized audio and
   graphics in interactive applications.

6.3.3. Desktop Environment Support

   The reference implementation is capable of running full desktop
   environments such as KDE Plasma 6. This validates multi‑window
   behaviour, compositor correctness, input routing, session
   management, and the ability of Mercurius to support complex,
   latency‑sensitive graphical workloads.

6.4. Dependencies

   The reference implementation targets contemporary UNIX‑like systems,
   with FreeBSD as the primary development platform. Required
   facilities and libraries include:

      • SCTP support (FreeBSD provides a full in‑kernel SCTP stack)
      • Vulkan loader and validation layers (Mesa or LunarG)
      • dav1d — AV1 decoder for Stream 3 fallback
      • PAM — Pluggable Authentication Modules
      • libtevent — portable asynchronous event loop library

6.5. Bootstrap Example

   This example illustrates a minimal three‑host deployment:

      • xavier    — workstation running the MWS server
      • flash     — thin client providing display and input
      • greenway  — thin client based in Switzerland

   1. Start the server:

         xavier$ mwsd --port 49152

      mwsd listens for TLS/SCTP associations on port 49152 and exposes
      the compositor and session manager.

   2. Connect from the terminal:

         flash$ mwsc xavier

      The client establishes a TLS/SCTP association, validates
      certificates via DANE, and presents mwsdm, the graphical login
      manager.

   3. Run an application on the LAN:

         flash$ mwsc chris@xavier mlogo

      mlogo executes on xavier, connects to the local mwsd, joins the
      session associated with flash, creates a window, and renders the
      Mercurius logo.

      This validates the full end‑to‑end path: TLS/SCTP transport,
      certificate authentication, user authentication, session and seat
      creation, window creation and mapping, swapchain initialisation,
      GPU rendering, audio transport (where applicable), and remote
      display.

   4. Run an application over the Internet:

         greenway$ mwsc per@xavier.tebibyte.org mlogo

      mlogo executes on xavier, connects to the local mwsd, creates
      a new session and renders the Mercurius logo in a resizable,
      movable window on Per’s machine in Switzerland. In this example
      both endpoints have excellent connectivity with low latency and
      high bandwidth, allowing the remote session to behave similarly
      to a local‑area connection.

7. Implementation Requirements and Validation

   This section defines normative requirements for any conformant
   MWS implementation. These requirements ensure correct behaviour
   under load, predictable session semantics, and robust isolation
   between clients. The reference implementation demonstrates these
   properties but does not attempt to optimise for all hardware
   configurations.

7.1. Test Matrix

   An implementation of MWS MUST demonstrate correct behaviour
   across four major dimensions:

      • Session semantics — creation, resume, detachment, identifier
        stability, and state continuity.

      • Window lifecycle — creation, mapping, resizing, destruction,
        and identifier scoping.

      • Rendering correctness — surface creation, command ordering,
        GPU isolation, and frame delivery.

      • Transport behaviour — SCTP stream allocation, ordering
        guarantees, error handling, and reconnection.

   The following matrix defines the minimum set of tests required
   to validate interoperability between an MWS client and server.
   These tests are not exhaustive; they represent the baseline
   necessary to confirm that the architectural components described
   in this document behave as specified.

7.1.1. Core Validation Tests

   +------------------------+----------------------------+------------------+
   | Test Case              | Description                | Success Criteria |
   +------------------------+----------------------------+------------------+
   | Bootstrap + Render     | Establish a session and    | Initial frame    |
   |                        | render a minimal surface   | displayed in a   |
   |                        | using the reference        | reasonable time  |
   |                        | client.                    | on reference     |
   |                        |                            | hardware;        |
   |                        |                            | session          |
   |                        |                            | terminates or    |
   |                        |                            | detaches cleanly.|
   +------------------------+----------------------------+------------------+
   | Window Lifecycle       | Create, map, unmap, and    | Correct          |
   |                        | destroy a window while     | CREATE→MAP→      |
   |                        | observing compositor       | UNMAP→DESTROY    |
   |                        | events.                    | sequence; no     |
   |                        |                            | orphaned         |
   |                        |                            | resources.       |
   +------------------------+----------------------------+------------------+
   | Session Persistence    | Start a session, detach or | Session resumes  |
   |                        | allow the client to        | with compositor  |
   |                        | disconnect, then resume    | state            |
   |                        | using the same session     | reconstructed as |
   |                        | identifier.                | defined in       |
   |                        |                            | Sections 4.7 and |
   |                        |                            | 5.2.2; client    |
   |                        |                            | can redraw       |
   |                        |                            | without protocol |
   |                        |                            | violations.      |
   +------------------------+----------------------------+------------------+
   | GPU Isolation          | Run multiple clients       | No cross‑session |
   |                        | concurrently, each         | resource leakage;|
   |                        | creating independent       | surfaces and     |
   |                        | surfaces.                  | windows remain   |
   |                        |                            | isolated.        |
   +------------------------+----------------------------+------------------+
   | Transport Stream       | Exercise streams 0–4 with  | No reordering    |
   | Allocation             | mixed control, rendering,  | within a stream; |
   |                        | input, audio, and video    | correct routing  |
   |                        | fallback traffic.          | based on session |
   |                        |                            | and window IDs.  |
   +------------------------+----------------------------+------------------+
   | Input Event Semantics  | Deliver pointer and        | Events delivered |
   |                        | keyboard events to         | only to the      |
   |                        | multiple windows across    | focused window;  |
   |                        | seats.                     | correct seat and |
   |                        |                            | session scoping. |
   +------------------------+----------------------------+------------------+
   | Error Handling         | Trigger invalid IDs,       | Server returns   |
   |                        | malformed messages, and    | appropriate      |
   |                        | protocol violations.       | error codes      |
   |                        |                            | (700–799);       |
   |                        |                            | session integrity|
   |                        |                            | maintained.      |
   +------------------------+----------------------------+------------------+

7.1.2. Reference Implementation Commands (Non‑Normative)

   The reference implementation provides the canonical server and client
   binaries for exercising the above tests. The following examples
   illustrate the intended usage pattern; conforming implementations MAY
   use any equivalent mechanism.

   For this example there are three hosts:

      • xavier: a workstation acting as the server
      • flash:  a thin client
      • kitty:  a laptop acting as a thin client

      • Bootstrap + Render:

           root@xavier# mwsd
           chris@xavier$ mwsc -lc mlogo
           chris@xavier$ mwsc localhost mlogo

      Expected result: a logo window appears on the display attached
      to xavier; mlogo exits when the user closes the window.

           flash$ mwsc --host xavier --port 49152 mlogo
           Run the mlogo command on the server 'xavier' from flash.

        Expected result: a logo window appears on the display attached
        to flash; mlogo exits when the user closes the window.

      • Session Persistence:

           flash$ mwsc xavier
           [mwsdm presents a login prompt; user authenticates]
           [mwsdm offers to resume an existing detached session or start a new one]
           [user resumes their previous session]

        Expected result: the resumed session appears exactly as it was
        left, with windows and compositor state reconstructed as defined
        in Sections 4.7 and 5.2.2. The client can re‑establish rendering
        without violating protocol or WSI rules.

      • GPU Isolation (multiple clients):

           flash$ mwsc xavier mlogo
           kitty$ mwsc xavier mlogo

        Expected result: independent windows are created in separate
        sessions on flash and kitty without visible interference.
        Destroying one client or terminating one session does not
        affect the others.

      • CLI Variants (for completeness):

           # mwsd variants
           root@xavier# mwsd --help
           root@xavier# mwsd --version
           root@xavier# mwsd -p 49152

           # mwsc variants
           xavier$ mwsc -l mlogo
           xavier$ mwsc -lc mlogo
           xavier$ mwsc --local mlogo
           xavier$ mwsc --local --command /usr/local/bin/mlogo
           xavier$ mwsc localhost mlogo

           flash$ mwsc -h xavier -p 49152
           flash$ mwsc -h xavier -c mlogo

           kitty$ mwsc chris@xavier
           kitty$ mwsc chris@xavier.tebibyte.org "/usr/local/bin/mlogo"

           flash$ mwsc --help
           flash$ mwsc --version

   These examples are illustrative only. They do not form part of the
   normative protocol and do not constrain implementation‑specific
   tooling.

7.2. GPU Isolation Requirements

   Implementations MUST ensure that GPU workloads from one session
   cannot compromise the integrity or confidentiality of another
   session’s resources, regardless of whether the server contains a
   single GPU or multiple GPUs.

      • The server SHOULD avoid allowing GPU workloads from one
        session to starve or block those of another. Implementations
        MAY use separate Vulkan queues, queue subsets, per‑session
        scheduling domains, or multi‑GPU distribution strategies to
        achieve this.

      • The server MUST validate VkCommandBuffer submissions
        sufficiently to prevent malformed or out‑of‑bounds accesses
        that would violate isolation guarantees. Invalid or malformed
        command buffers MUST be rejected without execution.

      • The server MUST enforce per‑session limits on GPU resource
        usage, including device memory, descriptor sets, and command
        buffer size. When limits are exceeded, the server MAY throttle,
        reject further submissions, or terminate the session. On
        systems with multiple GPUs, implementations MAY assign sessions
        to different GPUs to improve isolation or load distribution.

      • The server SHOULD implement watchdog mechanisms to detect and
        recover from GPU hangs attributable to a particular session.
        Recovery MAY include resetting client queues, revoking
        swapchains, or terminating the offending session while
        preserving other sessions. On multi‑GPU systems, recovery MAY
        include migrating unaffected sessions to other GPUs.

7.3. Bandwidth and Transport Isolation Requirements

   Implementations MUST ensure that control and input remain responsive
   under load and that one client cannot monopolise transport resources
   to the detriment of others.

      • Stream 0 (control) and Stream 2 (input) MUST use reliable,
        ordered delivery and MUST be prioritised over bulk data on
        other streams.

      • Stream 1 (rendering commands) MUST use reliable, ordered
        delivery. The server MAY impose rate limits on VK_SUBMIT or
        equivalent rendering messages to prevent excessive queueing.

      • Stream 3 (AV1 fallback video) MAY use partially reliable
        delivery (PR‑SCTP). The server MAY drop frames under congestion
        to maintain interactivity.

      • Stream 4 (audio playback) MUST use reliable, ordered delivery
        for MWS_AUDIO_PLAYBACK_DATA. The server and client SHOULD keep
        audio buffered sufficiently to tolerate moderate jitter while
        maintaining low‑latency playback.

      • The server SHOULD implement per‑session or per‑client bandwidth
        limits to prevent link saturation. Limits MAY be enforced at
        the SCTP layer, via traffic shaping, or using equivalent
        mechanisms.

      • The server MUST be able to unilaterally terminate a misbehaving
        client without impacting other sessions. Termination SHOULD be
        signalled with MWS_ERROR(type=700, fatal=1) followed by closure
        of the SCTP association, as defined in Section 5.8. Termination
        MUST release all GPU, transport, and compositor resources
        associated with that client, and MUST NOT affect other active
        sessions.

8. Performance Considerations

   MWS is designed for environments where the workstation and
   client devices are connected by modern high‑bandwidth,
   low‑latency networks. Contemporary 10 GbE and fibre‑backed LANs
   routinely deliver round‑trip latencies of 200–500 µs. In the
   reference test environment, measured RTT between a thin client
   and the workstation averaged approximately 0.33 ms (ICMP),
   consistent with this range. At such latencies, the network
   contributes only a small fraction of the end‑to‑end
   input‑to‑display path. A typical RTT in this range implies
   one‑way transport latency well below 0.5 ms, meaning that
   interactive responsiveness is dominated by GPU encode and decode
   rather than by the network.

   Modern GPUs provide hardware‑accelerated AV1 encoding with
   sub‑millisecond latency at common workstation resolutions.
   Because MWS performs all rendering and compositing on the server,
   and because clients do not execute application code or GPU
   workloads, the client‑side overhead is minimal. This eliminates
   the multiple GPU round‑trips, buffer copies, and synchronisation
   barriers found in traditional remote‑desktop systems, where the
   client must composite, scale, or colour‑convert frames before
   display. By reducing the client to a lightweight presentation
   endpoint with minimal CPU and GPU involvement, MWS ensures that
   interactive latency is dominated by server‑side encode time
   rather than by client‑side processing. As a result, a remote
   graphical session over a modern LAN is effectively
   indistinguishable from local use for typical workstation
   workloads.

   The following non‑normative figures illustrate achievable
   encode‑and‑transport performance on a 10 GbE LAN using an RTX
   5070‑class GPU. These values represent server‑side encode latency
   plus network transit time under ideal conditions; they do not
   include client‑side display latency.

   +===============+==========+==========================+======+
   | Resolution    | Bitrate  | Encode+Transport Latency | CPU  |
   +===============+==========+==========================+======+
   | 4K60 HDR      | 200 Mbps | <0.5 ms                  | <1%  |
   +---------------+----------+--------------------------+------+
   | 4K120 Gaming  | 500 Mbps | <1.0 ms                  | <2%  |
   +---------------+----------+--------------------------+------+
   | 8K60 HDR      | 1.2 Gbps | <1.5 ms                  | <3%  |
   +===============+==========+==========================+======+

   These figures demonstrate that MWS can support
   high‑resolution, high‑refresh‑rate graphical workloads over a
   modern LAN without compromising interactive performance. For
   typical desktop and workstation applications, the resulting
   experience is comparable to local use.

9. Security Considerations

   MWS is designed according to zero‑trust principles: no client
   device, network segment, or intermediary is implicitly trusted.
   All trust is derived from cryptographic identity and explicit
   authorisation rather than network location. The protocol assumes
   that client devices may be compromised, mobile, or operating on
   hostile networks, and that attackers may observe, inject, or
   replay traffic unless prevented by cryptographic protections.

   Transport security is provided by a secure, mutually
   authenticated TLS session layered over SCTP. When DANE is
   deployed, the client and server certificates are validated
   against DNSSEC‑protected TLSA records, ensuring that only devices
   explicitly provisioned by the domain administrator may initiate a
   connection. Deployments without DNSSEC or without control over
   their DNS zone SHOULD use traditional PKI validation instead.

   User authentication is performed at the application layer
   using the mechanism‑agnostic model defined in Section 5.2.1. The
   server advertises supported mechanisms (for example, “PAM”,
   “FIDO2”) as UTF‑8 identifiers in MWS_AUTH_CHALLENGE, and the
   client selects one. This separation of device and user identity
   ensures that compromise of a device does not grant access to a
   user’s session without the corresponding user credential.

   Each user is given an isolated session and compositor context.
   Clients cannot observe or interfere with other users’ windows,
   input events, or rendering state. Window identifiers are scoped
   to a session, and all client‑originated messages are validated by
   the server. Attempts to reference resources outside the
   authenticated session are rejected with MWS_ERROR(type=700) as
   described in Section 5.8.

   Rendering commands (300–399), input events (400–499), and
   video frames (500–599) are isolated on separate SCTP streams to
   limit the impact of congestion or packet loss. Partially reliable
   delivery MAY be used for video fallback to avoid resource
   exhaustion and to prevent attackers from inducing excessive
   retransmissions without affecting control or input flows.

   Low‑latency audio streams (600–699) follow the same isolation
   model. Audio is transported on dedicated SCTP streams with
   independent reliability and congestion‑control behaviour,
   allowing deployments where the audio interface, mixer, or
   monitoring equipment is located with the user while the digital
   audio workstation (e.g., Ardour) runs remotely on a
   workstation‑class server. This enables studio workflows that
   require silent or thermally isolated compute hardware without
   exposing audio data to other users or sessions. Attempts to
   inject, redirect, or observe audio frames outside the
   authenticated session are rejected by the server, and compromise
   of a client device does not grant access to any other user’s
   audio streams.

   The client is not part of the trusted computing base. It
   stores no confidential data, long‑term session state, or reusable
   credentials. If a client disconnects unexpectedly, the session
   persists only for the duration of the reconnection grace period
   unless the user has explicitly detached. After this period, the
   session is terminated and all associated resources are destroyed,
   as defined in Section 4.7.

   Loss of the transport association for any reason (network
   failure, timeout, endpoint crash) is treated as a fatal transport
   error. The session MAY persist according to the rules in Section
   4.7 and MAY be resumed from another client device subject to
   policy.

9.1. DANE Deployment (Non‑Normative)

   Deployments that operate their own DNS infrastructure MAY use
   DNSSEC and TLSA records (DANE) to authenticate client and server
   certificates during the TLS/SCTP handshake. When DNSSEC
   validation is available, DANE provides a robust mechanism for
   binding workstation and device identity to DNS without relying on
   public certificate authorities.

   When DANE is enabled, the client proves possession of a
   private key whose certificate is published via a DNSSEC‑protected
   TLSA record. No credential payload is required for this
   device‑level authentication; all trust material is derived from
   the mutual‑TLS handshake and DNSSEC validation. This allows
   deployments to authenticate devices without storing reusable
   secrets on the client.

   DANE is OPTIONAL and does not alter the protocol semantics.
   When enabled, it reduces operational complexity in closed trust
   domains by eliminating external trust dependencies, simplifying
   certificate lifecycle management, and mitigating
   man‑in‑the‑middle attacks even in the event of public CA
   compromise.

   Deployments without DNSSEC or without administrative control
   over their DNS zone SHOULD use traditional PKI validation
   instead.

10. IANA Considerations

   This document requests two IANA actions:

      1. Registration of a port number for the Mercurius Window
         System (MWS) in the “Service Name and Transport Protocol Port
         Number Registry.” The suggested service name is mws, and the port
         number SHOULD be allocated from the Registered Port range
         (1024–49151). MWS uses SCTP as its transport.

      2. Registration of the TLS application‑layer protocol
         identifier (ALPN) “mws”.

   No other registries are required. In particular, MWS message
   types, opcodes, and SCTP stream assignments are managed entirely
   within the protocol and do not require IANA allocation.

11. Acknowledgements

   Christopher Ross (chris@tebibyte.org) provided the initial design
   and the reference implementation.

   The reference implementation described in Section 6 is maintained
   in the Mercurius source code repository. Git and SSH access are
   available to contributors on request via mercurius@tebibyte.org.

   Additional background material, including architectural
   rationale, design philosophy, and example use cases, is available
   from the Mercurius project website
   <https://mercurius.tebibyte.org>. This information is provided
   for context only and is non‑normative; the protocol defined in
   this document is complete and does not depend on any specific
   implementation or external documentation.

12. References

12.1. Normative References

   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
             Requirement Levels", BCP 14, RFC 2119,
             DOI 10.17487/RFC2119, March 1997.

   [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
             2119 Key Words", BCP 14, RFC 8174,
             DOI 10.17487/RFC8174, May 2017.

   [RFC4895] Tuexen, M., Stewart, R., and P. Lei, "Authenticated Chunks
             for Stream Control Transmission Protocol (SCTP)", RFC 4895,
             DOI 10.17487/RFC4895, August 2007.

   [RFC9260] Stewart, R., Tuexen, M., and X. Dutreilh, "Stream Control
             Transmission Protocol", RFC 9260,
             DOI 10.17487/RFC9260, June 2022.

   [RFC3436] Jungmaier, A., Rescorla, E., and M. Tuexen,
             "Transport Layer Security over Stream Control Transmission
             Protocol", RFC 3436, DOI 10.17487/RFC3436, December 2002.

   [RFC6698] Hoffman, P. and J. Schlyter, "The DNS-Based Authentication
             of Named Entities (DANE) Transport Layer Security (TLS)
             Protocol: TLSA", RFC 6698,
             DOI 10.17487/RFC6698, August 2012.

   [RFC7671] Dukhovni, V. and W. Hardaker, "The DNS-Based Authentication
             of Named Entities (DANE) Protocol: Updates and Operational
             Guidance", RFC 7671,
             DOI 10.17487/RFC7671, October 2015.

   [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
             Version 1.3", RFC 8446,
             DOI 10.17487/RFC8446, August 2018.

12.2. Informative References

   [VK14]    Khronos Group, "Vulkan 1.4 Specification", 2024.

   [NIST800-207]
             National Institute of Standards and Technology,
             "Zero Trust Architecture", NIST Special Publication
             800-207, August 2020.

   [RFC9261] Tuexen, M. and R. Stewart, "Datagram Transport Layer
             Security (DTLS) Encapsulation of SCTP Packets", RFC 9261,
             DOI 10.17487/RFC9261, June 2022.

Appendix A. MWS Opcode Registry

   This appendix defines the complete registry of MWS opcodes. All
   opcodes are 16‑bit unsigned integers. Opcodes are grouped into
   100‑entry ranges according to functional category. Implementations
   MUST treat unknown opcodes as protocol errors and respond with
   MWS_ERROR (type=700) as described in Section 5.9.

A.1. Handshake and Authentication (000–099)

   NOTE: Type 000 is reserved and MUST be treated as a NULL/invalid
         value. Implementations encountering type=000 MUST respond
         with MWS_ERROR.

      001  MWS_QUERY
      002  MWS_AUTH_CHALLENGE
      003  MWS_AUTH_RESPONSE
      004  MWS_SURFACE_CAPS
      005  MWS_AUDIO_CAPS
      006  MWS_SESSION_INFO

      050–099  Reserved for future handshake extensions

A.2. Session Management (100–199)

      100  MWS_SESSION_RESUME_OFFER
      101  MWS_SESSION_RESUME_REQUEST
      102  MWS_SESSION_RESUME_COMPLETE
      103  MWS_SESSION_DETACH

      110  MWS_EXEC_REQUEST
      111  MWS_EXEC_RESULT

      150–199  Reserved for future session‑management extensions

A.3. Window Lifecycle (200–299)

      200  MWS_CREATE_WINDOW
      201  MWS_WINDOW_CREATED
      202  MWS_DESTROY_WINDOW
      203  MWS_WINDOW_DESTROYED
      204  MWS_MAP_WINDOW
      205  MWS_UNMAP_WINDOW
      206  MWS_CONFIGURE_WINDOW
      207  MWS_FOCUS_WINDOW
      208  MWS_SWAPCHAIN_REVOKED

      250–299  Reserved for future window‑lifecycle extensions

A.4. Rendering Commands (300–399)

      300  MWS_VK_SUBMIT
      301  MWS_VK_SYNC
      302  MWS_VK_DESTROY

      350–399  Reserved for future rendering extensions
               (e.g., bitmap upload, GPU‑side composition, etc.)

A.5. Input Events (400–499)

      400  MWS_INPUT_EVENT
      401  MWS_INPUT_ACK

      450–499  Reserved for future input extensions
               (e.g., haptics, multi‑seat extensions)

A.6. Video Plane (500–599)

      500  MWS_AV1_FRAME
      501  MWS_PLACEHOLDER_FRAME

      550–599  Reserved for future video‑plane extensions

A.7. Audio Plane (600–699)

   Playback streams (server → client):

      600  MWS_AUDIO_PLAYBACK_OPEN
      601  MWS_AUDIO_PLAYBACK_ACCEPT
      602  MWS_AUDIO_PLAYBACK_REJECT
      603  MWS_AUDIO_PLAYBACK_DATA
      604  MWS_AUDIO_PLAYBACK_CLOSE

   Capture streams (client → server):

      620  MWS_AUDIO_CAPTURE_OPEN
      621  MWS_AUDIO_CAPTURE_ACCEPT
      622  MWS_AUDIO_CAPTURE_REJECT
      623  MWS_AUDIO_CAPTURE_DATA
      624  MWS_AUDIO_CAPTURE_CLOSE

      640–699  Reserved for future audio‑plane extensions

A.8. Error Reporting (700–799)

      700  MWS_ERROR
      701  MWS_ERROR_PROTOCOL
      702  MWS_ERROR_SESSION
      703  MWS_ERROR_TRANSPORT
      704  MWS_ERROR_POLICY
      705  MWS_ERROR_RESOURCE

      750–799  Reserved for future error‑reporting extensions

A.9. Reserved for Future Extensions (800–899)

      800–899  Reserved for future protocol extensions.

A.10. Experimental and Vendor‑Specific (900–999)

      900–999  Experimental, vendor‑specific, or implementation‑defined
               opcodes. These MUST NOT be used in interoperable
               deployments and MUST NOT be relied upon in Internet‑scale
               deployments.

Appendix B. Authentication Mechanism Registry

   MWS supports a mechanism‑agnostic authentication model. During
   the initial handshake, the server advertises one or more
   authentication mechanisms using MWS_AUTH_CHALLENGE (type=002).
   The client selects a mechanism and responds with
   MWS_AUTH_RESPONSE (type=003), providing mechanism‑specific
   credentials or authentication data.

   This appendix defines the registry of authentication mechanism
   identifiers. Mechanism identifiers are UTF‑8 strings and are
   compared using case‑sensitive bytewise comparison. Identifiers
   MUST NOT exceed 64 bytes in length.

   Implementations MUST ignore unknown mechanism identifiers and
   MUST NOT attempt to interpret their payloads. Servers MUST NOT
   advertise mechanisms they do not fully support.

B.1. Standard Mechanisms

   The following mechanism identifiers are defined by this specification:

      "PAM"
         The server authenticates the user using the system’s
         Pluggable Authentication Modules (PAM) stack. The credential
         payload contains a NUL‑terminated username followed by a
         NUL‑terminated password.

      "SSHKEY"
         Mercurius can use the same public/private key files that OpenSSH
         uses. This is simply a convenience: tools like `ssh-keygen` and
         `ssh-copy-id` make it easy to create and install keypairs, and
         Mercurius understands the same PEM‑encoded RSA private key format.

B.2. Extensible Mechanisms

   The following identifiers are reserved for future specifications or
   external standards. Their payload formats are not defined by this
   document.

      "FIDO2"
         Authentication using a FIDO2 authenticator.

      "WEBAUTHN"
         Authentication using a WebAuthn ceremony.

      "KERBEROS"
         Authentication using a Kerberos AP‑REQ exchange.

      "OAUTH2"
         Authentication using an OAuth 2.0 device or authorization‑code
         flow.

   Servers MAY advertise any subset of these mechanisms. Clients MAY
   implement any subset.

B.3. Private and Experimental Mechanisms

   Mechanism identifiers beginning with the prefix "X‑" are
   reserved for private, experimental, or vendor‑specific use. These
   identifiers MUST NOT appear in interoperable deployments or
   Internet‑facing services.

      Examples:
         "X‑FINGERPRINT"
         "X‑HARDWARE‑TOKEN"
         "X‑SSO‑PROTOTYPE"

B.4. Registration Policy

   New mechanism identifiers MAY be defined by future MWS
   extensions or external standards. To avoid collisions, new
   identifiers SHOULD be registered with IANA if this specification
   is published on the IETF Standards Track.

   Until such time, implementers SHOULD use the "X‑" prefix for
   experimental mechanisms and MUST NOT assume global uniqueness.

Appendix C. SCTP Stream Usage Summary

   MWS uses multiple SCTP streams to isolate control, rendering,
   input, video, and audio traffic. This appendix summarises the
   required stream assignments. All streams use DTLS for
   confidentiality and integrity.

   Stream assignments are fixed and MUST NOT be repurposed for
   other message classes. Implementations MAY open additional
   streams for experimental or vendor‑specific extensions, provided
   they do not conflict with the assignments below.

C.1. Stream 0 — Control Plane

   Stream 0 carries all ordered control‑plane traffic, including:

      • handshake messages (001–099)
      • session‑management messages (100–199)
      • window‑lifecycle messages (200–299)
      • error messages (700–799)

   Messages on Stream 0 MUST be delivered reliably and in order.

C.2. Stream 1 — Rendering Commands

   Stream 1 carries rendering commands (300–399), including
   Vulkan command submission and GPU‑resource management.

   Messages on Stream 1 MUST be delivered reliably and in order.

C.3. Stream 2 — Input Events

   Stream 2 carries input events (400–499), including keyboard,
   pointer, and tablet events.

   Messages on Stream 2 SHOULD be delivered reliably but MAY be
   processed out of order by the server if permitted by the input
   subsystem.

C.4. Stream 3 — Video Plane

   Stream 3 carries video‑plane traffic (500–599), including
   AV1 frames and placeholder frames.

   Stream 3 MUST use PR‑SCTP (Partial Reliability SCTP) to allow
   frame discard under congestion. Implementations SHOULD use timed
   reliability or limited retransmission to avoid head‑of‑line
   blocking.

C.5. Stream 4 — Audio Plane

   Stream 4 carries audio‑plane traffic (600–699), including:

      • playback streams (600–619)
      • capture streams  (620–639)

   Messages on Stream 4 MUST be delivered reliably and in order.
   Implementations MAY allocate additional SCTP streams for audio
   (for example, one SCTP stream per logical audio stream) as an
   optimisation, provided that audio opcodes (600–699) are not
   multiplexed with control, input, or video opcodes on the same
   SCTP stream.

C.6. Additional Streams

   Streams beyond Stream 4 are reserved for future extensions.
   Such extensions MUST specify:

      • reliability requirements (reliable, PR‑SCTP, unordered)
      • congestion‑control expectations
      • interaction with the control plane

   Experimental or vendor‑specific extensions SHOULD use streams
   ≥16 to avoid collision with future standardised assignments.

Appendix D. Protocol State Machine Diagrams

   This appendix provides normative state‑machine diagrams for
   the MWS protocol. These diagrams illustrate the ordered
   interactions between client and server during initial connection,
   session resumption, and normal operation. All control‑plane
   transitions occur on SCTP Stream 0. Rendering, input, video, and
   audio traffic occur on their respective streams as defined in
   Appendix C.

D.1. Initial Connection State Machine

      +------------------+
      |  START           |
      +------------------+
                |
                |  (TLS/SCTP handshake)
                v
      +------------------+
      |  TRANSPORT_UP    |
      +------------------+
                |
                |  MWS_QUERY (001)
                v
      +------------------+
      |  WAIT_AUTH       |
      +------------------+
                |
                |  MWS_AUTH_CHALLENGE (002)
                v
      +------------------+
      |  AUTH_NEGOTIATE  |
      +------------------+
                |
                |  MWS_AUTH_RESPONSE (003)
                v
      +------------------+
      |  AUTH_VERIFY     |
      +------------------+
                |
                |  success → MWS_SURFACE_CAPS (004)
                |  failure → MWS_ERROR (700, fatal=1)
                v
      +------------------+
      |  SEND_CAPS       |
      +------------------+
                |
                |  MWS_SESSION_INFO (005)
                v
      +------------------+
      |  SESSION_READY   |
      +------------------+
                |
                |  MWS_CREATE_WINDOW (200)
                v
      +------------------+
      |  WINDOW_CREATE   |
      +------------------+
                |
                |  MWS_WINDOW_CREATED (201)
                v
      +------------------+
      |  ACTIVE_SESSION  |
      +------------------+
                |
                |  normal operation:
                |    • rendering (300–399) on Stream 1
                |    • input (400–499) on Stream 2
                |    • video (500–599) on Stream 3
                |    • audio (600–699) on Stream 4
                v
      +------------------+
      |  RUNNING         |
      +------------------+

D.2. Session Resume State Machine

      +------------------+
      |  START           |
      +------------------+
                |
                |  (TLS/SCTP handshake)
                v
      +------------------+
      |  TRANSPORT_UP    |
      +------------------+
                |
                |  MWS_QUERY (001)
                v
      +------------------+
      |  WAIT_RESUME     |
      +------------------+
                |
                |  MWS_SESSION_RESUME_OFFER (100)
                v
      +------------------+
      |  RESUME_OFFERED  |
      +------------------+
                |
                |  MWS_SESSION_RESUME_REQUEST (101)
                v
      +------------------+
      |  RESUME_VERIFY   |
      +------------------+
                |
                |  success → MWS_SESSION_RESUME_COMPLETE (102)
                |  failure → MWS_ERROR (700, fatal=1)
                v
      +------------------+
      |  ACTIVE_SESSION  |
      +------------------+
                |
                |  normal operation resumes:
                |    • rendering (300–399) on Stream 1
                |    • input (400–499) on Stream 2
                |    • video (500–599) on Stream 3
                |    • audio (600–699) on Stream 4
                |      (audio stream capabilities and timing are
                |       re‑negotiated during session resume)
                v
      +------------------+
      |  RUNNING         |
      +------------------+

D.3. Error Handling State Machine

   Errors may occur at any point in the protocol. The
   following diagram illustrates the error‑handling model:

      +------------------+
      |  ANY_STATE       |
      +------------------+
                |
                |  recoverable error
                |  MWS_ERROR (700, fatal=0)
                v
      +------------------+
      |  CONTINUE        |
      +------------------+

                |
                |  fatal error
                |  MWS_ERROR (700, fatal=1)
                v
      +------------------+
      |  TERMINATE       |
      +------------------+
                |o
                |  SCTP association closed
                v
      +------------------+
      |  END             |
      +------------------+

D.4. Stream Interaction Summary

   The following summary illustrates the concurrency model across
   SCTP streams:

   Stream 0 (control):   ordered, reliable
      • handshake (001–099)
      • session management (100–199)
      • window lifecycle (200–299)
      • error reporting (700–799)

   Stream 1 (rendering): ordered, reliable
      • Vulkan commands (300–399)

   Stream 2 (input):     reliable; MAY be processed out of order
      • input events (400–499)

   Stream 3 (video):     PR‑SCTP, typically unordered
      • AV1 frames and placeholder frames (500–599)

   Stream 4 (audio):     ordered, reliable; MAY use additional SCTP streams
      • playback streams (600–619)
      • capture streams  (620–639)

   These streams operate independently. Loss or delay on one
   stream MUST NOT block progress on any other stream.

13. Copyright

   Copyright (c) 2026 IETF Trust and the persons
   identified as the document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (https://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.  Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.
Mercurius Window System (MWS) draft-ross-mercurius-02

Mercurius Window System (MWS)
draft-ross-mercurius-02