WebSockets & Real-Time Apps

Understanding the WebSocket Protocol

20 min Lesson 2 of 35

The WebSocket Protocol

WebSocket is a communication protocol defined in RFC 6455 that provides full-duplex communication channels over a single TCP connection. Unlike HTTP, which follows a request-response pattern, WebSocket enables continuous bidirectional data flow between client and server.

Key Specification: The WebSocket protocol was standardized by the IETF as RFC 6455 in 2011 and is supported by the WHATWG HTML Living Standard. It operates on ports 80 (ws://) and 443 (wss://) to work through firewalls and proxies.

The WebSocket Handshake

WebSocket connections begin as HTTP requests and are then "upgraded" to the WebSocket protocol through a handshake process:

Client Handshake Request

GET /chat HTTP/1.1 Host: example.com:8080 Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Sec-WebSocket-Version: 13 Origin: http://example.com

Key Headers Explained:

  • Upgrade: websocket - Requests protocol upgrade from HTTP to WebSocket
  • Connection: Upgrade - Indicates the client wants to change protocols
  • Sec-WebSocket-Key - Base64-encoded random 16-byte value for security
  • Sec-WebSocket-Version - WebSocket protocol version (13 is current)
  • Origin - Browser security header indicating request origin

Server Handshake Response

HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

Response Details:

  • 101 Switching Protocols - Status code indicating successful upgrade
  • Sec-WebSocket-Accept - Computed hash of the client key proving server understands WebSocket protocol
Security Mechanism: The Sec-WebSocket-Accept header is computed by concatenating the client's Sec-WebSocket-Key with the magic string "258EAFA5-E914-47DA-95CA-C5AB0DC85B11", then taking the SHA-1 hash and Base64-encoding it. This prevents protocol confusion attacks.

WebSocket URIs: ws:// and wss://

WebSocket uses its own URI schemes similar to HTTP:

// Unencrypted WebSocket connection ws://example.com:8080/socket // Encrypted WebSocket connection (over TLS/SSL) wss://example.com:443/socket

URI Components:

  • Scheme: ws:// (unencrypted) or wss:// (encrypted)
  • Host: Domain name or IP address
  • Port: 80 for ws://, 443 for wss:// (can be custom)
  • Path: Resource path on the server
  • Query: Optional query parameters (e.g., ?token=abc123)
Security Best Practice: Always use wss:// in production! Unencrypted ws:// connections expose all data to eavesdropping and man-in-the-middle attacks. Many browsers will block ws:// connections from HTTPS pages.

WebSocket Frame Structure

After the handshake, all communication happens through WebSocket frames. Each frame has a minimal header structure for efficiency:

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-------+-+-------------+-------------------------------+ |F|R|R|R| opcode|M| Payload len | Extended payload length | |I|S|S|S| (4) |A| (7) | (16/64) | |N|V|V|V| |S| | (if payload len==126/127) | | |1|2|3| |K| | | +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - + | Extended payload length continued, if payload len == 127 | + - - - - - - - - - - - - - - - +-------------------------------+ | |Masking-key, if MASK set to 1 | +-------------------------------+-------------------------------+ | Masking-key (continued) | Payload Data | +-------------------------------- - - - - - - - - - - - - - - - + : Payload Data continued ... : + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + | Payload Data continued ... | +---------------------------------------------------------------+

Frame Header Components:

  • FIN (1 bit): Indicates if this is the final fragment (1) or more fragments coming (0)
  • RSV1-3 (3 bits): Reserved for extensions, must be 0
  • Opcode (4 bits): Frame type (text, binary, close, ping, pong)
  • MASK (1 bit): Whether payload is masked (required for client-to-server)
  • Payload length (7 bits + extended): Length of the payload data
  • Masking-key (32 bits): Random key used to mask the payload (client-to-server only)

Frame Opcodes

Opcode | Meaning | Description --------|-------------------|---------------------------------- 0x0 | Continuation | Continuation frame 0x1 | Text | UTF-8 text data 0x2 | Binary | Binary data 0x8 | Close | Connection close 0x9 | Ping | Heartbeat ping 0xA | Pong | Heartbeat pong response
Client-to-Server Masking: All frames sent from client to server must be masked with a random 32-bit key. This prevents cache poisoning attacks where malicious WebSocket data could be interpreted as HTTP by proxy servers. Server-to-client frames are not masked.

Connection Lifecycle

A WebSocket connection goes through several distinct phases:

1. Connecting (CONNECTING)

The initial state when the WebSocket constructor is called but the handshake hasn't completed yet.

const socket = new WebSocket('wss://example.com/socket'); console.log(socket.readyState); // 0 - CONNECTING

2. Open (OPEN)

The handshake succeeded and the connection is established. Data can now be sent and received.

socket.onopen = (event) => { console.log('Connection established'); console.log(socket.readyState); // 1 - OPEN socket.send('Hello!'); };

3. Closing (CLOSING)

Either party has initiated the closing handshake but the connection is not yet closed.

socket.close(1000, 'Normal closure'); console.log(socket.readyState); // 2 - CLOSING

4. Closed (CLOSED)

The connection has been closed or could not be established.

socket.onclose = (event) => { console.log('Connection closed'); console.log(socket.readyState); // 3 - CLOSED console.log('Code:', event.code); console.log('Reason:', event.reason); };

Close Status Codes

When closing a WebSocket connection, a status code and optional reason can be provided:

Code | Name | Description ------|----------------------|---------------------------------------- 1000 | Normal Closure | Successful operation / normal closure 1001 | Going Away | Endpoint going away (server down, page navigation) 1002 | Protocol Error | Endpoint terminating due to protocol error 1003 | Unsupported Data | Data type cannot be accepted 1006 | Abnormal Closure | No close frame received (reserved, cannot be sent) 1007 | Invalid Payload | Inconsistent data (e.g., non-UTF8 in text frame) 1008 | Policy Violation | Message violates policy 1009 | Message Too Big | Message too large to process 1010 | Mandatory Extension | Client expected server to negotiate extension 1011 | Internal Server Error| Unexpected condition prevented fulfillment 1015 | TLS Handshake | TLS handshake failure (reserved, cannot be sent)
Clean Closure: Always close connections with status code 1000 and a descriptive reason when shutting down normally. This helps with debugging and provides better user experience.

WebSocket vs HTTP: Key Differences

Connection Model

HTTP: - Request → Response → Close - Short-lived connections - Client must initiate all communication WebSocket: - Handshake → Persistent Connection - Long-lived bidirectional channel - Both parties can send at any time

Overhead Comparison

HTTP Request: - Headers: 200-2000+ bytes per request - Each request includes full headers WebSocket Frame: - Headers: 2-14 bytes per message - Connection established once - 99% reduction in overhead for small messages

Latency

HTTP: Every message requires TCP handshake (if connection closed), TLS handshake (HTTPS), and HTTP headers processing.

WebSocket: After initial handshake, messages have minimal framing overhead and no additional handshakes.

Exercise: Calculate the bandwidth savings of WebSocket vs HTTP for a chat application that sends 100 short messages per minute:
  • HTTP: ~150KB headers + message data per minute
  • WebSocket: ~1KB framing + message data per minute
  • Savings: ~99% reduction in overhead

Browser Support

WebSocket is supported by all modern browsers:

Browser | Version | Year -----------------|---------|------ Chrome | 16+ | 2012 Firefox | 11+ | 2012 Safari | 7+ | 2013 Edge | All | 2015 Opera | 12.1+ | 2012 Mobile Safari | 7.1+ | 2014 Chrome Android | All | 2012+
Feature Detection: Check for WebSocket support before using:

if ('WebSocket' in window) {
  // WebSocket is supported
} else {
  // Fallback to polling or SSE
}

Protocol Extensions

The WebSocket protocol supports extensions negotiated during the handshake:

// Client request with extension Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits // Server response accepting extension Sec-WebSocket-Extensions: permessage-deflate

Common Extensions:

  • permessage-deflate: Compresses messages using DEFLATE algorithm (RFC 7692)
  • permessage-deflate; client_no_context_takeover: Compression without context between messages
Compression Caution: While compression reduces bandwidth, it increases CPU usage and latency. Test performance with your specific workload before enabling it.

Summary

The WebSocket protocol provides efficient, bidirectional communication through an HTTP upgrade handshake, minimal frame headers, and a persistent TCP connection. Understanding the protocol details—handshake process, frame structure, connection lifecycle, and status codes—is crucial for building robust real-time applications. In the next lesson, we'll explore the native WebSocket API in the browser.