Networking Best Practices
Networking Best Practices
Knowing how to open a socket or fire an HTTP request is only half the job. Professional networking code must also survive the realities of distributed systems: slow servers, transient failures, exhausted connection pools, and man-in-the-middle attacks. This lesson distills the four pillars that separate production-grade network code from tutorial code: timeouts, retries with back-off, connection reuse, and TLS fundamentals.
1. Timeouts — Always Set Them
Every network call can block forever unless you explicitly bound it. Java's HttpClient exposes two independent timeout types:
- Connect timeout — how long to wait for the TCP handshake to complete. Set on the
HttpClientitself. - Request timeout — how long the entire request/response cycle may take. Set per
HttpRequest.
If the connect timeout fires you get a ConnectException; if the request timeout fires you get an HttpTimeoutException. Catch them separately — a timeout is a recoverable signal, not a fatal bug.
2. Retries with Exponential Back-off
Transient failures — a momentary DNS hiccup, a 503 Service Unavailable, a dropped TCP connection — are normal in distributed systems. The correct response is a bounded retry loop with exponential back-off and jitter.
Why exponential back-off? If a server is overloaded and all clients retry simultaneously at a fixed interval, they create a thundering herd that makes the situation worse. Doubling the wait time with each attempt spreads the load out. Adding random jitter prevents synchronised retry storms even when clients started at the same time.
GET or DELETE is safe to repeat. A POST that creates a resource is not — replaying it may duplicate data. For non-idempotent calls, investigate idempotency keys (send a unique Idempotency-Key header the server can use to deduplicate).
Also honour the Retry-After response header when it is present — servers that return 429 Too Many Requests often tell you exactly how long to wait. Parse that header and use it instead of your back-off value.
3. Connection Reuse — The HttpClient Connection Pool
Opening a TCP connection is expensive: it requires a three-way handshake, and if TLS is involved, an additional 1–2 round trips for the handshake. Creating a new HttpClient per request throws that investment away every time.
Java's HttpClient maintains an internal connection pool automatically when you reuse the same instance. The pool supports HTTP/1.1 keep-alive and HTTP/2 multiplexing (many requests over a single connection).
HttpClient inside a loop or per-request method. Each instance starts with an empty pool, so you pay the full connection overhead on every call and you may exhaust file descriptors under load.
In frameworks like Spring Boot you would inject a shared HttpClient bean (or use RestClient / WebClient which manage their own pools). The principle is the same: one pool per downstream service.
4. TLS Basics — Trust, Certificates, and Hostname Verification
Java's HttpClient enforces TLS by default for https:// URLs. Understanding what happens under the hood helps you handle the cases where the defaults need tuning.
- Trust store — the JVM ships with a
cacertstrust store containing well-known CA certificates. When your client connects, the server presents its certificate chain; the JVM verifies it chains up to a trusted CA and that the certificate has not expired. - Hostname verification — after trust is established, the JVM checks that the hostname in the URL matches the
Subject Alternative Name(orCN) in the certificate. This prevents a valid certificate for evil.com being presented for bank.com. - TLS version — by default the JVM negotiates the highest mutually supported version. Java 17 supports TLS 1.2 and TLS 1.3; TLS 1.0 and 1.1 are disabled. Never downgrade this in production.
The most common TLS problem in internal or dev environments is a self-signed certificate. The correct fix is to add that certificate to a custom trust store — not to disable verification.
trustAllCerts() or setting a no-op TrustManager removes every TLS guarantee. An attacker can intercept the connection with any certificate. This is occasionally seen in test code — keep it strictly in tests and behind a flag, never in a production path.
Putting It All Together
Production-grade networking combines all four concerns simultaneously. The pattern is: one shared, pooled HttpClient with a connect timeout; requests with individual timeouts; a retry helper that applies exponential back-off only for transient errors; and TLS with proper certificate validation.
- Connect timeout set on the client.
- Request timeout set on every request.
- Retry loop with exponential back-off and jitter — only for idempotent, retryable status codes.
HttpClientcreated once and reused (or obtained from a DI container).- TLS certificate validation left enabled; custom CAs added to a trust store, not bypassed.
- Structured logging of attempt counts, status codes, and elapsed time for observability.
Summary
Always set both a connect timeout and a request timeout — they guard against different failure modes. Use exponential back-off with jitter for retries, and only retry idempotent, transient failures. Reuse a single HttpClient instance to leverage its connection pool and avoid the overhead of repeated TCP and TLS handshakes. Understand the TLS trust chain and hostname verification that Java enforces by default; add custom CAs to a trust store rather than disabling validation. These practices make your networking code resilient, efficient, and secure.