Your WAF is buffering. Ours is streaming.

Web Application Firewalls (WAFs) have a reputation problem. They sit in front of your application, inspect every request, and add latency. The security folks love them. The performance folks tolerate them. Everyone accepts this tradeoff because what’s the alternative? Here’s the thing: that tradeoff isn’t inevitable. The latency comes from how most WAFs work, not from what they do.

The buffering problem

Traditional WAFs follow a simple pattern:

Receive the entire request
Validate it
If valid, forward it to the backend
If invalid, reject it

This means your backend sits idle while the WAF buffers and inspects. For a small JSON payload, you might not notice. For a large file upload, you will. Your backend doesn’t see anything until the WAF finishes its job. The latency adds up.

What if we validated while streaming?

Upsun runs a technical WAF as part of its infrastructure. “Technical” because it focuses on protocol-level validation rather than application-layer attacks like SQL injection. It checks that your JSON is actually JSON, that your request headers make sense, and that the HTTP protocol is being followed correctly. The key insight: you don’t need the entire request to start validating it. And you don’t need validation to finish before you start forwarding. Upsun’s edge layer forwards incoming requests to both the WAF and your backend simultaneously. The backend receives the data as it arrives. The WAF validates it at the same time. If validation passes, the backend already has everything it needs. If validation fails, the edge layer cuts the connection before the backend processes anything. The trick is that “end-of-stream marker” at the end. The backend receives data but doesn’t start processing until it knows the request is complete. The edge layer only sends that marker after the WAF confirms validation passed.

HTTP/2 makes this elegant

HTTP/2 works in frames, and one of those frames is an explicit end-of-stream marker. This is exactly what we need. The edge layer receives frames from the client and immediately forwards them to both the WAF and backend. Both process the incoming data in parallel. When the client finishes sending, the WAF completes validation. If everything checks out, the edge layer forwards the end-of-stream frame to the backend. The backend, which already has all the data buffered, starts processing immediately. If validation fails? The edge layer closes the connection to the backend. The backend never received that end-of-stream marker, so it discards the partial request. Most requests are valid. This means most of the time, by the moment validation completes, your backend is ready to go. The latency tax drops to nearly zero.

HTTP/1.1 is trickier

HTTP/1.1 doesn’t have built-in stream markers, but chunked transfer encoding provides something similar. With chunked requests, you send data in pieces, and a zero-length chunk signals the end. So the same pattern works: stream chunks to the backend, validate in parallel, send the final chunk only after validation passes. But there’s a catch.

PHP-FPM doesn’t support chunked requests

If your backend is PHP-FPM, you’re out of luck. It doesn’t understand chunked transfer encoding. It can’t buffer incoming chunks and wait for that final marker before processing. It just doesn’t. The workaround: Nginx buffers the request before passing it to PHP-FPM. This means PHP applications don’t get the streaming benefit. They still get the security validation, but they pay the latency cost of buffering. For most PHP applications, this is fine. Typical web requests are small, and the buffering overhead is minimal.

But what about WebSockets?

Some applications genuinely need streaming. WebSocket connections, for example, are long-lived and bidirectional. Buffering them would break everything. The good news: if your backend isn’t PHP-FPM, the streaming WAF works fine. Node.js, Go, Python, and most other runtimes handle chunked requests without issues. You get both streaming and WAF validation. Upsun provides configuration options to disable request buffering for routes that need it.

The takeaway

WAF latency isn’t a law of physics. It’s an implementation choice. By streaming requests to the backend while validating in parallel, you can have security without the performance penalty. The approach works best with HTTP/2 or HTTP/1.1 chunked encoding. PHP-FPM applications need buffering as a workaround, but that’s a limitation of the runtime, not the architecture. Most of this happens invisibly. Your requests get validated, your backend processes them, and you don’t notice the WAF sitting in the middle. Which is exactly how security infrastructure should work.

Articles

​The buffering problem

​What if we validated while streaming?

​HTTP/2 makes this elegant

​HTTP/1.1 is trickier

​PHP-FPM doesn’t support chunked requests

​But what about WebSockets?

​The takeaway