Varnish 102: Protecting Your Application with Rate Limiting on Upsun

Rate limiting is essential for protecting your application from abuse and ensuring fair resource allocation. While Varnish is best known for caching, its vsthrottle VMOD provides powerful rate limiting capabilities that can protect your backend infrastructure from being overwhelmed by excessive requests, whether from malicious actors or legitimate traffic spikes.

Why Rate Limiting in Varnish?

Rate limiting at the Varnish layer offers several key advantages:

Early Termination: Excessive requests are blocked at the edge, never reaching your application servers
Resource Protection: Prevents any single client from monopolizing backend resources
DDoS Mitigation: Provides a first line of defense against distributed denial-of-service attacks
Fair Resource Allocation: Ensures all users get reasonable access to your application
Reduced Infrastructure Costs: Less load on backend servers means more efficient resource usage

On Upsun, Varnish Cache combined with rate limiting VCL creates a robust protective layer that keeps your application running smoothly even under attack.

Understanding vsthrottle

The vsthrottle VMOD (Varnish Module) provides the core functionality for rate limiting. It works by tracking request counts per identifier (typically IP address) within sliding time windows. The primary function is vsthrottle.is_denied(), which takes four parameters:

vsthrottle.is_denied(key, limit, period, block_duration)

key: Identifier for tracking (usually IP address)
limit: Maximum number of requests allowed
period: Time window for counting requests
block_duration: How long to block after exceeding the limit

General Request Rate Limiting

The most common rate limiting pattern restricts how many requests a client can make within a time window:

sub vcl_recv {
...
    # --- Rate limiting ---
    if (req.url !~ "^/(media|static|banner|admin)/")  {
                if (vsthrottle.is_denied(req.http.X-Client-IP, 30, 15s, 15s)) {
                    # Client has exceeded 30 reqs per 15s.
                    # When this happens, block altogether for the next 15s.
                    return (synth(429, "Too Many Requests - Please wait 15 seconds"));
                }
    }
...
}

How It Works

Path Exclusion: The first condition if (req.url !~ "^/(media|static|banner|admin)/") excludes certain paths from rate limiting:
- !~: Negative regex match (does NOT match)
- Static assets (/media, /static) are excluded because they’re cacheable and typically high-volume
- Admin paths are excluded (these might have their own authentication/rate limiting)
vsthrottle.is_denied() Function: This is the core rate limiting logic with four parameters:
- Parameter 1 (req.http.X-Client-IP): The identifier key for rate limiting. Uses the client’s IP address to track individual users.
- Parameter 2 (30): Request threshold - maximum number of requests allowed
- Parameter 3 (15s): Time window for counting requests (15 seconds)
- Parameter 4 (15s): Penalty period - how long to block after exceeding the limit
Rate Limiting Logic: “30 requests per 15 seconds” means:
- Client can make up to 30 requests in any 15-second rolling window
- Once exceeded, the client is blocked completely for 15 seconds
- After the penalty period, the counter resets
HTTP 429 Response: Returns a standard “Too Many Requests” status code with a clear message indicating the wait time.

Why Exclude Static Assets?

Static assets like images, CSS, and JavaScript files are:

Usually cached by Varnish (subsequent requests don’t hit backend)
High-volume but low-cost to serve
Essential for page rendering

Including them in rate limits would punish users for simply loading a page with many assets.

Understanding the Rolling Window

The time window is “rolling,” not fixed. This means:

Time: 0s  → Request count: 1
Time: 1s  → Request count: 5
Time: 5s  → Request count: 15
Time: 10s → Request count: 25
Time: 12s → Request count: 30 (limit reached)
Time: 13s → BLOCKED (penalty period begins)
Time: 28s → Penalty expires, counter resets

The client isn’t tracked per “15-second bucket” but rather “requests in the last 15 seconds at any given moment.”

POST/PUT Rate Limiting

Write operations (POST, PUT) are more expensive than reads and require stricter limits:

sub vcl_recv {
...
    # Only allow a few POST/PUTs per client.
    if ((req.method == "POST" || req.method == "PUT") && (req.url !~ "^/admin")) {
            if (vsthrottle.is_denied("rw" + req.http.X-Client-IP, 5, 10s, 15s)) {
                return (synth(429, "Too Many Requests - Please wait 15 seconds"));
            }
    }
...
}

How Write Rate Limiting Works

Method Check: (req.method == "POST" || req.method == "PUT") targets only write operations, allowing unlimited GET requests.
Admin Exclusion: (req.url !~ "^/admin") excludes admin paths, which may need different rate limiting rules or have their own authentication.
Separate Counter: The key "rw" + req.http.X-Client-IP creates a separate rate limit counter:
- The "rw" prefix ensures write operations have their own bucket
- A client could make 30 GET requests AND 5 POST requests in the same period
- Prevents read traffic from consuming write quota
Stricter Limits: 5, 10s, 15s means:
- Only 5 write requests per 10-second window
- Block for 15 seconds after exceeding
- Much more restrictive than read limits (5 vs 30 requests)
Different Penalty Window: The 10-second measurement window is shorter than the 15-second penalty, meaning clients stay blocked even after their offense window expires.

Why Separate Write Limits?

Write operations typically:

Consume more server resources (database writes, processing)
Have security implications (CSRF, injection attacks)
Are used in brute force attacks (login forms, API abuse)
Should be rate-limited more aggressively than reads

Real-World Example

Consider a user browsing an e-commerce site:

GET  /products       → General rate limit counter (1/30)
GET  /products/item1 → General rate limit counter (2/30)
POST /cart/add       → Write rate limit counter (1/5)
GET  /cart           → General rate limit counter (3/30)
POST /cart/update    → Write rate limit counter (2/5)
GET  /checkout       → General rate limit counter (4/30)
POST /order/submit   → Write rate limit counter (3/5)

The user can browse freely (30 page views) while write operations remain tightly controlled (5 actions).

Advanced Rate Limiting Patterns

Per-Path Rate Limiting

Different endpoints may need different limits:

sub vcl_recv {
...
    # Aggressive rate limiting for APIs
    if (req.url ~ "^/api/") {
        if (vsthrottle.is_denied("api:" + req.http.X-Client-IP, 100, 60s, 60s)) {
            return (synth(429, "API rate limit exceeded"));
        }
    }

    # Strict limits for login endpoints
    if (req.url ~ "^/(login|auth|signin)") {
        if (vsthrottle.is_denied("login:" + req.http.X-Client-IP, 5, 60s, 300s)) {
            return (synth(429, "Too many login attempts. Please wait 5 minutes."));
        }
    }

    # Lenient limits for authenticated users
    if (!req.http.Authorization) {
        # Stricter limits for unauthenticated traffic
        if (vsthrottle.is_denied("unauth:" + req.http.X-Client-IP, 10, 10s, 30s)) {
            return (synth(429, "Please log in for higher rate limits"));
        }
    }
...
}

Key Insights

API Endpoints: Higher limits (100 req/60s) but longer penalty periods for legitimate API usage
Login Protection: Very strict limits (5 req/60s) with long penalties (5 minutes) to prevent brute force
Authentication-Based: Different limits for authenticated vs anonymous users

Combining Rate Limiting with Security Headers

Layer rate limiting with abuse scores and geographic filtering for enhanced protection, the last article in the series covers these headers in more detail:

sub vcl_recv {
...
    # More aggressive rate limiting for suspicious IPs
    if (std.integer(req.http.client-abuse-score, 0) > 15) {
        if (vsthrottle.is_denied("suspicious:" + req.http.X-Client-IP, 10, 30s, 120s)) {
            return (synth(429, "Rate limit exceeded for suspicious traffic"));
        }
    }

    # Stricter limits for certain geographic regions
    if (req.http.client-country ~ "(?i)^(CN|RU)$") {
        if (vsthrottle.is_denied("geo:" + req.http.X-Client-IP, 15, 30s, 60s)) {
            return (synth(429, "Geographic rate limit exceeded"));
        }
    }
...
}

Dynamic Rate Limiting by Risk Level

You can create tiered rate limiting based on risk scores: Important: The vsthrottle.is_denied() function requires literal values for time parameters. To implement tiered limits, use separate conditional blocks:

sub vcl_recv {
...
    # Tiered rate limiting by abuse score
    if (std.integer(req.http.client-abuse-score, 0) > 50) {
        # Very high abuse: 5 req per 60 seconds
        if (vsthrottle.is_denied("high:" + req.http.X-Client-IP, 5, 60s, 120s)) {
            return (synth(429, "Severe rate limit for high-risk IP"));
        }
    } elsif (std.integer(req.http.client-abuse-score, 0) > 25) {
        # Medium abuse: 15 req per 30 seconds
        if (vsthrottle.is_denied("med:" + req.http.X-Client-IP, 15, 30s, 60s)) {
            return (synth(429, "Rate limit exceeded for suspicious IP"));
        }
    } else {
        # Normal traffic: 30 req per 15 seconds
        if (vsthrottle.is_denied("normal:" + req.http.X-Client-IP, 30, 15s, 15s)) {
            return (synth(429, "Too Many Requests"));
        }
    }
...
}

Tuning Rate Limits

Adjust parameters based on your application’s needs and traffic patterns:

Finding the Right Limits

Baseline Analysis: Monitor your application logs to understand typical request patterns:
- Average requests per second per user
- Peak traffic during normal usage
- Page load request counts (including assets)

Conservative Start: Begin with generous limits and tighten based on observed abuse:

# Start here
vsthrottle.is_denied(req.http.X-Client-IP, 100, 60s, 30s)

# Adjust down if seeing abuse
vsthrottle.is_denied(req.http.X-Client-IP, 50, 60s, 60s)

# Final tuned value
vsthrottle.is_denied(req.http.X-Client-IP, 30, 30s, 30s)

Test Legitimate Use Cases: Ensure normal user workflows don’t trigger limits:
- Single page application (SPA) initial load
- Form submissions with multiple validations
- Mobile app burst traffic on app open

Monitoring and Alerting

Track 429 responses in your logs to identify:

False positives (legitimate users being blocked)
Attack patterns (sudden spikes in rate-limited requests)
Effectiveness (reduction in backend load during attacks)

Rate Limiting Best Practices

Use X-Client-IP: The X-Client-IP header typically contains the true client IP, even behind proxies or load balancers on Upsun.
Namespace Your Keys: Use prefixes like "rw", "api:", or "login:" to create separate rate limit buckets for different types of traffic.
Monitor and Adjust: Start with conservative limits and relax them based on legitimate traffic patterns.
Clear Error Messages: Tell users how long to wait and consider including support contact information:
```
return (synth(429, "Too many requests. Please wait 60 seconds or contact support@example.com"));
```
Document Your Limits: Provide API documentation that clearly states rate limits so developers can code accordingly.
Consider Business Logic: Don’t rate-limit so aggressively that it impacts revenue:
- E-commerce checkout flows should be permissive
- Payment endpoints need special consideration
- Customer support areas may need higher limits

Graceful Degradation: Consider returning cached content instead of blocking entirely:

if (vsthrottle.is_denied(req.http.X-Client-IP, 30, 15s, 15s)) {
    # Return stale cache if available, otherwise block
    if (obj.ttl + obj.grace > 0s) {
        return (deliver);
    }
    return (synth(429, "Too Many Requests"));
}

Common Pitfalls to Avoid

1. Rate Limiting Static Assets

Problem: Including cached static assets in rate limits punishes normal page loads. Solution: Exclude paths that serve cached content:

if (req.url !~ "^/(static|media|assets|cdn)/") {
    # Apply rate limiting
}

2. Shared IP Addresses

Problem: Corporate offices, schools, and mobile carriers often share IP addresses across many users. Solution: Consider using session cookies or API keys for more accurate per-user tracking, or use more permissive limits:

# More permissive for common shared IP scenarios
if (req.http.X-Client-IP ~ "^(10\.|172\.(1[6-9]|2[0-9]|3[01])\.|192\.168\.)") {
    # Use higher limits
    vsthrottle.is_denied("corp:" + req.http.X-Client-IP, 100, 30s, 15s)
}

3. Inconsistent Penalty Periods

Problem: Short penalty periods may not deter attackers. Solution: Use exponentially increasing penalties or ensure penalty periods exceed measurement windows:

# Good: Penalty (60s) > Measurement (30s)
vsthrottle.is_denied(req.http.X-Client-IP, 30, 30s, 60s)

# Less effective: Penalty (15s) = Measurement (15s)
vsthrottle.is_denied(req.http.X-Client-IP, 30, 15s, 15s)

4. Not Excluding Health Checks

Problem: Load balancer health checks can consume rate limit quotas. Solution: Exclude monitoring endpoints:

if (req.url ~ "^/(health|status|ping)$") {
    return (pass);  # Skip rate limiting entirely
}

Conclusion

Effective rate limiting with vsthrottle provides robust protection for your Upsun applications. By implementing tiered limits for different request types and carefully tuning your thresholds, you can ensure fair resource allocation while protecting against abuse and attacks. Start with conservative limits, monitor your traffic patterns, and adjust based on real-world behavior. In our next article, Varnish 103: Cache Optimization, we’ll explore URL normalization and query string handling techniques to dramatically improve your cache hit ratios and reduce backend load.

Articles

​Why Rate Limiting in Varnish?

​Understanding vsthrottle

​General Request Rate Limiting

​How It Works

​Why Exclude Static Assets?

​Understanding the Rolling Window

​POST/PUT Rate Limiting

​How Write Rate Limiting Works

​Why Separate Write Limits?

​Real-World Example

​Advanced Rate Limiting Patterns

​Per-Path Rate Limiting

​Key Insights

​Combining Rate Limiting with Security Headers

​Dynamic Rate Limiting by Risk Level

​Tuning Rate Limits

​Finding the Right Limits

​Monitoring and Alerting

​Rate Limiting Best Practices

​Common Pitfalls to Avoid

​1. Rate Limiting Static Assets

​2. Shared IP Addresses

​3. Inconsistent Penalty Periods

​4. Not Excluding Health Checks

​Conclusion