34. Rate Limiting

Protecting APIs from abuse with rate limiting policies, fixed and sliding window algorithms, and proper error handling

About this chapter

Protect your API from abuse and ensure fair resource usage by implementing rate limiting with appropriate algorithms, tier-based policies, and proper error responses.

  • Rate limiting fundamentals: Controlling requests per time period per client
  • Attack prevention: Protecting against malicious clients and resource exhaustion
  • Fair usage policies: Allocating resources fairly across different API tiers
  • Fixed window algorithm: Simple but has edge case behavior
  • Sliding window algorithm: More accurate rate limiting with better distribution
  • Implementation and errors: HTTP 429 Too Many Requests responses

Learning outcomes:

  • Understand rate limiting benefits and strategies
  • Choose between fixed and sliding window algorithms
  • Implement rate limiting middleware in ASP.NET Core
  • Create tiered rate limiting policies (free, premium, internal)
  • Return proper HTTP 429 responses with retry information
  • Track and monitor rate limit violations

34.1 Understanding Rate Limiting

Rate limiting = Controlling how many requests a client can make in a time period.

Without rate limiting:

Attacker writes script: for (int i = 0; i < 1000000; i++) { api.GetCommands(); }
    ↓
API processes all 1,000,000 requests
    ↓
Database connection pool exhausted
    ↓
Legitimate users get "Service Unavailable" errors

With rate limiting:

Client #1: 100 requests in 1 minute (allowed)
Client #2: 500 requests in 1 minute (allowed)
Client #3: 100,000 requests in 1 minute (blocked after limit)

Why rate limit?

  • Prevent abuse: Stop malicious users from overwhelming the API
  • Fair usage: Ensure all users get fair access to resources
  • Cost control: API calls cost money (database, bandwidth, compute)
  • SLA protection: Prevent one bad actor from affecting everyone
  • Testing: Catch bugs causing infinite loops early

Common rate limit strategies:

  • Public API: 1,000 requests per hour per IP
  • Free tier: 100 requests per hour per API key
  • Premium tier: 10,000 requests per hour per API key
  • Internal service: No limit (trust your own code)

34.2 Fixed Window vs Sliding Window

Two algorithms to track rate limits:

Fixed Window

How it works:

Time: 0:00 - 1:00
Client requests: 1, 2, 3, ..., 100 (allowed)

Time: 1:00 - 2:00  (window resets)
Requests reset to 0
Client can make 100 more requests

Implementation:

Request at 0:30 → Count = 1
Request at 0:45 → Count = 2
Request at 1:05 → Count resets to 1 (new window started)

Advantage: Simple, fast (no complex calculations)

Disadvantage: “Window boundary burst”

Time: 0:59:50 - Window about to reset
Client makes 100 requests (allowed)

Time: 1:00:10 - Window has reset
Client makes 100 more requests (allowed)

Total in 20 seconds: 200 requests (violates 100/hour limit!)

Sliding Window

How it works:

Request at 0:30 → Keep until 1:30
Request at 0:45 → Keep until 1:45
Request at 1:05 → Keep until 2:05
...

When request comes in at 1:30:
Remove all requests older than 30 minutes ago
Count remaining requests
If count < limit, allow request

Advantage: No boundary bursts, more accurate

Disadvantage: More complex, slightly more CPU/memory

34.3 Built-In Rate Limiting (.NET 8+)

.NET 8 introduced rate limiting middleware. No external package needed:

// Program.cs
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();

// Add rate limiting
builder.Services.AddRateLimiter(options =>
{
    // Default policy: fixed window, 100 requests per minute per IP
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            factory: partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1),
                QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
                QueueLimit = 0  // Don't queue requests
            }));

    options.OnRejected = (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = 429;  // Too Many Requests
        return new ValueTask();
    };
});

var app = builder.Build();

app.UseRateLimiter();  // MUST be early in pipeline

app.UseHttpsRedirection();
app.UseAuthorization();

app.MapControllers();

app.Run();

Result:

Request 1-100: Allowed
Request 101: Rejected with 429 Too Many Requests
Request 102: Rejected with 429
...
Wait 1 minute...
Request 1-100 (new minute): Allowed

34.4 Sliding Window Limiter

For more accurate rate limiting:

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetSlidingWindowLimiter(
            partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            factory: partition => new SlidingWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1),
                SegmentsPerWindow = 10,  // Divide window into 10 segments
                QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
                QueueLimit = 0
            }));
});

Comparison:

Fixed Window Sliding Window
Accuracy ±100% at boundary ±10%
CPU Cost Low Medium
Memory Low Medium
Best for High-traffic APIs APIs requiring accuracy

34.5 Named Rate Limit Policies

Different limits for different endpoints:

// Program.cs
builder.Services.AddRateLimiter(options =>
{
    // Policy 1: Strict limit for expensive operations
    options.AddFixedWindowLimiter("StrictPolicy", config =>
    {
        config.PermitLimit = 10;           // Only 10 requests
        config.Window = TimeSpan.FromMinutes(1);
        config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        config.QueueLimit = 0;
    });

    // Policy 2: Moderate limit for standard endpoints
    options.AddFixedWindowLimiter("StandardPolicy", config =>
    {
        config.PermitLimit = 100;
        config.Window = TimeSpan.FromMinutes(1);
        config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        config.QueueLimit = 0;
    });

    // Policy 3: Generous limit for read operations
    options.AddFixedWindowLimiter("GenerousPolicy", config =>
    {
        config.PermitLimit = 1000;
        config.Window = TimeSpan.FromMinutes(1);
        config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        config.QueueLimit = 0;
    });

    // Default policy
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            factory: partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1)
            }));
});

var app = builder.Build();
app.UseRateLimiter();

Apply policies to endpoints:

[ApiController]
[Route("api/[controller]")]
public class CommandsController : ControllerBase
{
    [HttpGet]
    [EnableRateLimiting("GenerousPolicy")]  // 1000/min
    public async Task<ActionResult> GetCommands()
    {
        var commands = await _repository.GetCommandsAsync();
        return Ok(commands);
    }

    [HttpPost]
    [EnableRateLimiting("StandardPolicy")]  // 100/min
    public async Task<ActionResult> CreateCommand(CommandMutateDto dto)
    {
        var command = _mapper.Map<Command>(dto);
        _repository.CreateCommand(command);
        await _repository.SaveChangesAsync();
        return CreatedAtRoute(nameof(GetCommandById), new { id = command.Id }, command);
    }

    [HttpPost("bulk-create")]
    [EnableRateLimiting("StrictPolicy")]  // Only 10/min
    public async Task<ActionResult> BulkCreate(BulkCreateDto dto)
    {
        // Expensive operation, limit aggressively
        var bulkOp = new BulkOperation { /* ... */ };
        await _bulkRepository.CreateBulkOperationAsync(bulkOp);
        return Accepted();
    }

    [HttpDelete("{id}")]
    [DisableRateLimiting]  // No limit for authorized admin
    [Authorize(Roles = "Admin")]
    public async Task<ActionResult> DeleteCommand(int id)
    {
        // Admins are trusted, no rate limit
        var command = await _repository.GetCommandByIdAsync(id);
        _repository.DeleteCommand(command);
        await _repository.SaveChangesAsync();
        return NoContent();
    }
}

34.6 Rate Limiting by API Key

Instead of IP address, limit by API key (better for APIs with authentication):

// Program.cs
builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
    {
        // Extract API key from header
        var apiKey = httpContext.Request.Headers["X-API-Key"].FirstOrDefault() ?? "anonymous";

        return RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: apiKey,
            factory: partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = GetLimitForApiKey(apiKey),  // Different limit per key
                Window = TimeSpan.FromMinutes(1),
                QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
                QueueLimit = 0
            });
    });
});

// Tiers based on API key
static int GetLimitForApiKey(string apiKey)
{
    return apiKey switch
    {
        "api-key-premium-user-123" => 10000,    // Premium: 10K/min
        "api-key-standard-user-456" => 1000,    // Standard: 1K/min
        "api-key-free-user-789" => 100,         // Free: 100/min
        _ => 10                                 // Unknown: 10/min
    };
}

34.7 Rate Limit Headers

When a request is rate-limited, include information in response headers:

// Custom middleware to add rate limit headers
public class RateLimitHeadersMiddleware
{
    private readonly RequestDelegate _next;

    public RateLimitHeadersMiddleware(RequestDelegate next)
    {
        _next = next;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        var endpoint = context.GetEndpoint();
        var rateLimitMetadata = endpoint?.Metadata.GetMetadata<IRateLimitMetadata>();

        if (rateLimitMetadata != null)
        {
            // Add rate limit info to response headers
            context.Response.Headers.Add("X-RateLimit-Limit", "100");
            context.Response.Headers.Add("X-RateLimit-Remaining", "42");
            context.Response.Headers.Add("X-RateLimit-Reset", "1702819200");  // Unix timestamp
        }

        await _next(context);
    }
}

// In Program.cs
app.UseMiddleware<RateLimitHeadersMiddleware>();
app.UseRateLimiter();

Headers clients receive:

X-RateLimit-Limit: 100        (requests allowed per window)
X-RateLimit-Remaining: 42     (requests left in current window)
X-RateLimit-Reset: 1702819200 (when the window resets, as Unix timestamp)

JavaScript client can use this:

const response = await fetch('https://api.example.com/api/commands');

const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const resetTime = parseInt(response.headers.get('X-RateLimit-Reset'));

console.log(`${remaining}/${limit} requests remaining`);
console.log(`Resets at ${new Date(resetTime * 1000).toLocaleTimeString()}`);

34.8 Handling Rate Limit Exceeded

When a user hits the rate limit, respond with 429 Too Many Requests:

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
        RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            factory: partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1)
            }));

    // Custom response when rate limited
    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        context.HttpContext.Response.ContentType = "application/json";

        var retryAfter = 60;  // Retry after 1 minute
        context.HttpContext.Response.Headers.Add("Retry-After", retryAfter.ToString());

        var response = new
        {
            error = "Too many requests",
            message = $"Rate limit exceeded. Try again in {retryAfter} seconds.",
            retryAfter = retryAfter,
            timestamp = DateTime.UtcNow
        };

        await context.HttpContext.Response.WriteAsJsonAsync(response, cancellationToken);
    };
});

Client receives:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1702819200

{
  "error": "Too many requests",
  "message": "Rate limit exceeded. Try again in 60 seconds.",
  "retryAfter": 60,
  "timestamp": "2024-12-17T10:30:00Z"
}

Client should:

  1. Stop making requests
  2. Wait for Retry-After seconds
  3. Resume requests
  4. Never hammer the API in a loop

34.9 Logging Rate Limit Violations

Track when clients hit limits:

// Logging middleware
public class RateLimitLoggingMiddleware
{
    private readonly RequestDelegate _next;
    private readonly ILogger<RateLimitLoggingMiddleware> _logger;

    public RateLimitLoggingMiddleware(RequestDelegate next, ILogger<RateLimitLoggingMiddleware> logger)
    {
        _next = next;
        _logger = logger;
    }

    public async Task InvokeAsync(HttpContext context)
    {
        await _next(context);

        // Check if response was rate limited
        if (context.Response.StatusCode == 429)
        {
            var clientIp = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";
            var apiKey = context.Request.Headers["X-API-Key"].FirstOrDefault() ?? "anonymous";
            var path = context.Request.Path;

            _logger.LogWarning(
                "Rate limit exceeded for IP {ClientIp}, API Key {ApiKey}, Path {Path}",
                clientIp, apiKey, path);
        }
    }
}

// In Program.cs
app.UseMiddleware<RateLimitLoggingMiddleware>();
app.UseRateLimiter();

34.10 Distributed Rate Limiting (Multiple Servers)

Single server rate limiting doesn’t work if you have multiple API instances behind a load balancer:

Load Balancer
    ↓
Server 1: Request count = 50
Server 2: Request count = 50
Server 3: Request count = 50
    ↓
Client has made 150 requests but each server thinks only 50

Use Redis to share state:
    ↓
Server 1 checks Redis: "50 requests"
Server 2 checks Redis: "100 requests"
Server 3 checks Redis: "150 requests (exceeded!)"

Using StackExchange.Redis for distributed limits:

// Program.cs
using StackExchange.Redis;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddSingleton<IConnectionMultiplexer>(sp =>
    ConnectionMultiplexer.Connect(builder.Configuration["Redis:ConnectionString"]!));

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
    {
        var redis = httpContext.RequestServices.GetRequiredService<IConnectionMultiplexer>();
        var clientId = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
        var db = redis.GetDatabase();

        // Check current count from Redis
        var currentCount = db.StringIncrement($"ratelimit:{clientId}");

        if (currentCount == 1)
        {
            // First request this minute, set expiration
            db.KeyExpire($"ratelimit:{clientId}", TimeSpan.FromMinutes(1));
        }

        return RateLimitPartition.GetFixedWindowLimiter(
            partitionKey: clientId,
            factory: partition => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1)
            });
    });
});

34.11 Rate Limit Strategies by Use Case

Public API (no authentication):

// Limit by IP address
// 100 requests per minute per IP
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
    RateLimitPartition.GetFixedWindowLimiter(
        partitionKey: httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown",
        factory: partition => new FixedWindowRateLimiterOptions
        {
            PermitLimit = 100,
            Window = TimeSpan.FromMinutes(1)
        }));

Authenticated API (with API keys):

// Limit by API key
// Free tier: 100/min, Premium: 10K/min
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
{
    var apiKey = httpContext.Request.Headers["X-API-Key"].FirstOrDefault() ?? "anonymous";
    return RateLimitPartition.GetFixedWindowLimiter(
        partitionKey: apiKey,
        factory: partition => new FixedWindowRateLimiterOptions
        {
            PermitLimit = GetLimitForTier(apiKey),
            Window = TimeSpan.FromMinutes(1)
        });
});

SaaS with user tiers:

// Limit by user ID (from JWT token)
// Hobby: 1K/day, Pro: 10K/day, Enterprise: Unlimited
var userId = httpContext.User.FindFirst(ClaimTypes.NameIdentifier)?.Value ?? "anonymous";
var userTier = GetUserTier(userId);

34.12 Testing Rate Limiting

Unit test:

[Fact]
public async Task GetCommands_WhenRateLimited_Returns429()
{
    var client = _factory.CreateClient();

    // Make 101 requests (limit is 100/min)
    for (int i = 0; i < 101; i++)
    {
        var response = await client.GetAsync("/api/commands");

        if (i < 100)
            Assert.Equal(200, (int)response.StatusCode);
        else
            Assert.Equal(429, (int)response.StatusCode);
    }
}

Load test with rate limiting:

# Apache Bench: 1000 requests, see how many get 429
ab -n 1000 -c 100 https://api.example.com/api/commands

# Count 429 responses
ab -n 1000 -c 100 https://api.example.com/api/commands 2>&1 | grep "429"

34.13 What’s Next

You now have:

  • ✓ Understanding rate limiting fundamentals
  • ✓ Fixed and sliding window algorithms
  • ✓ Named rate limit policies per endpoint
  • ✓ Rate limiting by IP or API key
  • ✓ Proper error handling with 429 responses
  • ✓ Rate limit headers for client information
  • ✓ Distributed rate limiting with Redis
  • ✓ Testing strategies

Next: Security Headers & Best Practices—protecting against HSTS, XSS, SQL injection, and more.