21. Health Checks
- Health Checks
About this chapter
Implement health checks to report application and dependency status, enabling orchestrators and monitoring systems to make intelligent routing and alerting decisions.
- Health check concepts: Healthy, degraded, and unhealthy states
- Orchestrator integration: Kubernetes liveness and readiness probes
- Built-in health checks: PostgreSQL, Redis, and Hangfire checks
- Custom health checks: Building domain-specific health check implementations
- Multiple endpoints: Different endpoints for different probe types
- Response formats: JSON health status and detailed dependency information
Learning outcomes:
- Understand health check states and use cases
- Configure built-in health checks for databases and caches
- Create custom health check implementations
- Map multiple health check endpoints for different purposes
- Integrate with Kubernetes probes
- Parse and respond to health check requests
21.1 NEW: Understanding Health Checks
What Are Health Checks?
- Endpoints that report the health/status of your application and dependencies
- Used by orchestrators (Kubernetes, Docker Swarm) to determine if app should receive traffic
- Used by monitoring systems to alert on failures
- Help distinguish between “starting up”, “healthy”, and “unhealthy”
Health Check States:
- Healthy: All checks passed, app ready to serve traffic
- Degraded: Some non-critical checks failed, app functional but impaired
- Unhealthy: Critical checks failed, app should not receive traffic
Use Cases:
- Kubernetes liveness probes (restart if unhealthy)
- Kubernetes readiness probes (route traffic if healthy)
- Load balancer health monitoring
- Operational dashboards
- Automated alerting
21.2 Configuring Health Checks for PostgreSQL
Install Health Check Packages:
dotnet add package Microsoft.Extensions.Diagnostics.HealthChecks
dotnet add package AspNetCore.HealthChecks.Npgsql
dotnet add package AspNetCore.HealthChecks.Redis
dotnet add package AspNetCore.HealthChecks.Hangfire
dotnet add package AspNetCore.HealthChecks.UI
dotnet add package AspNetCore.HealthChecks.UI.Client
dotnet add package AspNetCore.HealthChecks.UI.InMemory.Storage
Configure Health Checks in Program.cs:
// Program.cs
var connectionString = new NpgsqlConnectionStringBuilder
{
ConnectionString = builder.Configuration.GetConnectionString("PostgreSqlConnection"),
Username = builder.Configuration["DbUserId"],
Password = builder.Configuration["DbPassword"]
};
builder.Services.AddHealthChecks()
// PostgreSQL health check
.AddNpgSql(
connectionString: connectionString.ConnectionString,
name: "postgresql",
failureStatus: HealthStatus.Unhealthy,
tags: new[] { "db", "sql", "postgresql" },
timeout: TimeSpan.FromSeconds(3))
// Redis health check
.AddRedis(
redisConnectionString: builder.Configuration.GetConnectionString("RedisConnection")!,
name: "redis",
failureStatus: HealthStatus.Degraded, // Redis failure is degraded, not unhealthy
tags: new[] { "cache", "redis" },
timeout: TimeSpan.FromSeconds(2))
// Hangfire health check
.AddHangfire(options =>
{
options.MinimumAvailableServers = 1;
options.MaximumJobsFailed = 5;
},
name: "hangfire",
failureStatus: HealthStatus.Degraded,
tags: new[] { "jobs", "hangfire" })
// Custom health checks (defined below)
.AddCheck<ApiKeyDatabaseHealthCheck>(
"apikey-database",
failureStatus: HealthStatus.Degraded,
tags: new[] { "auth", "database" })
.AddCheck<DiskSpaceHealthCheck>(
"disk-space",
failureStatus: HealthStatus.Degraded,
tags: new[] { "infrastructure" });
// Map health check endpoints
var app = builder.Build();
// Basic health endpoint
app.MapHealthChecks("/health");
// Detailed health endpoint with JSON response
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready"),
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = _ => false // No checks, just confirms process is running
});
// Detailed endpoint for all checks
app.MapHealthChecks("/health/detailed", new HealthCheckOptions
{
ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});
Health Check Response Examples:
// GET /health/detailed - Healthy
{
"status": "Healthy",
"totalDuration": "00:00:00.1234567",
"entries": {
"postgresql": {
"status": "Healthy",
"duration": "00:00:00.0523456",
"data": {},
"tags": ["db", "sql", "postgresql"]
},
"redis": {
"status": "Healthy",
"duration": "00:00:00.0123456",
"data": {},
"tags": ["cache", "redis"]
},
"hangfire": {
"status": "Healthy",
"duration": "00:00:00.0456789",
"data": {
"servers": 1,
"failedJobs": 0
},
"tags": ["jobs", "hangfire"]
}
}
}
// GET /health/detailed - Degraded
{
"status": "Degraded",
"totalDuration": "00:00:00.2345678",
"entries": {
"postgresql": {
"status": "Healthy",
"duration": "00:00:00.0523456",
"data": {}
},
"redis": {
"status": "Unhealthy",
"duration": "00:00:02.0000000",
"exception": "It was not possible to connect to the redis server(s)",
"description": "Redis connection failed",
"data": {},
"tags": ["cache", "redis"]
}
}
}
21.3 Redis Health Checks
Redis Connection Health:
// Already configured above with .AddRedis()
// But here's a custom implementation for learning:
public class RedisHealthCheck : IHealthCheck
{
private readonly IConnectionMultiplexer _redis;
private readonly ILogger<RedisHealthCheck> _logger;
public RedisHealthCheck(
IConnectionMultiplexer redis,
ILogger<RedisHealthCheck> logger)
{
_redis = redis;
_logger = logger;
}
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
var database = _redis.GetDatabase();
var testKey = "__health_check__";
var testValue = DateTime.UtcNow.Ticks.ToString();
// Test write
await database.StringSetAsync(testKey, testValue);
// Test read
var retrievedValue = await database.StringGetAsync(testKey);
// Cleanup
await database.KeyDeleteAsync(testKey);
if (retrievedValue == testValue)
{
var endpoints = _redis.GetEndPoints();
var data = new Dictionary<string, object>
{
{ "endpoints", string.Join(", ", endpoints.Select(e => e.ToString())) },
{ "connected", _redis.IsConnected }
};
return HealthCheckResult.Healthy(
"Redis is healthy",
data);
}
return HealthCheckResult.Degraded(
"Redis read/write test failed");
}
catch (Exception ex)
{
_logger.LogError(ex, "Redis health check failed");
return HealthCheckResult.Unhealthy(
"Redis is unavailable",
ex,
new Dictionary<string, object>
{
{ "error", ex.Message }
});
}
}
}
21.4 Custom Health Checks
API Key Database Health Check:
using Microsoft.Extensions.Diagnostics.HealthChecks;
public class ApiKeyDatabaseHealthCheck : IHealthCheck
{
private readonly IRegistrationRepository _registrationRepo;
private readonly ILogger<ApiKeyDatabaseHealthCheck> _logger;
public ApiKeyDatabaseHealthCheck(
IRegistrationRepository registrationRepo,
ILogger<ApiKeyDatabaseHealthCheck> logger)
{
_registrationRepo = registrationRepo;
_logger = logger;
}
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
// Try to get key registrations count
var keyCount = await _registrationRepo.GetKeyCountAsync();
var data = new Dictionary<string, object>
{
{ "registeredKeys", keyCount },
{ "checkedAt", DateTime.UtcNow }
};
if (keyCount >= 0)
{
return HealthCheckResult.Healthy(
$"API key database is healthy with {keyCount} registered keys",
data);
}
return HealthCheckResult.Degraded(
"Unable to verify API key count",
data: data);
}
catch (Exception ex)
{
_logger.LogError(ex, "API key database health check failed");
return HealthCheckResult.Unhealthy(
"API key database is unavailable",
ex);
}
}
}
Disk Space Health Check:
public class DiskSpaceHealthCheck : IHealthCheck
{
private readonly ILogger<DiskSpaceHealthCheck> _logger;
private const long MinimumFreeBytesWarning = 5_000_000_000; // 5 GB
private const long MinimumFreeBytesError = 1_000_000_000; // 1 GB
public DiskSpaceHealthCheck(ILogger<DiskSpaceHealthCheck> logger)
{
_logger = logger;
}
public Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
var driveInfo = new DriveInfo(Path.GetPathRoot(Directory.GetCurrentDirectory())!);
var freeSpaceGB = driveInfo.AvailableFreeSpace / 1_073_741_824.0; // Convert to GB
var data = new Dictionary<string, object>
{
{ "drive", driveInfo.Name },
{ "freeSpaceGB", Math.Round(freeSpaceGB, 2) },
{ "totalSpaceGB", Math.Round(driveInfo.TotalSize / 1_073_741_824.0, 2) },
{ "usedPercentage", Math.Round((1 - (double)driveInfo.AvailableFreeSpace / driveInfo.TotalSize) * 100, 2) }
};
if (driveInfo.AvailableFreeSpace < MinimumFreeBytesError)
{
_logger.LogError(
"Critical: Low disk space. Only {FreeSpaceGB:F2} GB remaining",
freeSpaceGB);
return Task.FromResult(HealthCheckResult.Unhealthy(
$"Critical: Only {freeSpaceGB:F2} GB disk space remaining",
data: data));
}
if (driveInfo.AvailableFreeSpace < MinimumFreeBytesWarning)
{
_logger.LogWarning(
"Warning: Low disk space. Only {FreeSpaceGB:F2} GB remaining",
freeSpaceGB);
return Task.FromResult(HealthCheckResult.Degraded(
$"Warning: Only {freeSpaceGB:F2} GB disk space remaining",
data: data));
}
return Task.FromResult(HealthCheckResult.Healthy(
$"Sufficient disk space available: {freeSpaceGB:F2} GB",
data));
}
catch (Exception ex)
{
_logger.LogError(ex, "Disk space health check failed");
return Task.FromResult(HealthCheckResult.Unhealthy(
"Unable to check disk space",
ex));
}
}
}
Memory Health Check:
public class MemoryHealthCheck : IHealthCheck
{
private const long MaxMemoryBytes = 1_000_000_000; // 1 GB threshold
public Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
var allocated = GC.GetTotalMemory(forceFullCollection: false);
var allocatedMB = allocated / 1_048_576.0;
var data = new Dictionary<string, object>
{
{ "allocatedMB", Math.Round(allocatedMB, 2) },
{ "gen0Collections", GC.CollectionCount(0) },
{ "gen1Collections", GC.CollectionCount(1) },
{ "gen2Collections", GC.CollectionCount(2) }
};
if (allocated >= MaxMemoryBytes)
{
return Task.FromResult(HealthCheckResult.Degraded(
$"High memory usage: {allocatedMB:F2} MB allocated",
data: data));
}
return Task.FromResult(HealthCheckResult.Healthy(
$"Memory usage normal: {allocatedMB:F2} MB allocated",
data));
}
}
21.5 Health Check UI
Configure Health Check UI:
// Program.cs
builder.Services.AddHealthChecksUI(options =>
{
options.SetEvaluationTimeInSeconds(30); // Check every 30 seconds
options.MaximumHistoryEntriesPerEndpoint(50);
options.AddHealthCheckEndpoint("CommandAPI", "/health/detailed");
}).AddInMemoryStorage();
// In app configuration
app.MapHealthChecksUI(options =>
{
options.UIPath = "/health-ui"; // UI at /health-ui
options.ApiPath = "/health-ui-api";
});
Access UI:
http://localhost:7213/health-ui
appsettings.json Configuration:
{
"HealthChecksUI": {
"HealthChecks": [
{
"Name": "CommandAPI",
"Uri": "https://localhost:7213/health/detailed"
}
],
"EvaluationTimeInSeconds": 30,
"MinimumSecondsBetweenFailureNotifications": 300
}
}
21.6 Integration with Monitoring Systems
Kubernetes Probes:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: commandapi
spec:
template:
spec:
containers:
- name: commandapi
image: commandapi:latest
ports:
- containerPort: 80
# Liveness probe - restart container if fails
livenessProbe:
httpGet:
path: /health/live
port: 80
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
# Readiness probe - remove from load balancer if fails
readinessProbe:
httpGet:
path: /health/ready
port: 80
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Docker Compose Health Check:
# docker-compose.yaml
services:
commandapi:
image: commandapi:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:80/health"\]
interval: 30s
timeout: 3s
retries: 3
start_period: 40s
Prometheus Metrics (Optional):
dotnet add package AspNetCore.HealthChecks.Publisher.Prometheus
builder.Services.AddHealthChecks()
.AddNpgSql(/* ... */)
.ForwardToPrometheus();
Custom Alerting:
// Create a hosted service that monitors health and sends alerts
public class HealthCheckAlertService : BackgroundService
{
private readonly HealthCheckService _healthCheckService;
private readonly ILogger<HealthCheckAlertService> _logger;
// Add email service or notification service
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
var report = await _healthCheckService.CheckHealthAsync(stoppingToken);
if (report.Status == HealthStatus.Unhealthy)
{
_logger.LogCritical(
"Health check failed: {@HealthReport}",
new
{
report.Status,
Entries = report.Entries.Select(e => new
{
e.Key,
Status = e.Value.Status.ToString(),
e.Value.Description,
e.Value.Exception?.Message
})
});
// Send alert email/SMS/Slack notification
// await _notificationService.SendAlertAsync(report);
}
await Task.Delay(TimeSpan.FromMinutes(5), stoppingToken);
}
}
}
// Register service
builder.Services.AddHostedService<HealthCheckAlertService>();
Testing Health Checks:
### Check overall health
GET https://localhost:7213/health
### Check detailed health with JSON
GET https://localhost:7213/health/detailed
Accept: application/json
### Check liveness (for K8s)
GET https://localhost:7213/health/live
### Check readiness (for K8s)
GET https://localhost:7213/health/ready
### Expected responses:
# 200 OK - Healthy
# 200 OK - Degraded (some non-critical checks failed)
# 503 Service Unavailable - Unhealthy