Health Checking

GRPC provides Health Checking Protocol to implement health checks. You can see it’s latest definition here: grpc/health/v1/health.proto.

As you can see from the service definition, Health service should implement one or two methods: simple unary-unary Check method for synchronous checks and more sophisticated unary-stream Watch method to asynchronously wait for status changes. grpclib implements both of them.

grpclib also provides additional functionality to help write health checks, so users don’t have to write a lot of code on their own. It is possible to implement health check in two ways (you can use both ways simultaneously):

  • use ServiceCheck class by providing a callable object which can be called asynchronously to determine check’s status

  • use ServiceStatus class and change it’s status by using set method

ServiceCheck is a simplest and most generic way to implement periodic checks.

ServiceStatus is for a more advanced usage, when you are able to detect and change check’s status proactively (e.g. by detecting lost connection). And this way is more efficient and robust.

User Guide

Note

To test server’s health we will use grpc_health_probe command.

Overall Server Health

The most simplest health checks:

from grpclib.health.service import Health

health = Health()

server = Server(handlers + [health])

Testing:

$ grpc_health_probe -addr=localhost:50051
healthy: SERVING

Overall server status is always SERVING.

If you want to add real checks:

from grpclib.health.service import Health, OVERALL

health = Health({OVERALL: [db_check, cache_check]})

Overall server status is SERVING if all checks are passing.

Detailed Services Health

If you want to provide different checks for different services:

foo = FooService()
bar = BarService()

health = Health({
    foo: [a_check, b_check],
    bar: [b_check, c_check],
})

Testing:

$ grpc_health_probe -addr=localhost:50051 -service acme.FooService
healthy: SERVING
$ grpc_health_probe -addr=localhost:50051 -service acme.BarService
healthy: NOT_SERVING
$ grpc_health_probe -addr=localhost:50051
healthy: NOT_SERVING
  • acme.FooService is healthy if a_check and b_check are passing

  • acme.BarService is healthy if b_check and c_check are passing

  • Overall health status depends on all checks

You can also override checks list for overall server’s health status:

foo = FooService()
bar = BarService()

health = Health({
    foo: [a_check, b_check],
    bar: [b_check, c_check],
    OVERALL: [a_check, c_check],
})

Reference

grpclib.health.service.OVERALL = <grpclib.health.service._Overall object>

Represents overall health status of all services

class grpclib.health.service.Health(checks: Optional[Mapping[ICheckable, Collection[CheckBase]]] = None)

Health-checking service

Example:

from grpclib.health.service import Health

auth = AuthService()
billing = BillingService()

health = Health({
    auth: [redis_status],
    billing: [db_check],
})

server = Server([auth, billing, health])
async Check(stream: Stream[HealthCheckRequest, HealthCheckResponse]) None

Implements synchronous periodic checks

class grpclib.health.check.ServiceCheck(func: Callable[[], Awaitable[Optional[bool]]], *, loop: Optional[AbstractEventLoop] = None, check_ttl: float = 30, check_timeout: float = 10)

Performs periodic checks

Example:

async def db_test():
    # raised exceptions are the same as returning False,
    # except that exceptions will be logged
    await db.execute('SELECT 1;')
    return True

db_check = ServiceCheck(db_test)
Parameters:
  • func – callable object which returns awaitable object, where result is one of: True (healthy), False (unhealthy), or None (unknown)

  • loop – (deprecated) asyncio-compatible event loop

  • check_ttl – how long we can cache result of the previous check

  • check_timeout – timeout for this check

class grpclib.health.check.ServiceStatus(*, loop: Optional[AbstractEventLoop] = None)

Contains status of a proactive check

Example:

redis_status = ServiceStatus()

# detected that Redis is available
redis_status.set(True)

# detected that Redis is unavailable
redis_status.set(False)
Parameters:

loop – (deprecated) asyncio-compatible event loop

set(value: Optional[bool]) None

Sets current status of a check

Parameters:

valueTrue (healthy), False (unhealthy), or None (unknown)