an humble gRPC rate-limiting tutorial

Summary:

  • 1) Rationale
  • 2) Live Demo
  • 3) Some Code details
  • 4) Disclaimer

1) Rationale:

Today I want to show you a possible solution, to a well-known problem when dealing with backend APIs or/and WEB applications.

The problem: imagine today is a special day, like 1 week before Christmas. Your servers will probably reach an abnormal higher number of client request.

If you don’t have a mechanism for limiting the number of requests, your servers resources, your servers or/and application will crash, or having a bad behaviour which will be probably will led to a massive service outage. (depending on your SREs or company size :P )

For solving this problem, we need to focus into the API/server backend application, usually we adopt rate-limiting techniques.. Of course one could cluster an application, but clustering or make HA a broken APP/service, doesn’t seem a good idea. :P

I will show a pragmatic implementation of rate-limiting and algorithms, in golang and gRPC server ecosystem.

Some useful references:

2) Live Demo:

First Download the live demo video here: https://github.com/MalloZup/pegasus/blob/main/ratelimiting.mp4?raw=true

The video showcase a grpc server/client on terminal with rate-limiting. As you will see, I used the interceptor and TockenBucket approach for implementing the rate-limiting in a gRPC server. Obviously the problem can be solved in other ways.

For example using other algorithms..

3) Some Code details

Following code illustrate how to create your rate-limiter.

1) creating the rate-limiter

type rateLimiterInterceptor struct {
        // using TockenBucket 
	TokenBucket *ratelimit.Bucket
}

// this function is the predicate which limits the requests. True -> rate-limiting
func (r *rateLimiterInterceptor) Limit() bool {
	// debug
	fmt.Printf("Token Avail %d \n", r.TokenBucket.Available())

	// if zero we reached rate limit, so return true ( report error to Grpc)
	tokenRes := r.TokenBucket.TakeAvailable(1)
	if tokenRes == 0 {
		fmt.Printf("Reached Rate-Limiting %d \n", r.TokenBucket.Available())
		return true
	}

	// if tokenRes is not zero, means gRpc request can continue to flow without rate limiting :)
	return false
}

2) Registering it to the gRPC server:

Here I just register my previous implemented rate-limiter as middleware to gRPC server.

Note this pattern is pretty much the standard in gRPC or http where you register middleware

	limiter := &rateLimiterInterceptor{}
        // gatherTime is the second when the tocken will refilled, the capacity is the RateLimiting request we tollerate.
        // Example: if set gatherTime to 30 and capacity 20, this means following:
        // we tollerate 20 request which could be like been executed in 1 seconds,
        // and after 30 seconds a new token will added , so only 1 request can be executed.
	limiter.TokenBucket = ratelimit.NewBucket(gatherTime, int64(tokenCapacity))
	s := grpc.NewServer(
		// init the Ratelimiting middleware registration
		grpc_middleware.WithUnaryServerChain(
			grpc_ratelimit.UnaryServerInterceptor(limiter),
		),
		grpc_middleware.WithStreamServerChain(
			grpc_ratelimit.StreamServerInterceptor(limiter),
		),
	)

Note: I took some arbritary values and implemented one possible way, for demo purposes.

I skip the whole gRPC and protobuf documentation/explication where you can find upstream.

Disclaimer:

During this time I researched in my free-time, as individual opensource contributor and citzen how to deal and implement rate-limiting in golang, for a gRPC server. This lines represent my own thoughts, and efforts and research for improving myself and researchs as individual.

P.S: I thanks the #thanos opensource community for support in this research, which I initially picked up and started from my own initiative for researching on an upstream issue, and which I’m not yet finished to do :) https://github.com/thanos-io/thanos

If you find something wrong or invalid, feel free to ping me on twitter or slack.