A request arrives at 15th second (25% of the current minute) of the current minute
Then, the number of requests in the rolling window is calculated using:
Requests in current window + requests in the previous window * overlap percentage of the rolling window and the previous window
0 + 5 * 0.75
3.75 - This can be rounded up or down, depending on the use case. Lets round up to 4
Since 4 is less than 5, this request (at the 15th second) will be allowed to pass through
Pros and Cons
Pros
Smooths out spikes in traffic
Memory efficient
Cons
It is an approximation of the actual rate because it assumes requests in the previous window are evenly distributed.
This problem is not as bad as it seems. Experiments done by Cloudflare, only 0.003% of requests are wrongly allowed or rate limited among 400 million requests