Load test report · Apr 2026

52,000 req/s sustained.
0.001% error rate.

The regulated proxy runs at this throughput under realistic conditions calibrated with production telemetry. In practice, what's already in place handles roughly 1.9× the business-hour average of the entire Brazilian Open Finance ecosystem. The path to more capacity is already mapped.

The four numbers

What the infra delivers under realistic load.

All measurements below came out of the same environment, running full FAPI-BR flows with production-calibrated latency. Not paper throughput, what's already in place today.

52k
req/s sustained
Stable throughput under realistic load.
0.001%
realistic error rate
8 errors in 775,344 requests under the production-calibrated scenario.
1,200×
largest OF initiator
Peak weekly volume of the ecosystem's largest payment initiator.
500 s
to absorb that peak
Time to process the 26.06M requests from the largest initiator's peak week.

Comparison

How much load is this, really?

Based on internal analysis of public Dashboard do Cidadão data, with a pessimistic ~47% buffer over the observed weekly peak. Cumbuca, as configured, absorbs almost twice the business-hour average of the entire ecosystem.

Cumbuca · tested Proxy as configured today
~52,000 req/s
OF Brazil · business-hour Effective average of the entire ecosystem
~27,000 TPS
OF Brazil · flat 24×7 Total weekly volume ÷ full week
~16,500 TPS
Largest OF initiator · peak week 26.06 million requests in the week
~43 TPS
What this means
The infrastructure as deployed could, in principle, absorb the regular load of every institution in the Open Finance ecosystem simultaneously. Serving a single large client leaves substantial headroom, before any architectural rescale.
500seconds

The time to process the peak week of the largest payment initiator in Open Finance.

The ecosystem's largest initiator generated 26.06 million requests during its peak week in the reference period. At the throughput our infrastructure delivers today, that entire volume clears in under nine minutes of operation.

The 52k req/s ceiling is a ceiling of the current configuration, not of the architecture. The layer that saturates under synthetic load is the load balancer's mTLS termination, not the proxy application. The path to more capacity is already defined: load balancer sharding, incremental, without changing a line of the application.

How we measured

Full FAPI-BR flows. No synthetic traffic.

Every virtual user runs the whole protocol: payment consent, PAR, authorise, client credentials, auth code, PIX initiation, polling. Real PKCE, real consent IDs, properly-scoped Bearer tokens.

Flows

The whole protocol, end to end

The load generator runs the payment flow and data flow in full, with PKCE, consent and polling. The simulator doesn't expose shortcut routes. The only way to receive a 200 is to walk through the correct protocol.

Latency

Calibrated from real telemetry

The simulator responds with a piecewise-linear distribution anchored to the quantiles we see in production: p50 at 130 ms, p90 at 350 ms, p99 at 1,350 ms. The reported number is under realistic load, not a zero-latency baseline.

Infra

120-instance proxy pool

120 m8i.large behind a managed ALB with mTLS termination. 16 m8i.large run the Open Finance simulator. Two c5.9xlarge run k6 in parallel as the load generator.

  • 240 aggregate vCPU on the proxy
  • Managed ALB · round-robin
  • k6 v1.7.1 · constant-arrival-rate

The ceiling and the path beyond

Every system has a ceiling. Ours is known, and so is the way out.

Where it stops · 52,000 req/s

mTLS saturation at the load balancer

Around 52k req/s, failures show up that horizontal proxy scaling can't resolve. The cause is the load balancer's cryptographic handshake capacity. The bottleneck lives at the edge, not in the application.

  • 502 when the LB can't accept new connections
  • 401 when the handshake times out
  • Proxy itself has headroom, with no saturation signal
Where it goes · linear sharding

Load balancer sharding

The architectural path is pre-planned: multiple independent load balancers behind an L4 distributor. Each additional shard adds mTLS termination capacity linearly.

  1. Linear scale: n shards, n× capacity
  2. Zero change to the proxy application
  3. Incremental rollout, no service interruption
  4. Transparent to the end client

Full report

The full report, with the methodology and the ceiling.

English text, 38 pages, Confidential classification. Includes test topology, latency distribution, status codes, the mTLS bottleneck analysis, comparison against the entire ecosystem, and the sharding roadmap.

Drop your email and company. An engineer on the team delivers the PDF within one business day.

In the report
  • Test methodology and topology§2, §3
  • Results, status codes, latency§4
  • mTLS bottleneck analysis and ecosystem comparison§5
  • Sharding roadmap§6
  • Ecosystem estimate and sensitivityappendix E
Email me the report < 1 business day

By submitting, you agree to our Privacy Policy. Your data goes to our CRM (HubSpot) and is used to deliver the report and respond to this request.

Next step

Want to run your own benchmark?

We give access to a sandbox that mirrors production. You run your own k6 suite against real infrastructure. Engineer in the loop from the first email.

Talk to engineering
mTLS · FAPI-BR
Full compliance, including PAR and PKCE
Regulation
Payment Institution regulated by BACEN
Stack
Erlang/OTP · self-healing · hot upgrades
Dedication
Isolated proxy instances per partner in production