FastAPI vs. Fastify vs. Spring Boot vs. Gin Benchmark
In a previous article, I benchmarked FastAPI, Express.js, Flask, and Nest.js in order to verify FastAPI’s claims of being on par with Node.js. In this article, I am pitting the champion, FastAPI, against a new set of faster competitors. For each framework, I created an API endpoint that returns 100 rows of data from a PostgreSQL database. The data is returned as JSON.
The code for this benchmark can be found here:
https://github.com/travisluong/python-vs-nodejs-benchmark
Disclaimer: I am not a benchmarking expert. This was simply a random experiment I did out of curiosity. This is for entertainment purposes only.
Here are the results:
FastAPI + asyncpg + orjson + gunicorn
Running 10s test @ http://localhost:8000/orjson
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.29ms 0.93ms 10.28ms 55.43%
Req/Sec 2.19k 568.66 3.25k 60.50%
43575 requests in 10.01s, 333.28MB read
Requests/sec: 4355.30
Transfer/sec: 33.31MB
Fastify + pg
Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 4.62ms 1.49ms 20.94ms 87.98%
Req/Sec 1.10k 165.28 1.31k 76.50%
21860 requests in 10.01s, 172.30MB read
Requests/sec: 2184.83
Transfer/sec: 17.22MB
Spring Boot + jdbc
Running 10s test @ http://localhost:8080
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.37ms 1.95ms 73.33ms 98.92%
Req/Sec 3.98k 361.02 5.78k 76.12%
79653 requests in 10.10s, 609.25MB read
Requests/sec: 7886.63
Transfer/sec: 60.32MB
Spring Boot + JPA
Running 10s test @ http://localhost:8080/posts
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 15.52ms 17.42ms 134.96ms 90.44%
Req/Sec 424.58 117.08 737.00 75.50%
8473 requests in 10.03s, 55.25MB read
Requests/sec: 844.82
Transfer/sec: 5.51MB
Gin + database/sql + lib/pq
Running 10s test @ http://localhost:8080/loadtest
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.31ms 5.76ms 33.29ms 80.44%
Req/Sec 1.49k 209.14 2.00k 68.50%
29687 requests in 10.01s, 182.53MB read
Requests/sec: 2966.86
Transfer/sec: 18.24MB
The Rankings
These include benchmarks from part 1 of this article.
- Spring Boot + jdbc (7886 req/sec)
- Go + pgx (7517 req/sec)
- Go + pg + SetMaxOpenConns + SetMaxIdleConns (7388 req/sec)
- FastAPI + asyncpg + ujson + gunicorn 8w (4831 req/sec)
- Fastify + pg + cluster mode 8w (without logging) (4622 req/sec)
- FastAPI + asyncpg + ujson + gunicorn 4w (4401 req/sec)
- FastAPI + asyncpg + gunicorn 4w + orjson (4193 req/sec)
- Express.js + pg + cluster mode 8w (4145 req/sec)
- Fastify + pg + cluster mode 8w (3417 req/sec)
- Gin + database/sql + lib/pq (2966 req/sec)
- Fastify + pg (without logging) (2750 req/sec)
- Fastify + pg (2184 req/sec)
- Express.js + pg (1931 req/sec)
- FastAPI + asyncpg + uvicorn + orjson (1885 req/sec)
- FastAPI + asyncpg + uvicorn + ujson (1711 req/sec)
- Flask + psycopg2 + gunicorn 4w (1478 req/sec)
- Nest.js + Prisma (1184 req/sec)
- FastAPI + psycopg2 + gunicorn 4w (989 req/sec)
- FastAPI + asyncpg + gunicorn 4w (952 req/sec)
- SpringBoot + JPA (844 req/sec)
- FastAPI + psycopg2 + uvicorn + orjson (827 req/sec)
- Flask + psycopg2 + flask run (705 req/sec)
- FastAPI + SQLModel + gunicorn 4w (569 req/sec)
- Flask + psycopg2 + gunicorn 1w (536 req/sec)
- FastAPI + asyncpg + uvicorn (314 req/sec)
- FastAPI + psycopg2 + uvicorn (308 req/sec)
- FastAPI + databases + uvicorn (267 req/sec)
- FastAPI + SQLModel + uvicorn (182 req/sec)
Update 1/18/22
I realized there is another huge flaw in my benchmark. I was running Express.js and Fastify with a single process and FastAPI with 4. Obviously, this isn’t a fair comparison, so I reran the tests with 8 workers each, to fully utilize the CPU on my quad-core MacBook pro.
If there’s a flaw in the code or benchmark, please suggest improvements in the comments.
Express.js + pg + cluster mode 8 workers
Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 16.52ms 30.81ms 187.78ms 85.78%
Req/Sec 2.09k 569.71 4.07k 70.15%
41873 requests in 10.10s, 332.72MB read
Requests/sec: 4145.64
Transfer/sec: 32.94MB
Fastify + pg + cluster mode 8 workers
Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 17.62ms 33.50ms 253.90ms 86.50%
Req/Sec 1.72k 425.97 2.97k 70.50%
34186 requests in 10.00s, 269.46MB read
Requests/sec: 3417.27
Transfer/sec: 26.94MB
FastAPI + asyncpg + gunicorn 8 workers + ujson
Running 10s test @ http://localhost:8000/ujson
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.07ms 542.26us 9.73ms 73.07%
Req/Sec 2.43k 278.47 2.99k 78.00%
48324 requests in 10.00s, 369.60MB read
Requests/sec: 4831.50
Transfer/sec: 36.95MB
Update 2/6/22
Fastify + pg (without logging)
Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 3.63ms 1.08ms 17.63ms 86.29%
Req/Sec 1.39k 181.96 2.58k 82.09%
27788 requests in 10.10s, 219.03MB read
Requests/sec: 2750.63
Transfer/sec: 21.68MB
Fastify + pg + cluster mode 8 workers (without logging)
Running 10s test @ http://localhost:3000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 15.87ms 30.37ms 170.58ms 86.04%
Req/Sec 2.32k 833.78 5.10k 72.50%
46246 requests in 10.00s, 364.52MB read
Requests/sec: 4622.35
Transfer/sec: 36.43MB
Update 2/8/22
Gin + pgx
Running 10s test @ http://localhost:8080/loadtest
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.47ms 1.70ms 34.71ms 97.38%
Req/Sec 3.78k 817.41 5.00k 74.75%
75941 requests in 10.10s, 466.91MB read
Requests/sec: 7517.37
Transfer/sec: 46.22MB
Update 2/20/22
Thanks to user abenz, another flaw was pointed out in the Go benchmark. I have updated the results.
Gin + pg
With pg.SetMaxOpenConns(10000)
and pg.SetMaxIdleConns(5000)
Running 10s test @ http://localhost:8080/loadtest
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.36ms 586.06us 21.15ms 83.11%
Req/Sec 3.71k 451.06 4.70k 72.28%
74626 requests in 10.10s, 458.83MB read
Requests/sec: 7388.91
Transfer/sec: 45.43MB
Conclusion
I initially did these benchmarks to verify FastAPI’s claims of being on par with Node.js. During the process, I also wanted to figure out why I wasn’t getting a great performance out of FastAPI when used in a typical scenario with a database request. There were a few things I learned from doing these benchmarks:
- FastAPI is not fast out of the box. You have to use the correct database drivers such as asyncpg to fully take advantage of FastAPI’s speed.
- Even with asyncpg, you still have to use a faster json library with FastAPI to push performance up to Node.js levels.
- Going from raw sql queries to json is significantly faster than using an ORM, which makes sense as you are skipping the object mapping process.
- I’ve always heard that compiled languages were faster than interpreted languages, although I never verified it for myself. Java/Go was indeed faster compared to similar setups in interpreted languages.
- Node.js has a cluster module that allows you to launch a cluster of Node.js processes to take advantage of multi-core systems.
- Logging affects performance.
A “maxed-out” FastAPI configuration vs. a “maxed-out” Express.js configuration seems to produce similar results. I’ve included a link to the code above. Let me know if there’s anything that can improve this benchmark.
Thanks for reading.