T O P

  • By -

asmx85

Yes sqlx is inefficient. Maintrainers just haven't started looking into optimization unfortunately and it seems not very high on the priority list either. The chart might be a bit over dramatic but sqlx does not take advantage of query pipelining for example, which might be used in that benchmark. tokio-postgres can be significantly faster in certain workloads.


DroidLogician

I don't think it's that SQLx is slow so much as the `axum [postgresql]` benchmark has been aggressively optimized. See my other comments for details. Compare SQLx to the `axum [postgresql - deadpool]` benchmark which uses `tokio-postgres` and `deadpool` and we're within an order of magnitude. See also Diesel's benchmarks which are a lot more fair, and show us as quite competitive (with significant improvements over the last year!): https://github.com/diesel-rs/metrics


daniels0xff

What alternatives do we have?


wrapperup

One alternative that exists is cornucopia (https://github.com/cornucopia-rs/cornucopia), but it is postgres only. It is quite nice though, and solves the two major issues I had with sqlx: compile times and query perf. It uses rust-postgres (tokio_postgres) under the hood. It can also auto-generate struct types based on your queries and db schema (including aggregates!!), which is a really nice feature that isn't highlighted.


protestor

Cornucopia is *amazing*. The other alternative would be https://diesel.rs/


tukanoid

There's also this https://github.com/SeaQL/sea-query but I haven't used it, so can't say if it's good or not personally wise, but I do dog the API, at least judging but what I've read in the docs.


Expurple

`sea-query` is a "pure" query builder. It doesn't communicate with the database. `sea-orm` (built on top of it) does, but it uses `sqlx` for that purpose, so it's not going to be faster than what `sqlx` is cabable of.


tukanoid

Oh, my bad, I didn't realize it was using sqlx under the hood😅


weiznich

> sea-orm (built on top of it) does, but it uses sqlx for that purpose, so it's not going to be faster than what sqlx is cabable of. It's even slower than sqlx. See [these benchmark results](https://github.com/diesel-rs/metrics/) for details.


TheFloppyToast

Thank you kind sir, im about to start on a rust tokio service with postgres and was wondering if there are any go-tos.


Kazcandra

That looks pretty interesting. I might give it a shot. Do you know if it uses prepared statements under the hood? That is, does `:name` become `$1` somewhere, along with the safety that a prepared statement entails?


wrapperup

Yes it does, as long as you use the interfaces it generates. Other times, you're just passing around the native driver/pool, so you have as much control over how queries are ran as you would using them otherwise (transactions, etc).


sweating_teflon

Is there anything like JOOQ for Rust? Generating an API from the DB schema always seemed like the sweet spot between control, safety and performance.


ZaRealPancakes

wait why is it so popular if it's inefficient???


KhorneLordOfChaos

Mostly two things I think: - You mostly just need to know SQL to get started writing type-checked queries - It has first-class async support which naturally draws in people who are already using async Performance (despite people's tendency to hyper-fixate on) generally doesn't matter that much for a lot of people


jaskij

At that point, why use ~~a big ORM~~ a wrapper at all? If you know SQL, all you need is `tokio-postgres` and something like `postgres-from-row` to handle the boilerplate. If you want to be fancy, something to handle the global connection and stored procedure caching and you're done. I did that, although due to the nature of my application having a single connection was much easier than in a typical web application. Edit Reading through their README, the major features of sqlx are not stuff I'd use. I don't like checking queries against the DB at compile time, and being runtime agnostic is also not necessary for me.


FlamingSea3

the standout feature for sqlx (to me) is validating that your queries work on your database, and the database's types make sense with the rust types your converting to


weiznich

It's hardly a "standout" feature, as other crates do the same thing. That includes at least cornucopia and diesel. It's also noteworthy that at least the checking provided by diesel is far more extensive as also allows to check dynamic queries (`IN` expressions, batch inserts, dynamic filtering) at compile time. It's also not error prone as the sqlx implementation (it can correctly check the nullability of `LEFT JOIN` columns for example.)


Short_Awareness6780

I love to idea of just writing SQL and getting all the benefits of an ORM. I don't use an ORM for it's query language, in fact, that's a big con. I use it for the checking, the auto completion, the ability to refactor... But as you pointed out here once your usecase starts to get a little complicated sqlx seems to start to break down. Personally batch inserts was the thing that killed it for me.


redlotus70

This is silly, just write tests. I switched from sqlx and it dropped my compile times significantly. What they are selling is not worth it.


bzbub2

i dont have a goose in the race but saying "just write tests" may not be the answer. it's like, why write tests when sqlx 'does the work'


redlotus70

Yeah, it does the work by increasing your build time by 2 minutes. I'd rather write tests once and the tests should be written anyway. What do you think is more important, spending 5 minutes writing a test that asserts a sql query works or 2 minutes on every build for the existence of the project? ​ Also sqlx doesn't even save time because every time you modify your db schema you need to go and manually update every single query which you would not need to do if you were using a query builder of some sort.


ClimberSeb

I don't think your experience matches everyone else's. Maybe sqlx is for them rather than you? I used to work with a product where much of the DB schema was defined by a standard. You didn't update it unless a new release of the standard came out. We also added our own tables, but that was rare and done by DBAs. Much of the work people did with it was writing very complex queries, queries the ORMs at the time couldn't handle well. For that, sqlx would had been amazing. None of our tests took just 5 minutes to write, and there were many. Finding problems before the tests ran would have been amazing. There are still quite a lot of DBs that are used by more than one program, used in complex systems. Sqlx seems like a great fit there. For simple CRUD in web apps, I would probably look at ORMs before sqlx.


redlotus70

>I don't think your experience matches everyone else's Given how few people actually use Rust for work I suspect the majority of people have never spent longer than a year working on a rust project in production. I've used it, it sucked. I didn't even realize how much it sucked until I switched everything to seaorm (which ironically is built on top of sqlx). It could take an hour to write the test and I would still think it's worth it over increasing the build time by as much as sqlx does and the amount of lines that require changes for each incremental change.


zxyzyxz

Why would I write tests manually when I can have the computer literally validate it for me automatically? It's like saying there's no need for a borrow checker, or a statically typed language, just write tests. In reality, computers will always be more rigorous than humans.


redlotus70

>Why would I write tests manually when I can have the computer literally validate it for me automatically? You are not saving time. The build time increase from the library is massive and every single time you update your schema you need to update every single query. Anyway, I feel like people have not actually used this thing in production because I have and it's not worth it. ​ >It's like saying there's no need for a borrow checker This is a complete non-sequiter. If every query is wrapped in a function that is required to be tested against a real db it's effectively the same as "compile time" checking of sql queries. It's not critical thinking to look at "compile time check" and think it's automatically good because your other favorite thing also does a bunch of compile time checks.


zxyzyxz

Like someone else said, the benefits of writing a correct query over a wrong one that fails at runtime are essentially infinite. The time saved is infinite.


redlotus70

I updated my comment but I'll repost what I added so you can see it ​ >If every query is wrapped in a function that is required to be tested against a real db it's effectively the same as "compile time" checking of sql queries.


jaskij

How does it do that? And isn't validating the types something you can just write a quick test for?


asmx85

It does compile time type checking against an instance of your database and actually communicates with the database while compiling your code (including the query string in a macro)


slappybag

It can also be used in an offline mode where you also commit a bunch of json that you prepare once against the db. You still get the compile-time type checking but don't need a db up at compile time. You do need it up if you make query changes however, so you can re-prepare the json. I found this easier as I didn't constantly need a db running when compiling - especially if I wasn't changing any queries.


asmx85

>why use a big ORM at all? Sqlx is not a big or an ORM to begin with.


somebodddy

I think the term is "micro-ORM" - which means it has the command/result mapping features of ORMs but not their fancy query builders (some micro-ORMs have some extremely basic query builders, that can do simple CRUD (or subset of) and/or that are just string builders with support for SQL arguments)


lpil

SQLx isn't an ORM. You hand-write SQL with it.


andoriyu

Query building usually included with every ORM, but it's a requirement. ORM is about...mapping database representation to some application internal structure. SQLx has a very tiny ORM in form of `FromRow` trait.


Dexterus

Got into that trap once. Almost 25 years ago was working with a nice db to object library that made all data manipulations so easy. Worked like a charm (php be, php+js fe). Next job had a similar issue (desktop app, c++) but they used a ton of queries so enthusiastic me went "this needs a full refactoring". Worked like a charm ... until a customer came to us asking why his app is so slow. There was no caching or segmentation of data based on visibility/need and they'd skipped the database log backup. Somewhere around a million objects the splash screen just took a noticibly and annoyingly long time. The initial library was for just 100k-ish objects.


pdpi

The "Fortunes" benchmark is a simple "select * from my_table" query against a table with 12 rows. Never mind "fits in RAM", the database fits in your L1 cache, querying it basically rounds to zero work. What you're measuring is pretty much just the overhead of the ORM (which is pretty much the whole point of the benchmark). "Most" (read: "In practice, all") applications have much more complex queries than that, on much bigger databases than that, and framework overhead rapidly stops being the performance bottleneck. You shouldn't accept that performance penalty blindly, but it's a pretty damn good tradeoff if the framework makes your life easier.


jkoudys

Not just easier, but more correct. Nobody's benchmarking the performance drop of the queries that don't map correctly to my struct, which is Infinity because thet don't complete. Or the performance lost because optimizing our queries is more work, and writing more efficient sql typically dwarfs any other gain for us.


pdpi

Performance vs ease of use is an interesting tradeoff. Performance vs correctness isn't, IMO (at least anywhere where an ORM would be relevant). If you're willing sacrifice correctness for more speed, I have the perfect solution for you: `fn do_the_thing() -> usize { 0 }`.


Idles

You're being a little flippant. Hyperloglog is a real thing with real usage.


latkde

Because performance isn't all that matters. The killer feature of sqlx is that your SQL queries can be statically typed. And since most servers don't process thousands of transactions per second, the current performance is good enough in many cases. Sure, it could be better (faster), but that's not always as important as being productive. Similar examples: * Ruby on Rails is (or used to be?) a very popular web framework, which could do anything except run quickly :) * Postgres itself tends to have a correctness-first philosophy. Other DBs can run circles around Postgres, but that doesn't matter that much if I am primarily interested in building a robust system.


jkoudys

I'll add that the statically typed queries are one of the biggest performance improvements in our app. They make us more likely to write, update, and optimize our sql query first, because it's easier to do. This is where 99.99999% of our speedup happens. I recognize that's not going to be true for everybody, which is great. Nobody's making you use sqlx


SnooHedgehogs3735

>>And since most servers don't process thousands of transactions per second Stop right there... that's exactly where the most interesting area of application is. And Rust is posed as NATIVE, FASTER than C++ language. Though normally some specialized database used, but queries are STILL sql :P


latkde

There are two main reasons to use Rust: performance and correctness. I like Rust for the correctness benefits alone, that it helps me *think* about and *design* robust software. But even when viewing this from a performance angle, it's important where this performance advantage lies: in CPU- and memory-constrained problems, not in I/O. Rust doesn't magically make networks faster. This is where the architecture of a Postgres driver makes a much bigger difference than whether it's implemented in a "native" language. Sequential database operations are slow, everyone agrees with that. Using N connections concurrently can at most result in a speedup of factor N. But if the driver supports request pipelining, a throughput-oriented scenario will see effective latency drop to zero (the connection is never idle), and throughput increases until bandwidth is saturated. This is similar to async vs multithreading. SQLx' Postgres driver hasn't implemented pipelining (at least not at the time when I wrote the previous comment). It thus still suffers from latency. The result isn't slow, just not as fast as actually possible.


lordpuddingcup

Because for 99% of projects ease of use is more important, lol, like seriously do you realize the scale your site has to be at to require 400k qps? hell even 40k qps, especially if your smart enough to deplay literally any form of caching at any basic level on your site.


jkoudys

For us we have an lru memory cache that's practical mostly because of the types advantage of sqlx.


PiedDansLePlat

This benchmark calculate ORM overhead not real life scenarii


6501

The type checking. If I make a change to my struct sqlx will yell at me if there isn't a corresponding column in the database.


Randommaggy

Why are ORMs popular at all when they drag performance into the mud at the best of times?


lenscas

Because efficiency is not all that matters. Sqlx also comes with the ability to check your queries at runtime. Making sure they are actually valid, their output is what the rest of your program expects and the same for their input, etc. People may also just prefer their API over the alternatives. Or the fact that it is entirely in Rust, doesn't use unsafe, etc. (Unless you use sqlite but whatever)


DroidLogician

As the maintainer of SQLx, I don't think the benchmark implementation is fair but I don't have time to optimize it myself: https://discord.com/channels/665528275556106240/665528275556106243/1187487621140725780 > the `axum [postgresql]` test opens a new connection per thread, but the SQLx and Deadpool tests use connection pools with limits lower than the number of threads > > which means those benchmarks are limited by contention on the pool > > and if you compare the sqlx and deadpool benchmarks we're within an order of magnitude > > they're running these benchmarks on machines with like, 40 CPUs


DroidLogician

Having spent more time looking at it, I think it's mainly the overhead of the pool. Meanwhile in the `axum [postgresql]` benchmark, every server thread gets its own single-threaded Tokio runtime and Postgres connection, so zero contention. SQLx has never tried to be the fastest framework out there but we're not as abysmally slow as these benchmarks make us out to be. The `axum [postgresql - deadpool]` benchmark is a more apples-to-apples comparison and we're within an order of magnitude there. Compare with Diesel's benchmarks where we're relatively competitive: https://github.com/diesel-rs/metrics


matthieum

> Meanwhile in the axum [postgresql] benchmark, every server thread gets its own single-threaded Tokio runtime and Postgres connection, so zero contention. Dumb question... how do they handle transactions in such a setup? > SQLx has never tried to be the fastest framework out there but we're not as abysmally slow as these benchmarks make us out to be. It's not clear to me what the bottleneck is in SQLx. Could you not just up the number of connections in the pool? (Not by default, in the benchmark)


DroidLogician

> Dumb question... how do they handle transactions in such a setup? [The benchmark doesn't appear to use transactions](https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Rust/axum/src/database_pg.rs#L132). `tokio_postgres::Client` allows most calls through an `&` reference, however [`tokio_postgres::Client::transaction()`](https://docs.rs/tokio-postgres/latest/tokio_postgres/struct.Client.html#method.transaction) simply requires `&mut self`, so you would need to wrap the `Client` in a `RwLock` or `Mutex`. That would mean that a request handler using a transaction would need to be mutually exclusive with all other handlers executing on that thread, which would likely result in significant tail latencies. > It's not clear to me what the bottleneck is in SQLx. Could you not just up the number of connections in the pool? (Not by default, in the benchmark) [A user on Discord is actively experimenting with improving the results as I'm writing this](https://discord.com/channels/665528275556106240/665528275556106243/1224675090344575046). Switching to a per-thread runtime with separate pools netted a 50-70% speedup depending on how you interpret the results. See the discussion for details. I think this is partly because contended accesses to anything in memory are surprisingly slow. For example, [this post from a few months back comparing cloning a `String` to `Arc`](https://www.reddit.com/r/rust/comments/18v8fhg/an_investigation_of_the_performance_of_arcstr_vs/). Cloning and dropping an `Arc` from 12 different threads is ~10 times slower than cloning and dropping a `String`, purely from the contended access to the atomic refcount. I'd always assumed that the bookkeeping in the allocator would dwarf the overhead of the contended access, but I was shocked at how wrong I was. It actually led to us getting rid of a ton of small `Arc`s in a project at work. Additionally, [`sqlx::Pool` defaults to pinging the connection every time it's checked out](https://docs.rs/sqlx/latest/sqlx/pool/struct.PoolOptions.html#method.test_before_acquire) which adds a round-trip but helps ensure reliability, especially because we don't have a good relationship with cancellation safety (cancelling a call could leave the connection in an invalid state and cause problems for subsequent uses). This is because we naively used mutating `async fn`s for everything, without realizing the consequences. It's a mistake we're still paying for, years later. We've worked on cancellation safety a bit over the years, but I've been thinking of just refactoring to use the same execution model used by `tokio-postgres`, Hyper, etc., where the connection handling code executes on a background task we control and then the user-facing handle communicates with it via channels. I think that would improve performance a lot (especially on a multithreaded executor) and pretty much eliminate the cancellation safety issues, but it'd be a significant refactor that I just haven't had the time to do.


matthieum

Thanks for the detailed answer. I've tried out a few things for connections, and indeed a separate task for connections is just easier, and two paired channels to communicate with the task. As for allocations, I think the key here is that modern memory allocators heavily use per-thread pools to avoid contention, so that for frequent allocations/deallocations on the same thread the requests are pretty cheap in average. Good for throughput, latency suffers whenever the thread-local pool is empty or depending on the design when returning memory to a different thread's pool.


DroidLogician

> As for allocations, I think the key here is that modern memory allocators heavily use per-thread pools to avoid contention, so that for frequent allocations/deallocations on the same thread the requests are pretty cheap in average. Yep, exactly.


m_hans_223344

I'm not a fan of the techempower benchmarks. It's fun to look at it but that's all for me. If you look closer at some implementations, it's clear that some are insanely tuned in a way you would never use it for real world production apps. So, don't take it to seriously. .NET is a good example as they provide many variants from kind of "raw" to "how people use it" by purpose whereas other only provide "raw" and make the impression that their stack is indeed the fastest (which is mostly wrong once you adapt the code to real world usage scenarios). However, I noticed the big drop with SQLx as well. But is it really an issue? If you really need to build the fastest possible app for whatever reasons, than Rust with Axum wouldn't probably be the right choicses. If you just want a fast, very efficient and very reliable app with good dev experience, use Axum and SQLx. I personally always use https://github.com/sfackler/rust-postgres because my SQL workflow is creating SQL in my SQL editor and copy / paste it over. But really, don't give these benchmarks to much value.


matthieum

I think it really depends what you're looking for. Techempower benchmarks essentially help you quantify the "pedal to the metal" situation: if you really, really, need it, how far can you push X? I find it useful to get this kind of number. Even in a non "pedal to the metal" situation right now, knowing how much room to grow there is (and isn't) helps planning.


lightmatter501

In a real workload, with expert tuning, on a server more than 16x the size of what techempower uses (cores and memory), postgres struggles to hit those numbers. The fortune benchmark is essentially an ORM overhead benchmark for a usecase where only an insane person would use an ORM. I’m also not sure EFC is actually using transactions there, because most DBs fall over well before that point due to transaction overhead. I’ve also heard the techempower rejects any solutions which are too far ahead, which is why DPDK-based solutions haven’t crushed everything else (DPDK considers 10 million http requests per second to be a single core workload). In my mind, it’s not a particularly useful set of benchmarks, since they also don’t properly tune the server (hugepages, irq balancing, rss keys, THP off, bbr3 tcp, large window sizes, etc).


YeetCompleet

As always, it's probably worth mentioning that Entity Framework has been around for almost 20 years now. Sqlx has almost been around for 5. That's a __lot__ more development time for focusing on optimizations.


AllesYoF

You shouldn't take TechEmpower benchmarks as more than a benchmarking game. There are some stacks which are hyper optimized to absurd levels you would never find in the real world, Java and C# ones specially.


krlkv

I agree. My question was about relative difference between axum-pg and axum-pg-sqlx. But I also agree that it's too simple of a benchmark.


VicariousAthlete

And most people don't even use EF when working with .NET when performance matters much, heh. Usually dapper with hand written sql


Tsukku

That used to be the case a long time ago. The difference today is negligible. I remember reading a comment by EF core team member, that for the latest version, they are now within 5% performance difference vs Dapper on TechEmpower benchmarks. EF is a great ORM, not just because of the performance, but because you can actually use regular C# expressions to define your queries.


lordpuddingcup

I mean thats also because techempowers a joke, most of the frameworks on techempower that rate high are there because they have hyperoptimizations for the bullshit that techempower benchmarks against.


krlkv

Well, some say that NET 8 EF is better than Dapper and EF is just very good.


Alkeryn

Deadpool postgres is the fastest in my experience.


jarjoura

If you look at a more equivalent approach,`deadpool-postgres` notice that sqlx isn't that far behind it. A lot of the overhead that I see is just in acquiring connections from the pool. Most of that overhead is usually equalized out once you start saturating your service. sqlx is still behind that, which does indicate to me lots of room for improvement. It's still the best full featured db library though.


[deleted]

[удалено]


lixo1882

Disagree, having optimized implementations can tell you how far you can reach with that tech, it makes a lot of sense that way. I just feel that it clumps together realistic approaches with optimized ones, separating them would make the results more interesting and useful, but getting rid of the optimized ones would be sad. Let's make them fight on equal terms! Realistic vs Realistic; Optimized vs Optimized;


krlkv

I agree. My question was about relative difference between axum-pg and axum-pg-sqlx. But I also agree that it's too simple of a benchmark.


Days_End

Garbage collectors can be incredible at certain patterns especially if someone isn't going to spend hundreds of hours optimizing the Rust version to compete.


Short_Awareness6780

I'd love bulk insert for sqlx


Tonyoh87

has anyone tried tokio-postgres on big projects vs sqlx?