Grokking Diesel, Rust's ORM

kbenson · on Aug 2, 2017

The last major component, and ‘secret weapon’ or ‘unofficial guide’ (according to me) is the the test suite in the official repo (https://github.com/diesel-rs/diesel/tree/master/diesel_tests).

The tests are indeed informative, and give a good idea of how it would be used in practice. I just looked at a few and I have a much better idea of how Diesel looks to be used in practice.

What I didn't see, and don't know if it exists since I haven't read a lot of the docs, is whether Diesel has a built in way to deliver joined results as HashMaps with all fields, as that would be the simplest way I would use it in practice.

For example, if I'm doing the equivalent of:

    SELECT user.*, post.*, comment.* FROM user LEFT JOIN post ON user.id = post.user_id LEFT JOIN comment ON post.id = comment.post_id

Does it have a helper method to identify the three tables that are joined, and instead of returning individual rows, return a HashMap of the user_id to user, a HashMap of the user_id to a list of posts, and a HashMap of the post_id to a list of comments? I think that would be:

    vec![ HashMap<i32,User>, HashMap<i32,Vec<Post>>, HashMap<i32,Vec<Comment>> ]

This would approximate the actual "objectiness" or many ORMs, and would make it easier to loop over data in the desired way with a minimum of boilerplate.

Edit: Or perhaps the more performant way would be to allow registering of callbacks for each change of user, post and comment, and call the appropriate one if defined. That way there isn't time and memory spent on condensing the data into hashmaps? I'm new to thinking about how to deal with ORM results in a language like this.

rabidferret · on Aug 2, 2017

Not a HashMap specifically, but there is a way to group them into `Vec<(User, Vec<(Post, Vec<Comment>)>)>`. (Which you can convert into a hashmap pretty easily) http://docs.diesel.rs/diesel/associations/index.html for details.

kbenson · on Aug 2, 2017

That's actually what I was looking for. I'm was too out of practice in general with signatures in static languages and unfamiliar with Rust's syntax in specific to figure it out and express what I was looking for correctly. ;)

pedrocr · on Aug 2, 2017

Isn't what you actually want from that something like one of the following?

    HashMap<User, (Vec<Post>, Vec<Comment>)>
    HashMap<i32, (User, Vec<Post>, Vec<Comment>)> 
    Vec<(User, Vec<Post>, Vec<Comment>)>

kbenson · on Aug 2, 2017

I think Comments are children of Posts, so ideally there would be a way to easily access all comments by post_id. Ideally, comments would be keyed to their post, and the post would be keyed to the user.

On rethinking this though, I'm wondering if the more performant way might be to provide a helper function that iterates over all returned rows, identifies when one record type in the result ends and another begins (the same steps you have to do to condense), but just executes a registered callback at that time. E.g. user_begin, user_end, post_begin, post_end, etc.

Although I should probably read up more on how Java/C/C# do this before spouting my half formed ideas about an area I'm not really informed in.... ;)

neverminder · on Aug 2, 2017

I wish it was closer to Scala's Slick which is an FRM (Functional Relational Mapping), since Rust's syntax is so similar to Scala. Unlike ORMs, libraries like Slick or JOOQ (Java) have pretty much 1 to 1 DSL to SQL and Slick also provides compile time type safety which in my opinion is the main point of using it.

steveklabnik · on Aug 2, 2017

Hm, Diesel is also trying to do the "1-1" thing and is certainly focused on compile-time type safety.

I haven't used Slick, I'd be interested, if you have the time, in an elaboration with examples on this stuff.

neverminder · on Aug 2, 2017

Well, like I said Slick is not an ORM whereas Diesel is. Slick is not as much about an ORM abstraction, but writing SQL queries in Scala by leveraging it's collections library, taking advantage of Scala's advanced features such as operator overloading, etc.

For instance this is an example from Diesel:

Post::belonging_to(user).load(&connection)

And this is how it would look like in Slick:

Post.filter(_.userID === 1).result

So both of the above would translate into SQL "SELECT * FROM posts WHERE user_id = 1", however Slick is more explicit whereas Diesel definitely feels like an ORM. The tripple equals in Slick example for instance is a nice touch and an example of operator overloading, because userID could be of type Integer or Option[Integer] (nullable field) and it would work both ways.

steveklabnik · on Aug 2, 2017

> like I said Slick is not an ORM whereas Diesel is.

I personally find "ORM" to be such a broad term as to be almost meaningless. This may be a personal problem :)

I see now. I am pretty sure that you could write

  posts.filter(user_id.eq(1)).load(&connection)

if you wanted; the belonging_to is just fancier and a bit shorter.

Anyway, thank you! I've mentally filed a "check out Slick sometime" bug for my infinite personal-to-do list :)

neverminder · on Aug 2, 2017

No problem, I guess what I'm trying to say is that just like Rust took inspiration from Scala syntax (at least I get this feeling), maybe so could Diesel from Slick.

rabidferret · on Aug 2, 2017

Slick was definitely an influence on Diesel's design. I tried to have our query builder map more closely to SQL though, whereas Slick feels like it's trying to pretend it's just an Iterable.

rabidferret · on Aug 2, 2017

You could just as easily write `posts.filter(user_id.eq(1)).load(&connection)`.

virtualwhys · on Aug 2, 2017

Just glanced through the article; biggest difference I'd say between the two libraries would be, one composes, the other doesn't.

    val base = for {
      ur <- UserRole
       u <- ur.user
       r <- ur.role
    } yield (ur, u, r)

    val withManager = for {
      (ur, u, r) <- base
              ts <- TeamStaff if u.id === ts.userId
    } yield (ur, u, r, ts)

    val teams = for {
      (_, _, _, ts) <- withManager
                 t  <- Team if t.id === ts.teamId
    } yield (ts, t)

You can slice and dice a schema however you like, building up queries from other queries, none of which are run until you execute them. Basically you get semantically the same query plans as you'd get writing plain sql, but it's all type checked, sql injection safe, and compiled (queries generally are generated at compile time, not run time).

Quill, another Scala FRM library, is worth checking out in this regard, as is Haskell's Esqueleto.

rabidferret · on Aug 2, 2017

Can you elaborate on how Diesel doesn't compose in this way? Composition is one of the main goals of Diesel.

virtualwhys · on Aug 2, 2017

Didn't see anything in the article, assumed query composition wasn't possible. I don't see anything offhand in the Diesel docs, are there examples that illustrate composition along the lines of the above?

steveklabnik · on Aug 3, 2017

Until you do something to cause the query to be evaluated, it's a value like any other, and you can combine them together however you want.

throwaway91111 · on Aug 2, 2017

Also check out Squeryl; it's virtually the same thing as slick with an alternative (imho much more straightforward) syntax.

I do agree that things like associations are an entirely new layer of abstraction.

squiguy7 · on Aug 2, 2017

The nice thing about diesel is that it does provide compile time safety as well. Not sure if you alluded to that, but even though it doesn't have the FRM style it still offers that benefit of compile time checking. I personally like having code generation handle the boilerplate of making something queryable or insertable.

dvdplm · on Aug 2, 2017

Is the column limit still a problem for Diesel? The 16 columns max default is not enough and even the optional 26/52 can be too little for some odd legacy schemas. I'm curious: is this a limitation that will be overcome as the library matures or is it intrinsic to the codegen architecture?

killercup · on Aug 2, 2017

It's currently dependent on implementing traits for very large tuples, so you could argue it's a language feature we are waiting for (cf. open RFC #1935 [1]). (There was a PR on Diesel to use HLists but it's no longer pursuit.)

[1]: https://github.com/rust-lang/rfcs/pull/1935

neverminder · on Aug 2, 2017

What was the reason you guys dropped HList approach? I'm curious since I have to deal with tables with north of 200 hundred columns and because of forever annoying Scala's tuple parity limit (22) Slick falls back to HList which is really heavy on a compiler and IDE aside from other drawbacks.

rabidferret · on Aug 2, 2017

We didn't want to commit to our own home-grown hlist type until we had some indicator of whether the language was going to go with variadics based on tuples or hlists.

killercup · on Aug 2, 2017

There are some details in https://github.com/diesel-rs/diesel/pull/747

rabidferret · on Aug 2, 2017

It'll eventually go away when some form of variadic generics lands in the language

rabidferret · on Aug 2, 2017

If you open an issue we can up the limit again

steveklabnik · on Aug 2, 2017

Diesel's authors have been hard at work on docs, but there's only so many hours in the day. Love to see posts like this.

kiliankoe · on Aug 2, 2017

Completely unrelated to the topic, but does anybody know why I'm getting a private gist link with the post's content when sharing it from MobileSafari (iOS 11 Beta)?

Scarbutt · on Aug 2, 2017

I would have though that ORM's were an antipattern in Rust.

fredsir · on Aug 2, 2017

How come?

Scarbutt · on Aug 2, 2017

Maybe because of the language constructs that facilitate functional programming, you would think a data driven centric approach would be chose first over objects.

steveklabnik · on Aug 2, 2017

Rust doens't have objects. (Trait objects kind of come close but are very rare and I've never used one with Diesel.)

I think if you read the post, you'll see that Diesel is very data-driven.

Personally, I tend to call it a "query builder" if I think someone might be ORM-phobic; I find it closer to that than other ORMs, even though it's authored by the maintainer of one of the most massively popular ORMs ever. They've sometimes joked that that experience has taught them what not to do ;)

jsd1982 · on Aug 2, 2017

Given the previous experience of authoring a popular ORM, why repeat the mistake of pluralization of table names from model names?

I've always found this convention to be problematic at best and I've learned to avoid it. Maintaining consistency of names inside and outside of code makes searches and references easier to follow without having to mentally (or even mechanically) expand the singular name to plural name or vice versa (say, in a scratch pad to compose a test query).

steveklabnik · on Aug 2, 2017

I guess not everyone considers that a mistake, but at least it's very easy to change; tack on a #[table_name="whatever"] and you're done. I'm not sure if there's a more global way, since I don't mind pluralization.

jsd1982 · on Aug 2, 2017

Yeah, it's just my opinion tempered with experience.

Is the pluralization rule literally as simple as stated in the article, namely appending an 's' and calling it done? English is rarely so simple and there are tons of corner cases which is why I tend to frown on the pluralization rules that are unpredictable. Those must be a nightmare for English-as-a-second-language folks too.

steveklabnik · on Aug 2, 2017

Oh yeah, I remember thanks to ActiveSupport::Inflector, heh.

Yes: https://github.com/diesel-rs/diesel/blob/e52d1710d98a05c3aa4...

But it also converts CamelCase to snake_case: https://github.com/diesel-rs/diesel/blob/e52d1710d98a05c3aa4...

gchp · on Aug 2, 2017

> even though it's authored by the maintainer of one of the most massively popular ORMs ever

Just curious, what ORM is that?

steveklabnik · on Aug 2, 2017

ActiveRecord. Referencing it is a double-edged sword; Diesel is very much not ActiveRecord, so by saying it up front, people tend to get the wrong impression.

gchp · on Aug 2, 2017

IThat's cool, didn't realise it was the same author.

I understand. The other side of that coin though I guess is that knowing this might give some more confidence in using diesel. ActiveRecord was the first ORM I used, and knowing now that diesel is "from the same stock" (for want of a better phrase) gives some extra confidence in what it can do / where it is going.

Not that I was not confident in it before...

steveklabnik · on Aug 2, 2017

Yup, absolutely, that's the other side of the sword :)

dmix · on Aug 2, 2017

Indeed, the flood of Rails people who went over to Rust have definitely given some people pause to consider the direction it was taking.

I don't see this particular case of someone who spent a significant amount of time coding core Rails OSS library being seen as a negative though, quite the opposite. If anything it would be an excellent measure of the language for that particular use case assuming any competent developer would seek to operate in the constraints of the language rather than shoehorning an existing implementation into something that doesn't fit.

The bigger question is whether the ORM label itself is limiting or can be understood to fit into a broader scope of implementations.

rabidferret · on Aug 2, 2017

I think people read way too much into the ORM label. Diesel is very much a query builder first, not an ORM. As a random data point, I've been fixing bugs in Rails lately by basically copying over code from Diesel and converting it to Ruby. That code has all been in Arel not Active Record.

I agree that the "I make Rails" thing is a double edged sword. I would hope people would see it as an indicator that I have a lot of experience and lessons to learn from, but I think many people see the opposite.

yazaddaruvala · on Aug 3, 2017

Thoughts on replacing Arel with Diesel + Helix[0]?

[0] https://github.com/tildeio/helix - When Helix can handle that complexity (if it can't already)

Manishearth · on Aug 2, 2017

https://github.com/rails/rails/tree/master/activerecord

Scarbutt · on Aug 2, 2017

Thanks for clarifying. A subjective note, the executed SQL is way more beautiful than the Rust code ;) and to me, defines the intention better, a nicer approach to me would be if the syntax/grammar would have been more SQLish.

rabidferret · on Aug 2, 2017

Ironically, I was actually originally planning on having Diesel primarily built around parsing actual SQL. (See the commit message of https://github.com/diesel-rs/diesel/commit/2ba8bf481c3d21d8b..., which is like commit number 4)

I agree that if you're just writing a single self-contained query, SQL is a great DSL for that. I've been trying to improve our story for working with raw SQL (one tradeoff will be that it isn't type checked).

However, there are major benefits to an in-language DSL as well. If you are wanting to re-use fragments of queries, or conditionally construct a different query, that's a pain to do with SQL strings.

crzwdjk · on Aug 3, 2017

You don't need to work with SQL strings though, you can work with ASTs or other higher level abstractions, and with Rust's macros and the infer_schema! machinery you can probably find a way to project the SQL types into Rust's type systems to keep things more or less type-safe. But then again, I've never tried to write such a thing, so maybe it's not quite as easy as that. I've definitely used a similar query-builder library in the past though and while it definitely had its benefits over raw SQL, it also led to some abuses where bits and pieces were used to compose gargantuan queries that nobody would have ever thought to write by hand.

leshow · on Aug 2, 2017

One benefit of the diesel approach is that it's type safe, whereas the pure SQL approach isn't.

u320 · on Aug 2, 2017

There is no technical reason why you couldn't create a fully typesafe pure SQL interface in Rust, the same way Rust's format language is typesafe.

steveklabnik · on Aug 2, 2017

This is true, but to do that, you need procedural macros, which are not yet stable.

u320 · on Aug 2, 2017

> it's authored by the maintainer of one of the most massively popular ORMs ever

Cool, didn't know Sean Griffin maintained Hibernate!

steveklabnik · on Aug 2, 2017

I'm not sure what you're saying, I put a "one of" in there for a reason. I'm not particularly interested in debating which ORM is the most popular, but it's undeniable that ActiveRecord is in the set of ones you'd be considering if you were trying to answer that question.

swah · on Aug 4, 2017

Me too as in Golang...

endymi0n · on Aug 2, 2017

After the unfortunate news from NPM today, I'm hoping Diesel doesn't come with a built-in defeat device... /s