Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Segmentio/encoding: optimized drop-in replacement for Go's encoding/JSON package (github.com/segmentio)
83 points by fullung on Dec 4, 2019 | hide | past | favorite | 17 comments


One of the typical arguments in favor of small base languages is that frequently the best package will be written by a third party and become the "default", making the language bloated. For example, everyone uses Requests in Python instead of the built-in alternative. On the other hand, I think a big advantage of batteries-included languages that they define a standard implementation that other libraries can copy. Because there's an official JSON package for Go, everyone already knows how to use this new one, since it's a drop-in replacement. Obviously this won't happen in all cases, but when it does it feels really nice.


> everyone uses Requests in Python

I love Requests, but this statement overreaches. All of my Python these days is simple scripting, and in those simple cases the hassle of adding a dependency outweighs the utility that Requests provides over the standard library. And I'd say Requests is even a special case here in that it's widely known and acknowledged; while I wager you can find superior third-party replacements to the Python standard library if you look for them, the fact that you don't need to look for them at all (and audit them, ideally!) is what makes a batteries-included language so convenient even in the presence of a sub-par stdlib.


yup, I've used stdlib urlopen in an assortment of scripts, it works, even with python2/python3 cross compatibility:

    if sys.version_info[0] == 2:
        from urllib2 import urlopen, Request  # python2
    else:
        from urllib.request import urlopen, Request  # python3


Another good example is the python stdlib package json: ujson and cjson are drop in replacements that use c bindings to achieve much better performance. However, json is still very useful if you only need to parse/dump a reasonably small amount of data and don't want your users to need to build the c bindings.


Looks neat, why not try to upstream it to the Go standard library if it's fully compatible?


Replacing a commonly used package like this likely wouldn't be without accidental compatibility breakage (in some edge cases), and as such I appreciate it not silently becoming the new default and potentially breaking software or corrupting data. This is especially true as JSON-the-standard (RFC7159) is horribly underspecified and as such it's difficult to even use some sort of acceptance suite to declare a replacement as suitable.

Go's standard encoding/json isn't great (for many reasons...), but it's definitely in the Good Enough category, and as such IMO falls under "if it ain't broken don't fix it" umbrella.


> This is especially true as JSON-the-standard (RFC7159) is horribly underspecified

Normally.. you would hope that the great test system built in to go would benefit stdlib upgrades like this and allow you do them more easily and with greater confidence.. but if what you're implementing suffers from this, then you pretty much stuck locking yourself into specific implementations rather than general specifications.

> but it's definitely in the Good Enough category

"Two steps forward, one step back" sums up how I feel about Go. It's still my favorite user-mode language, though.


Maybe there are ideas that can be used without changing the API surface at all? I don't think grandparent is proposing the Go developers just drop the thing in as-is without any evaluation of fitness or backwards compatibility. Part of the significance of this work is that (they claim) it is API-compatible with the stdlib version (modulo error messages, which can be addressed).

Edit: Then again, the authors advocate mildly against upstreaming it:

> For these reasons, we also don't believe that this code should be ported upsteam to the standard encoding/json package.

https://github.com/segmentio/encoding/tree/master/json#trade...


Seconded, also an explanation of why it's faster on the landing page.


Tradeoffs are at https://github.com/segmentio/encoding/tree/master/json#trade.... Maybe we should move that upfront (or at least link to it for discovery).


That's just a file list on mobile. It's better to link to the README itself: https://github.com/segmentio/encoding/blob/master/json/READM...


Ah that answers it, thanks for posting that.


My problem with serialization in Go isn't so much speed but marshalling/unmarshalling pain.

I've been using these packages with some success:

https://github.com/tidwall/sjson

https://github.com/tidwall/gjson

Not knocking this pkg, just thought I'd share xD


There are surprisingly many ways to do JSON in Go. There's reflection with the stdlib as well as keeping parts of it as "raw" byte slices; code generation with packages like ffjson and easyjson; practically hand-built with gojay; the approach of a state machine on a simple byte string like gjson and jsonparser; and then through templates like e.g. quicktemplate. The best approach really depends on what you want to do.

I think these sorts of challenges show that Go tends to be used in lower-level code than typical dynamic languages. I've combined three or four different approaches in several Go projects, which I couldn't imagine doing in a language like Python.


These dudes create some awesome stuff.

https://github.com/segmentio/nightmare


Now if there's a way to vendor built-in libraries...


Are you saying this so that you can exclude the built-in implementation from your binary? This is already done, the go compiler doesn't even look at the "built-in" implementation unless it's explicitly included in the code. You might have it included through a transitive dependency, but your dependencies can offer configuration like Gonic does. [0]

[0] https://github.com/gin-gonic/gin#build-with-jsoniter




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: