alexozer 5 hours ago

So am I identifying the bottlenecks that motivate this design correctly?

1. Go FFI is slow

2. Per-proto generated code specialization is slow, because of icache pressure

I know there's more to the optimization story here, but I guess these are the primary motivations for the VM over just better code generation or implementing a parser in non-Go?

skybrian 12 hours ago

This is excellent: an in-depth description showing how the Go internals make writing fast interpreters difficult, by someone who is far more determined than I ever was to make it fast anyway.

I’ve assumed that writing fast interpreters wasn’t a use case the Go team cared much about, but if it makes protobuf parsing faster, maybe it will get some attention, and some of these low-level tricks will no longer be necessary?

jeffrallen an hour ago

> hyperpb is a brand new library, written in the most cursed Go imaginable

This made me LOL.

mdhb 13 hours ago

I’d really love to see more work bringing the best parts of protobuf to a standardised serialization format like CBOR.

I’d make the same argument for gRPC-web to something like WHATWG streams and or WebTransport.

There is a lot of really cool and important learnings in both but it’s also so tied up in weird tooling and assumptions. Let’s rebase on IETF and W3C standards

  • youngtaff 8 hours ago

    Would be good to see support for encoding / decoding CBOR exposed as a broswer API - they currently use CBOR internally for WebAuthn so I’d hope it’s bnot too hard

UncleEntity 13 hours ago

> In other words, a UPB parser is actually configuration for an interpreter VM, which executes Protobuf messages as its bytecode.

This is kind of confusing, the VM is runtime crafted to parse a single protobuf message type and only this message type? The Second Futamura Projection, I suppose...

Or the VM is designed specifically around generic protobuf messages and it can parse any random message but only if it's a protobuf message?

I've been working on the design of a similar system but for general binary parsing (think bison/yacc for binary data) and hadn't even considered doing data over specialized VM vs. bytecode+data over general VM. Honestly, since it's designed around 'maximum laziness' (it just parses/verifies and creates metadata over the input so you only pay for decoding bytes you actually use) and I/O overhead is way greater than the VM dispatching trying this out is probably one of those "premature optimization is the root of all evil" cases but intriguing none the less.