Hacker News

saisrirampur
WAL-RUS: a Rust Rewrite of WAL-G for PostgreSQL Backups clickhouse.com

nasretdinov11 hours ago

I must say I'm quite pleased to see how well Go version works. It does only use 1.5x the CPU and (predictably) much more RAM/VRAM, but not a crazy amount either (the expected increase is 2x).

Of course you can write a more optimal version in C / C++ / Zig / Rust, but at the same time Go is much easier to write and you don't pay for the convenience with an absurd performance loss like in Python or PHP

__s5 hours ago

indeed, wal-g actually started as a port of wal-e which was Python: https://www.citusdata.com/blog/2017/08/18/introducing-wal-g-...

wal-g was a much larger improvement over wal-e. we're optimizing the margins here

sgt5 hours ago

I'd like to use the one getting the most community support. So too soon to wait for Rust vs Go. Although on paper, Rust is better.

__s5 hours ago

tbf it took 4 years since PG15 support was added for me to fix remote BASE_BACKUP support & wal-g base backups being inconsistent on PG15+ (parameter typo had pg_backup_stop return before wal archived far enough for consistency)

https://github.com/wal-g/wal-g/pull/2262

but yes, this is young project, so fair take

saisrirampurop5 hours ago

++ We’re big Go fans, most of PeerDB is written in Go: https://github.com/PeerDB-io/peerdb

The importance of optimizing (resource) margins and having predictable memory usage increases significantly in the DBaaS/Postgres world, where your process coexists and competes with other critical workloads.

Also, WAL-RUS isn’t rocket science. Postgres already exposes a bunch native constructs for WAL archival, making development fairly straightforward even in Rust.

TL;DR: when to choose Rust or Go really depends on the workload and what you are going after.

TOMDM9 hours ago

PostgreSQL WAL-RUS, no relation to PostgreSQL WALRUS https://github.com/supabase/walrus

whitepoplar3 hours ago

As someone who only has a cursory knowledge of Postgres backup systems, how does this compare to something like pgBackRest? When would someone reach for one over the other?

caffeinated_me14 hours ago

Do you have any benchmarks with a mix of long open transactions and short ones? I've struggled a lot with WAL-E in the past there, and am curious if that changes here.

__s13 hours ago

no. but wal-g & wal-rus both have parallelism over wal-e. however are you more asking about handling build up of wal / vacuum prevention caused by long running transactions? those are up to postgres, archive command only keeps pushing wal so that when postgres is ready to get rid of wal it can. seems like your scenario wouldn't care much what the archiver is since wal should be shipped long before postgres is ready to get rid of wal

caffeinated_me13 hours ago

Yeah, I'm probably misremembering some details there. Thanks

dionian3 hours ago

Great name

westurner3 hours ago

How to Design an index layer for postgres WAL-G backups to make a paging VFS like sqlite-http-vfs for pglite in WASM?

cipherselfan hour ago

[dead]

valentynkit10 hours ago

Quick one on the benchmark: was the 2.8GB peak virtual or resident? Go reserves a large virtual arena it mostly never faults in, so RSS tends to be a fraction of the virtual peak, and if Postgres headroom was getting squeezed off the virtual number you were sizing against memory the kernel never actually charges for.

__s5 hours ago

Correct. We tune overcommit so postgres reliably returns out of memory. It becomes complicated to accurately tune overcommit for every AWS instance type. We configure GOMEMLIMIT/cgroups but those are about RSS. Outliers come together: instances running queries out of memory on our service tend to also be pushing other resource limits, causing wal-g & prometheus exporters to start having more erratic memory usage at the worst time

This helps on both ends of the cost spectrum. Large 64 core instances are where our heuristics fall off the most as variance increases, & tiny instances with 8GB of memory can use every 100MB of RSS we can get

nasretdinov4 hours ago

You probably could limit the bloating of Go programs by setting GOMAXPROCS to something like 1 or 2 on smaller machines, but then again you wouldn't get the best performance. So IMO good call here to rewrite it in a language without GC.

hn-front (c) 2024 voximity
source