Skip to content

Structured Postgres Regression Tests

I've been using pg_regress tests for a while. It's generally a great way to ensure the behavior of your Postgres code works and continues working as expected. However, as my tests became larger, I started getting lost in them; and there are limits as to what you can test by having a psql session.

If you don't know, pg_regress basically takes an SQL file you give to it and sends it to psql running against a Postgres instance (either managed by pg_regress or by you) and compares the output against a previous stored execution log. If there are no differences, all is good (aka "tests passed".)

But I found that at least in my practice, tests tend to become rather large and it's hard to find separation between steps, individual tests and so on. Of course, you can put every individual test into a separate file, but that feels like a bandaid.

So I thought, "how can I structure this better? There must be a way."

My hunch was that if I can put both inputs (queries) and outputs (results, errors, etc.) into a machine-processable format that is also very visual, I can do a lot with it. I didn't want to write JSON, though. It doesn't force you into having a visual structure.

Unsurprisingly, YAML fit the bill. As much as I may not be its fan (it has design flaws and is definitely being overused for, uhm, "programming"), but it does provide a good visual structure, can be essentially queried and has some interesting features like tags and anchors. And, importantly, it has pretty good support for multiline strings1!

So, instead of having something like this in your test:

select true as value
 value
-------
    t
(1 row)    

...what if you can have something a bit more structured?

query: select true as value
results:
- value: t

At this point I got excited and decided that there's no going back and pg_yregress was prototyped.

By focusing on the structure one can feed into it, one can provide a lot of information to it, such as:

  • configuration of the instances to be tested against
  • initialization sequences
  • reusable queries (hello, YAML aliases!)
  • sending tests to multiple instances (want to test a replication scenario?)2

Unlike pg_regress, it doesn't use psql and this opens some interesting opportunities. For example, it can be used to test binary encodings as it simply uses libpq:

query: select true as value
binary: true
results:
- value: 0x01

Also, it wraps queries into transactions by default3, which removes a lot of begin/rollback noise in pg_regress-based tests.

It's still pretty new and rough around edges but I am already migrating Omnigres to it and will continue adding necessary features and improving the user experience. Check out the documentation if you are interested.

Feedback, suggestions and contributions will be appreciated, so don't be shy!


  1. not a lot of queries easily fit into a single line, so... 

  2. Not all features are ready, please be patient or consider contributing. I'll get to it if you don't! 

  3. Actually, there's no way to turn this behavior off right now, but it'll come soon as it is a trivial change. 

Comments