Table Config#

“table config” are options whichs are valid at the per-table level, such as:

backup:
  tables:
    - name: foo # <--- here

Remember that any option defined below can be specified at more general levels of config as a fallback for options which are the same across many/all tables.

All options are intentionally identical, and in most cases are valid for both backup and restore commands, unless specifically noted otherwise.

name#

When tables are given as a mapping, defaults to the key-side of the mapping. When tables are given as a list, this field is required!

The name field defines the matching criteria for tables which should be backed up/ restored. For the simplest case, this can just be the name (or {schema}.{name}) of the table. However this name can also be “globbed” to match multiple tables!

Note

Remember each item in this list has a corresponding query field which can be an arbitrary query. This means that you can utilize the same name more than once. It’s just a way of matching the source table that each iteration should target.

# This
tables:
  public.foo:
  '*.*':

# Is the same as this
tables:
  - name: public.foo
  - name: '*.*'

# Is the same as this
tables:
  - public.foo
  - '*.*'

Note

name can also be omitted entirely, with some caveats. The “name” field populates the {table} templated into queries and location paths (both of which default to including the {table} template value).

Thus, if you omit the “name” field, you must have also provided a concrete “query” and “location” field.

tables:
  - query: select * from for_example_a_view
    location: backups/public.for_example_a_view

Globbing#

Using common globbing rules:

Pattern

Meaning

*

matches everything

?

matches any single character

[seq]

matches any character in seq

[!seq]

matches any character not in seq

For some common examples:

  • public.*: All tables in a schema

  • *.foo: Tables with a given name in all schemas

  • *_log: All tables ending with some suffix

  • *_*_log: Multiple globs

See also the exclude key below.

Note

Globbing was chosen over regex for a much more simplified way of quickly matching table names in a way that is easily grokkable. It’s conceivable that regex matching could be supported in the future, but in most common cases globs with exclusions should be able to match most kinds of cases.

location#

Defaults to backups/{table}

location paths use URI protocols for determining (on a per path basis) what protocol to use for the backup/restore of that path.

Tip

Output files default to being separated into table-specific folder through {table}. They can be colocated regardless of folder by removing that template source e.x. backups/.

Local files#

Note an otherwise unadorned path will be assumed to be a local file path, for example path/to/folder.

For backups, if the path leading up to the leaf folder does not yet exist, it will be automatically created.

S3#

A path is identified as an “S3 path” when it is prefixed with the S3 protocol: s3://.

For example s3://bucket/path/to/folder references a path path/to/folder inside of a bucket bucket.

S3 paths make use of the s3 config for authorization against the included bucket. Alternatively, the common environment variables recognized by the aws CLI (i.e. AWS_PROFILE, AWS_REGION, AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, etc) will be automatically read.

filename#

Defaults to {timestamp}.{ext}.

Coupled with the “location” configuration, a fully templated path will result as (by default) backups/{table}/{timestamp}.{ext}. This yields a new file each time a command is run.

Tip

{timestamp} is a “variable” template source, meaning a new value will be yielded each time. In order to reference a static filename, configure a filename without a variable source, e.x. {table}.{ext}.

strategy#

Defaults to use_latest_filename.

This option is only read during restore commands and has two valid values: use_latest_filename and use_latest_metadata.

The restore-time “strategy” defines how databudgie should determine which file, on a per-table basis to read from. Note that each time you run databudgie backup, it’s never altering preexisting files, instead it’s writing new files to disk with a timestamp in the name to disambiguate.

  • use_latest_filename will make use of the default file naming scheme which includes write-time timestamps in the name of the file, and chooses the most recent timestamp.

  • use_latest_metadata will use the Operating System file attributes for file creation time (or equivalent in S3), and chooses the most recent one.

truncate#

Defaults to false.

This option is only read during restore commands. When true, truncates the contents of the table before attempting to restore into it.

Note

This can run afoul of foreign key constraints, depending on your table structure.

The tables are intentionally ordered in such a way as to avoid or reduce the possibility of foreign key related issues; however self referential or circular foreign key relationships may encounter issues with this option (on those tables).

query#

Defaults to select * from {table}

Specify the query to be used on a given match. The default, which simply selects the whole table is the most obviously useful query one might use, to backup the whole table.

There aren’t any constraints on the query to be executed, however, so this field can apply filters, perform joins, alter/obfuscate the data, or otherwise do whatever it wants.

compression#

Defaults to null.

Depending on the size of tables, the backups can get quite large. By default compression is disabled, but it can be enabled for any/all table “data” backups.

Valid values include: gzip.

Note

This automatically appends the compression file extension to the backup files (i.e. .gz for gzip), and will only work correctly if both the backup side and restore side agree on the value of the compression key.

exclude#

Defaults to [].

This is most commonly useful when using globs, particularly when running up against the limitations of glob matching versus regex

tables:
  # All log tables
  - name: "*_log"
    exclude:
      - "tree_log"

Note that exclude list entries can also be globs themselves. So you can use them to arrive at more complex matching criteria than could be achieved with the single name matching glob.

follow_foreign_keys#

Defaults to false.

When true, any foreign keys on the table will be recursively followed when performing the backup/restore. This allows one to specify only the table one seeks to backup/restore and any tables related through foreign keys will also be backed up.

Note

The backup file that is stored/read from will be relative to the explicit table that originated the inclusion of that table in the config.

That is, if your config file includes some table “public.foo” with location “backups/{table}”, then any tables backed up as a result of follow_foreign_keys on behalf of this table will end up at backups/public.foo/{table}.

In the event that two tables produce “followed” versions of the same table, only one backup will be produced, under whichever table happens to resolve first (a necessary measure on the restore-side due to foreign key constraints). For a given heirarchy of foreign keys, this should remain constant, but doesnt preclude some future table/foreign key from taking control of that table by virtue of being higher up in the heirarchy.

skip_if_exists#

Note

Unlike most options, this option only has an effect in the backup-side of the config.

Defaults to false.

When true, skips the backing up the table, if there already exists backup data for the annotated table.