Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit f21929f

Browse filesBrowse files
Nikolay Devxxagneum
Nikolay Devxx
authored andcommitted
feat(engine): add data source: pgBackRest (#265)
1 parent 78c503b commit f21929f
Copy full SHA for f21929f

File tree

4 files changed

+473
-0
lines changed
Filter options

4 files changed

+473
-0
lines changed
+375Lines changed: 375 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,375 @@
1+
# Copy the following to: ~/.dblab/engine/configs/server.yml
2+
3+
# Database Lab API server. This API is used to work with clones
4+
# (list them, create, delete, see how to connect to a clone).
5+
# Normally, it is supposed to listen 127.0.0.1:2345 (default),
6+
# and to be running inside a Docker container,
7+
# with port mapping, to allow users to connect from outside
8+
# to 2345 port using private or public IP address of the machine
9+
# where the container is running. See https://postgres.ai/docs/database-lab/how-to-manage-database-lab
10+
server:
11+
# The main token that is used to work with Database Lab API.
12+
# Note, that only one token is supported.
13+
# However, if the integration with Postgres.ai Platform is configured
14+
# (see below, "platform: ..." configuration), then users may use
15+
# their personal tokens generated on the Platform. In this case,
16+
# it is recommended to keep "verificationToken" secret, known
17+
# only to the administrator of the Database Lab instance.
18+
#
19+
# Database Lab Engine can be running with an empty verification token, which is not recommended.
20+
# In this case, the DLE API and the UI application will not require any credentials.
21+
verificationToken: "secret_token"
22+
23+
# HTTP server port. Default: 2345.
24+
port: 2345
25+
26+
# Embedded UI. Controls the application to provide a user interface to DLE API.
27+
embeddedUI:
28+
enabled: true
29+
30+
# Docker image of the UI application.
31+
dockerImage: "postgresai/ce-ui:latest"
32+
33+
# Host or IP address, from which the embedded UI container accepts HTTP connections.
34+
# By default, use a loop-back to accept only local connections.
35+
# The empty string means "all available addresses".
36+
host: "127.0.0.1"
37+
38+
# HTTP port of the UI application. Default: 2346.
39+
port: 2346
40+
41+
global:
42+
# Database engine. Currently, the only supported option: "postgres".
43+
engine: postgres
44+
45+
# Debugging, when enabled, allows seeing more in the Database Lab logs
46+
# (not PostgreSQL logs). Enable in the case of troubleshooting.
47+
debug: false
48+
49+
# Contains default configuration options of the restored database.
50+
database:
51+
# Default database username that will be used for Postgres management connections.
52+
# This user must exist.
53+
username: postgres
54+
55+
# Default database name.
56+
dbname: postgres
57+
58+
# Telemetry: anonymous statistics sent to Postgres.ai.
59+
# Used to analyze DLE usage, it helps the DLE maintainers make decisions on product development.
60+
# Please leave it enabled if possible – this will contribute to DLE development.
61+
# The full list of data points being collected: https://postgres.ai/docs/database-lab/telemetry
62+
telemetry:
63+
enabled: true
64+
# Telemetry API URL. To send anonymous telemetry data, keep it default ("https://postgres.ai/api/general").
65+
url: "https://postgres.ai/api/general"
66+
67+
# Manages filesystem pools (in the case of ZFS) or volume groups.
68+
poolManager:
69+
# The full path which contains the pool mount directories. mountDir can contain multiple pool directories.
70+
mountDir: /var/lib/dblab
71+
72+
# Subdir where PGDATA located relative to the pool mount directory.
73+
# This directory must already exist before launching Database Lab instance. It may be empty if
74+
# data initialization is configured (see below).
75+
# Note, it is a relative path. Default: "data".
76+
# For example, for the PostgreSQL data directory "/var/lib/dblab/dblab_pool/data" (`dblab_pool` is a pool mount directory) set:
77+
# mountDir: /var/lib/dblab
78+
# dataSubDir: data
79+
# In this case, we assume that the mount point is: /var/lib/dblab/dblab_pool
80+
dataSubDir: data
81+
82+
# Directory that will be used to mount clones. Subdirectories in this directory
83+
# will be used as mount points for clones. Subdirectory names will
84+
# correspond to ports. E.g., subdirectory "dblab_clone_6000" for the clone running on port 6000.
85+
clonesMountSubDir: clones
86+
87+
# Unix domain socket directory used to establish local connections to cloned databases.
88+
socketSubDir: sockets
89+
90+
# Directory that will be used to store observability artifacts. The directory will be created inside PGDATA.
91+
observerSubDir: observer
92+
93+
# Snapshots with this suffix are considered preliminary. They are not supposed to be accessible to end-users.
94+
preSnapshotSuffix: "_pre"
95+
96+
# Force selection of a working pool inside the `mountDir`.
97+
# It is an empty string by default which means that the standard selection and rotation mechanism will be applied.
98+
selectedPool: ""
99+
100+
# Configure PostgreSQL containers
101+
databaseContainer: &db_container
102+
# Database Lab provisions thin clones using Docker containers and uses auxiliary containers.
103+
# We need to specify which Postgres Docker image is to be used for that.
104+
# The default is the extended Postgres image built on top of the official Postgres image
105+
# (See https://postgres.ai/docs/database-lab/supported_databases).
106+
# Any custom or official Docker image that runs Postgres. Our Dockerfile
107+
# (See https://gitlab.com/postgres-ai/custom-images/-/tree/master/extended)
108+
# is recommended in case if customization is needed.
109+
dockerImage: "postgresai/extended-postgres:14"
110+
111+
# Custom parameters for containers with PostgreSQL, see
112+
# https://docs.docker.com/engine/reference/run/#runtime-constraints-on-resources
113+
containerConfig:
114+
"shm-size": 1gb
115+
116+
# Adjust PostgreSQL configuration
117+
databaseConfigs: &db_configs
118+
configs:
119+
# In order to match production plans with Database Lab plans set parameters related to Query Planning as on production.
120+
shared_buffers: 1GB
121+
# shared_preload_libraries – copy the value from the source
122+
# Adding shared preload libraries, make sure that there are "pg_stat_statements, auto_explain, logerrors" in the list.
123+
# It is necessary to perform query and db migration analysis.
124+
# Note, if you are using PostgreSQL 9.6 and older, remove the logerrors extension from the list since it is not supported.
125+
shared_preload_libraries: "pg_stat_statements, auto_explain, logerrors"
126+
# work_mem and all the Query Planning parameters – copy the values from the source.
127+
# To do it, use this query:
128+
# select format($$%s = '%s'$$, name, setting)
129+
# from pg_settings
130+
# where
131+
# name ~ '(work_mem$|^enable_|_cost$|scan_size$|effective_cache_size|^jit)'
132+
# or name ~ '(^geqo|default_statistics_target|constraint_exclusion|cursor_tuple_fraction)'
133+
# or name ~ '(collapse_limit$|parallel|plan_cache_mode)';
134+
work_mem: "100MB"
135+
# ... put Query Planning parameters here
136+
137+
# Details of provisioning – where data is located,
138+
# thin cloning method, etc.
139+
provision:
140+
<<: *db_container
141+
# Pool of ports for Postgres clones. Ports will be allocated sequentially,
142+
# starting from the lowest value. The "from" value must be less than "to".
143+
portPool:
144+
from: 6000
145+
to: 6100
146+
147+
# Use sudo for ZFS/LVM and Docker commands if Database Lab server running
148+
# outside a container. Keep it "false" (default) when running in a container.
149+
useSudo: false
150+
151+
# Avoid default password resetting in clones and have the ability for
152+
# existing users to log in with old passwords.
153+
keepUserPasswords: false
154+
155+
# Data retrieval flow. This section defines both initial retrieval, and rules
156+
# to keep the data directory in a synchronized state with the source. Both are optional:
157+
# you may already have the data directory, so neither initial retrieval nor
158+
# synchronization are needed.
159+
#
160+
# Data retrieval can be also considered as "thick" cloning. Once it's done, users
161+
# can use "thin" cloning to get independent full-size clones of the database in
162+
# seconds, for testing and development. Normally, retrieval (thick cloning) is
163+
# a slow operation (1 TiB/h is a good speed). Optionally, the process of keeping
164+
# the Database Lab data directory in sync with the source (being continuously
165+
# updated) can be configured.
166+
#
167+
# There are two basic ways to organize data retrieval:
168+
# - "logical": use dump/restore processes, obtaining a logical copy of the initial
169+
# database (a sequence of SQL commands), and then loading it to
170+
# the target Database Lab data directory. This is the only option
171+
# for managed cloud PostgreSQL services such as Amazon RDS. Physically,
172+
# the copy of the database created using this method differs from
173+
# the original one (data blocks are stored differently). However,
174+
# row counts are the same, as well as internal database statistics,
175+
# allowing to do various kinds of development and testing, including
176+
# running EXPLAIN command to optimize SQL queries.
177+
# - "physical": physically copy the data directory from the source (or from the
178+
# archive if a physical backup tool such as WAL-G, pgBackRest, or Barman
179+
# is used). This approach allows to have a copy of the original database
180+
# which is physically identical, including the existing bloat, data
181+
# blocks location. Not supported for managed cloud Postgres services
182+
# such as Amazon RDS.
183+
retrieval:
184+
# The jobs section must not contain physical and logical restore jobs simultaneously.
185+
jobs:
186+
- physicalRestore
187+
- physicalSnapshot
188+
189+
spec:
190+
# Restores database data from a physical backup.
191+
physicalRestore:
192+
options:
193+
<<: *db_container
194+
# Defines the tool to restore data.
195+
tool: pgbackrest
196+
197+
# Sync instance options.
198+
sync:
199+
# Enable running of a sync instance.
200+
enabled: true
201+
202+
# Custom health check options for a sync instance container.
203+
healthCheck:
204+
# Health check interval for a sync instance container (in seconds).
205+
interval: 5
206+
207+
# Maximum number of health check retries.
208+
maxRetries: 200
209+
210+
# Add PostgreSQL configuration parameters to the sync container.
211+
configs:
212+
shared_buffers: 2GB
213+
214+
# Add PostgreSQL recovery configuration parameters to the sync container.
215+
recovery:
216+
# Uncomment this only if you are on Postgres version 11 or older.
217+
# standby_mode: on
218+
219+
# Passes custom environment variables to the Docker container with the restoring tool.
220+
envs:
221+
PGBACKREST_LOG_LEVEL_CONSOLE: detail
222+
PGBACKREST_PROCESS_MAX: 2
223+
PGBACKREST_REPO: 1
224+
# SSH example
225+
PGBACKREST_REPO1_TYPE: posix
226+
PGBACKREST_REPO1_HOST: repo.hostname
227+
PGBACKREST_REPO1_HOST_USER: postgres
228+
# S3 example
229+
#PGBACKREST_REPO1_TYPE: s3
230+
#PGBACKREST_REPO1_PATH: "/pgbackrest"
231+
#PGBACKREST_REPO1_S3_BUCKET: my_bucket
232+
#PGBACKREST_REPO1_S3_ENDPOINT: s3.amazonaws.com
233+
#PGBACKREST_REPO1_S3_KEY: "XXXXXXXXXXXXXXXXXX"
234+
#PGBACKREST_REPO1_S3_KEY_SECRET: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
235+
#PGBACKREST_REPO1_S3_REGION: us_east_1
236+
237+
# Defines pgBackRest configuration options.
238+
pgbackrest:
239+
stanza: stanzaName
240+
forceInit: false
241+
242+
physicalSnapshot:
243+
options:
244+
# Skip taking a snapshot while the retrieval starts.
245+
skipStartSnapshot: false
246+
247+
# Adjust PostgreSQL configuration of the snapshot.
248+
<<: *db_configs
249+
250+
# Promote PGDATA after data fetching.
251+
promotion:
252+
<<: *db_container
253+
# Enable PGDATA promotion.
254+
enabled: true
255+
256+
# Custom health check options for a data promotion container.
257+
healthCheck:
258+
# Health check interval for a data promotion container (in seconds).
259+
interval: 5
260+
261+
# Maximum number of health check retries.
262+
maxRetries: 200
263+
264+
# It is possible to define pre-processing SQL queries. For example, "/tmp/scripts/sql".
265+
# Default: empty string (no pre-processing defined).
266+
queryPreprocessing:
267+
# Path to SQL pre-processing queries.
268+
queryPath: ""
269+
270+
# Worker limit for parallel queries.
271+
maxParallelWorkers: 2
272+
273+
# Add PostgreSQL configuration parameters to the promotion container.
274+
configs:
275+
shared_buffers: 2GB
276+
277+
# Add PostgreSQL recovery configuration parameters to the promotion container.
278+
recovery:
279+
# Uncomment this only if you are on Postgres version 11 or older.
280+
# standby_mode: on
281+
282+
# It is possible to define a pre-processing script. For example, "/tmp/scripts/custom.sh".
283+
# Default: empty string (no pre-processing defined).
284+
# This can be used for scrubbing eliminating PII data, to define data masking, etc.
285+
preprocessingScript: ""
286+
287+
# Scheduler contains tasks that run on a schedule.
288+
scheduler:
289+
# Snapshot scheduler creates a new snapshot on a schedule.
290+
snapshot:
291+
# Timetable defines in crontab format: https://en.wikipedia.org/wiki/Cron#Overview
292+
timetable: "0 */6 * * *"
293+
# Retention scheduler cleans up old snapshots on a schedule.
294+
retention:
295+
# Timetable defines in crontab format: https://en.wikipedia.org/wiki/Cron#Overview
296+
timetable: "0 * * * *"
297+
# Limit defines how many snapshots should be hold.
298+
limit: 4
299+
300+
# Passes custom environment variables to the promotion Docker container.
301+
envs:
302+
PGBACKREST_LOG_LEVEL_CONSOLE: detail
303+
PGBACKREST_PROCESS_MAX: 2
304+
PGBACKREST_REPO: 1
305+
# SSH example
306+
PGBACKREST_REPO1_TYPE: posix
307+
PGBACKREST_REPO1_HOST: repo.hostname
308+
PGBACKREST_REPO1_HOST_USER: postgres
309+
# S3 example
310+
#PGBACKREST_REPO1_TYPE: s3
311+
#PGBACKREST_REPO1_PATH: "/pgbackrest"
312+
#PGBACKREST_REPO1_S3_BUCKET: my_bucket
313+
#PGBACKREST_REPO1_S3_ENDPOINT: s3.amazonaws.com
314+
#PGBACKREST_REPO1_S3_KEY: "XXXXXXXXXXXXXXXXXX"
315+
#PGBACKREST_REPO1_S3_KEY_SECRET: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
316+
#PGBACKREST_REPO1_S3_REGION: us_east_1
317+
318+
cloning:
319+
# Host that will be specified in database connection info for all clones
320+
# Use public IP address if database connections are allowed from outside
321+
# This value is only used to inform users about how to connect to database clones
322+
accessHost: "localhost"
323+
324+
# Automatically delete clones after the specified minutes of inactivity.
325+
# 0 - disable automatic deletion.
326+
# Inactivity means:
327+
# - no active sessions (queries being processed right now)
328+
# - no recently logged queries in the query log
329+
maxIdleMinutes: 120
330+
331+
332+
# ### INTEGRATION ###
333+
334+
# Postgres.ai Platform integration (provides GUI) – extends the open source offering.
335+
# Uncomment the following lines if you need GUI, personal tokens, audit logs, more.
336+
#
337+
#platform:
338+
# # Platform API URL. To work with Postgres.ai SaaS, keep it default
339+
# # ("https://postgres.ai/api/general").
340+
# url: "https://postgres.ai/api/general"
341+
#
342+
# # Token for authorization in Platform API. This token can be obtained on
343+
# # the Postgres.ai Console: https://postgres.ai/console/YOUR_ORG_NAME/tokens
344+
# # This token needs to be kept in secret, known only to the administrator.
345+
# accessToken: "platform_access_token"
346+
#
347+
# # Enable authorization with personal tokens of the organization's members.
348+
# # If false: all users must use "accessToken" value for any API request
349+
# # If true: "accessToken" is known only to admin, users use their own tokens,
350+
# # and any token can be revoked not affecting others
351+
# enablePersonalTokens: true
352+
#
353+
# CI Observer configuration.
354+
#observer:
355+
# # Set up regexp rules for Postgres logs.
356+
# # These rules are applied before sending the logs to the Platform, to ensure that personal data is masked properly.
357+
# # Check the syntax of regular expressions: https://github.com/google/re2/wiki/Syntax
358+
# replacementRules:
359+
# "regexp": "replace"
360+
# "select \\d+": "***"
361+
# "[a-z0-9._%+\\-]+(@[a-z0-9.\\-]+\\.[a-z]{2,4})": "***$1"
362+
#
363+
# Tool to calculate timing difference between Database Lab and production environments.
364+
#estimator:
365+
# # The ratio evaluating the timing difference for operations involving IO Read between Database Lab and production environments.
366+
# readRatio: 1
367+
#
368+
# # The ratio evaluating the timing difference for operations involving IO Write between Database Lab and production environments.
369+
# writeRatio: 1
370+
#
371+
# # Time interval of samples taken by the profiler.
372+
# profilingInterval: 10ms
373+
#
374+
# # The minimum number of samples sufficient to display the estimation results.
375+
# sampleThreshold: 20

‎engine/internal/retrieval/engine/postgres/physical/physical.go

Copy file name to clipboardExpand all lines: engine/internal/retrieval/engine/postgres/physical/physical.go
+4Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ type CopyOptions struct {
7272
ContainerConfig map[string]interface{} `yaml:"containerConfig"`
7373
Envs map[string]string `yaml:"envs"`
7474
WALG walgOptions `yaml:"walg"`
75+
PgBackRest pgbackrestOptions `yaml:"pgbackrest"`
7576
CustomTool customOptions `yaml:"customTool"`
7677
Sync Sync `yaml:"sync"`
7778
}
@@ -130,6 +131,9 @@ func (r *RestoreJob) getRestorer(tool string) (restorer, error) {
130131
case walgTool:
131132
return newWALG(r.fsPool.DataDir(), r.WALG), nil
132133

134+
case pgbackrestTool:
135+
return newPgBackRest(r.fsPool.DataDir(), r.PgBackRest), nil
136+
133137
case customTool:
134138
return newCustomTool(r.CustomTool), nil
135139
}

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.