Feature/custom restore data dir #638

jefeish · Sep 17, 2020

This provides an option for the backup/restore utilities to use separate base directories

Based on a customer use case:

When each backup process is complete, the data is sent off to HDFS for storage and the local copies are removed (the whole backup folder, including snapshot subfolders). When a restore from a backup is requested, the backup data is retrieved from the HDFS and placed back into the 'GHE_DATA_DIR', this is required, due to the fact that the backup and restore scripts refer to the same 'GHE_DATA_DIR'.

This scenario causes conflicts if a restore is attempted while a backup is still in progress.

This feature provides an option for the backup/restore utilities to use separate base directories. It introduces GHE_RESTORE_DATA_DIR.
Allowing users to restore from a different data location, then the backup.

All original Backup/Restore functionality (cmd options) remain the same, this simply adds additional flexibility. If no GHE_RESTORE_DATA_DIR is provided in the backup.config file, the original GHE_DATA_DIR will be used.

This was "life" tested with the GitHub-Demo-Stack (AWS) and an additional EC2 Restore Instance

Note:
To address the customer's use case, alternative solutions, that don't require a code change, could have been used.

For example, run " a different backup-utils installation" (for restore actions), or don't back up and delete the whole backup folder, only individual snapshots, etc.

The decision to add more flexibility to our tools comes from the fact that this would not require the customer(s) to adjust their process (or policy) and that this adds additional flexibility without interfering with the existing functionality.

Test Runs

$ ./test/test-ghe-restore.sh
test: ghe-restore-snapshot-path reports an error when current symlink doesn't exist ... OK (0s)
test: ghe-restore-snapshot-path reports an error when specified snapshot doesn't exist ... OK (0s)
test: ghe-restore into configured vm ...                                OK (9s)
test: ghe-restore logs the benchmark ...                                OK (23s)
test: ghe-restore aborts without user verification ...                  OK (0s)
test: ghe-restore accepts user verification ...                         OK (5s)
test: ghe-restore -c into unconfigured vm ...                           OK (10s)
test: ghe-restore into unconfigured vm ...                              OK (28s)
test: ghe-restore with host arg and config value ...                    OK (14s)
test: ghe-restore with host arg ...                                     OK (10s)
test: ghe-restore no host arg or configured restore host ...            OK (0s)
test: ghe-restore with no pages backup ...                              OK (8s)
test: ghe-restore cluster backup to non-cluster appliance ...           OK (0s)
test: ghe-restore no leaked ssh host keys detected ...                  OK (0s)
test: ghe-restore with current backup leaked key detection ...          OK (0s)
test: ghe-restore fails when restore to an active HA pair ...           OK (1s)
test: ghe-restore honours --version flag ...                            OK (0s)
test: ghe-restore honours --help and -h flags ...                       OK (0s)
test: ghe-restore exits early on unsupported version ...                OK (0s)
test: ghe-restore cluster ...                                           OK (5s)
test: ghe-restore missing directories or files from source snapshot displays warning ... OK (6s)

$ ./test/test-ghe-backup.sh
test: ghe-backup first snapshot ...                                     OK (4s)
test: ghe-backup subsequent snapshot ...                                OK (5s)
test: ghe-backup logs the benchmark ...                                 OK (5s)
test: ghe-backup with relative data dir path ...                        OK (4s)
test: ghe-backup fails fast when old style run in progress ...          OK (1s)
test: ghe-backup cleans up stale in-progress file ...                   OK (4s)
test: ghe-backup without management console password ...                OK (4s)
test: ghe-backup empty hookshot directory ...                           OK (3s)
test: ghe-backup empty git-hooks directory ...                          OK (4s)
test: ghe-backup fsck ...                                               OK (15s)
test: ghe-backup stores version when not run from a clone ...           OK (4s)
test: ghe-backup with leaked SSH host key detection for current backup ... OK (4s)
test: ghe-backup with no leaked keys ...                                OK (3s)
test: ghe-backup honours --version flag ...                             OK (0s)
test: ghe-backup honours --help and -h flags ...                        OK (0s)
test: ghe-backup exits early on unsupported version ...                 OK (1s)
test: ghe-backup-strategy returns rsync for HA backup ...               OK (0s)
test: ghe-backup cluster ...                                            OK (4s)
test: ghe-backup not missing directories or files on source appliance ... OK (5s)
test: ghe-backup missing directories or files on source appliance ...   OK (4s)
test: ghe-backup fix_paths_for_ghe_version performance tests - gists ... OK (0s)
test: ghe-backup fix_paths_for_ghe_version performance tests - wikis ... OK (0s)
test: ghe-backup fix_paths_for_ghe_version newer/older ...              OK (1s)

/cc @michaelsainz @steffen

making sure that the feature is disabled by default

share/github-backup-utils/ghe-restore-storage

backup.config-example

maclarel

This LGTM, but I'm likely not the best person to formally approve. Thanks for adding this @jefeish !

jefeish · Dec 1, 2020

@maclarel thanks for the feedback ...I'll address the merge conflict and we should be good to go for another 👀

jefeish added 5 commits Sep 14, 2020

updated all 'restore' scripts from 'DATA_DIR' to 'RESTORE_DATA_DIR'

45d3ace

add 'GHE_RESTORE_DATA_DIR'

da65d36

add 'GHE_RESTORE_SNAPSHOT_PATH' to comment output

86512af

commented 'GHE_RESTORE_DATA_DIR'

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

fdbd5c4

making sure that the feature is disabled by default

Merge branch 'master' into feature/custom-restore-data-dir

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

6f2c36e

jefeish requested review from omgitsads, maclarel and lildude Sep 17, 2020

Merge branch 'master' into feature/custom-restore-data-dir

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

d30d1eb

oakeyc reviewed Nov 19, 2020

View changes

share/github-backup-utils/ghe-restore-storage Outdated Show resolved Hide resolved

oakeyc reviewed Nov 19, 2020

View changes

backup.config-example Show resolved Hide resolved

jefeish added 2 commits Nov 30, 2020

sync'd from master

b81c5e4

sync'd from master

8e99ffe

maclarel reviewed Dec 1, 2020

View changes

oakeyc approved these changes Dec 1, 2020

View changes

Nov	DEC	Jan
	11
2019	2020	2021

github / backup-utils

Feature/custom restore data dir #638

Feature/custom restore data dir #638

jefeish commented Sep 17, 2020 •

edited

maclarel left a comment

jefeish commented Dec 1, 2020

github / backup-utils

Join GitHub today

GitHub is where the world builds software

Feature/custom restore data dir #638

Feature/custom restore data dir #638

Conversation

jefeish commented Sep 17, 2020 • edited

This provides an option for the backup/restore utilities to use separate base directories

maclarel left a comment

jefeish commented Dec 1, 2020

Essential cookies

Always active

Analytics cookies

jefeish commented Sep 17, 2020 •

edited