How the blog was built part 5 - import a backup
A few times I have had to manually re-import the blog data... I've done this enough times and I need to update the Ghost blog version and fix any Snyk reported security issues. I will automate all this and will show the scripts that will help import a backup automatically.
Now I can backup and update the blog but a few times I have had to manually re-import the blog data. You saw from the backup scripts in part 1 will export the blog content as a JSON file and the content (that includes all the images used) as a ZIP file. The blog is manually imported with these steps:
- The blog content JSON file from the Settings Lab Import button.
- The images content uncompressed from the ZIP file and into the blog folder.
- The profile is reset whenever the blog is reset and the details have to be manually updated again, the profile text and pictures.
I've done this enough times and I need to update the Ghost blog version and fix any Snyk reported security issues. I will automate all this and will show the scripts that will help import a backup automatically. These scripts are a work-in-progress because I haven't needed to execute them . This code is my educated guess so far and I will update the code once I have confirmed them working.
Importing the blog content
The import is split into two executing scripts, one to download the backup and the other to do the import.
The download_website.sh
script will:
- Get the list of backups from AWS S3 after the set date time after when backups are packaged correctly.
- Convert the JSON array to a bash shell array.
- Display the list of backup datetimes and wait for user input.
- Once selected, create the download folder.
- Get each file in the backup from AWS S3 in the requested backup and save to the download folder.
- From Docker, extract the blog content images from the archive into the blog folder. NOTE: This doesn't currently work because I haven't restarted the blog with the new
import
folder reference in the Dockerfile.
This saves me logging into the AWS console, listing the bucket contents and downloading the archive or using the AWS CLI myself.
#!/bin/bash
GHOST_DOMAIN=binarydreams.biz
AWS_REGION=eu-west-1
# This datetime since the backups are packaged a certain way from then
DATETIME="2023-03-03-00-00-00"
echo "\nGet list of backups from S3 ..."
# Filter after datetime AND by tar.gz file because there is only ever one in a backup rather than by .json
# to get a definitive list of backups
declare -a DATETIMES_BASHED
declare -a DATETIMES
DATETIMES=$(aws s3api list-objects-v2 --bucket $GHOST_DOMAIN-backup --region $AWS_REGION --output json --profile ghost-blogger --query 'Contents[?LastModified > `'"$DATETIME"'` && ends_with(Key, `.tar.gz`)].Key | sort(@)')
echo "Backups found.\n"
# I didn't want to install more extensions etc but I just wanted
# a working solution.
# installed xcode-select --install
# brew install jq
# https://jqplay.org/
# jr -r output raw string
# @sh converts input string to to space separated strings
# and also removes [] and ,
# tr to remove single quotes from string output
DATETIMES_BASHED=($(jq -r '@sh' <<< $DATETIMES | tr -d \'\"))
# Show datetimes after $DATETIME and add the extracted string
# to a new array
declare -a EXTRACTED_DATETIMES
for (( i=0; i<${#DATETIMES_BASHED[@]}; i++ ));
do
# Extract first 19 characters to get datetime
backup=${DATETIMES_BASHED[$i]}
#echo $backup
backup=${backup:0:19}
EXTRACTED_DATETIMES+=($backup)
menuitem=$(($i + 1))
echo "[${menuitem}] ${backup}"
done
echo "[x] Any other key to exit\n"
read -p "Choose backup number> " RESULT
# Check if not a number
if [ -z "${RESULT##*[!0-9]*}" ]
then
exit 1
fi
# Reduce array index to get correct menu item
RESULT=$(($RESULT - 1))
SELECTED_DATE=${EXTRACTED_DATETIMES[$RESULT]}
echo "\nDownloading backup $SELECTED_DATE\n"
IMPORT_LOCATION="data/import"
DOWNLOAD_FOLDER="$IMPORT_LOCATION/$SELECTED_DATE"
# Create backup download folder if required
if [ ! -d "$DOWNLOAD_FOLDER" ]
then
mkdir -p "$DOWNLOAD_FOLDER"
if [ $? -ne 0 ]; then
exit 1
fi
echo "Created required $DOWNLOAD_FOLDER folder"
else
# TODO: not working
rm -rf "$DOWNLOAD_FOLDER/*"
if [ $? -ne 0 ]; then
exit 1
fi
fi
function get_file {
FILENAME=$1
FILE_KEY="$SELECTED_DATE/$FILENAME"
OUTPUT_FILE="$DOWNLOAD_FOLDER/$FILENAME"
OUTPUT=$(aws s3api get-object --bucket $GHOST_DOMAIN-backup --region $AWS_REGION --profile ghost-blogger --key $FILE_KEY $OUTPUT_FILE)
echo "$FILENAME downloaded."
}
get_file "ghost-content-$SELECTED_DATE.tar.gz"
get_file "content.ghost.$SELECTED_DATE.json"
get_file "profile.ghost.$SELECTED_DATE.json"
echo "Download complete.\n"
echo "Extract content folder from archive"
docker compose exec -T app /usr/local/bin/extract_content.sh $SELECTED_DATE
Extract the blog data
The first part of the extraction is to uncompress the blog content into the import location and then move it to the Ghost install location.
#!/bin/bash
NOW=$1
GHOST_INSTALL=/var/www/ghost/
GHOST_ARCHIVE=ghost-content-$NOW.tar.gz
IMPORT_LOCATION=import/$NOW
echo "Unarchiving Ghost content"
cd /$IMPORT_LOCATION
# x - , v - show verbose progress,
# f - file name type, z - create compressed gzip archive
tar -xvf $GHOST_ARCHIVE -C /$IMPORT_LOCATION
if [ $? -ne 0 ]; then
exit 1
fi
#echo "Moving archive to $IMPORT_LOCATION"
#cp -Rv $GHOST_INSTALL$GHOST_ARCHIVE /$IMPORT_LOCATION
#rm -f $GHOST_INSTALL$GHOST_ARCHIVE
Import the blog data
The 2nd script, import_website.sh
, will:
- Get the list of downloaded imports found in the import folder.
- Wait for user to select the datetime of the import.
- Execute the Cypress tests with the selected datetime.
#!/bin/bash
echo "\nGet list of imports from import folder ..."
declare -a FOLDERS
FOLDERS=($(ls -d data/import/*))
for (( i=0; i<${#FOLDERS[@]}; i++ ));
do
folder=${FOLDERS[$i]}
echo "[${i}] $folder";
done
if [ -z ]
then
echo "No imports found."
exit 1
fi
echo "[x] Any other key to exit\n"
read -p "Choose import number> " RESULT
# Check if not a number
if [ -z "${RESULT##*[!0-9]*}" ]
then
exit 1
fi
# Reduce array index to get correct menu item
RESULT=$(($RESULT - 1))
SELECTED_DATE=${FOLDERS[$RESULT]}
echo $SELECTED_DATE
# TODO:
# User choose whether to reset blog content by deleting existing blog content
# Check if first page in test is logging in OR blog setup.
# FOR NOW:
# Will need to manually setup blog, delete default blog posts and content files
# The UI tests should do the rest
#echo "Run the UI test to import the blog from JSON files and return to this process"
#npx as-a binarydreams-blog cypress run --spec "cypress/e2e/ghost_import.cy.js" --env timestamp=$SELECTED_DATE
The Cypress test will:
- log into Ghost
- check the blog content JSON file exists
- run the test to import the blog with the import datetime passed in as an argument
Then the profile is imported with these steps:
- log into Ghost
- reading the profile JSON file from the expected location
- Browse to the profile page
- Upload the cover picture
- Upload the profile picture
- Enter the profile details
/// <reference types="cypress" />
// Command to use to pass secret to cypress
// as-a local cypress open/run
describe('Import', () => {
beforeEach(() => {
// Log into ghost
const username = Cypress.env('username')
const password = Cypress.env('password')
cy.visit('/#/signin')
// it is ok for the username to be visible in the Command Log
expect(username, 'username was set').to.be.a('string').and.not.be.empty
// but the password value should not be shown
if (typeof password !== 'string' || !password) {
throw new Error('Missing password value, set using CYPRESS_password=...')
}
cy.get('#ember7').type(username).should('have.value', username)
cy.get('#ember9').type(password, { log: false }).should(el$ => {
if (el$.val() !== password) {
throw new Error('Different value of typed password')
}
})
// Click Log in button
cy.get('#ember11 > span').click()
})
it('Content from JSON', () => {
let timestamp = Cypress.env("timestamp")
let inputFile = `/import/${timestamp}/content.ghost.${timestamp}.json`
cy.readFile(inputFile)
// Click Settings icon
cy.get('.gh-nav-bottom-tabicon', { timeout: 10000 }).should('be.visible').click()
// The Labs link is generated so go via the link
cy.visit('/#/settings/labs')
// Click browse and select the file
cy.get('.gh-input > span').selectFile(inputFile)
// Click Import button
cy.get(':nth-child(1) > .gh-expandable-header > #startupload > span').click()
})
it('Profile from JSON', () => {
let timestamp = Cypress.env("timestamp")
let inputFile = `/import/${timestamp}/profile.ghost.${timestamp}.json`
let profile = cy.readFile(inputFile)
// Click Settings icon
cy.get('.gh-nav-bottom-tabicon', { timeout: 10000 }).should('be.visible').click()
// The profile link is easier to go via the link
cy.visit('/#/staff/jp')
// Cover picture
cy.get('.gh-btn gh-btn-default user-cover-edit', { timeout: 10000 })
.should('be.visible').click()
cy.get('.gh-btn gh-btn-white').click().selectFile(profile.coverpicture)
// Save the picture
cy.get('.gh-btn gh-btn-black right gh-btn-icon ember-view').click()
// Profile picture
cy.get('.edit-user-image', { timeout: 10000 })
.should('be.visible').click()
cy.get('.gh-btn gh-btn-white').click().selectFile(profile.profilepicture)
// Save the picture
cy.get('.gh-btn gh-btn-black right gh-btn-icon ember-view').click()
// Import text from profile file
cy.get('#user-name')
.type(profile.username)
cy.get('#user-slug')
.type(profile.userslug)
cy.get('#user-email')
.type(profile.email)
cy.get('#user-location')
.type(profile.location)
cy.get('#user-website')
.type(profile.location)
cy.get('#user-facebook')
.type(profile.facebookprofile)
cy.get('#user-twitter')
.type(profile.twitterprofile)
cy.get('#user-bio')
.type(profile.bio)
})
})
Once this has been fully tested then I will update this article.