AWS

How the blog was built part 2 - backup

Jon-Paul Flood

Oct 27, 2021 • 5 min read

At this point I now have customised the original GitHub scripts and made it more configurable than before but having so many different commands to run meant I really wanted an easier way to back up the existing Ghost content and files.

I had run into some issues and when re-ran the docker-compose up it reset the Ghost blog and lost all the existing files BUT luckily I had already uploaded my data and the blog was live. I just copied the posts I created and manually rec-created my profile and other additional steps.

Latest Ghost version

One of those issues was the Ghost blog version so I hardcoded that in the docker-compose.yml file as an argument. I want to update, when I want to update.

OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: no such file or directory: unknown

Running these commands on Windows I sometimes faced the above issue. All I had to do was in VSCode set the line feed from CRLF to LF and save the file.

Creating the backup script

My backup script would need to do the following things:

Backup the Ghost blog content, profile data and images.
Collect the backup files together.
Upload the backup folder to S3.

Backup the content folder

Firstly, Ghost does not provide a CLI command to backup your files so we need to copy the data/content folder itself. This folder contains the database, settings and images used in your blog and not the actual blog post content - I will get that in a separate step.

This script will copy the content folder to a backup location - the additional volume referenced in the docker-compose.yml file.

It takes the datetime now, creates the backup folder with the datetime value, e.g backup/2021-09-09-23-02-58, archives the Ghost install folder and moves the archive, e.g ghost-content-2021-09-09-23-02-58.tar.gz, to the backup location.

#!/bin/bash

echo "Backing up Ghost"

NOW=$1
GHOST_INSTALL=/var/www/ghost/
GHOST_ARCHIVE=ghost-content-$NOW.tar.gz
BACKUP_LOCATION=backup

if [ ! -d "/$BACKUP_LOCATION/$NOW" ] 
then
    cd /
    mkdir $BACKUP_LOCATION/$NOW
    echo "Created required /$BACKUP_LOCATION/$NOW folder"
fi

cd $GHOST_INSTALL

# c - creates new file, v - show verbose progress, 
# f - file name type, z - create compressed gzip archive
echo "Archiving Ghost content"
tar cvzf $GHOST_ARCHIVE content

echo "Moving archive to $BACKUP_LOCATION/$NOW"
cp -Rv $GHOST_INSTALL$GHOST_ARCHIVE /$BACKUP_LOCATION/$NOW
rm -f $GHOST_INSTALL$GHOST_ARCHIVE

#To recover your blog, just replace the content folder and import your JSON file in the Labs.

backup.sh

To run the script you pass the datetime value docker-compose exec -T app /usr/local/bin/backup.sh %datetime%.

Export the JSON content

The other issue with Ghost is that I want to export the content that I am able to export but you can only do it through the UI. This is where I use Cypress ;)

In this test I have a reusable login sequence before each test but there's only one at this time anyway. This will get the Cypress config values from the cypress.json file where I have set the base URL, username and password as blank.

A very important property trashAssetsBeforeRuns is set as true because I want the test to start fresh each time with an empty cypress/downloads folder because when I copy the JSON file later I only want the one just created and I won't know what the datetime value is in the filename.

{
    "baseUrl": "https://binarydreams.biz/ghost",
    "env": {
        "username": "[email protected]",
        "password": ""
    },
    "trashAssetsBeforeRuns": true
}

cypress.json

This is the test file.

Get the environment variables, username and password.
If login fails then we do not show the password - that would be bad!
If login succeeds, then begin the test.
Click each UI link/button until the JSON is exported. This saves to the cypress/downloads folder.

Note the timeout value is set to 10 seconds because when the test first runs sometimes it can take a while and it just needs some time to run that first time. If it fails I just have to run it again. I would like to make this more resilient so I don't have to do that.

A very important note, the references to element names like #ember63 will only work when you have installed Ghost at that time. If you ever reinstall Ghost OR delete the container and re-run the image then these embers will likely to have changed and the test has to be updated. Be warned! Again, I'd like this to be more resilient to change.


// Command to use to pass secret to cypress
// as-a local cypress open/run

describe('Backup', () => {

  beforeEach(() => {
    // Log into ghost
    const username = Cypress.env('username')
    const password = Cypress.env('password')
    
    cy.visit('/#/signin')

    // it is ok for the username to be visible in the Command Log
    expect(username, 'username was set').to.be.a('string').and.not.be.empty
    // but the password value should not be shown
    if (typeof password !== 'string' || !password) {
      throw new Error('Missing password value, set using CYPRESS_password=...')
    }

    cy.get('#ember7').type(username).should('have.value', username)
    cy.get('#ember9').type(password, { log: false }).should(el$ => {
      if (el$.val() !== password) {
        throw new Error('Different value of typed password')
      }
    })

    // Click Log in button
    cy.get('#ember11 > span').click()

  })

  it('Exports content as JSON', () => {

    // Click Settings icon
    cy.get('.gh-nav-bottom-tabicon', { timeout: 10000 }).should('be.visible').click()

    // Click Labs
    cy.get('#ember63 > .pink > svg').click()

    // Click Export button
    cy.get(':nth-child(2) > .gh-expandable-header > .gh-btn > span').click()

  })

  // I have no members so no need to export that data
  // Plus no need to download the theme as I'm using the default - Casper.
})

content_export_spec.js

To run the Cypress test we store the Ghost Admin password locally in a .as-a.ini config file. Install the NPM as-a package and then have this file in your project root alongside the executing command or script file. No need to ever commit to your repository.

[local]
CYPRESS_password=yourpasswordgoeshere

Then we run it headless CALL npx as-a local cypress run expecting the .as-a.ini file to be there with a local profile, return to the main process and then copy the exported JSON file xcopy /Y .\cypress\downloads\*.json .\data\backup\%datetime%\ to the backup location.

Upload to S3

Now we can finally upload the backup to S3. This will need a new AWS profile created, e.g ghost-blogger. It has enough S3 access to upload files to the bucket.

This is where I use AWS CLI to get the files from the backup folder and upload to the domain backup bucket in my region - aws s3 sync ./data/backup/ s3://%GHOST_DOMAIN%-backup --region %AWS_REGION% --profile ghost-blogger.

The final script

This is the resulting batch script to backup everything and then upload to S3. I intend to rewrite this in Python. You will note I have hardcoded the domain and region here but I want to configure these elsewhere and eventually remove that extra change to make it reusable.

@echo off
REM This file will run on Windows only of course

set GHOST_DOMAIN=binarydreams.biz
set AWS_REGION=eu-west-1

REM Get the current date time
for /f %%a in ('powershell -Command "Get-Date -format yyyy-MM-dd-HH-mm-ss"') do set datetime=%%a

ECHO Back up content folder first
docker-compose exec -T app /usr/local/bin/backup.sh %datetime%

ECHO Run the UI test to export the content as a JSON file and return to this process
CALL npx as-a local cypress run

ECHO Copy the JSON file to the backup folder
xcopy /Y .\cypress\downloads\*.json .\data\backup\%datetime%\

ECHO Sync back up files to S3
aws s3 sync ./data/backup/ s3://%GHOST_DOMAIN%-backup --region %AWS_REGION% --profile ghost-blogger

backup_website.bat

Not quite the end

When I do the import I should run a script that will gather the files, unzip and re-import the data. This script should be generated and located with that backup so very little commands needed.

Do bare in mind I have only tested the JSON content import and not everything else yet but that time will come when I upgrade my laptop.

Next on the agenda is to upload the generated static files in part 3.

Creating the backup script

Backup the content folder

Export the JSON content

Upload to S3

The final script

Sign up for more like this.