Import website files from old server with Rsync and SSH

by Danila Vershinin, February 28, 2016 , revisited on March 21, 2024

We have by far the largest RPM repository with NGINX module packages and VMODs for Varnish. If you want to install NGINX, Varnish, and lots of useful performance/security software with smooth yum upgrades for production use, this is the repository for you.
Active subscription is required.

When you are switching hosting, you need a quick yet reliable way to transfer files between two servers.
Rsync is the best tool for the job. Here is how you do it.

First step to import website files. SSH keys

It is not required but you may want to have SSH connectivity between old and new servers to work without passwords first. Further, this will let you run import commands in the background easily.

Generate SSH key on the new server:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

Next, make the old server trust this key:

ssh-copy-id -i ~/.ssh/id_rsa.pub root@1.2.3.4

where 1.2.3.4 is the old server’s IP address.

If SSH runs on a non-standard port, refer to our previous tutorial on password-less SSH login.

Run import with Rsync

Make sure it is installed on the new server first:

yum -y install rsync

Create the directory where you want to have a copy of remote files first, i.e. mkdir /var/www/html. Copying files with is easy now with a few lines:

REMOTE_PORT=22
REMOTE_USERHOST='root@1.2.3.4'
rsync -e "ssh -p $REMOTE_PORT" -avz $REMOTE_USERHOST:/var/www/html/ /var/www/html/

Run these commands one by one. You can change the default port from 22 to whatever SSH port is configured on the remote machine. The second line specifies the system user and IP address of the remote server. The last command will copy all the files from a remote location over to the local directory, including hidden files.

If you want to have the import running in the background, even after you close SSH session, run:

nohup rsync -e "ssh -p $REMOTE_PORT" -avz $REMOTE_USERHOST:/var/www/html/ /var/www/html/ > import.log 2>&1&

Here’s a breakdown of the command and the significance of the slash (/) at the end of the directory paths:

nohup: This command is used to run another command in the background, and it prevents the command from being stopped even if the user logs out.
-e "ssh -p $REMOTE_PORT": Specifies the remote shell to use, in this case, SSH on a specific port defined by the variable $REMOTE_PORT.
-avz: This is a combination of options where a stands for “archive” (which preserves permissions, symlinks, etc.), v for “verbose” (provides detailed output of the operation), and z for “compress” (reduces the size of data during the transfer).
$REMOTE_USERHOST:/var/www/html/: Specifies the source directory on the remote server. $REMOTE_USERHOST contains the user and host information in the format user@hostname.
/var/www/html/: Specifies the destination directory on the local machine. The trailing slash is also significant here.
> import.log: Redirects standard output (stdout) to a file named import.log, effectively logging the output of the command.
2>&1: Redirects standard error (stderr) to standard output (stdout), which means both standard output and standard error get logged to import.log.
&: Puts the command in the background, allowing the user to continue using the terminal session.

Importance of the Slash (/):

In the context of rsync, the trailing slash on directory paths has a specific meaning. When the source directory in an rsync command ends with a slash /, rsync will copy the contents of the directory, rather than the directory itself. This means:

With the slash ($REMOTE_USERHOST:/var/www/html/): rsync copies the contents of /var/www/html/ from the remote server into the local /var/www/html/ directory. It does not create a nested /var/www/html/html directory.
Without the slash ($REMOTE_USERHOST:/var/www/html): rsync would copy the html directory itself into the destination, resulting in /var/www/html/html, which is undesired.

Therefore, the trailing slash is important for ensuring that the contents of the source directory are synchronized directly into the root of the destination directory, maintaining the intended directory structure without creating an additional nested directory.

It is safe to close Terminal / Putty now. You will be able to check the log of running command by listing contents of created import.log file:

tail -f import.log

The command above will display real-time updates of newly transferred files of our background import process.

Making import reliable

Now, I often find that the reason for changing host is reliability. Quite oftentimes the original server has an unreliable network and thus the import simply fails because the network went down on the old server.

How do we deal with that? Let’s expand our previous commands into a script, that will repeatedly check rsync returned status and restart transfer until it is successful.

#!/bin/bash
REMOTE_PORT=22
REMOTE_USERHOST='root@1.2.3.4'

ssh-keyscan -t rsa -T 10 $REMOTE_HOST >> ~/.ssh/known_hosts
ssh-copy-id -i ~/.ssh/id_rsa.pub $REMOTE_USERHOST -p$REMOTE_PORT

while ! rsync -e "ssh -p $REMOTE_PORT" -avz $REMOTE_USERHOST:/var/www/html/ /var/www/html/
do
  sleep 1
  echo "Restarting program..."
done

We also added two helpful lines:

ssh-keyscan will accept the remote host key for us. This is insecure but for our automation purpose we will sacrifice security over convenience
ssh-copy-id will make sure there is trust between the two hosts and will allow skipping password prompt between subsequent tries

Save the script as, i.e. import.sh and make it executable by giving +x permission: chmod +x import.sh. Then run it using the background approach:

nohup import.sh > import.log 2>&1&

This will pick up only the remaining files after each failure and the script will take care of making a log of transferred files and failures. It will not have to download all the files every time – this is what makes it great. The script will work indefinitely until it’s done transferring all original files.

Back up the whole drive to a remote location

It can be useful to snapshot the entire drive and put that image in a remote location.
Suppose that you have a Hetzner Storage Box set up, and an OS X machine with an external 2 TB USB HDD.
In our case, the disk can be accessed with /dev/rdisk2. Check your exact disk number by using the Disk Utility in your OS X.

First make sure that you can freely SSH into the storage box: ssh -p23 uXXXXX@uXXXXX.your-storagebox.de.

To snapshot the HDD and upload the image remotely, you can run:

tmux new
sudo -i
dd if=/dev/rdisk2 | gzip -1 - | ssh -p23 uXXXXX@uXXXXX.your-storagebox.de dd of=usbhdd2tb.img.gz

Note the output though. If you’re getting includes this:

dd: /dev/rdisk2: Input/output error

Then the disk is faulty and you must use ddrescue instead.

However, it doesn’t support piping the output over to SSH. So we must mount our storage box locally using SSHFS on Mac. Install the required macFUSE and SSHFS from here.

Server Setup

Import website files from old server with Rsync and SSH

First step to import website files. SSH keys

Run import with Rsync

Making import reliable

Back up the whole drive to a remote location

Like this:

Related

Leave a Reply Cancel Reply

Recommended Web Hosting

Secure email hosting for your domain

Secure and Accelerate Your Website

Server Setup

Import website files from old server with Rsync and SSH

First step to import website files. SSH keys

Run import with Rsync

Making import reliable

Back up the whole drive to a remote location

Share this:

Like this:

Related

Leave a Reply Cancel Reply

Recommended Web Hosting

Secure email hosting for your domain

Secure and Accelerate Your Website

More Performance Related Articles!