-
Notifications
You must be signed in to change notification settings - Fork 0
Storages
Currently supported storages:
- Amazon Simple Storage Service (S3)
- Rackspace Cloud Files (Mosso)
- Ninefold Cloud Storage
- Dropbox Web Service
- Remote Server (Protocols: FTP, SFTP, SCP, RSync)
- Local Storage
store_with S3 do |s3|
s3.access_key_id = 'my_access_key_id'
s3.secret_access_key = 'my_secret_access_key'
s3.region = 'us-east-1'
s3.bucket = 'bucket-name'
s3.path = '/path/to/my/backups'
s3.keep = 10
end
You will need an Amazon AWS (S3) account. You can get one here.
-
us-east-1
- US Standard (Default) -
us-west-2
- US West (Oregon) -
us-west-1
- US West (Northern California) -
eu-west-1
- EU (Ireland) -
ap-southeast-1
- Asia Pacific (Singapore) -
ap-southeast-2
- Asia Pacific (Sydney) -
ap-northeast-1
- Asia Pacific (Tokyo) -
sa-east-1
- South America (Sao Paulo)
Multipart Uploading
Amazon's Multipart Uploading will be used to upload
each of your final package files which are larger than the default chunk_size
of 5 MiB.
Each package file less than or equal to the chunk_size
will be uploaded using a single request.
This may be changed using:
store_with S3 do |s3|
# Minimum allowed setting is 5.
s3.chunk_size = 10 # MiB
end
Error Handling
Each request involved in transmitting your package files will be retried if an error occurs. By default, each failed request will be retried 10 times, pausing 30 seconds before each retry. These defaults may be changed using:
store_with S3 do |s3|
s3.max_retries = 10
s3.retry_waitsec = 30
end
If the request being retried was a failed request to upload a chunk_size
portion of the file being uploaded,
only that chunk_size
portion will be re-transmitted. For files less than chunk_size
in size, the whole file upload
will be attempted again. For this reason, it's best not to set chunk_size
too high.
When an error occurs that causes Backup to retry the request, the error will be logged. Note that these messages will be logged as informational messages, so they will not generate warnings.
Data Integrity
All data is uploaded along with a MD5 checksum which AWS uses to verify the data received. If the data uploaded fails this integrity check, the error will be handled as stated above and the data will be retransmitted.
Server-Side Encryption
You may configure your AWS S3 stored files to use Server-Side Encryption by adding the following:
store_with S3 do |s3|
s3.encryption = :aes256
end
Reduced Redundancy Storage
You may configure your AWS S3 stored files to use Reduced Redundancy Storage by adding the following:
store_with S3 do |s3|
s3.storage_class = :reduced_redundancy
end
store_with CloudFiles do |cf|
cf.api_key = 'my_api_key'
cf.username = 'my_username'
cf.container = 'my_container'
cf.path = '/path/to/my/backups'
cf.keep = 5
cf.auth_url = 'lon.auth.api.rackspacecloud.com'
end
The cf.auth_url
option allows you to provide a non-standard auth URL for the Rackspace API. By default the US API will
be used; to use a different region's API, provide the relevant URL for that region. The example above demonstrates usage
for the London region.
You will need a Rackspace Cloud Files account. You can get one here.
store_with Ninefold do |nf|
nf.storage_token = 'my_storage_token'
nf.storage_secret = 'my_storage_secret'
nf.path = '/path/to/my/backups'
nf.keep = 10
end
You will need a Ninefold account. You can get one here.
store_with Dropbox do |db|
db.api_key = 'my_api_key'
db.api_secret = 'my_api_secret'
# Dropbox Access Type
# The default value is :app_folder
# Change this to :dropbox if needed
# db.access_type = :dropbox
db.path = '/path/to/my/backups'
db.keep = 25
end
To use the Dropbox service as a backup storage, you need two things:
- A Dropbox Account (Get one for free here: dropbox.com)
- A Dropbox App (Create one for free here: developer.dropbox.com)
The default db.access_type
is :app_folder
. This is the default for Dropbox accounts.
If you have contacted Dropbox and upgraded your account to Full Dropbox Access, then you will need to set the
db.access_type
to :dropbox
.
NOTE The first link I provided is a referral link. If you create your account through that link, then you should receive an additional 500MB storage (2.5GB total, instead of 2GB) for your newly created account.
FOR YOUR INFORMATION you must run your backup to Dropbox manually the first time to authorize your machine with your Dropbox account. When you manually run your backup, backup will provide you with a URL which you must visit with your browser. Once you've authorized your machine, Backup will write out the session to a cache file and from there on Backup will use the cache file and won't prompt you to manually authorize, meaning you can run it in the background as normal using for example a Cron task.
Chunked Uploader
The Dropbox Storage uses Dropbox's /chunked_upload API. By default, this will upload the final backup package file(s) in chunks of 4 MiB. If an error occurs while uploading a chunk, Backup will retry the failed chunk 10 times, pausing 30 seconds between retries. If you wish to customize these values, you can do so as follows:
store_with Dropbox do |db|
db.chunk_size = 4 # MiB
db.chunk_retries = 10
db.retry_waitsec = 30
end
Note: This has nothing to do with Backup's Splitter. If you have a Splitter defined on your model
using split_into_chunks_of
, your final backup package will still be split into multiple files, and each of those files
will be uploaded to Dropbox.
Also note that in Backup versions prior to 3.3.0
, the Splitter was required to upload files to Dropbox
that were larger than 150MB. This is no longer the case. You may still use the Splitter, and you may now split your
final backup package into chunks larger than 150MB.
store_with FTP do |server|
server.username = 'my_username'
server.password = 'my_password'
server.ip = '123.45.678.90'
server.port = 21
server.path = '~/backups/'
server.keep = 5
end
TIP use SFTP if possible, it's a more secure protocol.
store_with SFTP do |server|
server.username = 'my_username'
server.password = 'my_password'
server.ip = '123.45.678.90'
server.port = 22
server.path = '~/backups/'
server.keep = 5
end
store_with SCP do |server|
server.username = 'my_username'
server.password = 'my_password'
server.ip = '123.45.678.90'
server.port = 22
server.path = '~/backups/'
server.keep = 5
end
Say you just transferred a backup of about 2000MB in size. 12 hours later the Backup gem packages a new backup file for you and it appears to be 2050MB in size. Rather than transferring the whole 2050MB to the remote server, it'll lookup the difference between the source and destination backups and only transfer the bytes that changed. In this case it'll transfer only around 50MB rather than the full 2050MB.
Note: If you only want to sync particular folders on your filesystem to a backup server then be sure to take a look at Syncers. They are, in most cases, more suitable for this purpose.
There are 3 different modes of remote operation available:
-
:ssh (default) -- Connects to the remote host via SSH and does not require the use of an rsync daemon.
-
:ssh_daemon -- Connects via SSH, then spawns a single-use rsync daemon to allow certain daemon features to be used.
-
:rsync_daemon -- Connects directly to an rsync daemon on the remote host via TCP.
Note that :ssh
and :ssh_daemon
modes transfer data over an encrypted connection. :rsync_daemon
does not.
The following is all of the configuration options available, along with information about there use depending on which
mode
you are using:
store_with RSync do |storage|
##
# :ssh is the default mode if not specified.
storage.mode = :ssh # or :ssh_daemon or :rsync_daemon
##
# May be a hostname or IP address.
storage.host = "123.45.678.90"
##
# When using :ssh or :ssh_daemon mode, this will be the SSH port (default: 22).
# When using :rsync_daemon mode, this is the rsync:// port (default: 873).
storage.port = 22
##
# When using :ssh or :ssh_daemon mode, this is the remote user name used to connect via SSH.
# This only needs to be specified if different than the user running Backup.
#
# The SSH user must have a passphrase-less SSH key setup to authenticate to the remote host.
# If this is not desirable, you can provide the path to a specific SSH key for this purpose
# using SSH's -i option in #additional_ssh_options
storage.ssh_user = "ssh_username"
##
# If you need to pass additional options to the SSH command, specify them here.
# Options may be given as a String (as shown) or an Array (see additional_rsync_options).
# These will be added to the rsync command like so:
# rsync -a -e "ssh -p 22 <additional_ssh_options>" ...
storage.additional_ssh_options = "-i '/path/to/id_rsa'"
##
# When using :ssh_daemon or :rsync_daemon mode, this is the user used to authenticate to the rsync daemon.
# This only needs to be specified if different than the user running Backup.
storage.rsync_user = "rsync_username"
##
# When using :ssh_daemon or :rsync_daemon mode, if a password is needed to authenticate to the rsync daemon,
# it may be supplied here. Backup will write this password to a temporary file, then use it with rsync's
# --password-file option.
storage.rsync_password = "my_password"
# If you prefer to supply the path to your own password file for this option, use:
storage.rsync_password_file = "/path/to/password_file"
##
# If you need to pass additional options to the rsync command, specify them here.
# Options may be given as an Array (as shown) or as a String (see additional_ssh_options).
storage.additional_rsync_options = ['--sparse', "--exclude='some_pattern'"]
##
# When set to `true`, rsync will compress the data being transerred.
# Note that this only reduces the amount of data sent.
# It does not result in compressed files on the destination.
storage.compress = true
##
# The path to store the backup package file(s) to.
#
# If no `host` is specified, this will be a local path.
# Otherwise, this will be a path on the remote server.
#
# In :ssh mode, relative paths (or paths that start with '~/') will be relative to the directory
# the `ssh_user` is placed in upon logging in via SSH.
#
# For both local and :ssh mode operation, if the given path does not exist, it will be created.
# (see additional notes about `path` below)
#
# For :ssh_daemon and :rsync_daemon modes, `path` will be a named rsync module; optionally followed
# by a path. In these modes, the path referenced must already exist on the remote server.
#
storage.path = "~/backups"
end
If no host
is configured, the operation will be local and the only options used would be path
and
additional_rsync_options
.
Using Compression:
Only the Gzip
Compressor should be used with your backup model if you use this storage option. And only
if your version of gzip
supports the --rsyncable
option, which allows gzip
to compress data using an algorithm
that allows rsync
to efficiently detect changes. Otherwise, even a small change in the original data will result in
nearly the entire archive being transferred.
See the Compressor page for more information.
Using Encryption:
An Encryptor should not be added to your backup model when using this storage option. Encrypting the
final archive will make it impossible for rsync
to distinguish changes between the source and destination files.
This will result in the entire backup archive will be transferred, even if only a small change was made to the original
files.
Additional Notes Regarding path
:
Currently for :ssh
mode or when operating locally, the given path
will have an additional directory added to it named
after the backup model's trigger. For example, if you set path
to ~/backups
, and your trigger is :my_backup
, then
the final path where your backup package file(s) will be stored will be ~/backups/my_backup/
. As mentioned above, this
path will be created if needed.
This will be changed with Backup v4.0. At that time, the creation of this additional directory will no longer be done
and your backup package file(s) will simply be stored in the path
as given.
Note that this is the current behavior for :ssh_daemon
and :rsync_daemon
modes. No additional directory will be
added to the path
given in these modes. However, the path
you specify must already exist for these modes.
I encourage you to look into using :ssh_daemon
mode. Setting this up can be as simple as adding a rsyncd.conf
file
(with 0644 permissions) in the $HOME dir of the ssh_user
on the remote system (most likely the same username running
the backup) with the following contents:
[backup-module]
path = backups
read only = false
use chroot = false
Then simply use storage.path = 'backup-module'
, making sure ~/backups
exists on the remote.
Splitter:
If you use the Splitter with your backup model, understand that the RSync Storage will never remove any
files from path
. For example, say your backup usually results in 2 chunk files being stored: my_backup.tar-aa
and
my_backup.tar-ab
. Then one day, it results in 3 chunks for some reason - an additional my_backup.tar-ac
file.
You discover a ton of files you meant to delete the next day, and your backup returns to it's normal 2 chunks.
That 3rd my_backup.tar-ac
file will remain until you delete it.
Cycling:
The RSync Storage option does not support cycling, so you cannot specify server.keep = num_of_backups
here. With
this storage, only one copy of your backup archive will exist on the remote, which rsync
updates with the changes
it detects.
If you're looking for a way to keep rotated backups, you can simply change the path
each time the backup runs.
For example, to keep:
- Monthly backups
- Weekly backups, rotated each month
- Daily backups, rotated each week
- Hourly backups, rotated every 4 hours
Create the following backup model:
Backup::Model.new(:my_backup, 'My Backup') do
# Archives, Databases...
# Make sure you compress your Archives and Databases
# using an rsync-friendly algorithm
compress_with Gzip do |gzip|
gzip.rsyncable = true
end
store_with RSync do |storage|
time = Time.now
if time.hour == 0 # first hour of the day
if time.day == 1 # first day of the month
# store a monthly
path = time.strftime '%B' # January, February, etc...
elsif time.sunday?
# store a weekly
path = "Weekly_#{ time.day / 7 + 1 }" # Weekly_1 thru Weekly_5
else
# store a daily
path = time.strftime '%A' # Monday thru Saturday
end
else
# store an hourly
path = "Hourly_#{ time.hour % 4 + 1 }" # Hourly_1 thru Hourly_4
end
storage.path = "~/backups/#{ path }"
end
end
Then simply setup cron to run the job every hour.
Note that this will require space for 27 full backups.
You could use a different storage.host
for the monthly, weekly, etc...
Remember that for :ssh_daemon
and :rsync_daemon
modes, each of these paths must already exist.
Or of course, think of your own use cases (and let me know if you figure out any good ones!).
store_with Local do |local|
local.path = '~/backups/'
local.keep = 5
end
If multiple Storage options are configured for your backup, then the Local Storage option should be listed last. This is so the Local Storage option can transfer the final backup package file(s) using a move operation. If you configure a Local Storage and it is not the last Storage option listed in your backup model, then a warning will be issued and the final backup package file(s) will be transferred locally using a copy operation. This is due to the fact that the each Storage configured is performed in the order in which you configure it in you model.
Most storage services place restrictions on the size of files being stored. To work around these limits, see the Splitter page.
Each Storage (except for RSync) supports the keep
setting, which specifies how many backups to keep at this location.
store_with SFTP do |sftp|
sftp.keep = 5
end
Once the keep
limit has been reached, the oldest backup will be removed.
Note that if keep
is set to 5, then the 6th backup will be transferred and stored, before the oldest is removed.
For more information, see the Cycling page.
If you are backing up to multiple storage locations, you may want to specify default configuration so that you don't
have to rewrite the same lines of code for each of the same storage types. For example, say that the Amazon S3 storage
always has the same access_key_id
and secret_access_key
. You could add the following to your ~/Backup/config.rb
:
Backup::Storage::S3.defaults do |s3|
s3.access_key_id = "my_access_key_id"
s3.secret_access_key = "my_secret_access_key"
end
So now for every S3 database you wish to back up that requires the access_key_id
and secret_access_key
to be filled
in with the defaults we just specified above, you may omit them in the actual store_with
block, like so:
store_with S3 do |s3|
s3.bucket = "some-bucket"
# no need to specify access_key_id
# no need to specify my_secret_access_key
end
You would set defaults for CloudFiles
by using:
Backup::Storage::CloudFiles.defaults do |storage|
# ...and so forth for every supported storage location.
end