Theres a growing trend for smart use of resources and automation of routine tasks. That's why even in large companies, engineers would rather use an open source community tool instead of developing their own business app.
Backup today is a necessity rather than a preventive action. It allows you to remove risks in data privacy, integrity, and availability when properly configured.
Tools like rsync and sftp save a lot of time and nerves for the IT professionals in your company. However, if theres an incorrect setting, you may face big problems. The primary purpose of using rsync is to backup important files or even automate the entire data storage mirroring.
*The per-record pricing is based on Ponemon Research sponsored by IBM Security: per-record price is $380 (Medical record) and $245 (Financial record)
Recent data leaks
Check these links to learn what rsync data leaks might lead to:
Explosive Data Leakage 01 / 12 / 2016
Lawful Conduct 17 / 10 / 2016
AMP Trading Platform Breach 25 / 04 / 2017
Thousands and Thousands of Patients Info Potentially Exposed? 10 / 05 / 2017
If you are already familiar with rsync basics, use this link to check your configurations with our short roadmap.
Depending on your preferences, you can use various applications such as cross-platform rsync, grsync, rdiff, or rdiff-backup. You can also use cwRsync for Windows with Graphical User Interface (GUI) as your client app. We prefer to use the command line rsync because it makes it easier to specify each needed option. Its also very useful to run “rsync – help” or “man rsync” in your Terminal in case you forgot what the current option does.
A modified delta algorithm inside rsync will also minimize the network traffic use. It means after the chosen folders are synced for the first time, rsync will only copy the changes you make on your host machine. Please note: the success rate of such data operations depends on the security of the transferred files.
In the table below, you can determine the variation of rsync client you should use depending on your environment.
Rsync copies files from one place to another and performs both local and remote copying, unlike Unix cp. For example, in the command shown below, the user copies the /usr/photos directory with all its contents to the home directory.
The -a parameter here denotes an archive mode. Thats also a short form for -rlptgoD parameters, indicating the necessity of recursive copying (-r) by copying symbolic links as symbolic links (-l), saving permissions for all files (-p), time modifications (-t), group (-g) and owner of the file (-o), and also saving device files (-D). The -v parameter is for the verbosity of the program output. Usually, the -a switch creates a mirror copy of the files, except when the system you are copying to does not support some attributes of the files copied.
What is the difference between this and the previous command? Yes, it is the slash at the end of the copy source argument. If it ends with a slash, then the content of the specified directory will be copied, but not the directory itself. Having a slash at the end of the destination address does not affect the outcome.
The -n parameter here can be used if you suspect that the command parameters, source description, or copy destination to be incorrect. Use the -n parameter for a test run. Rsync will show you what will be done with each file without performing any changes. After making sure all the parameters are correct, -n can be deleted and changes will apply.
By default, rsync uses Secure Shell (SSH) as the transport mechanism. When working with it, you can use the existing aliases of machines and public keys.
If the remote machine has an account with the same name, rsync prompts you to enter a password. Next, after successful authorization, it creates the album directory and copies all the photos into it.
On the server side, you will need to support rsync daemon service. Rsyncd opens a determined port and listens for clients. Every sync path has its own block of options. Rsyncd is the vector for data exposure involving rsync, as it can be opened by an anonymous third party without the proper protection. For our purposes, we will focus on rsyncd, which is the most common way rsync is utilized at scale.
Server mode is perfect for creating a central backup server or project repository.
Theres a difference between the remote shell mode and the server mode: the latter uses two colons (: :) in both source and destination addresses. The following command copies files from the remote server (here its example.com) to the local machine:
What is src? This is the rsync module, defined and configured on the machine where the daemon is running. The module has a name, a path to the files, and several other parameters like read-only, which protects the content from the modification.
To start the rsync daemon, type this:
For managing rsync daemon, use a rsyncd.conf file. It contains options for each sync path, access and authentication, logging, and additional modules. The minimal configuration for exchanging files from its home directory, which does not use sudo, is shown below:
The first four lines configure the rsync daemon. The first line specifies the file path with a greeting message to identify the server. The second line specifies another file where the server process ID is written. This can be handy if you need to send a kill signal to the rsync daemon manually:
These two files are in the home directory because in this example the server runs without superuser privileges. There is also a port on which the daemon will work; it can be found in the first part of the configuration file. Ports with numbers greater than 1000 can be used for any application.
The other part of directives is divided into small sections, one section per module. In turn, each section has a header and a list of key-value pairs specifying the parameters of each module. By default, all these modules are in the read-only mode. To allow recording, you need to set read only = no. Also, all the modules are listed in the modules directory. To hide a module just set list = no.
To start the daemon, run this:
Now connect to the daemon from another machine without specifying the module name. You will see the following:
If you do not specify the module name after the double colon (: :), the daemon will list all the available modules. If you specify a module name without specifying the name of a particular file or directory inside it, the daemon will return the module's content directory:
By swapping source and destination addresses, you can write the file from the local machine to the module, as shown:
This was a brief but fairly complete overview of rsync client options. Now lets proceed to some security issues often heard in news headlines.
Because rsync is a lean utility, it is not engaged by default. This requires administrators to understand and validate their rsync module configurations to properly limit access to the information they handle. Lets look at each of them in detail.
One of the main pitfalls during secure configuration is allowing anonymous access to rsyncd.conf. In other words, if you leave it set to public, a potential attacker will have permissions to break your data in the worst possible way.
Two things you should remember:
Service mode is only for debugging
Each path has its own configurations
Running the service mode, rsync gives details for all synchronization paths. New configurations are often duplicated and accumulate with the infrastructure development, which interferes with the subsequent changes to the file.
To help you understand how this works, well describe a new configuration file for secure data transmitting.
Below, you can find out more about rsync daemon security options and the issues each of them solves.
If chroot is set to “yes,” the rsync daemon will chroot to the “path” before starting the file transfer with the client. This brings the advantage of extra protection against possible implementation security holes. Instead, this will require superuser privileges, and you wont be able to follow symbolic links that are either absolute or outside the new root path. This also complicates the preservation of users and groups by name (see below).
As an additional safety feature, you can specify a dot-dir in the modules “path” to indicate the point where the chroot should occur. This allows rsync to run in a chroot with a non-“/” path for the top of the transfer hierarchy. This safeguards against unintended library loading (since those absolute paths will not be inside the transfer hierarchy unless you have used an inappropriate pathname). This also allows you to set up libraries for the chroot that are outside of the transfer. For example, specifying “/var/rsync/./module1” will chroot to the “/var/rsync” directory and set the inside-chroot path to “/module”. If you had omitted the dot-dir, the chroot would use the whole path, and the inside-chroot path would have been “/”.
When "use chroot” parameter is false or the inside-chroot path is not “/”, rsync will do the following:
1) munge symlinks for security reasons by default (see “munge symlinks” to turn this off, but only if you trust your users)
(2) substitute leading slashes in absolute paths with the modules path (so that options such as --backup-dir, --compare-dest, etc. interpret an absolute path as rooted in the modules “path” dir)
(3) trim “..” path elements from args if rsync believes they would escape the module hierarchy. The default for “use chroot” is true and is the safer choice (especially if the module is not read-only).
When this parameter is enabled, the “numeric-ids“ option will also be enabled by default (disabling name lookups). See below for what a chroot needs for name lookups.
If you copy library resources into the modules chroot area, protect them through your OS normal user/group or ACL settings. This will help to prevent the rsync modules user from being able to change the library resources. Next, hide them from users view via “exclude“. At that point, it will be safe to enable the mapping of users and groups by name using this “numeric-ids“ daemon parameter.
Also note you are free to set up custom user/group information in the chroot area that is different from your normal system. For example, you could abbreviate the list of users and groups.
The daemon must run with root privileges if you wish to use chroot to bind to a port numbered under 1024 (the default 873), or to set file ownership. Otherwise, it must receive a permission to read and write the appropriate data, log, and lock files.
This parameter allows you to specify the maximum number of simultaneous connections you will allow. Any clients trying to connect after reaching the maximum will receive a message telling them to retry later. The default is 0, which means no limit. A negative value disables the module.
Lock file parameter specifies the file to use to support the “max connections“ parameter. The rsync daemon uses record locking on this file to ensure that the max connections limit is not exceeded for the modules sharing the lock file. The default is /var/run/rsyncd.lock.
If the rsync port is opened, then the one who scans the server will find it. The change of the default 873 port will bring a small benefit; the same applies for changing the content of the service banner.
Access to the rsync port should be limited, as it should be for any corporate service. Access Control List (ACL) will help to do that with blocking unauthorized IP addresses, similar to what Whitelists and Firewalls do. You can even manage it from rsync by checking the Host Allow/Deny option.
As we noted, the basic way to protect rsync service from a security breach is to restrict which external machines can communicate with it. By default, all hosts are allowed. By using the “hosts allow“ and “hosts deny” directives, rsync can build a privilege policy by permitting only necessary clients. With ”hosts allow”, all unspecified source IPs will be disallowed automatically, while “hosts deny” can block specific IP addresses, offering further access granularity to an allowed IP range.
You can also combine "hosts allow" with a separate “hosts deny” parameter. If both parameters are specified, then the "hosts allow" parameter is checked first and a match results in the client ability to connect. The "hosts deny" parameter is then checked, and a match means that the host is rejected. If the host does not match either the "hosts allow" or the "hosts deny" patterns then it is allowed to connect.
This parameter specifies the user name or user ID that a file transfers to and from. That module should be used when the daemon is run as root. In combination with the "gid" parameter, the uid determines available file permissions. The default, when run by a superuser, is to switch to the system's "nobody" user. The default for a non-superuser is to leave the user unchanged. See also the "gid" parameter.
The RSYNC_USER_NAME environment variable may be used to request that rsync runs as the authorizing user. For example, if you want a rsync to run as the same user that was received for the rsync authentication, this setup is useful:
This parameter specifies one or more group names/IDs that will be used when accessing the module. The first one will be the default group, and any extra will be set as supplemental groups. You may also specify a "*" as the first gid in the list, which will be replaced by all the normal groups for the transfer's user (see "uid"). The default, when run by a superuser, is to switch to your operating systems "nobody" (or perhaps "nogroup") group with no other supplementary groups. The default for a non-superuser is to not change any group attributes (and indeed, your OS may not allow a non-superuser to try to change their group settings).
Setting "fake super = yes" for a module causes the daemon side to behave as if the --fake-super command-line option had been specified. This allows all the file attributes to be stored without having to have the daemon actually running as root.
A basic one. The “read only” parameter determines whether clients will be able to upload files or not. If "read only" is true, then any attempted uploads will fail. If "read only" is false, then uploads will be possible if file permissions on the daemon side allow them. The default is for all modules to be read only.
Note that "auth users" can override this setting on a per-user basis.
Write only defines whether clients will be able to download files or not. If "write only" is true, then any attempted downloads will fail. If "write only" is false, then downloads will be possible if file permissions on the daemon side allow them. The default is for this parameter to be disabled.
This tells the rsync daemon to completely ignore files that are not readable by the user. This is useful for public archives that may have some non-readable files among the directories, and the sysadmin doesn't want those files to be seen at all.
The chmod parameters allow rsync to set ACLs on the files during the transfer process. This can be crucial when the source and destination require different permission sets.
This parameter determines whether this module is listed when the client asks for a listing of available modules. Also, if this is false, the daemon will pretend the module does not exist when a client denied by "hosts allow" or "hosts deny" attempts to access it. Realize that if "reverse lookup" is disabled globally but enabled for the module, the resulting reverse lookup to a potentially client-controlled DNS server might still reveal to the client that it hits an existing module. The default is for modules to be listable.
This parameter specifies a comma and/or space-separated list of authorization rules. In its simplest form, you list the usernames that will be allowed to connect to this module. The usernames do not need to exist on the local system. The rules may contain shell wildcard characters that will be matched against the username provided by the client for authentication. If "auth users" is set, then the client will be challenged to supply a username and password to connect to the module. A challenge-response authentication protocol is used for this exchange. The plain text usernames and passwords are stored in the file specified by the "secrets file" parameter. The default is for all users to be able to connect without a password (this is called "anonymous rsync").
In addition to username matching, you can specify group name matching via the '@' prefix. When using group name matching, the authenticating username must be a real user on the system, or it will be assumed the user is a member of no groups. For example, specifying "@rsync" will match the authenticating user if the named user is a member of the rsync group.
Finally, options may be specified after a colon (:). The options allow you to "deny" a user or a group, set the access to "ro" (read-only), or set the access to "rw" (read/write). Setting an auth-rule-specific ro/rw setting overrides the module's "read only" setting.
Make sure to put the rules in the order you want them to be matched, because the checking stops at the first matching user or group, and that is the only auth that is checked. For example:
auth users = joe:deny @guest:deny admin:rw @rsync:ro susan joe sam
In the rule above, user joe will be denied access no matter what. Any user that is in the group "guest" is also denied access. The user "admin" gets access in read/write mode, but only if the admin user is not in group "guest" (because the admin user-matching rule would never be reached if the user is in group "guest"). Any other user who is in group "rsync" will get read-only access. Finally, users susan, joe, and sam get the ro/rw setting of the module, but only if the users didn't match an earlier group-matching rule.
If you need to specify a user or group name with a space in it, start your list with a comma to indicate that the list should only be split on commas (though leading and trailing whitespace will also be removed, and empty entries ignored). For example:
auth users = , joe:deny, @Some Group:deny, admin:rw, @RO Group:ro
See the description of the secrets file to learn how you can have per-user passwords as well as per-group passwords. It also explains how a user can authenticate using a user password or (when applicable) a group password, depending on what rule is being authenticated.
See also the section entitled "USING RSYNC-DAEMON FEATURES VIA A REMOTE SHELL CONNECTION" in rsync for information on how to handle a rsyncd.conf-level username that differs from the remote-shell-level username when using a remote shell to connect to a rsync daemon.
This parameter specifies the name of a file that contains the username: password and/or @groupname: password pairs used for authenticating this module. This file is only consulted if the "auth users" parameter is specified. The file is line-based and contains one name: password pair per line. Any line that has a hashtag (#) as the very first character on the line is considered a comment and is skipped. The passwords can contain any characters but be warned that many operating systems limit the length of passwords that can be typed at the client end, so you may find that passwords longer than 8 characters don't work.
The use of group-specific lines is only relevant when the module is being authorized using a matching "@groupname" rule. When that happens, the user can be authorized via either their "username: password" line or the "@groupname: password" line for the group that triggered the authentication.
It is up to you what kind of password entries you want to include, either users, groups, or both. The use of group rules in "auth users" does not require that you specify a group password if you do not want to use shared passwords.
There is no default for the "secrets file" parameter. You must choose a name (such as /etc/rsyncd.secrets). Normally, the file must not be readable by "other"; see "strict modes". If the file is not found or is rejected, no logins for a "user auth" module will be possible.
This parameter determines whether or not the permissions on the secrets file will be checked. If "strict modes" is true, then the secrets file must not be readable by any user ID other than the one the rsync daemon is running under. If "strict modes" is false, the check is not performed. The default is true. This parameter was added to accommodate rsync running on the Windows operating system.
Authentication and Encryption
The authentication protocol used in rsync is a 128-bit MD4 based challenge-response system. This is fairly weak protection, however (with at least one brute-force hash-finding algorithm publicly available), so if you want top-quality security, then we recommend running rsync over ssh. Yes, a future version of rsync will probably switch over to a stronger hashing method, but now everything you have for authentication is an 8-symbols maximum password.
Also, note that the rsync daemon protocol does not currently provide any encryption of the data that is transferred over the connection. Only authentication is provided. Use ssh as the transport if you want encryption.
Future versions of rsync may support SSL for better authentication and encryption, but that is still being investigated. But now, if you are passionate about additional encryption layers, you can set up stunnel to tunnel rsync over an SSL connection using this manual.
The most important takeaway to remember when building a secure rsync setup is that by default, anyone can access your synchronization path. On average, it takes 150 days to find a misconfigured service and an additional 60 days to correct it. So keep calm and check your rsync configurations.