Why do you need to make backups? The equipment is very, very reliable, in addition there are “clouds” that are better than physical servers in terms of reliability: if configured correctly, the “cloud” server will easily survive the failure of the physical server infrastructure, and from the point of view of service users, there will be a small, almost no noticeable jump in service time. In addition, duplication of information often requires payment for “extra” processor time, disk load, network traffic.
An ideal program runs fast, doesn’t flow through RAM, doesn’t have holes, and doesn’t exist.
Since the programs are still written by protein developers, and the testing process is often absent, plus the delivery of programs is extremely rare with the use of “best practices” (which are also programs in themselves, and therefore, imperfect), system administrators often have to solve problems that sound briefly but succinctly: “return as it was,” “bring the base back to normal operation,” “it works slowly – we roll back”, as well as my favorite “I don’t know what, but fix it”.
In addition to logical errors that come out as a result of careless work of developers, or a combination of circumstances, as well as incomplete knowledge or misunderstanding of the small features of building programs – including binding and system ones, including operating systems, drivers and firmware – there are also other errors. For example, most developers rely on runtime, completely forgetting about the physical laws that are still impossible to bypass with the help of programs. This includes the infinite reliability of the disk subsystem and of any data storage subsystem in general (including RAM and processor cache!), And zero processing time on the processor, and the absence of errors during transmission over the network and during processing on the processor, and network latencies, which are 0. Do not neglect the notorious deadline, because if you don’t have time for it, there will be problems cleaner than the nuances of the network and disk.
But what about the problems that arise in full growth and hang over valuable data? There is nothing to replace the live developers, and not the fact that it will be possible in the near future. On the other hand, to fully prove that the program will work as intended, so far only a few projects have succeeded, and it is not at all possible to take and apply the evidence to other, similar projects. Also, such evidence takes a lot of time, and requires special skills and knowledge, and this practically minimizes the possibility of their application taking into account deadlines. In addition, we still don’t know how to make ultrafast, cheap and infinitely reliable technology for storing, processing and transmitting information. Such technologies, if they exist, then in the form of concepts, or – most often – only in science fiction books and films.
Good artists copy, great artists steal.—Pablo Picasso.
The most successful solutions and surprisingly simple things usually happen where there are absolutely incompatible, at first glance, concepts, technologies, knowledge, fields of science.
For example, birds and planes have wings, but despite the functional similarity – the principle of operation in some modes is the same, and technical problems are solved similarly: hollow bones, the use of strong and lightweight materials, etc. – the results are completely different, although very similar. The best samples that we observe in our technology are also mostly borrowed from nature: airtight compartments in ships and submarines – a direct analogy with annelids; building raid arrays and checking data integrity – duplication of the DNA chain; as well as paired organs, the independence of the work of various organs from the central nervous system (automatic heart function) and reflexes are autonomous systems on the Internet. Of course, taking and applying ready-made solutions is fraught with problems, but who knows, maybe there are no other solutions.
So, backups are vital for those who want:
- To be able to restore the operation of their systems with minimal downtime, or even without them
- Feel free to act, because in case of an error there is always the possibility of a rollback
- Minimize the effects of intentional data corruption
Here is a bit of theory
So, for a small server, you need to provide a backup scheme that meets the following requirements:
- Easy to use – no special additional steps are required when working, minimal steps to create and restore copies.
- Universal – works on both large and small servers; this is important when increasing the number of servers or scaling.
- It is installed by the package manager, or in one or two commands of the “download and unzip” type.
- Stable – uses a standard or long-established storage format.
- Fast in work.
Applicants from those who more or less meet the requirements:
- rdiff-backup
- rsnapshot
- burp
- duplicati
- duplicity
- deja dup
- dar
- zbackup
- restic
- borgbackup
A virtual machine (based on XenServer) with the following characteristics will be used as a test bench:
- 4 cores 2.5 GHz,
- 16 GB of RAM
- 50 GB hybrid storage (storage with caching on SSDs at 20% of the size of the virtual disk) as a separate virtual disk without partitioning,
- 200 Mbps Internet channel.
Almost the same machine will be used as the backup destination server, only with a 500 GB hard drive.
Operating system – Centos 7 x64: the breakdown is standard, an additional partition will be used as a data source.
Let’s take a wordpress site as source data, with 40 GB media files, a mysql database.
This article starts big backup articles cycle, so stay tuned for updates and subscribe for our newsletters!