Fault-tolerance is the system’s ability to continue operating in the event when some of its components fails. Ensuring site fault-tolerance is service that we provide for those customers who extremely value the web resource availability. This service includes setting up server OS and software, data replication to several servers, possibly in different data centers, so that in case of system failure it would be possible to accurately predict when the resource will be available and minimize downtime, perhaps even so that the user won’t even notice problems. Fault-tolerance can be organized in various ways, depending on requirements, different tasks, and specific customer needs. Our goal is to develop project and implement it in such way that the client is confident in the resource operability, as well as we can offer round-the-clock monitoring and support. Thus, when building a high availability system, in case of failure, the site will be automatically restored as soon as possible, which eliminates downtime and data loss. Another solution to ensure the availability of sites is mirroring (this is copying of site data to the backup site in real time). Due to this, it is possible to prevent the loss of data and restore the work on the backup equipment in the shortest possible time when the main equipment fails. As a rule, the “mirror” is stored on a remote server in different data center than the main one, to minimize the risks of information destruction.

Advantages of highly available fault-tolerant site

Fault-tolerance technologies provide online backup of critical sites. Thus, in case of problems with the main resource, the site will automatically start working on the reserve site, and user will not even notice it. This will ensure continued availability and eliminate downtime and financial loss in the event of software or hardware failure

Cluster systems are self-sufficient, in most cases do not require special intervention. That is, it is enough to configure everything correctly once and the system will operate autonomously without the need for constant supervision. Thus, there is no need to keep the staff around the clock.

In case of equipment physical failure due to the data replication on a secondary servers data will be protected from loss

Multiple resources are often used to provide fault-tolerance, so not only fault-tolerance can be provided, but also load balancing to different servers