5/5 - (2 votes)

Today we will talk about the main and most commonly used methods to optimize costs on AWS services.

The main options for optimizing the costs

These options primarily include:

  • Reserved instances for permanent loads;
  • Scheduled Reserved Instances-for loads that constantly occur at well-defined points in time;
  • Spot instances – for temporary use of idle servers on the spot market (at substantial discount);
  • Optimizing resource usage time;
  • Optimizing resource usage;
  • Tracking unused resources.

Before we begin to look at each of these options in more detail, I want to draw your attention to the fact that these listed optimization methods are not the only ways to optimize costs.

Reserved Instances

If you definitely know that a certain number of EC2 machines will be used continuously for a year or even longer, RI is a great way to save money. The cost of RI consists of two main components – the initial payment and the payment for the server per hour. In addition to savings you get a guarantee of the capacity availability at your disposal during the reservation period.

RI is a way to reserve EC2 machines, which is available in three options:

  • No Upfront – no initial payment, lowest discount compared to other RI;
  • Partial Upfront-this option involves a substantial initial payment and, accordingly, a substantial discount for EC2 use;
  • All Upfront-the total cost of the EC2 machine is paid immediately, but the total annual cost will be much lower. In addition to a substantial cost reduction, this option can be considered as a means currency risks managing.

Besides the options described above, it has recently become possible to reserve a portion of the EC2 machines’ time to perform periodic tasks (for example, monthly reporting processing). Such option is called Scheduled Reserved Instances. The cost of such machines is slightly lower than ordering virtual machines on demand.

The cost of RI can be calculated using Simple Monthly Calculator (don’t forget to specify the correct region).
It is worth adding that the concept of reservation is applied in AWS not only in relation to EC2 machines, but also to many other services, including DynamoDB, RDS, Redshift, ElasticCache.

Spot instances

Spot Instances are, in fact, virtual capacities that are currently idle. Spot instances can be ordered in two main options:

  • The Usual Spot Instances;
  • Spot Instances for long term use (defined duration).

The mechanics is very simple – first you make a bet. You can launch the spot machine at the moment when the cost of the machine on the spot market will be lower than your bid. The good news is that theoretically, when using Spot Instances, you can save up to 90% of the initial cost of the resource.

The flip side is that the machine can be disabled at any time (if someone outbid you), but before deactivating the machine you will be given two minutes to perform the necessary steps. Spot instances can be used in auto-scaling events or, more commonly, as resources to launch large computational loads.

Below is a diagram of the spot life cycle.
There is also a special tool to assess the likelihood of an instance loss.

Spot instances with specified usage duration imply that, unlike conventional spot machines, they will work exactly the specified time (you can choose from one to six hours in hourly increments). If you use such machines, you are guaranteed to get 50% discount.

More details on the cost of spot instances can be found here (do not forget to specify the region and type of spot or defined duration).
Below is a comparative diagram of the main options of payment for EC2.

Options for optimization

Optimizing working time

Optimizing dev/test environment resources usage time is an interesting and often underestimated opportunity to save substantial amount of money. Besides, it is quite simple to use this opportunity – you just need to configure the machines so that they can be activated and deactivated without losing data, test results and (if someone does it) changes in the code.

It is quite feasible with the use of features such as EC2, AMI, EBS, CloudFormation and others, and also with the use of automation. Of particular note is the fact that technologically enabling and disabling the environment without loss of state and data is the first step to implementing HA & DR for your applications.

Optimizing working time can save you a lot of money if your whole team is not working 24/7.

Optimizing resources

This recommendation is obvious and, like all obvious things is often forgotten. It should always be recalled that AWS allows you to get the resources you need when you need them. When you need resources, you won’t have to perform complex administrative procedures or wait a few days for the operations service to provide you with the capacity you need. You can get everything at once. I will say this one more time. Use exactly the amount of resources that you need right now. Try not to:

  • Don’t order virtual disks with 100% margin – when it’s needed you will allocate a new disk and connect it to your virtual machine;
  • Don’t allocate powerful machines for routine functional testing;
  • Don’t use huge amounts of data in routine work;
  • Don’t reinstall everything every time – use AMI;
  • Don’t neglect long-term archives and data lifecycle;
  • Don’t abuse the opportunity to get everything now – plan activities (analysis of large logs is better to carry out on Spot with a minimum price, load testing with on-demand EC2 on black
  • Friday is better to move to a quieter time, and so on.);
  • Don’t abuse manual operations-try to use automation tools;
  • Feel free to interact – contact AWS and get help if your limits are exceeded and everything seems to be gone. Look for technical support-if there is no time to wait and especially if you are using AWS for productive environments.

Am I wrong? Have you faced the situations when there is no capacity for resource allocation? Wait for half an hour. Don’t want to wait? Start a new machine of a similar type. Is it taking too long to initialize disk? See the documentation – initialization speed can be optimized. The limit has been reached? Contact the support service– it will be increased. All problems can be solved. If something is really missing – we will include it in our roadmap.

Tracking unused resources

Unused resources are often a substantial cost source in AWS. Most often, unused resources occur due to incorrect settings of individual services (for example, EBS was not deleted when you deleted the EC2 machine, because the save disk flag was set for instance deleting), lack of understanding of the basic payment concepts (for example, Elastic IP is not assigned to a specific resource for a long time) or when performing a large number of actions in manual mode (for example, manual provisioning and bootstrapping of a distributed system using the Management Console).

If you project this situation on a distributed development team that conducts many experiments and at the same time works on a large number of projects, the problem of unused resources can acquire a serious scale. There are several basic ways to determine the existence of unused resources in AWS. The most affordable include:

  • Billing forecasts using the Cost Explorer;
  • Trusted Advisor is a tool that provides recommendations in four main areas (security, performance, reliability and cost optimization).

In addition, we offer you a few basic recommendations when working with AWS that will reduce the probability of unused resources:

  • limit developers privilege – assign permissions to your developers as needed, create a separate subnet for each one for experiments, use VPC and tags to control the scope of the project-this will help to understand who and how many resources has spent, and will reduce the risk of running unnecessary resources;
  • use tags – it will help you better understand and control your environment. Properly organized tags will give you the opportunity to always determine what, who, when and by whom was launched, and (what is important) whether it can be disabled;
  • use automation tools to allocate and remove resources (Amazon CloudFormation) – they allow you not only to create, but also to remove stack resources correctly. You can create and delete entire environments using a single tool just by clicking one button;
  • if you use your own scripts-do not forget to delete resources, they are usually connected with each other. Errors during removal will allow you to find unused resources;
  • don’t underestimate the impact of a well – built architecture on the final cost of a solution-consider AWS services at the application level.

We provide AWS support services. Contact us and we will help you save on AWS.