5/5 - (1 vote)

If you have worked with Microsoft Azure cloud technologies, then you have probably come across, or at least read, about the Azure Storage Account and its components – Tables, Queues and Blobs.

In this article, I would like to consider the last of them, blobs, in terms of access speed to them and make a small overview of the possibilities for modifying them without downloading all the blob content to the client.

Azure currently provides three types of blobs:

  1. Block blobs (Block BLOBs) store binary data in the form of separate blocks of variable size and allow you to download up to 190TB of data in total in one blob.
  2. Append BLOBs are essentially the same block blobs, but the Azure Storage Account infrastructure takes responsibility for appending data to the end of an existing blob and also allows multiple individual producers to write to the same blob without locks (but and without guarantees of consistency, there is only a guarantee that each individual data insert will be added to the blob consistently and will not overwrite another).
  3. Page blobs (Page BLOBs) provide random access to content and are primarily used to store virtual machine images.

What could be simpler than a blob?

To begin with, a little story that we encountered on one of the projects a few years ago. We needed to receive, store and send large amounts of telemetry from clients through the API, and using tables was inconvenient for several reasons.

Firstly, they did not give the desired read speed when querying for long periods. Everything corny rested on the speed of obtaining data from the Table API, and even switching to daily partitioning and parallel queries every day did not give the desired result.

Secondly, the hardware of our API services began to come out very expensive due to JSON serialization in which responses from the Table API come.

We began to explore options, asked the price of Cosmos DB, but it turned out to be quite expensive with our volumes, and then we stumbled upon Append BLOBs, which had just entered General Availability. Microsoft suggested using them for adding data scripts (logs, logs), and out of the box they had the functionality of non-blocking writing to the blob by several writers. It would seem that what could go wrong?

And now our prototype is deployed on the load testing stand. At first, everything was pretty good – the data poured into blobs quickly, queries to them were also faster than when working with Azure Table Storage, since you didn’t need to scan the table partition to find them, it was enough to form the blob name from the event type and date, and the binary protobuf serialization allowed to save a lot on processor resources.

Everything was fine until the moment when the number of entries in the blobs began to approach the expected daily number. The longer the data upload worked, the slower our application returned data on API requests. The speed of reading blobs inside the Azure infrastructure, in one data center, from the Storage Account to our Web API services, has decreased to inadequate values. A ten megabyte blob could be read for several minutes! As the analysis showed, we faced the problem of reading fragmented blobs.

If you use Append BLOBs with competitive insertion, then Azure creates a separate block for each append operation, and synchronizes only block commit operations, without grouping data blocks in any way. And when the block size turns out to be very small with a large number of blocks, the reading speed of such a blob drops catastrophically.

From here, by the way, it is worth immediately concluding: if you use such blobs for logging, and the speed of loading these logs into your own infrastructure, for example ELK, is important to you, then it would be a good idea to set the maximum possible, in terms of allowable losses, the size of the send buffer to logger of your application software.

Block BLOB and how to replace Append BLOB with it

Now back to block blobs, and at the same time I will tell you how we solved the problem with telemetry. In the Blob Storage documentation, Microsoft gives an example of loading an entire blob in one call.

This method has a limit on the size of the created blob, but I suspect that in most cases no one comes out of it, since at the moment it is about 5GB (until 2019 – 256Mb, before 2016 – up to 64Mb).

But besides such a simple API, there is also an extended one. What’s in it? The three operations are Put Block, Put Block List and Get Block List. In short – you can upload individual blocks up to 4GB in size (up to 2019 – 100Mb, before 2016 – 4Mb), each block must have a unique identifier up to 64 bytes in size, and then you call Put Block List passing it a list of block identifiers and the blob becomes available and visible to other clients.

Digging deeper, what else can be done with these methods?

For example, when loading a blob, you can independently split the data into the number of blocks we need and load them simultaneously. In this way, you can speed up the loading of large blobs by almost an order of magnitude and at the same time practically without changing the logic of loading data as a whole. But more on that below.

Or you can implement your own Append BLOB by adding new blocks to the end of an existing blob. And you can do this – store in the service that updates the blob, the contents of the last tail block and the current list of blocks. Then, if you need to add some data to the blob, you simply add it to this content, create a new tail block from it, and replace the old one with it. Two API calls (Put Block and Put Block List), not a single read, and you practically have an Append BLOB, only better, since its fragmentation is much lower. Well, when the tail block becomes too large, we begin to assemble a new one. Of the minuses – you need to bind clients to the service instance through which the blob is modified. Actually, this is what we got, and now it digests quite large amounts of telemetry.

But it doesn’t have to stop there. After all, you can change any blocks of an already created blob, not only the last one. In addition, block sizes do not have to be the same, so blocks can be both split and glued within the maximum allowable 4GB.

And we also have a block ID, in which you can put up to 64 bytes of data. And if a couple of bytes are enough for the actual block identifier in a blob (remember – no more than 50,000 blocks per blob), then arbitrary data can be put in the remaining 62 bytes, so you can organize your own small metadata for blocks, though in read-only mode “.

Fragmentation, upload and download speed

But what about the speed of working with blobs, is fragmentation always harmful? The answer here is: it depends on the task.

An example is the use of blobs to deliver data to the Azure DC from an external infrastructure, when it is more profitable to import data into Azure Tables by packing the data into a blob, uploading it to a Storage Account in the same DC as your table, and upload data to the table there. Most likely, on the Azure side, the limiting factor will no longer be the speed of reading the blob (if you do not bring its fragmentation to the point of absurdity), but inserting it into the table, and then speeding up the process of filling the blob can be useful.

public static class BlockBlobClientExtensions
{
    public static async Task UploadUsingMultipleBlocksAsync(this BlockBlobClient client, byte[] content, int blockCount)
    {
        if(client == null) throw new ArgumentNullException(nameof(client));
        if(content == null) throw new ArgumentNullException(nameof(content));
        if(blockCount < 0 || blockCount > content.Length) throw new ArgumentOutOfRangeException(nameof(blockCount));

        var position = 0;
        var blockSize = content.Length / blockCount;
        var blockIds = new List<string>();

        var tasks = new List<Task>();

        while (position < content.Length)
        {
            var blockId = Convert.ToBase64String(Guid.NewGuid().ToByteArray());
            blockIds.Add(blockId);
            tasks.Add(UploadBlockAsync(client, blockId, content, position, blockSize));
            position += blockSize;
        }

        await Task.WhenAll(tasks);
        await client.CommitBlockListAsync(blockIds);
    }

    private static async Task UploadBlockAsync(BlockBlobClient client, string blockId, byte[] content, int position, int blockSize)
    {
        await using var blockContent = new MemoryStream(content, position, Math.Min(blockSize, content.Length - position));
        await client.StageBlockAsync(blockId, blockContent);
    }
}

Of course, you can implement such logic by parallelizing the loading of data across different blobs and collect the initial data from them on the Azure side, however, if in the future it is necessary to guarantee the sequence of data processing and their integrity, then it is easier to use block loading and Put Block List to atomically create a blob from individual blocks.

Especially this approach makes sense to consider when it is necessary to quickly upload large amounts of data to a geographically remote Azure DC, when, due to the high latency of the TCP connection, we cannot fully utilize the channel available to us.

But in the race for speed, you should not forget about the Storage Account limits so as not to fall under request throttling (as in the test below), well, you should definitely do this only if, when divided into blocks, their size remains large enough.

Some tests

You can evaluate the impact of fragmentation on the speed of downloading a blob, as well as how parallelism will affect the speed of loading a blob in a Storage Account, based on the results of a small test below. We take a random array of bytes with a size of 100.000.000 bytes, upload it to the Storage Account as a blob consisting of 1, 100, 1000, 10000 or 50000 (more blocks cannot be added to one blob in the current version of the API) blocks obtained by splitting the original array into equal parts . After that, the resulting blob is downloaded and deleted. We measure the upload and download time, calculate the speed using the time in seconds, rounded up to two decimal places.

Test number 1. Storage Account in DC Azure North Europe

Blocks count Block size, bytes Upload time, s Upload speed, Kb/s Download time, s Download speed, Kb/s
1 100 000 000 19.32 5 054 28.90 3 379
100 1 000 000 3.81 25 631 38.49 2 537
1 000 100 000 6.00 16 276 42.16 2 316
10 000 10 000 7.27 13 432 127.73 764
50 000 2 000 31.97 3 054 394.86 247

 Test number 2. Storage Account in DC Azure West US

Blocks count Block size, bytes Upload time, s Upload speed, Kb/s Download time, s Download speed, Kb/s
1 100 000 000 51.48 1 896 80 1 220
100 1 000 000 8.96 10 899 96 1 017
1 000 100 000 3.48 28 062 105 930
10 000 10 000 2.67 36 575 230 424
50 000 2 000 9.82 9 944 770 127

The increase in blob loading time with an increase in the number of blocks from 1000 and above seems to be due to throttling due to going beyond the limits of the Storage Account API – a situation that should not be brought to the sale in any case.

Conclusions

When working with large mutable blobs, it is important to control the degree of their fragmentation, otherwise it is easy to find yourself in a situation where the data seems to be there, but the speed of access to it has become so low that it (the data) seems to be non-existent.

If you still encounter such a problem, then most likely you will either have to change the application logic, or think about regular “treatment” of such blobs by rebuilding them with combining blocks into larger ones, since the API allows you to do this without creating intermediate blobs, directly in-place, and even with the possibility, through Optimistic concurrency, not to lose consistency at the cost of reprocessing.

But for the Azure data import scenario, a small fragmentation (at the level of 100–1000 blocks, and with control over the degree of parallelism) will allow you to load data into the Azure DC an order of magnitude faster and more fully utilize your channel , which, if the data processing speed is lost further by about 25 % looks like an acceptable compromise and does not require deep modification of the application code.