Variable Size Memory Allocations Manager

VariableSizeAllocationsManager class handles free memory block management to accommodate variable-size allocation requests. It keeps track of free blocks only and does not record allocation sizes. The class uses two ordered maps to facilitate operations. The first map keeps blocks sorted by their offsets. The second multimap keeps blocks sorted by their sizes. The elements of the two maps reference each other, which enables efficient block insertion, removal and merging. Note that since we need ordered access, all operations with the maps (search/insert/delete) take logarithmic time on the number of blocks. The two maps are defined as shown below:

Note that the second map is a multimap because there may be several blocks of the same size. To the contrary, there could be only single block at the specific offset. An example for the case of four free blocks is given in the figure below:

FreeBlockListManager1

Note that the Size  member of the FreeBlockInfo structure is redundant as it is equal to OrderBySizeIt->first. It however makes implementation more readable and clear.

Allocation

To allocate a new block, we use the second map to find the smallest block that is large enough to encompass the requested size. multimap::lower_bound standard function does exactly what we need:

This operation takes logarithmic time on the size of the container. Offset to the beginning of the requested allocation will then be given by the beginning of this block:

The offset of the new free block will be given by adding requested size to the original offset, and the new size will be given by subtracting the requested memory size from the original block size:

The final step is to update the maps. Since both block size and offset have been changed, we have to first remove original elements from the containers, and if the new block is not empty, insert new elements into the maps:

AddNewBlock()  is a helper function that inserts a new block of given size at the specified offset:

Both insertion and deletion of an element into an ordered map take logarithmic time, so the total method complexity is O(log n).

As an example, if we request the block of size 20 bytes, the system will return offset 32 and the structure of the blocks after the allocation will look like shown in the figure below:

BlockDeletion2

The full function source code is given in the following listing:

Deallocation

Deallocation is little more involved as there are more cases we need to handle. The first step is to understand where the new block should be inserted. For that purpose, we can use the first map that keeps memory blocks sorted by the offsets. Since we know the offset of the block to be deallocated, we can use map::upper_bound to find a block that goes right after it. Using the obtained interator, we can then find preceding free block:

Next we need to handle four possible cases:

  1. The block being released is not adjacent to any other free block. In this case we simply add the block as a new free block.
  2. The block is adjacent to the previous block. The two blocks need to be merged.
  3. The block is adjacent to the next block. The two blocks need to be merged.
  4. The block is adjacent to both previous and next blocks. All three blocks need to be merged.

The full function listing is given below:

The run-time complexity of deallocation routine is logarithmic with respect to the number of free blocks. The figure below shows an example of deallocating a block of size 8 at offset 56, that results in a merge of the block with the previous and next free blocks:

BlockInsertion1

Variable Size GPU Allocations Manager

When managing GPU memory, deallocations can only be performed when the memory is no longer accessed by the GPU. A simple extension of the basic allocator allows deferring block deallocation until the time when it is safe to do so. The class provides new Free() method that takes frame number along with the block offset and size. The method adds the block attributes in the deletion queue, and releases all the stale allocations at once at the end of each frame: