Down the malloc hole in search of free_sized()

Considering the security implications of the sized memory deallocation proposal in the C23 language standard

Nov 22, 2022

While exploring the proposed changes to the C language standard as part of the C23 update, I came across an intriguing little thread. And having seemingly nothing better to do at that moment, I decided to pull on it and see what I learned.

The newly proposed function is one to free memory, by not only specifying the pointer of the memory to be freed, but also its size.

#include <stdlib.h>
void free_sized(void *ptr, size_t size);
void free_aligned_sized(void *ptr, size_t alignment, size_t size);

Any security practitioner is well aware of the risks associated with memory management, and freeing memory is even subject to several of its own memory corruption vulnerability classes, such as use-after-free or double-free vulnerabilities.

At first glance, I did not understand the benefits this free_sized() proposal could bring, so down the malloc hole I went.

Understanding how free() works

Let’s start off by having a look at how free() is implemented in the commonly used glibc implementation. For this, we will look at malloc/malloc.c1

3324 void
3325 __libc_free (void *mem)
3326 {
3327   mstate ar_ptr;
3328   mchunkptr p;                          /* chunk corresponding to mem */
3329 
3330   if (mem == 0)                         /* free(0) has no effect */
3331     return;
...
3340   p = mem2chunk (mem);
3341 
3342   if (chunk_is_mmapped (p))             /* release mmapped memory. */
3343     {
...
3355       munmap_chunk (p);
3356     }
...
3369 }

While many things happen in __libc_free(), let’s focus on just two of them. At line 3340, mem2chunk() is a macro that does exactly what's on the tin, it converts a memory pointer to a chunk address and tag.

1284 /* Convert a user mem pointer to a chunk address and extract the right tag.  */
1285 #define mem2chunk(mem) ((mchunkptr)tag_at (((char*)(mem) - CHUNK_HDR_SZ)))

The actual freeing of the memory takes place by calling munmap_chunk() at line 3355.

3016 static void
3017 munmap_chunk (mchunkptr p)
3018 {
3019   size_t pagesize = GLRO (dl_pagesize);
3020   INTERNAL_SIZE_T size = chunksize (p);
3021 
3022   assert (chunk_is_mmapped (p));
3023 
3024   uintptr_t mem = (uintptr_t) chunk2mem (p);
3025   uintptr_t block = (uintptr_t) p - prev_size (p);
3026   size_t total_size = prev_size (p) + size;

3027   /* Unfortunately we have to do the compilers job by hand here.  Normally
3028      we would test BLOCK and TOTAL-SIZE separately for compliance with the
3029      page size.  But gcc does not recognize the optimization possibility
3030      (in the moment at least) so we combine the two values into one before
3031      the bit test.  */

3032   if (__glibc_unlikely ((block | total_size) & (pagesize - 1)) != 0
3033       || __glibc_unlikely (!powerof2 (mem & (pagesize - 1))))
3034     malloc_printerr ("munmap_chunk(): invalid pointer");
...
3042   __munmap ((char *) block, total_size);
3043 }

Surprisingly, at line 3024, the inverse operation takes place and the memory pointer for the referenced chunk is looked up by calling the chunk2mem() macro. At first glance this does not seem optimal.

So what we see here is that, apart from performing a number of checks, prior to freeing the memory we look up the actual chunk of memory to be freed, and calculate its size.

Now that we understand what happens under the hood of a current implementation, let’s look what the proposal means where the size is not looked up, but passed along.

Understanding the new sized memory deallocation proposal

The proposal in question is N2699 - Sized Memory Deallocation2.

The sought benefits that are listed for introducing free_sized() are:

Performance optimisation
Improved security hardening and testing

Apparently, looking up the size of a chunk of memory given just its pointer is an intensive part of the process that could be avoided, with performance benefits expected to reach 30% in some popular memory allocators, because “Looking up the size given just a pointer requires walking a hashtable or radix tree, which requires several memory loads and (often) cache misses.”

On the topic of security hardening, the proposal creators have this to say:

“If storage is deallocated with an incorrect size, either heap corruption has occurred, or the application has incorrectly tracked the sizes of its allocations, risking a heap overrun vulnerability. In either case, the process can be aborted (or the error otherwise managed). When the equivalent C++ functionality was deployed in both Google and Facebook’s codebases, it exposed a significant number of correctness bugs caused by incorrect type-tracking in applications. With a standardized name, tools like Address Sanitizer can flag these sorts of errors, as they do in C++ code.”

Discussion

The proposal makes an interesting argument that making sized deallocation functions available can contribute to improved correctness and increased security. While this might be seen as a positive contribution, I feel it is important to consider whether this size can be trusted, and what happens in case it isn’t correct ?

Will the size be verified before attempting to free the memory ?
How should the memory allocator handle incorrect sizes ?
If the size is to be known before calling free_sized(), where is it kept ?

A quick glance at the implementation in tcmalloc3 looks like the do_free_with_size() function in tcmalloc/tcmalloc.cc4 makes use of an assert macro, which would result in an abort() of the program in case the assertion fails.

ASSERT(CorrectSize(ptr, size, align));
ASSERT(CorrectAlignment(ptr, static_cast<std::align_val_t>(align.align())));

As the proposed function has no return value, it would not be possible to verify whether the free_sized() function has successfully completed or not. While this is true in case of free() as well, there the size is looked up as part of the function and this error condition is not expected to occur for mapped memory.

While the proposal correctly identifies that in the case where the size does not match, a form of heap corruption or unalignment has already happened, further exploration of what to do in such instance would be interesting to consider.

Is this a solid proposal, or is it a risk for creating additional attack surface ? What do you think ?

Regardless, the entire proposal is well worth a read for anyone professionally (or otherwise) involved in source code auditing or security.

A big thanks to Dmitry Janushkevich for his thoughtful comments.

Till next time,

Thierry

https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2699.htm

https://github.com/google/tcmalloc

https://github.com/google/tcmalloc/blob/master/tcmalloc/tcmalloc.cc

INT3 / Low-level Cybersecurity

Discussion about this post