Recrutement
Si vous êtes intéressés pour bosser sur des sujets sympas tout en restant loin de Paris, consultez nos offres d'emploi et envoyez nous votre CV à rh@amossys.fr.

DIMCT

We developped a small tool, "DIMCT" which simply allows tracing inter module calls, without a too big overhead.

During our evaluations tests, we often need to analyze quickly large Windows products, and want to pinpoint how their different bricks work together, especially their modules. In most cases, a module will import another module's functions, and this will be easily retrieved statically.

However, in a few other cases, a module may export classes constructors, which will return objects containing references towards their virtual methods. In some other few cases, callbacks may be registered and called by other modules. In these cases, it will not be trivial to pinpoint which method will be called by another module (and especially by which function).

We developped a small tool, "DIMCT" (for Dirty Inter-Module Calls Tracer) which allows tracing inter module calls, without a too big overhead.

The tool may be found at https://github.com/AMOSSYS/DIMCT.

Usage

The usage is relatively straightforward:

  1. Run the provided IDAPython script in order to generate a configuration file;
  2. Start the monitored process;
  3. Run the provided executable with the process PID, the configuration file, and the delay before killing the process;
  4. Load the output with the IDAPython script in order to pinpoint which functions have been called;
  5. Manually parse the output file if you want more information, e.g 'who called who'.

Internals

The inner concepts are also quite simple: inline hooks are placed in top of any identified function. The hook points toward a logging function, which only logs intermodular calls. Logs are performed in a dedicated memory area, which is periodically read and dumped by the remote process.

It follows this scheme:

Figure1
Figure 1: DIMCT flow

The reasons why we call this tool "dirty" are the following ones:

  1. we do NOT use a shared memory section, the monitoring process keeps reading the remote memory area and wipes it when full (two WriteProcessMemory calls are done, one to wipe the area, the second one to "release" the mutex). We just gave the monitoring process an higher priority than the target process in order to minimize the impact;
  2. we do NOT use any Windows API in the logging function, so mutexes are implemented with a lock cmpxchg instruction (i.e no OS benefits such as thread priority boosts).

Yeah, that's really dirty, but this actually worked without too much bugs/overhead/drops, so... we keeped it as is. We also did not encounter the need for x64 binaries so actually only x86 processes are handled (the concept remains the same, we will implement it soon, I guess).

The main problems we faced is handling relative instructions while moving our saved instructions. Moving a SHORT JMP or a CALL, which opcodes are relatives to the current instruction position is not that straightforward, and that's the main reason why we used an IDAPython script.

In order to face this problem and use absolute addresses, we replaced CALLS and JMPS with PUSH/RET instructions, and conditional jumps with their counterparts and PUSH/RET instructions. For instance, a JNZ SHORT <addr> will be replaced by a JZ SHORT $+6 / PUSH <addr> / RET. Those absolute addresses belonging to the module itself are stored relatively to the module base address, and then "relocated" at the hook installation. Absolute addresses are also logged in order to be relocated by the program.

As an example, here are the original function, the configuration file and the final result:

Figure1
Figure 2: DIMCT trampolines

Example

As an example, let's test it on KernelBase.dll and the 32 bit version of notepad.exe. First, load KernelBase.dll (the SysWOW64 version) in IDA Pro, load the script and run create_config("config.bin", True).

Python>create_config("C:\\Users\\user\\Desktop\\config.bin", True)
4374 subs will be monitored

On a Windows 10 1709 we actually cover 4374 over 4458 subs.

Now let's start the notepad.exe instance and then DIMCT tool, with notepad's PID and 120 seconds. The interface is actually quite responsive but may be slowed, especially when opening the file/open dialog. Finally we've got a log.bin file of approximatively 6Mb.

Figure1
Figure 3: DIMCT running

In order to show the results in IDA, we use the parse_output function, and here are the called functions:

Python>parse_output("C:\\Users\\User\\Desktop\\log.bin")
Modules list:
notepad.exe : 00380000 - 003be000
ntdll.dll : 77c20000 - 77dae000
[...]
Unique callers:
COMDLG32.dll
urlmon.dll
gdi32full.dll
TextInputFramework.dll
msvcrt.dll
CoreUIComponents.dll
dwmapi.dll
ntdll.dll
sechost.dll
PROPSYS.dll
cfgmgr32.dll
KERNEL32.DLL
IMM32.DLL
SHLWAPI.dll
USER32.dll
MPR.dll
combase.dll
notepad.exe
uxtheme.dll
OLEAUT32.dll
profapi.dll
SHELL32.dll
RPCRT4.dll
shcore.dll
clbcatq.dll
COMCTL32.dll
windows.storage.dll
MSCTF.dll
ucrtbase.dll
CoreMessaging.dll
twinapi.appcore.dll
ADVAPI32.dll
oleacc.dll
Unique called subs:
0x100f2800 GetProcessHeap
0x100fb300 DeactivateActCtx
0x100f77b0 sub_100F77B0
[...]

Sorting the called functions:

AccessCheck
ActivateActCtx
AddAccessAllowedAce
AddRefActCtx
[...]
Wow64DisableWow64FsRedirection
Wow64RevertWow64FsRedirection
lstrcmpW
lstrcmpiW
lstrlenW
sub_100D991D
sub_100EE882
sub_100F77B0
sub_100FD090
sub_10103281
sub_1010331F

Interrestingly, 6 non exported subs have been called. For instance, sub_100F77B0 and sub_100FD090 are only referenced by CreateThreadpoolIo. Let's see who called them:

Python>whocalled("sub_100D991D", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
0x32c1f4d
Python>whocalled("sub_100EE882", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
0x32c2e2a
Python>whocalled("sub_100F77B0", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
ntdll.dll : 0x77c597c7
Python>whocalled("sub_100FD090", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
ntdll.dll : 0x77c5d087
Python>whocalled("sub_10103281", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
0x32c1c11
Python>whocalled("sub_1010331F", "C:\\Users\\user\\Desktop\\log.bin")
Unique callers:
0x32c42e0
Python>

Ntdll called the 2 thread pools callbacks, the other ones seem to have been called by jitted code, which is in fact... our own "trampolined" code (which moved several CALL instructions), which we really should add in the white list.

Conclusion

We hope this basic tool/source code will be useful to others than us. We want it to remain simple, so the biggest improvements will probably be removing the "dirty" part (i.e using shared memory, Windows mutexes, and tuning the assembly code), and adding the x64 support. We may also test it against intra-modular calls in the future, but we're not really confident over the performances. We'll see. Feel free to contribute!