简体   繁体   中英

Returning from exe entry point does not terminate the process on Windows 10

My attempt

I created a minimal, CRT-free, dependency-depleted executable with Microsoft Visual Studio by specifying the /GS- compiler flag and the /NoDefaultLib linker flag, and naming the main function mainCRTStartup . The application does not create additional threads and returns from mainCRTStartup after < 5 seconds, but it takes 30 seconds in total for the process to terminate.

Problem description

From my experience, if an application, executed on Windows 10, only depends on dynamic libraries that are loaded by default into every Windows process, namingly ntdll.dll , KernelBase.dll and kernel32.dll , the process exits normally when the main thread returns from the mainCRTStartup function.

If other libraries are loaded, statically or dynamically (fe by calling LoadLibraryW ), returning from the main function will leave the process alive: for 30 seconds when run normally and indefinitely when run under a debugger.

Context

On process creation, the Windows 10 process loader creates additional threads to load dynamic libraries faster, see:

Cylance mentions in Windows 10 Parallel Loading Breakdown :

The worker thread idle timeout is set to 30 seconds. Programs which execute in less than 30 seconds will appear to hang due to ntdll!TppWorkerThreadwaiting for the idle timeout before the process terminates.

Microsoft mentions in Terminating a Process: How Processes are Terminated :

Note that some implementation of the C run-time library (CRT) call ExitProcess if the primary thread of the process returns.

On the other hand, Microsoft mentions in ExitProcess :

Note that returning from the main function of an application results in a call to ExitProcess .

Test code

This is the minimal test code I worked with, I used kernel32!CloseHandle and user32!CloseWindow as examples, the call to them does not actually do anything:

#include <cstdint>

namespace windows {
    typedef const intptr_t Handle;
    typedef const void *   Module;

    constexpr Handle InvalidHandleValue = -1;

    namespace kernel32 {
        extern "C" uint32_t __stdcall CloseHandle(Handle);
        extern "C" uint32_t __stdcall FreeLibrary(Module);
        extern "C" Module   __stdcall LoadLibraryW(const wchar_t *);
    }

    namespace user32 {
        extern "C" uint32_t __stdcall CloseWindow(Handle);
    }
}

int mainCRTStartup() {
    // 0 seconds
    // windows::kernel32::CloseHandle(windows::InvalidHandleValue);

    // 30 seconds
    // windows::user32::CloseWindow(windows::InvalidHandleValue);

    // 0 seconds
    // windows::kernel32::FreeLibrary(windows::kernel32::LoadLibraryW(L"kernel32.dll"));

    // 30 seconds
    // windows::kernel32::FreeLibrary(windows::kernel32::LoadLibraryW(L"user32.dll"));

    // 0 seconds
    // windows::kernel32::FreeLibrary(windows::kernel32::LoadLibraryW(L""));

    return 0;
}

Debugging

Commenting in the WinAPI usage in the the mainCRTStartup function results in execution times mentioned above the respective WinAPI call.

This is the execution flow of the program traced in a debugger in pseudo C++:

ntdll.RtlUserThreadStart() {
    kernel32.BaseThreadInitThunk() {
        const auto return_code = test.mainCRTStartup();

        ntdll.RtlExitUserThread(return_code) {
            if (ntdll.NtQueryInformationThread(CURRENT_THREAD, ThreadAmILastThread) != STATUS_SUCCESS || !AmILastThread) {
                // Bad path - for `30 seconds`.

                ntdll.LdrShutdownThread();
                ntdll.TpCheckTerminateWorker(0);
                ntdll.NtTerminateThread(0, return_code);

                // The thread execution does not return from `NtTerminateThread`, but the process still runs.
            } else {
                // Good path - for `0 seconds`.

                ntdll.RtlExitUserProcess(return_code) {
                    ntdll.EtwpShutdownPrivateLoggers();
                    ntdll.LdrpDrainWorkQueue(0);
                    ntdll.LdrpAcquireLoaderLock();
                    ntdll.RtlEnterCriticalSection(ntdll.FastPebLock);
                    ntdll.RtlLockHeap(peb.ProcessHeap);
                    ntdll.NtTerminateProcess(0, return_code);
                    ntdll.RtlUnlockProcessHeapOnProcessTerminate();
                    ntdll.RtlLeaveCriticalSection(ntdll.FastPebLock);
                    ntdll.RtlReportSilentProcessExit(CURRENT_PROCESS, return_code);
                    ntdll.LdrShutdownProcess();
                    ntdll.NtTerminateProcess(CURRENT_PROCESS, return_code);

                    // The thread execution does not return from `NtTerminateProcess` and the process is terminated.
                }
            }
        }
    }
}

Expected results

I expected the process to terminate if it does not create additional threads and returns from the main function.

Calling ExitProcess at the end of the main function terminates the process, even if WinAPI is called which resulted in 30 seconds execution before. Using this API is not always possible, because the problematic application might not be mine, but a 3rd party application (from Microsoft ) like here: Why would a process hang within RtlExitUserProcess/LdrpDrainWorkQueue?

It seems to me that the Windows 10 process loader is broken, if even Microsoft processes behave incorrectly.

  1. Is there a clean solution to this problem?
  2. What are those loader threads needed for, if the last user created thread exits? AFAIK it is impossible at this point to load any other libraries.

I expected the process to terminate if it does not create additional threads and returns from the main function.

process can implicit create additional threads. loader for example. and need understanding what mean

returns from the main function

here mean function which called from standard CRT mainCRTStartup function. after this mainCRTStartup call ExitProcess . so not any exe entry real entry point function but some sub-function called from entry point. but entry point call ExitProcess than.

if we not use CRT - we need call ExitProcess yourself. if we simply return from from entry point - will be RtlExitUserThread which not call ExitProcess except this is last thread in process ( AmILastThread ) (and here also can be race if 2 or more threads in parallel call ExitThread )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM