简体   繁体   中英

IRQL_UNEXPECTED_VALUE BSOD after NdisFIndicateReceiveNetBufferLists?

We have an NDIS LWF driver, and only on very few systems, we get IRQL_UNEXPECTED_VALUE BSOD on the NdisFIndicateReceiveNetBufferLists , But we do not raise or lower IRQL in any part of the code, and the NdisFIndicateReceiveNetBufferLists is called in the irp_mj_device_control callback. We also check the IRQL and if its DISPATCH, we set the last argument to NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL , and 0 otherwise, could this be the issue?

I also found this article:

https://knowledge.broadcom.com/external/article/164146/crash-with-bug-check-0xc8-after-installi.html

They had a similar issue, and the issue seems to be that there was another NDIS driver raising the IRQL to DISPATCH_LEVEL and forgeting to lower it? But I'm still not sure if this is applicable to our issue or not? Could this be also our issue?

IRQL_UNEXPECTED_VALUE (c8)
The processor's IRQL is not what it should be at this time.  This is
usually caused by a lower level routine changing IRQL for some period
and not restoring IRQL at the end of that period (eg acquires spinlock
but doesn't release it).
Arguments:
Arg1: 0000000000020002, (Current IRQL << 16) | (Expected IRQL << 8) | UniqueValue
Arg2: fffff82621a444f0, Depends on UniqueValue:
    If UniqueValue is 0 or 1: APC->KernelRoutine.
    If UniqueValue is 2: the callout routine
    If UniqueValue is 3: the interrupt's ServiceRoutine
    If UniqueValue is 0xfe: 1 iff APCs are disabled
Arg3: ffff950cf4dccff0, Depends on UniqueValue:
    If UniqueValue is 0 or 1: APC
    If UniqueValue is 2: the callout's parameter
    If UniqueValue is 3: KINTERRUPT
Arg4: 0000000000000000, Depends on UniqueValue:
    If UniqueValue is 0 or 1: APC->NormalRoutine

Call stack:

nt!KeBugCheckEx
nt!KeExpandKernelStackAndCalloutInternal
nt!KeExpandKernelStackAndCalloutEx
ndis!ndisInvokeNextReceiveHandler
ndis!ndisFilterIndicateReceiveNetBufferLists
ndis!NdisFIndicateReceiveNetBufferLists
OurNdis

And the second parameter which is the callout routine (based on unique value), is ndis! ndisDataPathExpandStackCallback .

Edit1:

I did a little more digging, and indeed it seems like ndisDataPathExpandStackCallback actually just calls ndisCallReceiveHandler (which doesn't appear on the stack). and I assume this just indicates the recved NBL to other NDIS drivers? Anyways, ndisDataPathExpandStackCallback is called via KeExpandKernelStackAndCalloutInternal , and the latter stores the IRQL, and checks the IRQL after the call, and if it mismatches, it raises this bugcheck, bingo!

BUT, now my question is, how can i find the faulty driver? Can i somehow use the ndiskd extension to help me which NDIS driver did the KeExpandKernelStackAndCalloutInternal call so i can prove and find the faulty driver?

Although by investigating the stack, i did find pacer! PcFilterReceiveNetBufferLists , but i doubt this is the faulty driver considering its a windows driver, right?

They had a similar issue, and the issue seems to be that there was another NDIS driver raising the IRQL to DISPATCH_LEVEL and forgeting to lower it? But I'm still not sure if this is applicable to our issue or not? Could this be also our issue?

That particular bugcheck means that someone leaked the IRQL during the code that has already unwound off the stack. KeExpandKernelStackAndCalloutInternal is doing something like this:

oldIrql = KeGetCurrentIrql();
(*callback)(...);
newIrql = KeGetCurrentIrql();

if (oldIrql != newIrql) {
    KeBugCheckEx(IRQL_UNEXPECTED_VALUE, (newIrql << 16) | (oldIrql << 8) | 2, ...);
}

Decoding the first argument, that means the IRQL was PASSIVE_LEVEL on entry, and DISPATCH_LEVEL on exit.

Unfortunately, the code that did this has already finished running -- this bugcheck is just identifying that they didn't clean up the place before they left the room. You can make an educated guess as to what code was likely running by looking at the filter driver stack in .ndiskd.miniport . But this only gives you a starting place: depending on what packets were coming in from the.network, the.network stack could have called out into a variety of drivers. Eg, if the.network indicated up an SMB3 packet, then execution actually winds its way all the way up through the filesystem stack. So it's not particularly easy to list out all the possible drivers that could have run.

One thing to check, though, is that you are using the NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL flag correctly. You are only allowed to set the flag if you are certain that the IRQL is currently DISPATCH_LEVEL . If that flag is used incorrectly, you might be able to trick some other driver into mismatching the IRQL. For example, a hypothetical driver might have:

void FilterReceiveNbls(..., ULONG ReceiveFlags) {
    KIRQL oldIrql;
    KeRaiseCurrentIrql(DISPATCH_LEVEL, &oldIrql);

    . . . do stuff at dispatch level . . .

    if (0 == (ReceiveFlags & NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL)) {
        KeLowerCurrentIrql(oldIrql);
    }
}

I'm not saying with certainty that's exactly what happened. I'm just looking for things you can audit in your driver, and correct use of NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL is one of them. Note that it's always correct to not add this flag to ReceiveFlags . (In fact, it's even correct to just clear the flag if you see someone else set it -- the flag's only benefit is a very tiny perf optimization.) So if you're ever in doubt, just don't add the flag.

Windows 11 can strictly verify this flag if you enable Driver Verifier (DV) with the NDIS/WIFI option enabled. The easiest thing to do is to enable DV on all drivers, but if that runs too slow, you can just select each individual.network driver. On Windows 11, when DV is enabled with the NDIS/WIFI option, if any driver misuses any NDIS_XXX_DISPATCH_LEVEL flag, you'll get an instant bugcheck at the site of the error.

(DV does not currently verify that the driver returns the IRQL to its original level -- that's a good idea for the future, though.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM