简体   繁体   中英

How to distinguish armhf (ARMv7) and armel (ARMv4) in C code?

In the executable I'm writing I have 2 implementations of the same function, one for armhf (fast) and one for armel (slow). At runtime I'd like to detect the CPU type, and call the armhf implementation if armhf was detected. How do I detect the CPU? I need something like this in C code:

int is_cpu_armhf(void) {
  ...
}

The code may contain inline assembly, but preferably it shouldn't contain a call to a library function or a system call, because it should work with multiple libraries and multiple operating systems.

I've found https://github.com/pytorch/cpuinfo/tree/master/src/arm , but it doesn't seem to be using any inline assembly, but it relies on the operating system to get the CPU information.

... I have two implementations of the same function, one for armhf (fast) and one for armel (slow). At runtime I'd like to detect the CPU type, and call the armhf implementation if armhf was detected. How do I detect the CPU? I need something like this in C code ...

As @Ruslan noted, the cpu features are mostly privileged on ARM. If you are root then you can read a MRS register for the feature mask. The latest kernels fake a cpuid for ARM, but it is only available on most recent kernels.

At runtime you may be able to parse /proc/cpuinfo on Linux for cpu arch and features. You may also be able to call getauxval and read the bits from the auxiliary vector.

What I have found that works best is:

  1. Try to read getauxval for arch and feature
  2. Use a SIGILL probe if getauxval fails

The SIGILL probe is expensive. You setup a SIGILL handler and try the ARMv5 or ARMv7 instruction. If you catch a SIGILL you know the instruction is not available.

SIGILL probes are used by Crypto++ and OpenSSL. For example, movw and movt were added at ARMv7. Here is the code to probe for ARMv7 using the movw and movt instructions in Crypto++. OpenSSL performs similar in crypto/armcap.c .

bool CPU_ProbeARMv7()
{
    volatile bool result = true;

    volatile SigHandler oldHandler = signal(SIGILL, SigIllHandler);
    if (oldHandler == SIG_ERR)
        return false;

    volatile sigset_t oldMask;
    if (sigprocmask(0, NULLPTR, (sigset_t*)&oldMask))
        return false;

    if (setjmp(s_jmpSIGILL))
        result = false;
    else
    {
        unsigned int a;
        asm volatile (
    #if defined(__thumb__)
            ".inst.n 0xf241, 0x2034  \n\t"   // movw r0, 0x1234
            ".inst.n 0xf2c1, 0x2034  \n\t"   // movt r0, 0x1234
            "mov %0, r0              \n\t"   // mov [a], r0
    #else
            ".inst 0xe3010234  \n\t"   // movw r0, 0x1234
            ".inst 0xe3410234  \n\t"   // movt r0, 0x1234
            "mov %0, r0        \n\t"   // mov [a], r0
    #endif
            : "=r" (a) : : "r0");

        result = (a == 0x12341234);
    }

    sigprocmask(SIG_SETMASK, (sigset_t*)&oldMask, NULLPTR);
    signal(SIGILL, oldHandler);

    return result;
}

The volatiles are required in the probes. Also see What sense do these clobbered variable warnings make?

On Android you should use android_getCpuFamily() and android_getCpuFeatures() instead of getauxval .

The ARM folks say you should NOT parse /proc/cpuinfo . Also see ARM Blog and Runtime Detection of CPU Features on an armv8-a CPU . (Non-paywall version here ).

DO NOT perform SIGILL based feature probes on iOS devices. Apple devices trash memory. For Apple devices use something like How to get device make and model on iOS? .

You also need to enable code paths based on compiler options. That is a whole 'nother can of worms. For that problem see Detect ARM NEON availability in the preprocessor?

For some additional source code to examine, see cpu.cpp in Crypto++. It is the place where Crypto++ does things like call getauxval , android_getCpuFamily() and android_getCpuFeatures() .

The Crypto++ SIGILL probes occur in specific source files since a source file usually needs a compiler option to enable an arch, like -march=armv7-a and -fpu=neon for ARM. That's why ARMv7 and NEON are detected in neon_simd.cpp . (There are other similar files for i686 and x86_64, Altivec, PowerPC, and Aarch64).


Here is what a getauxval and android_getCpuFamily() looks like in Crypto++. CPU_QueryARMv7 is used first. If CPU_QueryARMv7 fails, then a SIGILL feature probe is used.

inline bool CPU_QueryARMv7()
{
#if defined(__ANDROID__) && defined(__arm__)
    if (((android_getCpuFamily() & ANDROID_CPU_FAMILY_ARM) != 0) &&
        ((android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_ARMv7) != 0))
        return true;
#elif defined(__linux__) && defined(__arm__)
    if ((getauxval(AT_HWCAP) & HWCAP_ARMv7) != 0 ||
        (getauxval(AT_HWCAP) & HWCAP_NEON) != 0)
        return true;
#elif defined(__APPLE__) && defined(__arm__)
    // Apple hardware is ARMv7 or above.
    return true;
#endif
    return false;
}

The ARM instructions for movw and movt were disassembled from the following source code:

int a;
asm volatile("movw %0,%1 \n"
             "movt %0,%1 \n"
             : "=r"(a) : "i"(0x1234));

00000010 <_Z5test2v>:  // ARM
  10:   e3010234        movw    r0, #4660       ; 0x1234
  14:   e3410234        movt    r0, #4660       ; 0x1234
  18:   e12fff1e        bx      lr

0000001c <_Z5test3v>:  // Thumb
  1c:   f241 2034       movw    r0, #4660       ; 0x1234
  20:   f2c1 2034       movt    r0, #4660       ; 0x1234
  24:   e12fff1e        bx      lr

Here is what reading a MRS looks like. It is very similar to getting cpuid bitmask on x86. The code below can be used to get Crypto features for Aarch64, but it requires root privileges.

The code requires Exception Level 1 (EL1) and above, but user space runs at EL0. Attempting to run the code from userland results in a SIGILL and termination.

#if defined(__arm64__) || defined(__aarch64__)
  uint64_t caps = 0;  // Read ID_AA64ISAR0_EL1
  __asm __volatile("mrs %0, " "id_aa64isar0_el1" : "=r" (caps));
#elif defined(__arm__) || defined(__aarch32__)
  uint32_t caps = 0;  // Read ID_ISAR5_EL1
  __asm __volatile("mrs %0, " "id_isar5_el1" : "=r" (caps));
#endif

The benefit of issuing instructions yourself is, it does not need arch options when compiling the source file:

    unsigned int a;
    asm volatile (
#if defined(__thumb__)
        ".inst.n 0xf241, 0x2034  \n\t"   // movw r0, 0x1234
        ".inst.n 0xf2c1, 0x2034  \n\t"   // movt r0, 0x1234
        "mov %0, r0              \n\t"   // mov [a], r0
#else
        ".inst 0xe3010234  \n\t"   // movw r0, 0x1234
        ".inst 0xe3410234  \n\t"   // movt r0, 0x1234
        "mov %0, r0        \n\t"   // mov [a], r0
#endif
        : "=r" (a) : : "r0");

You can compile the above code without arch options:

gcc cpu-test.c -o cpu-test.o

If you were to use movw and movt :

int a;
asm volatile("movw %0,%1 \n"
             "movt %0,%1 \n"
             : "=r"(a) : "i"(0x1234));

then your compiler would need to support ARMv7, and you would need to use the arch option:

gcc -march=armv7 cpu-test.c -o cpu-test.o

And GCC could use ARMv7 throughout the source file, which could cause a SIGILL outside your protected code.

I've experienced Clang using the wrong instruction set on x86. See Crypto++ Issue 751 . GCC will surely follow. In the Clang case, I needed to compile with -march=avx on a source file so I could use AVX intrinsics. Clang generated AVX code outside my protected block and it crashed on a old Core2 Duo machine. (The Clang generated unsafe code was initialization of a std::string ).

In the case of ARM the problem is, you need -march=armv7 to enable the ISA with movw and movt and the compiler thinks it can use the ISA, too. It is a design bug in the compiler where the user's arch and the compiler's arch'es are conflated. In reality, because of the compiler design, you need a user arch and a separate compiler arch.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM