简体   繁体   English

在Solaris x86平台上的映射文件中覆盖hwcap_2

[英]Override hwcap_2 in mapfile on Solaris x86 platforms

We have a library that guards runtime paths. 我们有一个保护运行时路径的库。 If a cpu feature is available then a faster code path is taken. 如果cpu功能可用,则采用更快的代码路径。 We are trying to add an AVX2 code path on Solaris 11.3. 我们正在尝试在Solaris 11.3上添加AVX2代码路径。

On an old, downlevel machine without AVX2 we are experiencing: 在没有AVX2的旧的低层计算机上,我们遇到:

$ ./cryptest.exe v
ld.so.1: cryptest.exe: fatal: cryptest.exe: hardware capability (CA_SUNW_HW_2) unsupported: 0x40  [ AVX2 ]
Killed

We have a mapfile that clears capabilities due to runtime feature detection. 我们有一个映射文件,可通过运行时功能检测来清除功能。 It has worked well for CA_SUNW_HW_1 and AESNI, CLMUL, SSE4.2, SSE4.1 and SSE3: 它对于CA_SUNW_HW_1和AESNI,CLMUL,SSE4.2,SSE4.1和SSE3效果很好:

$ cat cryptopp.mapfile
hwcap_1 = SSE SSE2 OVERRIDE;

We need to clear caps for hwcap_2 . 我们需要清除hwcap_2上限。 According to Sun's Mapfile Directives , we should be able to use an empty assignment to clear the caps: 根据Sun的Mapfile指令 ,我们应该能够使用空分配来清除上限:

If the “=” operator is used, the value specified replaces the previous value, and exclude is reset to 0. In addition, the use of “=” overrides any capabilities that are collected from input file processing. 如果使用“ =”运算符,则指定的值将替换先前的值,并将exclude重置为0。此外,使用“ =”将覆盖从输入文件处理中收集的所有功能。

And then later in the document: 然后在文档后面:

To completely eliminate a given capability from the output object, it suffices to use the “=” operator and an empty value list... 为了从输出对象中完全消除给定的功能,只需使用“ =”运算符和一个空值列表即可。

So we added an empty hwcap_2 to eliminate the capability: 因此,我们添加了一个空的hwcap_2以消除该功能:

$ cat cryptopp.mapfile
hwcap_1 = SSE SSE2 OVERRIDE;
hwcap_2 = ;

But it results in the same runtime error. 但这会导致相同的运行时错误。

We found one bug report at Disable hwcaps on libgfortran , but it has an Autools workaround and not a mapfile fix. 我们在libgfortran的Disable hwcaps上发现了一个错误报告,但是它具有Autools解决方法,而不是mapfile修复程序。

How do we clear AVX and AVX2 capabilities in a mapfile on Solaris x86? 如何在Solaris x86上的映射文件中清除AVX和AVX2功能?


Setting hwcap_2 = 0; 设置hwcap_2 = 0; results in the following at linktime: 在链接时产生以下结果:

ld: fatal: cryptopp.mapfile: 4: unknown segment attribute: 0
make: *** [GNUmakefile:1084: cryptest.exe] Error 2

We can't use hwcap_2 = SSE SSE2 because SSE and SSE2 from hwcap_1 collides with AV2_386_RDSEED and AV2_386_ADX from hwcap_2 . 我们不能使用hwcap_2 = SSE SSE2因为来自hwcap_1 SSESSE2与来自hwcap_2 AV2_386_RDSEEDAV2_386_ADX hwcap_2


Here is the full link command using the mapfile: 这是使用mapfile的完整链接命令:

$ CXX=/opt/solarisstudio12.4/bin/CC make
/opt/solarisstudio12.4/bin/CC -o cryptest.exe -DNDEBUG -g -xO3 -template=no%extd
ef adhoc.o test.o bench1.o bench2.o bench3.o datatest.o dlltest.o fipsalgt.o val
idat0.o validat1.o validat2.o validat3.o validat4.o validat5.o validat6.o valida
t7.o validat8.o validat9.o validat10.o regtest1.o regtest2.o regtest3.o regtest4
.o ./libcryptopp.a -xarch=sse2 -xarch=ssse3 -xarch=sse4_1 -xarch=sse4_2 -xarch=a
es -xarch=avx -xarch=avx2 -M cryptopp.mapfile -lnsl -lsocket
$

The reason the linker includes -xarch options (which are removed by the mapfile) is, the manual tells us the link command must include all -xarch options. 链接器包含-xarch选项(已由映射文件删除)的原因是,该手册告诉我们链接命令必须包含所有-xarch选项。 So we don't have a choice in omitting it. 因此,我们别无选择。


And here is <sys/auxv_386.h> : 这是<sys/auxv_386.h>

$ cat /usr/include/sys/auxv_386.h
/*
 * Copyright (c) 2004, 2015, Oracle and/or its affiliates. All rights reserved.
 */

#ifndef _SYS_AUXV_386_H
#define _SYS_AUXV_386_H

#ifdef __cplusplus
extern "C" {
#endif

/*
 * Flags used in AT_SUN_CAP_HW* elements to describe various userland
 * instruction set extensions available on different processors.
 * The basic assumption is that of the i386 ABI; that is, i386 plus i387
 * floating point.
 *
 * Note that if a given bit is set; the implication is that the kernel
 * provides all the underlying architectural support for the correct
 * functioning of the extended instruction(s).
 */
#define AV_386_FPU              0x00001 /* x87-style floating point */
#define AV_386_TSC              0x00002 /* rdtsc insn */
#define AV_386_CX8              0x00004 /* cmpxchg8b insn */
#define AV_386_SEP              0x00008 /* sysenter and sysexit */
#define AV_386_AMD_SYSC         0x00010 /* AMD's syscall and sysret */
#define AV_386_CMOV             0x00020 /* conditional move insns */
#define AV_386_MMX              0x00040 /* MMX insns */
#define AV_386_AMD_MMX          0x00080 /* AMD's MMX insns */
#define AV_386_AMD_3DNow        0x00100 /* AMD's 3Dnow! insns */
#define AV_386_AMD_3DNowx       0x00200 /* AMD's 3Dnow! extended insns */
#define AV_386_FXSR             0x00400 /* fxsave and fxrstor */
#define AV_386_SSE              0x00800 /* SSE insns and regs */
#define AV_386_SSE2             0x01000 /* SSE2 insns and regs */
                                        /* 0x02000 withdrawn - do not assign */
#define AV_386_SSE3             0x04000 /* SSE3 insns and regs */
                                        /* 0x08000 withdrawn - do not assign */
#define AV_386_CX16             0x10000 /* cmpxchg16b insn */
#define AV_386_AHF              0x20000 /* lahf/sahf insns */
#define AV_386_TSCP             0x40000 /* rdtscp instruction */
#define AV_386_AMD_SSE4A        0x80000 /* AMD's SSE4A insns */
#define AV_386_POPCNT           0x100000 /* POPCNT insn */
#define AV_386_AMD_LZCNT        0x200000 /* AMD's LZCNT insn */
#define AV_386_SSSE3            0x400000 /* Intel SSSE3 insns */
#define AV_386_SSE4_1           0x800000 /* Intel SSE4.1 insns */
#define AV_386_SSE4_2           0x1000000 /* Intel SSE4.2 insns */
#define AV_386_MOVBE            0x2000000 /* Intel MOVBE insns */
#define AV_386_AES              0x4000000 /* Intel AES insns */
#define AV_386_PCLMULQDQ        0x8000000 /* Intel PCLMULQDQ insn */
#define AV_386_XSAVE            0x10000000 /* Intel XSAVE/XRSTOR insns */
#define AV_386_AVX              0x20000000 /* Intel AVX insns */
#define AV_386_AMD_XOP          0x40000000 /* AMD XOP insns */
#define AV_386_AMD_FMA4         0x80000000 /* AMD FMA4 insns */

#define FMT_AV_386_HW1                                                  \
        "\20"                                                           \
        "\40amd_fma4\37amd_xop"                                         \
        "\36avx\35xsave"                                                \
        "\34pclmulqdq\33aes"                                            \
        "\32movbe\31sse4.2"                                             \
        "\30sse4.1\27ssse3\26amd_lzcnt\25popcnt"                        \
        "\24amd_sse4a\23tscp\22ahf\21cx16"                              \
        "\17sse3\15sse2\14sse\13fxsr\12amd3dx\11amd3d"  \
        "\10amdmmx\7mmx\6cmov\5amdsysc\4sep\3cx8\2tsc\1fpu"

#define FMT_AV_386_HW2                                                  \
        "\20"                                                           \
        "\16prfchw\15adx\14rdseed\13efs\12rtm\11hle\10bmi2\7avx2"       \
        "\6fsgsbase\5bmi1\4amd_tbm\3f16c\2fma\1rdrand"

/*
 * Flags used in AT_SUN_CAP_HW2 elements.
 */
#define AV2_386_RDRAND          0x00001 /* Intel RDRAND insns */
#define AV2_386_FMA             0x00002 /* Intel FMA insn */
#define AV2_386_F16C            0x00004 /* IEEE half precn(float) insn */
#define AV2_386_AMD_TBM         0x00008 /* AMD TBM insn */
#define AV2_386_BMI1            0x00010 /* Intel BMI1 insn */
#define AV2_386_FSGSBASE        0x00020 /* Intel RD/WR FS/GSBASE insn */
#define AV2_386_AVX2            0x00040 /* Intel AVX2 insns */
#define AV2_386_BMI2            0x00080 /* Intel BMI2 insns */
#define AV2_386_HLE             0x00100 /* Intel HLE insns */
#define AV2_386_RTM             0x00200 /* Intel RTM insns */
#define AV2_386_EFS             0x00400 /* Intel Enhanced Fast String */
#define AV2_386_RDSEED          0x00800 /* Intel RDSEED insn */
#define AV2_386_ADX             0x01000 /* Intel ADX insns */
#define AV2_386_PRFCHW          0x02000 /* Intel PREFETCHW hint */

#ifdef __cplusplus
}
#endif

#endif  /* !_SYS_AUXV_386_H */

It looks to me like your mapfile isn't complete. 在我看来,您的地图文件不完整。 The example from your link to the Oracle Solaris 11.1 Linkers and Libraries Guide looks like this: 链接到《 Oracle Solaris 11.1链接器和库指南 》的示例如下所示:

To completely eliminate a given capability from the output object, it suffices to use the “=” operator and an empty value list. 为了从输出对象中完全消除给定的功能,使用“ =”运算符和一个空值列表就足够了。 For example, the following suppresses any hardware capabilities contributed by the input objects: 例如,以下内容禁止输入对象贡献的任何硬件功能:

  $mapfile_version 2 CAPABILITY { HW = ; }; 

But your map file is: 但是您的地图文件是:

hwcap_1 = SSE SSE2 OVERRIDE;
hwcap_2 = ;

EDIT: 编辑:

Also, per @jww examining the ld source code that parses linker maps, the undocumented value V0x0 works to remove hardware capabilities with version 1 map files: 同样,通过@jww检查解析链接器映射ld源代码 ,未记录的值V0x0会删除版本1映射文件的硬件功能:

hwcap_1 = SSE SSE2 OVERRIDE;
hwcap_2 = V0x0;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM