简体   繁体   English

如何静态链接到TBB?

[英]How to statically link to TBB?

How can I statically link the intel's TBB libraries to my application? 如何将intel的TBB库静态链接到我的应用程序? I know all the caveats such as unfair load distribution of the scheduler, but I don't need the scheduler, just the containers, so it's ok. 我知道所有注意事项,例如调度程序的不公平负载分配,但我不需要调度程序,只需要容器,所以没关系。

Anyways I know this can be done, although its undocumented, however I just can't seem to find the way to do it right now (although I've seen it before somewhere). 无论如何我知道这可以做到,虽然它没有记录,但我现在似乎无法找到方法(尽管我在某处之前已经看过它)。

So does anyone know or have any clues? 那么有人知道或有任何线索吗?

thanks 谢谢

This is strongly not recommended: 强烈建议不要这样做:

Is there a version of TBB that provides statically linked libraries? 是否有提供静态链接库的TBB版本?

TBB is not provided as a statically linked library, for the following reasons*: TBB不是作为静态链接库提供的,原因如下*:

Most libraries operate locally. 大多数图书馆都在本地 For example, an Intel(R) MKL FFT transforms an array. 例如,英特尔(R)MKL FFT转换阵列。 It is irrelevant how many copies of the FFT there are. 与FFT有多少副本无关。 Multiple copies and versions can coexist without difficulty. 多个副本和版本可以毫无困难地共存。 But some libraries control program-wide resources, such as memory and processors. 但是一些库控制程序范围的资源,例如内存和处理器。 For example, garbage collectors control memory allocation across a program. 例如,垃圾收集器控制程序中的内存分配。 Analogously, TBB controls scheduling of tasks across a program. 类似地,TBB控制跨程序的任务调度。 To do their job effectively, each of these must be a singleton; 为了有效地完成工作,每个人都必须是单身人士; that is, have a sole instance that can coordinate activities across the entire program. 也就是说,有一个唯一的实例可以协调整个程序的活动。 Allowing k instances of the TBB scheduler in a single program would cause there to be k times as many software threads as hardware threads. 在单个程序中允许TBB调度程序的k个实例将导致软件线程的数量是硬件线程的k倍。 The program would operate inefficiently, because the machine would be oversubscribed by a factor of k, causing more context switching, cache contention, and memory consumption. 该程序运行效率低,因为机器将被k因子超额认购,导致更多的上下文切换,高速缓存争用和内存消耗。 Furthermore, TBB's efficient support for nested parallelism would be negated when nested parallelism arose from nested invocations of distinct schedulers. 此外,当嵌套并行性来自不同调度程序的嵌套调用时,TBB对嵌套并行性的有效支持将被否定。

The most practical solution for creating a program-wide singleton is a dynamic shared library that contains the singleton. 创建程序范围单例的最实用的解决方案是包含单例的动态共享库。 Of course if the schedulers could cooperate, we would not need a singleton. 当然,如果调度程序可以合作,我们就不需要单例。 But that cooperation requires a centralized agent to communicate through; 但这种合作需要集中的代理人进行沟通; that is, a singleton! 就是一个单身人士!

Our decision to omit a statically linkable version of TBB was strongly influenced by our OpenMP experience. 我们决定省略可静态链接的TBB版本受到我们OpenMP经验的强烈影响。 Like TBB, OpenMP also tries to schedule across a program. 与TBB一样,OpenMP也尝试安排整个计划。 A static version of the OpenMP run-time was once provided, and it has been a constant source of problems arising from duplicate schedulers. 曾经提供过静态版本的OpenMP运行时,它一直是重复调度程序引起的问题的根源。 We think it best not to repeat that history. 我们认为最好不要重复那段历史。 As an indirect proof of the validity of these considerations, we could point to the fact that Microsoft Visual C++ only provides OpenMP support via dynamic libraries. 作为这些考虑的有效性的间接证明,我们可以指出Microsoft Visual C ++仅通过动态库提供OpenMP支持这一事实。

Source: http://www.threadingbuildingblocks.org/faq/11#sthash.t3BrizFQ.dpuf 资料来源: http//www.threadingbuildingblocks.org/faq/11#sthash.t3BrizFQ.dpuf

EDIT - Changed to use extra_inc . 编辑 - 更改为使用extra_inc Thanks Jeff! 谢谢杰夫!

Build with the following parameter: 使用以下参数构建:

make extra_inc=big_iron.inc

The static libraries will be built. 将构建静态库。 See the caveats in build/big_iron.inc . 请参阅build/big_iron.inc的警告。

Build static libraries from source 从源构建静态库

After acquiring the source code from https://www.threadingbuildingblocks.org/ , build TBB like this: https://www.threadingbuildingblocks.org/获取源代码后,构建TBB如下:

make extra_inc=big_iron.inc

If you need extra options, then instead build like this: 如果你需要额外的选项,那么建立如下:

make extra_inc=big_iron.inc <extra options>

Running multiple TBB programs per node 每个节点运行多个TBB程序

If you run a multiprocessing application, eg using MPI, you may need to explicitly initialize the TBB scheduler with the appropriate number of threads to avoid oversubscription. 如果您运行多处理应用程序,例如使用MPI,则可能需要使用适当数量的线程显式初始化TBB调度程序,以避免超额订阅。

An example of this in a large application can be found in https://github.com/madness/madness/blob/master/src/madness/world/thread.cc . 大型应用程序中的一个示例可以在https://github.com/madness/madness/blob/master/src/madness/world/thread.cc中找到。

Comment on documentation 评论文件

This feature has been available for many years (since at least 2013), although it is not documented for the reasons described in other answers. 此功能已经使用多年(至少从2013年开始),尽管由于其他答案中描述的原因没有记录。

Historical note 历史记录

This feature was originally developed because IBM Blue Gene and Cray supercomputers either did not support shared libraries or did not perform well when using them, due to the lack of a locally mounted filesystem. 最初开发此功能是因为IBM Blue Gene和Cray超级计算机要么不支持共享库,要么在使用它们时性能不佳,因为缺少本地安装的文件系统。

Using the opensource version: 使用开源版本:

After running "make tbb",go to the build/linux_xxxxxxxx_release folder. 运行“make tbb”后,转到build / linux_xxxxxxxx_release文件夹。

Then run: 然后运行:

ar -r libtbb.a concurrent_hash_map.o concurrent_queue.o concurrent_vector.o 
dynamic_link.o itt_notify.o cache_aligned_allocator.o pipeline.o queuing_mutex.o 
queuing_rw_mutex.o reader_writer_lock.o spin_rw_mutex.o spin_mutex.o critical_section.o
task.o tbb_misc.o tbb_misc_ex.o mutex.o recursive_mutex.o condition_variable.o 
tbb_thread.o concurrent_monitor.o semaphore.o private_server.o rml_tbb.o 
task_group_context.o governor.o market.o arena.o scheduler.o observer_proxy.o 
tbb_statistics.o tbb_main.o concurrent_vector_v2.o concurrent_queue_v2.o 
spin_rw_mutex_v2.o task_v2.o

And you should get libtbb.a as output. 你应该得到libtbb.a作为输出。

Note that your program should build both with "-ldl" and libtbb.a 请注意,您的程序应使用“-ldl”和libtbb.a构建

Although not officially endorsed by the TBB team, it is possible to build your own statically linked version of TBB with make extra_inc=big_iron.inc . 虽然TBB团队没有正式认可,但可以使用make extra_inc=big_iron.inc构建自己的TBB静态链接版本。

I have not tested it on Windows or MacOS, but on Linux, it worked ( source ): 我没有在Windows或MacOS上测试它,但在Linux上,它工作( ):

wget https://github.com/01org/tbb/archive/2017_U6.tar.gz
tar xzfv 2017_U6.tar.gz
cd tbb-2017_U6
make extra_inc=big_iron.inc

The generated files are in tbb-2017_U6/build/linux*release . 生成的文件位于tbb-2017_U6/build/linux*release

When you link your application to the static TBB version: 将应用程序链接到静态TBB版本时:

  • Call g++ with the -static switch 使用-static开关调用g ++
  • Link against tbb ( -ltbb ) and pthread ( -lpthread ) 链接tbb( -ltbb )和pthread( -lpthread

In my test, I also needed to explicitely reference all .o files from the manually build TBB version. 在我的测试中,我还需要明确地引用手动构建TBB版本中的所有.o文件。 Depending on your project, you might also need to pass -pthread to gcc. 根据您的项目,您可能还需要将-pthread传递给gcc。

I have created a toy example to document all the steps in this Github repository: 我创建了一个玩具示例来记录此Github存储库中的所有步骤:

It also contains test code to make sure that the generated binary is portable on other Linux distributions. 它还包含测试代码,以确保生成的二进制文件在其他Linux发行版上是可移植的。

Unfortunately it does not appear to be possible: From TBB site. 不幸的是,它似乎不可能: 来自TBB网站。 .
One suggestion on the Intel forum was to compile it manually if you really need the static linkage: From Intel Forum . 英特尔论坛上的一个建议是,如果您真的需要静态链接,请手动编译它: 来自英特尔论坛

Just link the files, I just did it and works. 只需链接文件,我就做了,并且工作正常。 Here's the SConscript file. 这是SConscript文件。 There's two minor things, a symbol which has the same name in tbb and tbbmalloc which I had to prevent to be multiply defined, and I prevented the usage of ITT_NOTIFY since it creates another symbol with the same name in both libs. 有两个小问题,一个在tbb和tbbmalloc中具有相同名称的符号,我必须阻止它被多次定义,并且我阻止了ITT_NOTIFY的使用,因为它在两个库中创建了具有相同名称的另一个符号。

Import('g_CONFIGURATION')
import os
import SCutils
import utils

tbb_basedir = os.path.join(
    g_CONFIGURATION['basedir'],
    '3rd-party/tbb40_233oss/')

#print 'TBB base:', tbb_basedir
#print 'CWD: ', os.getcwd()

ccflags = []
cxxflags = [
    '-m64',
    '-march=native',
    '-I{0}'.format(tbb_basedir),
    '-I{0}'.format(os.path.join(tbb_basedir, 'src')),
    #'-I{0}'.format(os.path.join(tbb_basedir, 'src/tbb')),
    '-I{0}'.format(os.path.join(tbb_basedir, 'src/rml/include')),
    '-I{0}'.format(os.path.join(tbb_basedir, 'include')),
]
cppdefines = [
#    'DO_ITT_NOTIFY',
    'USE_PTHREAD',
    '__TBB_BUILD=1',
]
linkflags = []

if g_CONFIGURATION['build'] == 'debug':
    ccflags.extend([
        '-O0',
        '-g',
        '-ggdb2',
    ])
    cppdefines.extend([
        'TBB_USE_DEBUG',
    ])

else:
    ccflags.extend([
        '-O2',
    ])


tbbenv = Environment(
    platform = 'posix',
    CCFLAGS=ccflags,
    CXXFLAGS=cxxflags,
    CPPDEFINES=cppdefines,
    LINKFLAGS=linkflags
)

############################################################################
# Build verbosity
if not SCutils.has_option('verbose'):
    SCutils.setup_quiet_build(tbbenv, True if SCutils.has_option('colorblind') else False)
############################################################################



tbbmallocenv = tbbenv.Clone()

tbbmallocenv.Append(CCFLAGS=[
    '-fno-rtti',
    '-fno-exceptions',
    '-fno-schedule-insns2',
])

#tbbenv.Command('version_string.tmp', None, '')

# Write version_string.tmp
with open(os.path.join(os.getcwd(), 'version_string.tmp'), 'wb') as fd:
    (out, err, ret) = utils.xcall([
        '/bin/bash',
        os.path.join(g_CONFIGURATION['basedir'], '3rd-party/tbb40_233oss/build/version_info_linux.sh')
    ])

    if ret:
        raise SCons.Errors.StopError('version_info_linux.sh execution failed')

    fd.write(out);
    #print 'put version_string in', os.path.join(os.getcwd(), 'version_string.tmp')
    #print out
    fd.close()

result = []

def setup_tbb():
    print 'CWD: ', os.getcwd()
    tbb_sources = SCutils.find_files(os.path.join(tbb_basedir,'src/tbb'), r'^.*\.cpp$')
    tbb_sources.extend([
        'src/tbbmalloc/frontend.cpp',
        'src/tbbmalloc/backref.cpp',
        'src/tbbmalloc/tbbmalloc.cpp',
        'src/tbbmalloc/large_objects.cpp',
        'src/tbbmalloc/backend.cpp',
        'src/rml/client/rml_tbb.cpp',
    ])


    print tbb_sources
    result.append(tbbenv.StaticLibrary(target='libtbb', source=tbb_sources))


setup_tbb()

Return('result')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM