简体繁体 English

64 位应用程序和内联汇编

[英]64bit Applications and Inline Assembly

原文 2011-05-29 07:23:52 0 4 c++/ windows/ visual-c++/ 64-bit/ inline-assembly

I am using Visual C++ 2010 developing 32bit windows applications.我正在使用 Visual C++ 2010 开发 32 位 windows 应用程序。 There is something I really want to use inline assembly.有些东西我真的很想使用内联汇编。 But I just realized that visual C++ does not support inline assembly in 64bit applications.但我刚刚意识到，视觉 C++ 不支持 64 位应用程序中的内联汇编。 So porting to 64bit in the future is a big issue.所以未来移植到64位是个大问题。

I have no idea how 64bit applications are different from 32bit applications.我不知道 64 位应用程序与 32 位应用程序有何不同。 Is there a chance that 32bit applications will ALL have to be upgraded to 64bit in the future?未来 32 位应用程序是否有可能全部升级到 64 位？ I heard that 64bit CPUs have more registers.我听说 64 位 CPU 有更多的寄存器。 Since performance is not a concern for my applications, using these extra registers is not a concern to me.由于性能不是我的应用程序关心的问题，因此使用这些额外的寄存器对我来说不是一个问题。 Are there any other reasons that a 32bit application needs to be upgraded to 64bit?是否还有其他原因需要将 32 位应用程序升级到 64 位？ Would a 64 bit application process things differently when compared with a 32bit application, apart from that the 64bit applications may use registers or instructions that are unique to 64bit CPUs?与 32 位应用程序相比，64 位应用程序处理事情是否会有所不同，除了 64 位应用程序可能使用 64 位 CPU 独有的寄存器或指令？

My application needs to interact with other OS components eg drivers, which i know must be 64bit in 64bit windows.我的应用程序需要与其他操作系统组件交互，例如驱动程序，我知道驱动程序必须是 64 位的 64 位 windows。 Would my 32bit application compatible with them?我的 32 位应用程序会与它们兼容吗？

4 个解决方案

Visual C++ does not support inline assembly for x64 (or ARM) processors , because generally using inline assembly is a bad idea. Visual C++ 不支持 x64（或 ARM）处理器的内联汇编，因为通常使用内联汇编是个坏主意。

Usually compilers produce better assembly than humans.通常编译器产生比人类更好的汇编。
Even if you can produce better assembly than the compiler, using inline assembly generally defeats code optimizers of any type.即使您可以生成比编译器更好的汇编，使用内联汇编通常也会击败任何类型的代码优化器。 Sure, your bit of hand optimized code might be faster, but the fact that code around it can't be optimized will generally lead to a slower overall program.当然，您手动优化的代码可能会更快，但是无法优化它周围的代码通常会导致整个程序变慢。
Compiler intrinsics are available from pretty much every major compiler that let you access advanced CPU features (eg SSE) in a manner that's consistent with the C and C++ languages, and does not defeat the optimizer.编译器内在函数可从几乎所有主要编译器获得，让您以与 C 和 C++ 语言一致的方式访问高级 CPU 功能（例如 SSE），并且不会破坏优化器。

I am wondering would there be a chance that 32bit applications will ALL have to be upgraded to 64bit in the future.我想知道将来是否有可能将 32 位应用程序全部升级到 64 位。

That depends on your target audience.这取决于您的目标受众。 If you're targeting servers, then yes, it's reasonable to allow users to not install the WOW64 subsystem because it's a server -- you know it'll probably not be running too much 32 bit code.如果你的目标是服务器，那么是的，允许用户不安装 WOW64 子系统是合理的，因为它是一个服务器——你知道它可能不会运行太多的 32 位代码。 I believe Windows Server 2008 R2 already allows this as an option if you install it as a "server core" instance.如果您将其安装为“服务器核心”实例，我相信 Windows Server 2008 R2 已经允许将此作为选项。

Since performance is not a concern for my appli so using the extra 64bit registers is not a concern to me.由于性能不是我的应用程序关心的问题，因此使用额外的 64 位寄存器对我来说不是问题。 Is there any other reasons that a 32bit appli has to be upgraded to 64bit in the future? 32位应用程序将来必须升级到64位还有其他原因吗？

64 bit has nothing to do with registers. 64位与寄存器无关。 It has to do with size of addressable virtual memory.它与可寻址虚拟 memory 的大小有关。

Would a 64 bit app process different from a 32bit appl process apart from that the 64bit appli is using some registers/instructions that is unique to 64bit CPUs?除了 64 位应用程序使用一些 64 位 CPU 独有的寄存器/指令之外，64 位应用程序进程是否会不同于 32 位应用程序进程？

Most likely.最有可能的。 32 bit applications are constrained in that they can't map things more than ~2GB into memory at once. 32 位应用程序受到限制，因为它们不能一次将 map 超过 ~2GB 的东西放入 memory。 64 bit applications don't have that problem. 64 位应用程序没有这个问题。 Even if they're not using more than 4GB of physical memory, being able to address more than 4GB of virtual memory is helpful for mapping files on disk into memory and similar.即使他们没有使用超过 4GB 的物理 memory，能够寻址超过 4GB 的虚拟 memory 也有助于将磁盘上的文件映射到 ZCD69B4957F06CD818D7ZBF3D61980E291 等。

My application needs to interact with other OS components eg drivers, which i know must be 64bit in 64bit windows.我的应用程序需要与其他操作系统组件交互，例如驱动程序，我知道驱动程序必须是 64 位的 64 位 windows。 Would my 32bit application compatible with them?我的 32 位应用程序会与它们兼容吗？

That depends entirely on how you're communicating with those drivers.这完全取决于你如何与这些司机沟通。 If it's through something like a "named file interface" then your app could stay as 32 bit.如果它通过类似“命名文件接口”的方式，那么您的应用程序可以保持为 32 位。 If you try to do something like shared memory (Yikes? Shared memory accessible from user mode with a driver?.?) then you're going to have to build your app as 64 bit.如果您尝试执行共享 memory 之类的操作（Yikes？共享 memory 可通过驱动程序从用户模式访问？.?），那么您将不得不将您的应用程序构建为 64 位。

Apart form @Billy's great write up, if you really feel the need to use 64bit assembly, then you can use an external assembler like MASM to get that done, see this .除了@Billy 的精彩文章之外，如果您真的觉得需要使用 64 位汇编，那么您可以使用 MASM 之类的外部汇编程序来完成这项工作，请参阅。 (its also possible to speed this up with prebuild scripts). （也可以使用预构建脚本加快速度）。

the Intel C Compiler 15 has inline capability in 64bit too.英特尔 C 编译器 15 也具有 64 位的内联功能。 And you could integrate the IC in Visual Studio as a toolset: then you'd have VC++ 64bit with inline assembly.您可以将 IC 作为工具集集成到 Visual Studio 中：然后您将拥有 VC++ 64 位和内联汇编。 One catch though -its expensive cheers不过有一个问题-它昂贵的欢呼

While we're at it, MinGW also has 64-bit inline assembly language;当我们在做的时候，MinGW 也有 64 位内联汇编语言； and it's pretty fast, and free.而且速度很快，而且免费。 It used to be slow on some math;它曾经在一些数学上很慢； so I'd start out comparing performances of MSVC vs. MinGW to see if its a decent starting place for your application.所以我会开始比较 MSVC 与 MinGW 的性能，看看它是否适合您的应用程序。

Also, as to hand-coded assembly being slower:此外，对于手动编码的组装速度较慢：

Actually, humans very often do code assembly that runs more efficiently than compilers - or at least that was always the common wisdom when I was learning programming in the 70's and 80's and continued to be the case through ~2000.实际上，人类经常进行比编译器更高效的代码汇编——或者至少当我在 70 年代和 80 年代学习编程时这一直是常识，并且一直持续到 2000 年左右。
You can always code it in "C" or C++, compile that to assembly, and tweak it to see if you can improve that.您始终可以使用“C”或 C++ 对其进行编码，将其编译为程序集，然后对其进行调整以查看是否可以改进。 That way, you can learn from optimizations;这样，您可以从优化中学习； and see if you can improve on them.看看你是否可以改进它们。

Assembly very much can have a place in code that needs high optimization, no matter what M$ says.无论 M$ 说什么，汇编都可以在需要高度优化的代码中占有一席之地。 You won't really know if assembly will or won't speed up code until you try it.在您尝试之前，您不会真正知道汇编是否会加速代码。 Everything else is just pontificating.其他一切都只是自以为是。

As above, I favor the approach of compiling c++ code into assembly, and then hand-optimizing that.如上所述，我赞成将 c++ 代码编译成汇编，然后手动优化的方法。 It saves you the trouble of writing much of it;它省去了你写很多东西的麻烦； and with a little experimentation, you may get something that tests out faster.并且通过一些实验，您可能会得到更快测试出来的东西。 FWIW, I've never needed to with a modern program. FWIW，我从来不需要现代程序。 Often, other things can speed it up just as much or more - eg such as multi-threading, using look-up tables, moving time-expensive operations out of loops, using static analyzers, using real-time analyzers such as valgrind (if you're on Linux), etc. However, for performance-critical applications, I see no reason not to try;通常，其他东西可以加快速度，例如多线程，使用查找表，将耗时的操作移出循环，使用 static 分析器，使用实时分析器，如 valgrind（如果您使用的是 Linux）等。但是，对于性能关键型应用程序，我认为没有理由不尝试； and just use it if it works.如果它有效，就使用它。 M$ is just being lazy by dropping inline assembly. M$ 只是通过删除内联汇编来偷懒。

As to is 64-bit or 32-bit faster, this is similar to the situation with 16-bit vs. 32-bit.至于是 64 位还是 32 位更快，这类似于 16 位与 32 位的情况。 The wider bandwidth can sling huge amounts of data faster.更宽的带宽可以更快地传输大量数据。 If both run on a 64-bit OS, they run at exactly the same clock speed;如果两者都在 64 位操作系统上运行，则它们以完全相同的时钟速度运行； so the 32-bit program shouldn't be faster.所以32位程序不应该更快。 Yet, I've observed the CPU clock on 32-bit Win7 to run slightly faster than 64-bit Win7.然而，我观察到 32 位 Win7 上的 CPU 时钟运行速度略快于 64 位 Win7。 Thus for the same number of threads, and for more CPU intensive operations, a 32-bit app on 32-bit Win7 would be faster.因此，对于相同数量的线程和更多 CPU 密集型操作，32 位 Win7 上的 32 位应用程序会更快。 However, the difference isn't much;但是，差异并不大。 and 64-bit instructions can really make a difference.和 64 位指令真的可以有所作为。 However, a given user will only have one OS installed;但是，给定用户将只安装一个操作系统； and so the 64-bit app will be either faster for that OS;因此，对于该操作系统，64 位应用程序要么更快；要么or at best the same speed if running a 32-bit app on a 64-bit OS.或者如果在 64 位操作系统上运行 32 位应用程序，则最好是相同的速度。 It will be a larger download, however.然而，这将是一个更大的下载。 You might as well go for the possibly faster speed with 64-bits;您也可以使用 go 以获得更快的 64 位速度； unless you are dealing with a dedicated system running code you know won't be moving large amounts of data.除非您正在处理专用系统运行代码，否则您知道不会移动大量数据。

Also, note that I benchmarked a 64-bit and a 32-bit app on OSs of the respective sizes, using the respective versions of MinGW.另外，请注意，我使用各自版本的 MinGW 在各自大小的操作系统上对 64 位和 32 位应用程序进行了基准测试。 It did a lot of 64-bit floating point number crunching, and I was sure the 64-bit version would have the edge.它做了很多 64 位浮点数运算，我确信 64 位版本会有优势。 It didn't,.它没有，。 My guess is that the floating point registers in the built-in math coprocessor run in equal numbers of clock cycles on both OSs, and perhaps slightly slower on 64-bit Win7.我的猜测是，内置数学协处理器中的浮点寄存器在两个操作系统上以相同数量的时钟周期运行，在 64 位 Win7 上可能会稍微慢一些。 My benchmarks were so close in both versions, that one was not clearly faster.我的基准测试在两个版本中都非常接近，以致于没有明显更快。 Perhaps long number-crunching operations were slower on 64-bit, but the 64-bit program code ran a little faster - causing nearly equal results.在 64 位上，长时间的数字运算操作可能较慢，但 64 位程序代码运行得更快 - 导致几乎相同的结果。

Basically, the only time 32-bits makes sense, IMHO, is when you think you might have an in-house app that would run faster on a 32-bit OS;基本上，32 位唯一有意义的时候，恕我直言，是当您认为您可能有一个内部应用程序可以在 32 位操作系统上运行得更快时； you want a really small executable, or when you are delivering to users on 32-bit OS machines (many developers still offer both versions), or a 32-bit embedded system.您想要一个非常小的可执行文件，或者当您在 32 位操作系统机器上（许多开发人员仍然提供这两个版本）或 32 位嵌入式系统上交付给用户时。

Edited to reflect that some of my remarks pertain to my specific experience with Win7 x86 vs. x64.编辑以反映我的一些评论与我对 Win7 x86 与 x64 的具体经验有关。