简体   繁体   English

偏移的nm符号值?

[英]Offset in nm symbol value?

Just to give you some context, here's what I'm trying to achieve: I am embedding a const char* in a shared object file in order to have a version string in the .so file itself. 只是为了给你一些上下文,这是我想要实现的:我在一个共享对象文件中嵌入一个const char *,以便在.so文件本身中有一个版本字符串。 I am doing data analysis and this string enables me to let the data know which version of the software produced it. 我正在进行数据分析,这个字符串使我能够让数据知道产生它的软件版本。 This all works fine. 一切正常。

The issue I am having is when I try to read the string out of the .so library directly. 我遇到的问题是当我尝试直接读取.so库中的字符串时。 I tried to use 我试着用

nm libSMPselection.so | grep _version_info

and get 得到

000000000003d968 D __SMPselection_version_info

this is all fine and as expected (the char* is called _SMPselection_version_info). 这一切都很好并且符合预期(char *被称为_SMPselection_version_info)。 However I would have expected to now be able to open the file, seek to 0x3d968 and start reading my string, but all I get is garbage. 但是我本来希望现在能够打开文件,寻找0x3d968并开始读取我的字符串,但我得到的只是垃圾。

When I open the .so file and simply search for the contents of the string (I know how it starts), I can find it at address 0x2e0b4. 当我打开.so文件并只是搜索字符串的内容(我知道它是如何开始)时,我可以在地址0x2e0b4找到它。 At this address it's there, zero terminated and as expected. 在这个地址它就在那里,零终止并且如预期的那样。 (I am using this method for now.) (我现在正在使用这种方法。)

I am not a computer scientist. 我不是计算机科学家。 Could someone please explain to me why the symbol value shown by nm isn't correct, or differently, what is the symbol value if it isn't the address of the symbol? 有人可以向我解释为什么nm所示的符号值不正确,或者不同的是,如果它不是符号的地址,那么符号值是多少?

(By the way I am working on a Mac with OSX 10.7) (顺便说一句,我在使用OSX 10.7的Mac上工作)

Assuming its an ELF or similarily structured binary, you have to take into account the address where stuff is loaded, which is influenced by things in the ELF header. 假设它是一个ELF或类似结构的二进制文件,你必须考虑加载东西的地址,这受到ELF头中的东西的影响。

Using objdump -Fd on your binary, you can have the disassembler also show the exact file offset of a symbol. 在二进制文件上使用objdump -Fd ,您可以让反汇编程序也显示符号的确切文件偏移量。

Using objdump -x you can find this loader address, usually 0x400000 for standard linux executables. 使用objdump -x你可以找到这个加载器地址,通常是标准linux可执行文件的0x400000。

The next thing you have to be careful with is to see if its an indirect string, this you can do most easily by using objdump -g . 接下来要注意的是查看它是否是间接字符串,这可以通过使用objdump -g轻松完成。 When the string is found as being an indirect string, at the position output by objdump -Fd you will not find the string, but the address. 当字符串被发现为间接字符串时,在objdump -Fd输出的位置处,您将找不到字符串,而是找到地址。 From this you need to subtract the loader address again. 从这里你需要再次减去加载器地址。 Let me show you an example for one of my binaries: 让我给你看一个我的二进制文件的例子:

objdump -Fd BIN | grep VersionString
  45152f:       48 8b 1d 9a df 87 00    mov    0x87df9a(%rip),%rbx        # ccf4d0 <acVersionString> (File Offset: 0x8cf4d0)

objdump -x BIN
...
LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
...

So we look at 0x8cf4d0 in the file and find in the hexeditor: 所以我们查看文件中的0x8cf4d0并在hexeditor中找到:

008C:F4D0 D8 C1 89 00  00 00 00 00  01 00 00 00  FF FF FF FF

So we take the 0x89C1D8 there, subtract 0x400000 and have 0x49c1d8 and when we look there in the hexeditor we find: 所以我们在那里取0x89C1D8,减去0x400000并得到0x49c1d8,当我们在hexeditor中查看时,我们发现:

0049:C1D0 FF FF 7F 7F  FF FF 7F FF  74 72 75 6E  6B 5F 38 30
0049:C1E0 34 33 00 00  00 00 00 00  00 00 00 00  00 00 00 00

Which means "trunk_8043". 这意味着“trunk_8043”。

YMMV, especially when its some other file format, but that is the general way on how these things are structured, with lots of warts and details that deviate for special cases. YMMV,尤其是当它的其他一些文件格式时,但这是关于这些东西如何构造的一般方式,有很多疣和细节偏离特殊情况。

没有人建议最简单的方法:做一个动态加载你的lib的二进制文件(在命令行上给它命名)并为你的符号做dlsym()(或者也可以在命令行上得到它)把它转换为字符串指针和将它打印到stdout。

On Linux you have the 'strings' command which help you extract strings from binaries. 在Linux上,你有'strings'命令可以帮助你从二进制文件中提取字符串。

http://linux.about.com/library/cmd/blcmdl1_strings.htm http://linux.about.com/library/cmd/blcmdl1_strings.htm

In HPUX (and I think in other Unix flavors too) there's a similar command called 'what'. 在HPUX中(我认为在其他Unix版本中)也有类似的命令叫做'what'。 It extracts only strings that start with "@(#)", but if you control the content of the string this is not a problem. 它只提取以“@(#)”开头的字符串,但如果你控制字符串的内容,这不是问题。

Why would you expect the offset displayed by nm to be the offset in the .so file? 为什么你希望nm显示的偏移量是.so文件中的偏移量? .so files are not simply memory images; .so文件不仅仅是内存图像; they contain a lot of other information as well, and have a more or less complicated format. 它们还包含许多其他信息,并且具有或多或少的复杂格式。 Under Unix (at least under most Unices), shared objects use the elf format. 在Unix下(至少在大多数Unices下),共享对象使用elf格式。 To find the information, you will have to interpret the various fields in the file, to find where the symbol you want is located, in which segment, and where that segment starts in the file. 要查找信息,您必须解释文件中的各个字段,以查找所需符号的位置,在哪个段中以及该段在文件中的起始位置。 (You can probably find a library which will simplify reading them.) (您可以找到一个可以简化阅读的库。)

Also, if you are correct in saying that you've embedded a char const* , ie that your code contained something like: 另外,如果你说你嵌入了一个char const*是正确的,即你的代码包含如下内容:

char const* version = "...";

then the address or offset of version is the address or offset of the pointer, not the string data it is pointed to. 那么version的地址或偏移量是指针的地址或偏移量,而不是它指向的字符串数据。 Defining it as: 将其定义为:

char const version[] = "...";

will solve this. 会解决这个问题。

Finally, the simplest solution might be to just make sure that the string has some highly identifiable pattern, and scan the entire file linearly looking for this pattern. 最后,最简单的解决方案可能是确保字符串具有一些高度可识别的模式,并线性扫描整个文件以查找此模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM