I am compiling a fairly sophisticated application in two modes: Debug and Release. The main difference, as I see it, is -O0 vs -O3 (I can provide the relevant part of makefile if needed). I am trying to avoid syscall generation as much as possible, as I am simulating this application in syscall emulation mode (no OS running underneath). The problem that i am currently having is that in Release mode the compiler generates an extra socket syscall, which I prefer not to happen (and which does not happen in Debug mode).
The reason that I think the socket might be created is that I am using pthreads and two of my threads are communicating through a volatile char*. So I'm guessing maybe the compiler is trying to implement it in a fancy way when I set the -O3 flag? But I'm not sure if that is a reasonable assumption.
EDIT: BTW the code is in C and C++
EDIT: The code is statically linked against the following shared libraries:
libstdc++.a
libm.a
libglib-2.0.a
-static-libgcc
*special pthreads library*
Also, I found where in the binary the call to socket is happening:
8c716: db28 blt.n 8c76a <openlog_internal+0xf2>
8c718: f8d9 1008 ldr.w r1, [r9, #8]
8c71c: 4620 mov r0, r4
8c71e: 2200 movs r2, #0
8c720: f441 2100 orr.w r1, r1, #524288 ; 0x80000
8c724: f001 e97c blx 8da20 <__socket>
8c728: 4b20 ldr r3, [pc, #128] ; (8c7ac <openlog_internal+0x134>)
8c72a: 681b ldr r3, [r3, #0]
8c72c: f8c9 0004 str.w r0, [r9, #4]
8c730: b943 cbnz r3, 8c744 <openlog_internal+0xcc>
8c732: 1c43 adds r3, r0, #1
EDIT: I found out why this is happening (see my answer below). If anyone has an explanation as to why the compiler behaves like that please share!!!
Although, one can imagine such an optimization, I haven't heard of such and I really doubt it, because any system call is usually very expensive.
If you are on a *nix system, you can verify it by looking for undefined symbols with nm
nm -u file1.o file2.o | grep socket
should show somewhere the missing socket
symbol as
U socket
if there is somewhere a call to socket.
As I mentioned, I doubt, that there is an optimization inserting any system call and I expect no output from the command line above.
Update:
On my system (Ubuntu 12.04, gcc 4.6), I found the following note in man gcc
-O2 Optimize even more. ...
NOTE: In Ubuntu 8.10 and later versions, -D_FORTIFY_SOURCE=2 is set by default, and is activated when -O is set to 2 or higher. This enables additional compile- time and run-time checks for several libc functions. To disable, specify either -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
So, maybe through this or a similar mechanism, there is some code included when the optimization is set to -O2
or -O3
.
After a whole day of debugging, it turns out that arm-cross-gcc compiles strcpy() differently under -O0 and -O1,-O2,-O3 when the string you are copying from is a volatile char* . -O0 compiles using standard user mode assembly, whereas -O1,-O2,-O3 compile it with extra syscalls, such as socket, connect, and send.
So, after all, my initial hunch is justified:
"The reason that I think the socket might be created is that I am using pthreads and two of my threads are communicating through a volatile char*. So I'm guessing maybe the compiler is trying to implement it in a fancy way when I set the -O3 flag? But I'm not sure if that is a reasonable assumption."
EDIT: Here are some observations to support this claim:
I compiled my code in 4 versions
1. without strcpy() -O0 => obj.o0
2. with strcpy() -O0 => obj_strcpy.o0
3. without strcpy() -O3 => obj.o3
4. with strcpy() -O3 => obj_strcpy.o3
I ran nm -u on all of the above.
Here are the diffs:
$> diff obj.o0 obj_strcpy.o0
$> diff obj.o3 obj_strcpy.o3
> U __strcpy_chk
$>
This means that when you add strcpy() to your code and compile with -O0 no external symbols are added, whereas when you add strcpy() to your code and compile with -O3 U __strcpy_chk
symbol is added to the object file. I will look into the implementation of U __strcpy_chk
on ARM to figure out where the syscalls are coming from. As of right now it seems like U __strcpy_chk
is doing buffer overflow checks - here is the reference:
EDIT: So there are two solutions so far: one proposed by Olaf Dietsche to use another compiler flag in addition to -O3. The other option is to avoid strcpy() altogether and use something as follows:
for(int i=0;i<64;i++)
{
cmd[i] = msg[i];
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.