简体   繁体   中英

Why ELF produced by Python is quite big compared to the original source code?

I can't help but wondering why ELF produced by Python is quite big compare to the original source code. Let's take a look at the simplest code, hello world.

user@linux:~/Python$ cat hello.py    
print('Hello, World!')
user@linux:~/Python$ 

Converting to ELF using pyinstaller

user@linux:~/Python$ pyinstaller -F hello.py 
48 INFO: PyInstaller: 3.4
49 INFO: Python: 3.6.7
50 INFO: Platform: Linux-4.15.0-38-generic-x86_64-with-Ubuntu-18.04-bionic
50 INFO: wrote /home/user/Python/hello.spec
53 INFO: UPX is not available.
54 INFO: Extending PYTHONPATH with paths
['/home/user/Python', '/home/user/Python']
55 INFO: checking Analysis
60 INFO: Building because _python_version changed
60 INFO: Initializing module dependency graph...
62 INFO: Initializing module graph hooks...
64 INFO: Analyzing base_library.zip ...
3061 INFO: running Analysis Analysis-00.toc
3096 INFO: Caching module hooks...
3100 INFO: Analyzing /home/user/Python/hello.py
3103 INFO: Loading module hooks...
3104 INFO: Loading module hook "hook-encodings.py"...
3169 INFO: Loading module hook "hook-pydoc.py"...
3170 INFO: Loading module hook "hook-xml.py"...
3388 INFO: Looking for ctypes DLLs
3388 INFO: Analyzing run-time hooks ...
3394 INFO: Looking for dynamic libraries
3632 INFO: Looking for eggs
3633 INFO: Python library not in binary dependencies. Doing additional searching...
3684 INFO: Using Python library /usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
3695 INFO: Warnings written to /home/user/Python/build/hello/warn-hello.txt
3717 INFO: Graph cross-reference written to /home/user/Python/build/hello/xref-hello.html
3722 INFO: checking PYZ
3725 INFO: Building because toc changed
3725 INFO: Building PYZ (ZlibArchive) /home/user/Python/build/hello/PYZ-00.pyz
4053 INFO: Building PYZ (ZlibArchive) /home/user/Python/build/hello/PYZ-00.pyz completed successfully.
4059 INFO: checking PKG
4064 INFO: Building because toc changed
4064 INFO: Building PKG (CArchive) PKG-00.pkg
6474 INFO: Building PKG (CArchive) PKG-00.pkg completed successfully.
6476 INFO: Bootloader /home/user/.local/lib/python3.6/site-packages/PyInstaller/bootloader/Linux-64bit/run
6477 INFO: checking EXE
6479 INFO: Rebuilding EXE-00.toc because hello missing
6480 INFO: Building EXE from EXE-00.toc
6481 INFO: Appending archive to ELF section in EXE /home/user/Python/dist/hello
6516 INFO: Building EXE from EXE-00.toc completed successfully.
user@linux:~/Python$ 

New ELF format

user@linux:~/Python/dist$ ./hello 
Hello, World!
user@linux:~/Python/dist$ 

user@linux:~/Python$ ls -lh hello.py   
-rw-rw-r-- 1 user user 23 Dis  27 21:43 hello.py
user@linux:~/Python$ 

user@linux:~/Python/dist$ ls -lh hello 
-rwxr-xr-x 1 user user 5.3M Dis  27 21:48 hello
user@linux:~/Python/dist$ 

As you can see, the original code is only 23 bytes, while the ELF way much bigger ... 5.3M !!!

Let's look at another example with C.

user@linux:~/C$ cat hello.c  
#include<stdio.h>

int main()
{
    printf("Hello C World\n");
}
user@linux:~/C$ 

user@linux:~/C$ gcc hello.c -o helloC
user@linux:~/C$ 

user@linux:~/C$ ls -l helloC
-rwxrwxr-x 1 user user 8304 Dis  27 21:53 helloC
user@linux:~/C$ 

user@linux:~/C$ ./helloC
Hello C World
user@linux:~/C$ 

user@linux:~/C$ ls -l hello.c
-rw-rw-r-- 1 user user 65 Dis  27 21:52 hello.c
user@linux:~/C$ 

user@linux:~/C$ ls -lh helloC
-rwxrwxr-x 1 user user 8.2K Dis  27 21:53 helloC
user@linux:~/C$ 

Comparison

Python code size = 27 bytes
Python ELF size = 5.3M

C code size = 65 bytes
C ELF size = 8.2K

Is there a way to make the size smaller?

Because Python does NOT compile to machine code.

The ELF created by PyInstaller is as simple as your code packed up with all necessary Python runtime files. It's not in any way comparable to a compiled binary from C, which contains machine code and dynamically linked libraries ( libc.so for example).

PyInstaller, py2exe and pretty much any other project "converting" Python files to executables isn't really converting anything - it's just packing the full Python interpreter - 4.4 MB alone on my machine -, your project and all the dependencies required by it (all compiled to bytecode, which the interpreter runs) into a single self-extracting executable, so it's normal that it'll be at least as big as a (compressed) Python installation.

Pretty much anything besides the Python interpreter itself and big native dependencies (think numpy, scipy, PyQt) count next to nothing in final executable size. You may have a 10KLOC Python project and, as long as you don't pull in any other external dependency, you'll find out that the final executable size won't be significantly affected.

gcc compiling a C file instead is creating an actual executable, containing the imports and the machine code necessary just to invoke printf ; it's 15 bytes of literal string, a handful of bytes to setup a stack frame and actually invoke printf , and all the rest is ELF headers, import tables and various linker junk (even just doing strip -s on it shaves off 2 KB).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM