[英]Referencing symbols of code/data loaded separately to another part of memory
I have two nasm-syntax assembly files, let's say a.asm
and b.asm
. 我有两个nasm语法汇编文件,比方说
a.asm
和b.asm
。
They will need to be assembled into two seperate binary files, a.bin
and b.bin
. 它们需要组装成两个单独的二进制文件,
a.bin
和b.bin
。
On startup a.bin
will be loaded by another program to a fixed location in memory ( 0x1000
). 在启动时,
a.bin
将被另一个程序加载到内存中的固定位置( 0x1000
)。
b.bin
will be loaded later to an arbitrary location in memory. b.bin
将稍后加载到内存中的任意位置。
b.bin
will use some of the functions defined in a.bin
. b.bin
将使用a.bin
定义的一些函数。
PROBLEM: b.bin
does not know where the functions are located in a.bin
问题:
b.bin
不知道函数在a.bin
Why do they need to be seperate? 为什么他们需要分开? They're unrelated, keeping
b.bin
(and many more files) and a.bin
in one file would defeat the purpose of a file system. 它们是无关的,将
b.bin
(以及更多文件)和a.bin
在一个文件中会a.bin
文件系统的目的。
Why not %include
it? 为什么不
%include
? Memory usage, a.bin
is quite a large set of functions taking up lots of memory, and because of the 640kb memory limit in x86 real mode i can't really afford to have this in memory for every file that needs it. 内存使用,
a.bin
是占用大量内存的相当大的一组函数,并且由于x86实模式中的640kb内存限制,我无法真正承担内存中每个需要它的文件。
possible solution 1: just hardcode the locations. 可能的解决方案1:只需硬编码位置。
problem: what if i change something minor at the very start of a.bin
? 问题:如果我在
a.bin
开始就改变了一些小问题怎么a.bin
? I'll need to update all pointers to stuff after it, and that's not handy. 我需要在它之后更新所有指针,这并不方便。
possible solution 2: keep track of the function locations in one file, and %include
that. 可能的解决方案2:跟踪一个文件中的功能位置,
%include
该功能。
This is probably what i'll do if i have no other options. 如果我没有其他选择,这可能就是我要做的。 I might even be able to automatically generate this file if nasm can generate easy-to-parse symbol listings, otherwise it's still too much work.
如果nasm可以生成易于解析的符号列表,我甚至可以自动生成此文件,否则它仍然太多工作。
possible solution 3: keep a table in memory of where the functions are located, instead of the functions themselves. 可能的解决方案3:在函数所在的内存中保存一个表,而不是函数本身。 This also has the added benefit of backwards compatibility, if i do decide to change
a.bin
, all things using it don't have to change along with it. 这也具有向后兼容性的额外好处,如果我决定更改
a.bin
,使用它的所有东西都不必随之改变。
problem: indirect call is really slow and takes up lot's of disk space, though really this is a minor issue. 问题:间接调用真的很慢并占用了大量的磁盘空间,尽管这确实是一个小问题。 The table will also take up some space in disk and memory though.
该表还将占用磁盘和内存中的一些空间。
My idea was to add this later, as a library or something like that. 我的想法是稍后添加它,作为一个库或类似的东西。 So everything that's compiled along with
a.bin
can call it faster by using direct calls and things that are compiled seperately as eg. 因此,与
a.bin
一起编译的所有内容都可以通过使用直接调用和单独编译的内容来更快地调用它。 applications can use the table for slower but safer access to a.bin
. 应用程序可以使用该表来更慢但更安全地访问
a.bin
。
TLDR; TLDR;
how to include labels from another asm file so that they can be called w/o including the actual code in the final assembled file? 如何包含来自另一个asm文件的标签,以便可以调用它们,包括最终汇编文件中的实际代码?
You could proceed like this: 你可以像这样继续:
a.bin
to be loaded from address 0x1000
. 0x1000
汇编并链接要加载的a.bin
。 nm
utility (or similar) to dump the symbol table of a.bin
nm
实用程序(或类似工具)转储a.bin
的符号表 Write a script to turn the symbol table into an assembly file asyms.asm
that contains for each symbol in a.bin
a line of the form 写一个脚本打开符号表成组件文件
asyms.asm
包含每个符号在a.bin
的线的形式的
sym EQU addr
where addr
is the actual address of sym
as given by nm
其中
addr
是nm
给出的sym
的实际地址
asyms.asm
when compiling b.bin
. asyms.asm
编译时b.bin
。 This makes the addresses of the symbols in a.bin
visible to your assembler code without pulling in the corresponding code. a.bin
符号的地址对汇编代码可见,而不会a.bin
相应的代码。 What you are trying to do is known as building an overlay. 您要做的事情被称为构建叠加层。 I believe some assemblers and linkers do have support for this sort of thing but I am not sure about the details.
我相信一些汇编程序和链接器确实支持这类事情,但我不确定细节。
You have a number of possibilities. 你有很多可能性。 This answer focuses on a hybrid of 1 and 2. Although you can create a table of function pointers, we can use direct calls to the routines in a common library by symbol name without copying the common library routines into each program.
这个答案主要关注1和2的混合。虽然您可以创建函数指针表,但我们可以使用符号名称直接调用公共库中的例程,而无需将公共库例程复制到每个程序中。 The method I use would be to utilize the power of LD and linker scripts to create a shared library that will have a static location in memory that is accessed via FAR CALLs (segment and offset form function address) from independent programs(s) loaded elsewhere in RAM.
我使用的方法是利用LD和链接器脚本的强大功能来创建一个共享库,该库在内存中具有静态位置,可通过FAR CALL(段和偏移形式函数地址)从其他地方加载的独立程序访问在RAM中。
Most people when they start out create a linker script that produces a copy of all the input sections in the output. 大多数人在他们开始时创建一个链接器脚本,该脚本生成输出中所有输入节的副本。 It is possible to create output sections that never appear (not LOADed) in the output file but the linker can still use the symbols of those nonloaded sections to resolve symbol addresses.
可以在输出文件中创建从不出现(未加载)的输出节,但链接器仍然可以使用这些非加载节的符号来解析符号地址。
I've created a simple common library with a print_banner
and print_string
function that use BIOS functions to print to the console. 我创建了一个简单的公共库,其中包含
print_banner
和print_string
函数,它使用BIOS函数打印到控制台。 Both are assumed to be called via FAR CALL's from other segments. 假设两者都是通过FAR CALL从其他部分调用的。 You may have your common library loaded at 0x0100:0x0000 (physical address 0x01000) but called from code in other segments like 0x2000:0x0000 (physical address 0x20000).
您可以将公共库加载到0x0100:0x0000(物理地址0x01000),但可以从其他段中的代码调用,如0x2000:0x0000(物理地址0x20000)。 A sample commlib.asm file could look like:
示例commlib.asm文件可能如下所示:
bits 16
extern __COMMONSEG
global print_string
global print_banner
global _startcomm
section .text
; Function: print_string
; Display a string to the console on specified display page
; Type: FAR
;
; Inputs: ES:SI = Offset of address to print
; BL = Display page
; Clobbers: AX, SI
; Return: Nothing
print_string: ; Routine: output string in SI to screen
mov ah, 0x0e ; BIOS tty Print
jmp .getch
.repeat:
int 0x10 ; print character
.getch:
mov al, [es:si] ; Get character from string
inc si ; Advance pointer to next character
test al,al ; Have we reached end of string?
jnz .repeat ; if not process next character
.end:
retf ; Important: Far return
; Function: print_banner
; Display a banner to the console to specified display page
; Type: FAR
; Inputs: BL = Display page
; Clobbers: AX, SI
; Return: Nothing
print_banner:
push es ; Save ES
push cs
pop es ; ES = CS
mov si, bannermsg ; SI = STring to print
; Far call to print_string
call __COMMONSEG:print_string
pop es ; Restore ES
retf ; Important: Far return
_startcomm: ; Keep linker quiet by defining this
section .data
bannermsg: db "Welcome to this Library!", 13, 10, 0
We need a linker script that allows us to create a file that we can eventually load into memory. 我们需要一个链接器脚本,允许我们创建一个最终可以加载到内存中的文件。 This code assumes the segment the library will be loaded at is 0x0100 and offset 0x0000 (physical address 0x01000):
此代码假定将加载库的段为0x0100且偏移量为0x0000(物理地址0x01000):
commlib.ld commlib.ld
OUTPUT_FORMAT("elf32-i386");
ENTRY(_startcomm);
/* Common Library at 0x0100:0x0000 = physical address 0x1000 */
__COMMONSEG = 0x0100;
__COMMONOFFSET = 0x0000;
SECTIONS
{
. = __COMMONOFFSET;
/* Code and data for common library at VMA = __COMMONOFFSET */
.commlib : SUBALIGN(4) {
*(.text)
*(.rodata*)
*(.data)
*(.bss)
}
/* Remove unnecessary sections */
/DISCARD/ : {
*(.eh_frame);
*(.comment);
}
}
It is pretty simple, it effectively links a file commlib.o
so that it can eventually be loaded at 0x0100:0x0000. 它非常简单,它有效地链接文件
commlib.o
以便最终可以在0x0100:0x0000加载。 As sample program that uses this library could look like: 使用此库的示例程序可能如下所示:
prog.asm : prog.asm :
extern __COMMONSEG
extern print_banner
extern print_string
global _start
bits 16
section .text
_start:
mov ax, cs ; DS=ES=CS
mov ds, ax
mov es, ax
mov ss, ax ; SS:SP=CS:0x0000
xor sp, sp
xor bx, bx ; BL = page 0 to display on
call __COMMONSEG:print_banner; FAR Call
mov si, mymsg ; String to display ES:SI
call __COMMONSEG:print_string; FAR Call
cli
.endloop:
hlt
jmp .endloop
section .data
mymsg: db "Printing my own text!", 13, 10, 0
The trick now is to make a linker script that can take a program like this and reference the symbols in our common library without actually adding the common library code again. 现在的诀窍是创建一个链接器脚本,它可以接受这样的程序并引用我们公共库中的符号,而无需再次实际添加公共库代码。 This can be achieved by using the
NOLOAD
type on an output section in a linker script. 这可以通过在链接描述文件的输出节上使用
NOLOAD
类型来实现。
prog.ld : prog.ld :
OUTPUT_FORMAT("elf32-i386");
ENTRY(_start);
__PROGOFFSET = 0x0000;
/* Load the commlib.elf file to access all its symbols */
INPUT(commlib.elf)
SECTIONS
{
/* NOLOAD type prevents the actual code from being loaded into memory
which means if you create a BINARY file from this, this section will
not appear */
. = __COMMONOFFSET;
.commlib (NOLOAD) : {
commlib.elf(.commlib);
}
/* Code and data for program at VMA = __PROGOFFSET */
. = __PROGOFFSET;
.prog : SUBALIGN(4) {
*(.text)
*(.rodata*)
*(.data)
*(.bss)
}
/* Remove unnecessary sections */
/DISCARD/ : {
*(.eh_frame);
*(.comment);
}
}
The common library's ELF file is loaded by the linker and the .commlib
section is marked with a (NOLOAD)
type. 公共库的ELF文件由链接器加载,而
.commlib
部分标记为(NOLOAD)
类型。 This will prevent a final program from including the common library functions and data, but allows us to still reference the symbol addresses. 这将阻止最终程序包含公共库函数和数据,但允许我们仍然引用符号地址。
A simple test harness can be created as a bootloader. 可以将简单的测试工具创建为引导加载程序。 The bootloader will load the common library to 0x0100:0x0000 (physical address 0x01000), and the program that uses them is loaded to 0x2000:0x0000 (physical address 0x20000).
引导加载程序将公共库加载到0x0100:0x0000(物理地址0x01000),使用它们的程序加载到0x2000:0x0000(物理地址0x20000)。 The program address is arbitrary, I just picked it because it is in free memory below 1MB.
程序地址是任意的,我只是选择它,因为它在1MB以下的空闲内存中。
boot.asm : boot.asm :
org 0x7c00
bits 16
start:
; DL = boot drive number from BIOS
; Set up stack and segment registers
xor ax, ax ; DS = 0x0000
mov ds, ax
mov ss, ax ; SS:SP=0x0000:0x7c00 below bootloader
mov sp, 0x7c00
cld ; Set direction flag forward for String instructions
; Reset drive
xor ax, ax
int 0x13
; Read 2nd sector (commlib.bin) to 0x0100:0x0000 = phys addr 0x01000
mov ah, 0x02 ; Drive READ subfunction
mov al, 0x01 ; Read one sector
mov bx, 0x0100
mov es, bx ; ES=0x0100
xor bx, bx ; ES:BS = 0x0100:0x0000 = phys adress 0x01000
mov cx, 0x0002 ; CH = Cylinder = 0, CL = Sector # = 2
xor dh, dh ; DH = Head = 0
int 0x13
; Read 3rd sector (prog.bin) to 0x2000:0x0000 = phys addr 0x20000
mov ah, 0x02 ; Drive READ subfunction
mov al, 0x01 ; Read one sector
mov bx, 0x2000
mov es, bx ; ES=0x2000
xor bx, bx ; ES:BS = 0x2000:0x0000 = phys adress 0x20000
mov cx, 0x0003 ; CH = Cylinder = 0, CL = Sector # = 2
xor dh, dh ; DH = Head = 0
int 0x13
; Jump to the entry point of our program
jmp 0x2000:0x0000
times 510-($-$$) db 0
dw 0xaa55
After the bootloader loads the common library (sector 1) and program (sector 2) into memory it jumps to the entry point of the program at 0x2000:0x0000. 引导加载程序将公共库(扇区1)和程序(扇区2)加载到内存后,它会跳转到程序的入口点0x2000:0x0000。
We can create the file commlib.bin
with: 我们可以使用以下命令创建文件
commlib.bin
:
nasm -f elf32 commlib.asm -o commlib.o
ld -melf_i386 -nostdlib -nostartfiles -T commlib.ld -o commlib.elf commlib.o
objcopy -O binary commlib.elf commlib.bin
commlib.elf
is also created as an intermediate file. commlib.elf
也被创建为中间文件。 You can create prog.bin
with: 您可以使用以下命令创建
prog.bin
:
nasm -f elf32 prog.asm -o prog.o
ld -melf_i386 -nostdlib -nostartfiles -T prog.ld -o prog.elf prog.o
objcopy -O binary prog.elf prog.bin
Create the bootloader ( boot.bin
) with: 使用以下命令创建引导加载程序(
boot.bin
):
nasm -f bin boot.asm -o boot.bin
We can build a disk image ( disk.img
) that looks like a 1.44MB floppy with: 我们可以构建一个看起来像1.44MB软盘的磁盘映像(
disk.img
):
dd if=/dev/zero of=disk.img bs=1024 count=1440
dd if=boot.bin of=disk.img bs=512 seek=0 conv=notrunc
dd if=commlib.bin of=disk.img bs=512 seek=1 conv=notrunc
dd if=prog.bin of=disk.img bs=512 seek=2 conv=notrunc
This simple example can fit the common library and program in single sectors. 这个简单的例子可以适用于单个扇区中的公共库和程序。 I have also hard coded their locations on the disk.
我还在磁盘上硬编码了它们的位置。 This is just a proof of concept, and not meant to represent your final code.
这只是一个概念证明,并不代表您的最终代码。
When I run this in QEMU (BOCHS will also work) using qemu-system-i386 -fda disk.img
I get this output: 当我在QEMU中运行它(BOCHS也可以)使用
qemu-system-i386 -fda disk.img
我得到这个输出:
In the example above we created a prog.bin
file that wasn't suppose to have the common library code in it, but had symbols to it resolved. 在上面的示例中,我们创建了一个
prog.bin
文件,该文件不应该包含公共库代码,但是已经解析了它的符号。 Is that what happened? 那是怎么回事? If you use NDISASM you can disassemble the binary file as 16-bit code with an origin point of 0x0000 to see what was generated.
如果使用NDISASM,则可以将二进制文件反编译为原点为0x0000的16位代码,以查看生成的内容。 Using
ndisasm -o 0x0000 -b16 prog.bin
you should see something like: 使用
ndisasm -o 0x0000 -b16 prog.bin
你应该看到类似的东西:
; Text Section 00000000 8CC8 mov ax,cs 00000002 8ED8 mov ds,ax 00000004 8EC0 mov es,ax 00000006 8ED0 mov ss,ax 00000008 31E4 xor sp,sp 0000000A 31DB xor bx,bx ; Both the calls are to the function in the common library that are loaded ; in a different segment at 0x0100. The linker was able to resolve these ; locations for us. 0000000C 9A14000001 call word 0x100:0x11 ; FAR Call print_banner 00000011 BE2000 mov si,0x20 00000014 9A00000001 call word 0x100:0x0 ; FAR Call print_string 00000019 FA cli 0000001A F4 hlt 0000001B EBFD jmp short 0x1a ; Infinite loop 0000001D 6690 xchg eax,eax 0000001F 90 nop ; Data section ; String 'Printing my own text!', 13, 10, 0 00000020 50 push ax 00000021 7269 jc 0x8c 00000023 6E outsb 00000024 7469 jz 0x8f 00000026 6E outsb 00000027 67206D79 and [ebp+0x79],ch 0000002B 206F77 and [bx+0x77],ch 0000002E 6E outsb 0000002F 207465 and [si+0x65],dh 00000032 7874 js 0xa8 00000034 210D and [di],cx 00000036 0A00 or al,[bx+si]
I have annotated it with a few comments. 我已经注释了一些评论。
retf
). retf
)。 Far functions that use pointers passed from other segments generally need to handle segment and offset of pointers (FAR pointers), not just the offset.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.