简体   繁体   中英

How can I concatenate two strings in x86 Assembly?

For example I have two strings:

section .data
    stringA    db    "abcde"
    stringB    db    "fghij"

At some later point, how can I concatenate them into a new stringC? ( ie stringC should contain "abcdefghij")

Assembler does not have data types but it has one instruction for each instruction the CPU has.

Different programming languages have different methods for storing a string in memory:

Some languages (like C) use terminated strings: A string is some array in memory where characters are stored. The end of a string is marked by a special character (for example NUL) because the length of the array is larger than the maximum possible string length:

char a[100] = "Hello";

Actually means:

char a[100] = { 'H', 'e', 'l', 'l', 'o', 0, 'f', 'o', 'o', 'b', 'a', 'r', ...};

Other languages (like Java, Pascal or C#) internally store the length of the string in some variable and the characters in an array:

string a = "Hello";

Actually means:

int a_len = 5;
char a_text[100] = { 'H', 'e', 'l', 'l', 'o', 'f', 'o', 'o', 'b', 'a', 'r', ...};

Or (in the case of old Pascal variants):

char a[100] = { 5, 'H', 'e', 'l', 'l', 'o', 'f', 'o', 'o', 'b', 'a', 'r', ...};

Because assembly language is "just" another representation of the CPU instructions all variants which are used by any programming language can be used in assembly language.

So it depends on the way HOW your string is stored in memory.

If you want to concatenate two NUL-terminated strings you could do the concatenation in the following way:

  1. You set ds:si , esi or rsi (depending if you write 16-, 32- or 64-bit code) to the first character of the first string.
  2. You set es:di , edi or rdi to the destination memory
  3. You clear the direction flag
  4. You read one byte using the lodsb instruction
  5. You write the same byte using the stosb instruction
  6. If the al register is not zero you continue with step 4. (loop)
  7. You decrement di , edi or rdi
  8. You set ds:si , esi or rsi to the first character of the second string
  9. You perform the loop (steps 4.-6.) again

If you want to use other CPUs (eg. ARM, MIPS, PowerPC, ...) instead of x86 you'll have to use other registers, of course. Most CPUs don't have an equivalent of lodsb or stosb but you'll have to use two instructions: Load one byte and increment the register.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM