简体   繁体   中英

upper to lower case in 68K

I want to convert from upper to lower case. this the code I wrote but still not get where is the error. I know to covert from upper to lower I have subtract 32 so that what trying to do it. But, after I run the program it does not show the result in lower case. Thanks for any help.

START       ORG    $1000
            MOVE.L USERID,D0
            SUBI.L   #32323232,D0          
            MOVE.L  D0,result            
            SIMHALT 
USERID      DC.L    'ABCD1234'           
result      DS.L    1
            END     START  

#32323232 is ... well, that, value 32323232 . But you tried to apply it to 4 BYTE characters loaded as 32 bit (LONG) value, which means each character occupies it's own 8 bits, like after MOVE.L USERID,D0 the D0 is $41424344 , where $41 == 65 == 'A', $42 == 66 == 'B', $43 == 67 == 'C', $44 == 68 == 'D' .

In hexadecimal formatting each letter (8 bits) has exactly two hexadecimal digits, because 1 hexadecimal digit is value from 0 to 15 and that's exactly what 4 bits can encode in total, so it's perfect conversion 4 bits <=> 1 hexadecimal digit.

But if you will take that value $41424344 and convert it from hexadecimal to decimal, you will get 1094861636 . It's still the same value, but suddenly the letters are not easily "visible" in it any more. They are hidden there as 65*256*256*256 + 66*256*256 + 67*256 + 68 = 1094861636 , as 8 bits can encode 256 different values, so by multiplying by power of 256 you can place decimal value in particular BYTE of LONG. So for example 66*256*256 means "third least significant byte in long", or technically it means to move decimal value 66 "to left by 16 bits". And indeed, if you would load 66 into D1 and do LSL.L #16,D1 , you would calculate 66*256*256 == 66*65536 == (66<<16) without using multiply instruction.

It's all about how numbers "base N formatting" works, that you have M digits of (0 to N-1) value, each digit representing multiply of i-th power of N depending on it's position in the number. Ie. value 123 in decimal (base 10) formatting is written as "123", where the "1" represents value amount of 10 2 , "2" stands for 10 1 , and "3" is for 10 0 values.

Mind you, the written "123" form is not number 123 . Number 123 is purely abstract entity, the decimal formatting used to write it down here like "123" is actually imperfect mirror of the value itself, with few limitations imposed by the decimal formatting upon this format, not upon the real value. Probably the simplest example of these imperfections of "base N" formats: the value 123 has in decimal second valid form: 122.99999.. with infinite amount of "9" fractions. It's still the same value 123 exactly, but written differently (and not as practically, as the finitely short "123" variant).

So back to your 32323232 ... you did want to place each 32 at particular BYTE in LONG, but that would require you to use 32*256*256*256 + 32*256*256 + 32*256 + 32 = 538976288 decimal value. Which is PITA to calculate in head.

If you ever wondered "why Assembly sources are full of those annoying hexadecimal numbers?" , here comes the point of this lengthy answer and all of those numerical exercises.

32 is $20 (you should be able to convert powers of two in your head on the fly). And if you want to place $20 at particular byte positions, you have to write #$20202020 (it's still the same 538976288 of course). That's certainly manageable without calculator, while writing the source, right? So that's the answer, why the hexadecimal formatting is so popular among Assembly programmers, it allows you to see immediately, what are values of particular bytes in word/long value.

Just for curiosity, your 32323232 value splits into bytes as $01ED36A0 (can you see them now? Each byte is 8 bits = two 4 bits, and 4 bits = single hexadecimal digit).


And as Mark noted, you need to add. So fix of your source:

        MOVE.L   USERID,D0
        ADDI.L   #$20202020,D0
        MOVE.L   D0,result

This will indeed show "abcd" in memory at address result ( $101E ).


About "bit manipulation" : if you will take a look at ASCII table with hexadecimal formatting, you will see that the upper/lowercase letter have the same value except the 6th bit, which is clear for uppercase letter, and set for lowercase.

So by doing ORI.L #$20202020,D0 you will set 6th bits in each byte packed in D0 long, effectively doing "to lower case" for letter ASCII values.

ANDI.L #~$20202020,D0 ( ~$20202020 is $DFDFDFDF , inverted $20 bit pattern) will do "to upper case" for ASCII letters.

XORI.L #$20202020,D0 will flip upper to lower and lower to upper case, for ASCII letters.

All of these will meaninglessly mangle other ASCII characters, like digits and symbols, so these bit tricks are usable only when you know your value contains letters only ( "garbage in, garbage out" ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM