简体   繁体   中英

How to find out word length of a virtual machine?

I mounted ubuntu/trusty64 vagrant box ( https://app.vagrantup.com/ubuntu/boxes/trusty64 ) and I have the following code:

#include <stdio.h>
int main () {    
  short count = -1;
  int a = 5 ;
  int* a_ptr = &a;  
  char b = 'A';
  char* b_ptr = &b;
  
  printf ("count's value     = %d \n", count);  
  printf ("count's address     = %p \n", &count);  
  printf ("a's value     = %d \n", a);   
  printf ("a's address   = %p \n", &a);    
  printf ("a_ptr value   = %p \n", a_ptr);  
  printf ("a_ptr address = %p \n", &a_ptr);  
  printf ("a_ptr deref'ed= %d \n", *a_ptr);   
  printf ("\n");
  printf ("b's value     = %d \n", b);     
  printf ("b's address   = %p \n", &b);   
  printf ("b_ptr value   = %p \n", b_ptr);  
  printf ("b_ptr address = %p \n", &b_ptr);  
  printf ("b_ptr deref'ed= %d \n", *b_ptr);   
  printf ("\n");      
  
  for (count = -30; count < 500; count++) {
    printf ("test: %3d: %p: (%d, %x)\n", count, b_ptr+count, *(b_ptr+count),*(b_ptr+count)) ;  
  }
}

which shows the output of:

count's value     = -1
count's address     = 0x7ffd9440876a
a's value     = 5
a's address   = 0x7ffd94408764
a_ptr value   = 0x7ffd94408764
a_ptr address = 0x7ffd94408758
a_ptr deref'ed= 5

b's value     = 65
b's address   = 0x7ffd94408757
b_ptr value   = 0x7ffd94408757
b_ptr address = 0x7ffd94408748
b_ptr deref'ed= 65

test: -15: 0x7ffd94408748: (87, 57)
test: -14: 0x7ffd94408749: (-121, ffffff87)
test: -13: 0x7ffd9440874a: (64, 40)
test: -12: 0x7ffd9440874b: (-108, ffffff94)
test: -11: 0x7ffd9440874c: (-3, fffffffd)
test: -10: 0x7ffd9440874d: (127, 7f)

I am only showing the content at 0x7ffd94408748 which is 0x7ffd94408757, which is the address of b as expected.

Notice a memory address is spread over 6 memory locations on a supposed 64-bit processor. I tested other data types and it appears that the other data types are also represented with at most 8 bits per memory location, so I conclude that it is a 8-bit machine.

However, when I run $ getconf WORD_BIT, I get 32 and when I run $uname -a, I see "Linux vagrant-ubuntu-trusty-64 3.13.0-170-generic #220-Ubuntu SMP Thu May 9 12:40:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux."

In my VirtualBox manager, I see this machine is a 64-bit Ubuntu.

Which is the correct word length of this machine? Is there something about virtualization that makes its true word length appear different that what is stated (ie 64 bits)?

I see similar posts at: Determine word size of my processor How to determine whether a given Linux is 32 bit or 64 bit? how to find if the machine is 32bit or 64bit

but trying out the above links shows various results between 32 and 64 bits, so I am confused even more.

Which is the correct word length of this machine?

To find the bit size of an object like an int or pointer, use sizeof

 printf("%zu\n", sizeof(char) * CHAR_BIT);
 printf("%zu\n", sizeof(int) * CHAR_BIT);
 printf("%zu\n", sizeof(long) * CHAR_BIT);
 printf("%zu\n", sizeof(long long) * CHAR_BIT);
 printf("%zu\n", sizeof(void *) * CHAR_BIT);

Notice a memory address is spread over 6 memory locations on a supposed 64-bit processor.

No, they aren't. Note that they are hexadecimal addresses.

I conclude that it is a 8-bit machine.

How much each address covers is indeed 8 bits in most architectures (but not all). Check CHAR_BIT to be sure instead.

Which is the correct word length of this machine?

That depends on your definition of "word length". Some take that as the maximum integer register size available for a given architecture. Others talk about the pointer size. Others about the size of an int in the C vendor implementation.

There is no easy way to check for that in C, because each environment can define the standard types with the limits they like. Instead, what you should do is use the integer types in <stdint.h> if you need exact sizes.

All current Linux architectures are either ILP32 or LP64.

ILP32 has 32-bit int , long int , and pointer types. (Typically, the general-purpose registers in such hardware is 32-bit.)

LP64 has 32-bit int , but 64-bit long int and pointer types. (Typically, the general-purpose registers in such hardware is 64-bit.)

(Compare to Windows, which is ILP32 on 32-bit x86, but LLP64 on 64-bit x86-64, so that int and long int are 32-bit on both, and only long long int and pointers are 64-bit on Windows. Only the size of pointers vary; long long int is 64-bit on both 32-bit and 64-bit Windows.)

So, "word length" is really a vague concept, and does not capture the complexity of the situation at all.

Even common hardware architectures like x86 and x86-64 (Intel and AMD) have some registers of different sizes. In particular, SSE xmm registers (for integer and floating-point arithmetic via SIMD/single-instruction, multiple data operations) are 128-bit, AVX extends those to 256-bit, and AVX512 to 512-bit. (You cannot even rely on malloc() providing sufficiently aligned memory on all systems, when allocating dynamic memory for use with types corresponding to such registers, and have to use posix_memalign() , or _mm_malloc() from <xmmintrin.h> etc. instead. Static and local variables will be sufficiently aligned, of course; it's just that malloc() will not ensure sufficient alignment, because these are so-called vector types , and the C standard says they may need larger alignment than normal dynamic memory management provides.)

On Linux, the most useful "word length" is the size of the long int (equivalently, long ), because it matches the general purpose integer register sizes best , and is the optimal type (across all hardware architectures, when running in Linux) for example for multi-limb bit arrays, because basic binary operations (and, or, xor, bit shifts) are most efficient on the unsigned long / unsigned long int type. (But do remember, that only applies to Linux, and is not true in general, for example on Windows.)

In real-world applications, you should include <stdint.h> or <inttypes.h> , and use exact-width types (int N _t, uint N _t) for structure members and possibly the API (externally exposed functions – although you definitely should use size_t for in-memory sizes and counts, and intptr_t / uintptr_t for integer types that preserve pointer values across casting to an integer type and back to the original pointer type), and fast minimum-width types (int_fast N _t, uint_fast N _t) for local variables and internal function parameters; for N in 8, 16, 32, 64.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM