简体   繁体   中英

Assembly: Occurrence of Integers in Array

I'm writing a program in masm assembly to count and return the number of times integers appear in an array. I currently have the following code that allows me to populate an array with random integers. What I am struggling with is how to implement a counter that will store each occurrence of an integer at an index in the array. for instance, if the random array was [3,4,3,3,4,5,7,8], I would want to my count array to hold [3, 2, 1, 1, 1], as there are (three 3's, two 4's, etc).

I have the bounds of the random numbers fixed at 3/8 so I know they will be within this range. My current thinking is to compare each number to 3-8 as it is added, and increment my count array respectively. My main lack of understanding is how I can increment specific indices of the array. This code is how I am producing an array of random integers, with an idea of how I can begin to count integer occurrence, but I don't know if I am going in the right direction. Any advice?

push ebp
mov  ebp, esp
mov  esi, [ebp + 16]    ; @ holds array to store count of integer occurances
mov  edi, [ebp + 12]  ; @ holds array to be populated with random ints
mov  ecx, [ebp + 8]   ; value of request in ecx


MakeArray:              
    mov     eax, UPPER          ; upper boundary for random num in array
    sub     eax, LOWER          ; lower boundary for random num in array
    inc     eax             
    call    RandomRange
    add     eax, LOWER
    cmp     eax, 3          ; Where I start to compare the random numbers added
    je      inc_3           ; current thought is it cmp to each num 3-8
    mov     [edi], eax  ; put random number in array
    add     edi, 4      ; holds address of current element, moves to next element
    loop    fillArrLoop

inc_3:          ; if random num == 3
    inc     esi         ; holds address of count_array, increments count_array[0] to 1?
    mov     [edi], eax  ; put random number in array to be displayed
    add     edi, 4      ; holds address of current element, moves to next element
    loop    MakeArray

My current thinking is to compare each number to 3-8 as it is added

No, you're vastly overcomplicating this. You don't want to linear search for a j (index into the counts) such that arr[i] == j , just use j = arr[i] .

The standard way to do a histogram is ++counts[ arr[i] ] . In your case, you know the possible values are 3..8, so you can map an array value to a count bucket with arr[i] - 3 , so you'll operate on counts[0..5] . A memory-destination add instruction with a scaled-index addressing mode can do this in one x86 instruction , given the element value in a register.

If the possible values are not contiguous, you'd normally use a hash table to map values to count buckets. You can think about this simple case as allowing a trivial hash function.


Since you're generating the random numbers to fill arr[i] at the same time as histograming, you can combine those two tasks, and instead of subtracting 3 just don't add it yet.

; inputs: unsigned len, int *values, int *counts
; outputs: values[0..len-1] filled with random numbers, counts[] incremented
; clobbers: EAX, ECX, EDX  (not the other registers)
fill_array_and_counts:
    push ebp
    mov  ebp, esp

    push esi        ; Save/restore the caller's ESI.
 ;; Irvine32 functions like RandomRange are special and don't clobber EAX, ECX, or EDX except as return values,
 ;; so we can use EDX and ECX even in a loop that makes a function call.

    mov  edi, [ebp + 16]    ; int *counts  ; assumed already zeroed?
    mov  edx, [ebp + 12]    ; int *values  ; output pointers
    mov  ecx, [ebp + 8]     ; size_t length

    MakeArray:                       ; do{
        mov     eax, UPPER - LOWER + 1   ; size of random range, calculated at assemble time
        call    RandomRange                  ; eax = 0 .. eax-1
        add     dword ptr [edi + eax*4], 1   ; ++counts[ randval ]

        add     eax, LOWER               ; map 0..n to LOWER..UPPER
        mov     [edx], eax               ; *values = randval+3
        add     edx, 4                   ; values++

        dec     ecx
        jnz     MakeArray            ; }while(--ecx);

    pop   edi                  ; restore call-preserved regs
    pop   ebp                  ; including tearing down the stack frame
    ret

If the caller doesn't zero the counts array for you, you should do that yourself, perhaps with rep stosd with EAX=0 as a memset of ECX dword elements, and then reload EDI and ECX from the stack args.

I'm assuming UPPER and LOWER are assemble time constants like UPPER = 8 or LOWER equ 3 , since you used all-upper-case names for them, and they're not function args. If that's the case, then there's no need to do the math at runtime, just let the assembler calculate UPPER - LOWER + 1 for you.


I avoided the loop instruction because it's slow , and doesn't do anything you can't do with other simple instructions.

One standard performance trick for histograms with only a few buckets is to have multiple arrays of counts and unroll over them: Methods to vectorise histogram in SIMD? . This hides the latency of store/reload when the same counter needs to be incremented several times in a row. Your random values will generally avoid long runs of the same value, though, so worst-case performance is avoided.

There might be something to gain from AVX2 for large arrays since there are only 6 possible buckets: Micro Optimization of a 4-bucket histogram of a large array or list . (And you could generate random numbers in SIMD vectors with an AVX2 xorshift128+ PRNG if you wanted.)

If your range is fixed (3-8), you have a fixed-length array that can hold your counts:

(index0:Count of 3),(index1:Count of 4)..(index5:Count of 8s)

Once you have an element from the random array, you just take that element and put it through a switch:

cmp 3, [element]
jne compare4
mov ebx, [countsArrayAddress]     ; Can replace [countsArrayAddress] with  [ebp + 16]
add ebx, 0                        ; First index, can comment out this line
mov ecx, [ebx]
add ecx, 1                        ; Increment count
mov [ebx], ecx                    ; Count at the zeroth offset is now incremented
compare4:
cmp 4, [element]
jne compare5
mov ebx, [countsArrayAddress]
add ebx, 4                         ; Second index (1*4)
mov ecx, [ebx]
add ecx, 1
mov [ebx], ecx
...

Is this what you mean? I come from using fasm syntax but it looks pretty similar. The above block is a bit unoptimized, but think this shows how to build the counts array. The array has a fix length, which must be allocated, either on the stack (sub rsp the correct amount) or on the heap, ie with heapalloc/malloc calls. ( Edited, see you're using 32-bit registers )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM