简体   繁体   中英

Fastest way to convert a binary list(or array) into an integer in Python

Suppose there is a list(or an array) which contains 1s and 0s.

gona = [1, 0, 0, 0, 1, 1]

I want to convert this into the integer represented by the binary value 100011 (The number made out of the elements in the list).

I know that this can be done as follows.

int("".join(map(str, gona)),2)

or

edit: int("".join([str(i) for i in gona]),2)

Is there any other faster way to do this?

This is the fastest I came up with. A slight variation of your initial solution:

digits = ['0', '1']
int("".join([ digits[y] for y in x ]), 2)

%timeit int("".join([digits[y] for y in x]),2)
100000 loops, best of 3: 6.15 us per loop
%timeit int("".join(map(str, x)),2)
100000 loops, best of 3: 7.49 us per loop

(Btw, it seems that in this case, using a list comprehension is faster than using a generator expression.)

EDIT:

Also, I hate being a smartass, but you can always trade memory for speed:

# one time precalculation
cache_N = 16  # or much bigger?!
cache = {
   tuple(x): int("".join([digits[y] for y in x]),2)
   for x in itertools.product((0,1), repeat=cache_N)
}

Then:

res = cache[tuple(x)]

Way faster. Of course, this is only feasible up to a point...

EDIT2:

I now see you say your lists have 32 elements. In this case the caching solution is probably infeasible, BUT we have more ways to trade speed for memory. Eg, with cache_N=16 , which is surely feasible, you can access it twice:

c = 2 ** cache_N # compute once
xx = tuple(x)
cache[xx[:16]] * c + cache[xx[16:]]

%timeit cache[xx[:16]] * c + cache[xx[16:]]
1000000 loops, best of 3: 1.23 us per loop  # YES!

You could do it like this:

sum(x << i for i, x in enumerate(reversed(gona)))

Although it's not much faster

I decided to create a script to trial 4 different methods of doing this task.

import time

trials = range(1000000)
list1 = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0]
list0 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
listmix = [1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0]

def test1(l):
    start = time.time()
    for trial in trials:
        tot = 0
        n = 1
    for i in reversed(l):
            if i:
                tot += 2**n
            n += 1
    print 'Time taken:', str(time.time() - start)

def test2(l):
    start = time.time()
    for trial in trials:
        int("".join(map(str, l)),2)
    print 'Time taken:', str(time.time() - start)

def test3(l):
    start = time.time()
    for trial in trials:
        sum(x << i for i, x in enumerate(reversed(l)))
    print 'Time taken:', str(time.time() - start)

def test4(l):
    start = time.time()
    for trial in trials:
        int("".join([str(i) for i in l]),2)
    print 'Time taken:', str(time.time() - start)


test1(list1)
test2(list1)
test3(list1)
test4(list1)
print '.'
test1(list0)
test2(list0)
test3(list0)
test4(list0)
print '.'
test1(listmix)
test2(listmix)
test3(listmix)
test4(listmix)

My results:

Time taken: 7.14670491219
Time taken: 5.4076821804
Time taken: 4.7349550724
Time taken: 7.24234819412
.
Time taken: 2.29213285446
Time taken: 5.38784003258
Time taken: 4.70707392693
Time taken: 7.27936697006
.
Time taken: 4.78960323334
Time taken: 5.36612486839
Time taken: 4.70103287697
Time taken: 7.22436404228

Conclusion: @goncalopp's solution is probably the best one. It is consistently fast. On the other hand, if you're likely to have more zeros than ones, stepping through the list and manually multiplying powers of two and adding them will be fastest.

EDIT: I re-wrote my script to use timeit, the source code is at http://pastebin.com/m6sSmmR6

My output result:

7.78366303444
2.79321694374
5.29976511002
.
5.72017598152
5.70349907875
5.66881299019
.
5.25683712959
5.17318511009
5.20052909851
.
8.23388290405
8.24193501472
8.15649604797
.
3.94102287292
3.95323395729
3.9201271534

My method of stepping through the list backwards adding powers of two is still faster if you have all zeros, but otherwise, @sxh2's method is definitely the fastest, and my implementation didn't even include his caching optimization.

I tried this:

int(str(gona).replace(', ','')[1:-1])

and compared to this (which was @Sohcahtoa82 fastest case):

sum(x << i for i, x in enumerate(reversed(gona)))

On my machine, the first does 1000000 passes in ~5.97s. The second case takes ~8.03s.

I tried another approach, and inserted into @Sohcahtoa82's code:

T61 = """
    l = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
    digits = ['0', '1']
    s = ''
    for y in l:
        s += digits[y]
    int(s, 2)
"""

T60 = """
    l = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    digits = ['0', '1']
    s = ''
    for y in l:
        s += digits[y]
    int(s, 2)
"""

T6mix = """
    l = [1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0]
    digits = ['0', '1']
    s = ''
    for y in l:
        s += digits[y]
    int(s, 2)
"""

And got these results. Mine is the last set of times.

5.45334255339
1.89000112578
4.14859673729
.
4.39018410496
4.21122597336
4.57919181895
.
3.59095765307
3.25353409619
3.78588067833
.
6.53343932548
6.33234985363
6.65685678006
.
2.74509861151
2.6111819044
2.83928911064
.
2.79519545737
2.66091503704
2.9183024407

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM