Is Python bitwise shift really slow?

Question

I must be overlooking something, but really don't see why the Python code is so slow...

Counting unique elements in an array where elements are in the range [−1,000,000..1,000,000] and use a bitvector to do this. The Java code, which uses BitSet is about 50 times faster than Python, which takes 9 seconds.

Is this maybe because when I initialise bitvector = 0 Python doesn't reserve enough memory and the bitvector needs to be copied as it grows?

Python:

def solution(array):
    bitvector = 0
    count = 0
    for element in array:
        # transform -1,000,000 to 0 etc
        element_transformed = element + 1000000
        if bitvector >> element_transformed & 1 == 0:
            count += 1
            bitvector = bitvector | 1 << element_transformed

    return count

Test:

import unittest
import random

from .file1 import solution

class MySolutionTests(unittest.TestCase):
    def test_solution_random_all_unique(self):
        a = random.sample(range(-1000000, 1000001), 100000)
        self.assertEqual(100000, solution(a))

In Java:

package mypackage;

import java.util.ArrayList;
import java.util.BitSet;
import java.util.List;


public class MyClass {

    public static int solution(List<Integer> array) {
        BitSet bitvector = new BitSet();
        int count = 0;

        for(int i = 0; i < array.size(); i++) {
            int elementTransformed = array.get(i) + 1000000;
            if(bitvector.get(elementTransformed) != true) {
                count++;
                bitvector.set(elementTransformed, true);
            }
        }
        return count;
    }

    public static void main(String[] args) {
        // TODO code application logic here
    }
}

Test:

package mypackage;

import java.util.ArrayList;
import java.util.Collections;
import org.junit.Test;
import static org.junit.Assert.*;

public class MyClassTest {

    public MyClassTest() {
    }

    @Test
    public void testSolutionLong_RandomAllUnique() {
        ArrayList array = new ArrayList();
        for(int i = -1000000; i < 1000000; i++) {
            array.add(i);
        }
        Collections.shuffle(array);
        assertEquals(100000, MyClass.solution(array.subList(0, 100000)));

    }  
}

Answer 1

Just trying to reply directly to the question you posed. It is not a simple question to answer why Python takes 9 seconds and Java is 50 times faster. Here you can get a good insight of a precedent discussion Is Python slower than Java/C#?

The way I like to look at it, is that Java is a Object Oriented language, while python is Object Based.

When looking at a bitwise operation, Java uses the primitive data types that are arguably faster due to not having boxing-unboxing operations and wrappers as a layer of abstraction. So looking at your code at every iteration python re-wrappes the integer as an object of type integer, while Java does not.

But again, I wouldn't take for granted that Java is always faster than Python. It is up to which library you are using and which problem you are trying to solve!

Answer 2

The pythonic way to do this is

def solution(array):
    return len(set(array))

It is much faster, though will probably use more memory.

The set solution ran in about 100 ms for 10**6 samples from a 2*10**6 range. I didn't even time the bit array because it took seconds.

When talking about lists on the order of 10**6 , it is worth the trade off. Using sys.getsizeof , I measured the intermediate set as using 4.2 times the memory of the list . The equivalent int bit array has about 1/30 the memory of the list . This is on a 64 bit Linux system.

Is Python bitwise shift really slow?

Question

2 answers

solution1
1 2019-11-27 10:32:59

solution2
1 2021-09-03 05:30:24

Is Python bitwise shift really slow?

Question

2 answers

solution1 1 2019-11-27 10:32:59

solution2 1 2021-09-03 05:30:24

solution1
1 2019-11-27 10:32:59

solution2
1 2021-09-03 05:30:24