简体   繁体   中英

Processing big list using python

So I'm trying to solve a challenge and have come across a dead end. My solution works when the list is small or medium but when it is over 50000. It just "time out"

a = int(input().strip())
b = list(map(int,input().split()))


result = []
flag = []

for i in range(len(b)):
    temp = a - b[i]
    if(temp >=0 and temp in flag):
        if(temp<b[i]):
            result.append((temp,b[i]))
        else:
            result.append((b[i],temp))
        flag.remove(temp)
    else:
        flag.append(b[i])
    result.sort()


for i in result:
    print(i[0],i[1])

Where

a = 10

and b = [ 2, 4 ,6 ,8, 5 ]

Solution sum any two element in b which matches a

**Edit: ** Updated full code

flag is a list, of potentially the same order of magnitude as b . So, when you do temp in flag that's a linear search: it has to check every value in flag to see if that value is == temp . So, that's 50000 comparisons. And you're doing that once per loop in a linear walk over b . So, your total time is quadratic: 50,000 * 50,000 = 2,500,000,000 . (And flag.remove is also linear time.)

If you replace flag with a set, you can test it for membership (and remove from it) in constant time. So your total time drops from quadratic to linear, or 50,000 steps, which is a lot faster than 2 billion:

flagset = set(flag)
for i in range(len(b)):
    temp = a - b[i]
    if(temp >=0 and temp in flagset):
        if(temp<b[i]):
            result.append((temp,b[i]))
        else:
            result.append((b[i],temp))
        flagset.remove(temp)
    else:
        flagset.add(b[i])
flag = list(flagset)

If flag needs to retain duplicate values, then it's a multiset, not a set, which means you can implement with Counter :

flagset = collections.Counter(flag)
for i in range(len(b)):
    temp = a - b[i]
    if(temp >=0 and flagset[temp]):
        if(temp<b[i]):
            result.append((temp,b[i]))
        else:
            result.append((b[i],temp))
        flagset[temp] -= 1
    else:
        flagset[temp] += 1
flag = list(flagset.elements())

In your edited code, you've got another list that's potentially of the same size, result , and you're sorting that list every time through the loop.

Sorting takes log-linear time. Since you do it up to 50,000 times, that's around log(50;000) * 50,000 * 50,000 , or around 30 billion steps.

If you needed to keep result in order throughout the operation, you'd want to use a logarithmic data structure, like a binary search tree or a skiplist, so you could insert a new element in the right place in logarithmic time, which would mean just 800.000 steps.

But you don't need it in order until the end. So, much more simply, just move the result.sort out of the loop and do it at the end.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM