简体   繁体   English

有没有办法更快地做到这一点?

[英]Is there a way to do it faster?

ladder have around 15000 elements, this code snippet performed in 5-8sec, is there any way to do it faster? 梯子大约有15000个元素,此代码段在5到8秒内执行,有什么方法可以使其更快地执行吗? I try do it without checking for duplicate and without creating accs list and time was down to 2-3sec, but I don't need duplicate in csv file. 我尝试执行此操作,而无需检查重复项,也无需创建accs列表,时间降至2-3秒,但在csv文件中不需要重复项。
I work in python 2.7.9 我在python 2.7.9中工作

accs =[]
with codecs.open('test.csv','w', encoding="UTF-8") as out:
        row = ''
        for element in ladder:
                if element['account']['name'] not in accs:
                        accs.append(element['account']['name'])
                        row += element['account']['name']
                        if 'twitch' in element['account']:
                                row +=  "," + element['account']['twitch']['name'] + ","
                        else:
                                row += ",,"
                        row += str(element['account']['challenges']['total']) + "\n"
        out.write(row)

You can't do much about the loop, since you need to go through every element in ladder after all. 您不能对循环做太多事情,因为您毕竟需要遍历ladder每个元素。 However, you can improve this membership test: 但是,您可以改进此成员资格测试:

if element['account']['name'] not in accs:

Since accs is a list, this will essentially loop through all items of accs and check if the name is in there. 由于accs是一个列表,因此实际上将遍历accs所有项目,并检查名称是否在其中。 And you loop for every element in ladder , so this can easily become very inefficient. 而且您会为ladder每个元素循环,因此这很容易变得效率低下。

Instead, use a set instead of a list for accs as this will give you a constant membership lookup. 取而代之的是,使用集合而不是列表来访问accs因为这将使您不断进行成员资格查找。 So you reduce your algorithm from a quadratic complexity to a linear complexity. 因此,您可以将算法从二次复杂度降低到线性复杂度。 For that, just use accs = set() and change your code to use accs.add() instead of append . 为此,只需使用accs = set()并将代码更改为使用accs.add()而不是append

Another issue is that you are doing string concatenation. 另一个问题是您正在执行字符串连接。 Every time you do someString + "something" you are throwing away that string object and create a new one. 每次执行someString + "something"您都将丢弃该字符串对象并创建一个新的对象。 This can become inefficient for a high number of operations too. 对于大量操作而言,这也可能变得效率低下。 Instead, use a list here to collect all the elements you want to write, and then join them: 而是使用此处的列表来收集您要编写的所有元素,然后将它们加入:

row = []
row.append(element['account']['name'])
if 'twitch' in element['account']:
    row.append(element['account']['twitch']['name'])
else:
    row.append('')
row.append(str(element['account']['challenges']['total']))

out.write(','.join(row))
out.write('\n')

Alternatively, since you are writing to a file anyway, you could just call out.write multiple times with each string part. 另外,由于无论如何都在写入文件,因此每个字符串部分都可以多次调用out.write

Finally, you could also look into the csv module if you are interested in writing out CSV data. 最后,如果您有兴趣写出CSV数据,也可以查看csv模块。

seen    = set()
results = []

for user in ladder:
    acc  = user['account']
    name = acc['name']
    if name not in seen:
        seen.add(name)
        twitch_name = acc['twitch']['name'] if "twitch" in acc else ''
        challenges  = acc['challenges']['total']
        results.append("%s,%s,%d" % (name, twitch_name, challenges))

with codecs.open('test.csv','w', encoding="UTF-8") as out:
    out.write("\n".join(results))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM