有没有办法更快地做到这一点？

Question

ladder have around 15000 elements, this code snippet performed in 5-8sec, is there any way to do it faster? 梯子大约有15000个元素，此代码段在5到8秒内执行，有什么方法可以使其更快地执行吗？ I try do it without checking for duplicate and without creating accs list and time was down to 2-3sec, but I don't need duplicate in csv file. 我尝试执行此操作，而无需检查重复项，也无需创建accs列表，时间降至2-3秒，但在csv文件中不需要重复项。
I work in python 2.7.9 我在python 2.7.9中工作

accs =[]
with codecs.open('test.csv','w', encoding="UTF-8") as out:
        row = ''
        for element in ladder:
                if element['account']['name'] not in accs:
                        accs.append(element['account']['name'])
                        row += element['account']['name']
                        if 'twitch' in element['account']:
                                row +=  "," + element['account']['twitch']['name'] + ","
                        else:
                                row += ",,"
                        row += str(element['account']['challenges']['total']) + "\n"
        out.write(row)

Answer 1

You can't do much about the loop, since you need to go through every element in ladder after all. 您不能对循环做太多事情，因为您毕竟需要遍历ladder每个元素。 However, you can improve this membership test: 但是，您可以改进此成员资格测试：

if element['account']['name'] not in accs:

Since accs is a list, this will essentially loop through all items of accs and check if the name is in there. 由于accs是一个列表，因此实际上将遍历accs所有项目，并检查名称是否在其中。 And you loop for every element in ladder , so this can easily become very inefficient. 而且您会为ladder每个元素循环，因此这很容易变得效率低下。

Instead, use a set instead of a list for accs as this will give you a constant membership lookup. 取而代之的是，使用集合而不是列表来访问accs因为这将使您不断进行成员资格查找。 So you reduce your algorithm from a quadratic complexity to a linear complexity. 因此，您可以将算法从二次复杂度降低到线性复杂度。 For that, just use accs = set() and change your code to use accs.add() instead of append . 为此，只需使用accs = set()并将代码更改为使用accs.add()而不是append 。

Another issue is that you are doing string concatenation. 另一个问题是您正在执行字符串连接。 Every time you do someString + "something" you are throwing away that string object and create a new one. 每次执行someString + "something"您都将丢弃该字符串对象并创建一个新的对象。 This can become inefficient for a high number of operations too. 对于大量操作而言，这也可能变得效率低下。 Instead, use a list here to collect all the elements you want to write, and then join them: 而是使用此处的列表来收集您要编写的所有元素，然后将它们加入：

row = []
row.append(element['account']['name'])
if 'twitch' in element['account']:
    row.append(element['account']['twitch']['name'])
else:
    row.append('')
row.append(str(element['account']['challenges']['total']))

out.write(','.join(row))
out.write('\n')

Alternatively, since you are writing to a file anyway, you could just call out.write multiple times with each string part. 另外，由于无论如何都在写入文件，因此每个字符串部分都可以多次调用out.write 。

Finally, you could also look into the csv module if you are interested in writing out CSV data. 最后，如果您有兴趣写出CSV数据，也可以查看csv模块。

Answer 2

seen    = set()
results = []

for user in ladder:
    acc  = user['account']
    name = acc['name']
    if name not in seen:
        seen.add(name)
        twitch_name = acc['twitch']['name'] if "twitch" in acc else ''
        challenges  = acc['challenges']['total']
        results.append("%s,%s,%d" % (name, twitch_name, challenges))

with codecs.open('test.csv','w', encoding="UTF-8") as out:
    out.write("\n".join(results))

有没有办法更快地做到这一点？

问题描述

2 个解决方案

解决方案1
2 2015-02-14 01:12:29

解决方案2
2 已采纳 2015-02-14 01:15:22

有没有办法更快地做到这一点？

问题描述

2 个解决方案

解决方案1 2 2015-02-14 01:12:29

解决方案2 2 已采纳 2015-02-14 01:15:22

解决方案1
2 2015-02-14 01:12:29

解决方案2
2 已采纳 2015-02-14 01:15:22