简体   繁体   English

更快的方法.format()

[英]faster way to do .format()

I'm writing a program that needs to do a lot of string formatting and I have noticed that .format() is taking a small but significant amount of cpu time. 我正在编写一个需要进行大量字符串格式化的程序,并且我注意到.format()占用了很少但很大量的cpu时间。 Here's how I'm using it: 这是我使用它的方式:

str = 'vn {0:.15f} {1:.15f} {2:.15f}\\n'.format(normal_array[i].x, normal_array[i].y, normal_array[i].z)

Does anyone know if there is even a slightly faster way to do this as a small fraction X 100000 can add up 有没有人知道是否有更快的方法来做到这一点,因为一小部分X 100000可以加起来

Try to replace .format with % expression and pre-calculate normal_array: 尝试用% expression替换.format并预先计算normal_array:

item = normal_array[i]
'vn %.15f %.15f %.15f\n' % (item.x, item.y, item.z)

Also replacing indexes with iteration over values can slightly improve speed: 同时使用迭代值替换索引可以略微提高速度:

for item in normal_array:
    'vn %.15f %.15f %.15f\n' % (item.x, item.y, item.z)

Benchmark: 基准测试:

def gen_data(n):
    l = []
    for k in xrange(n):
        l.append(collections.namedtuple('normal', ('x', 'y', 'z'))(random.random(), random.random(), random.random()))
    return l

if __name__ == '__main__':
    times = 1000
    print 'format:'
    print timeit.Timer('for i in xrange(len(normal_array)):\n    str = "vn {0:.15f} {1:.15f} {2:.15f}\\n".format(normal_array[i].x, normal_array[i].y, normal_array[i].z)\n',
            'from __main__ import gen_data; normal_array = gen_data(1000)').timeit(times)
    print '%s:'
    print timeit.Timer('for i in xrange(len(normal_array)):\n    str = "vn %.15f %.15f %.15f\\n".format(normal_array[i].x, normal_array[i].y, normal_array[i].z)\n',
            'from __main__ import gen_data; normal_array = gen_data(1000)').timeit(times)
    print '%s+iteration:'
    print timeit.Timer('for o in normal_array:\n    str = "vn %.15f %.15f %.15f\\n".format(o.x, o.y, o.z)\n',
            'from __main__ import gen_data; normal_array = gen_data(1000)').timeit(times)

Results (lower is better) 结果(越低越好)

format:
5.34718108177
%s:
1.30601406097
%s+iteration:
1.23484301567

您也可以尝试迁移到PyPy ,有一篇关于cpython和PyPy中的字符串格式比较的文章

Try this (old school) approach by replacing .format() with % format directives: 通过使用% format指令替换.format()来尝试这种(旧学校)方法:

str = 'vn %.15f %.15f %.15f\n' % (normal_array[i].x, normal_array[i].y, normal_array[i].z )          

Seems using % will be faster: 似乎使用%会更快:

timeit str='%.15f %.15f %.15f\n' % (a, b, c)
100000 loops, best of 3: 4.99 us per loop

timeit str2='{:.15f} {:.15f} {:.15f}\n'.format(a, b, c)
100000 loops, best of 3: 5.97 us per loop

Python v 2.7.2 under XP SP2, variables a , b , and c are floats. XP SP2下的Python v 2.7.2,变量abc是浮点数。

If the float conversion is still a bottleneck, you might try to farm the formatting out to a multiprocessing.Pool , and use multiprocessing.map_async or multiprocessing.imap to print the resulting string. 如果浮点转换仍然是瓶颈,您可能会尝试将格式化为multiprocessing.Pool ,并使用multiprocessing.map_asyncmultiprocessing.imap来打印生成的字符串。 This will use all the cores on your machine to do the formatting. 这将使用计算机上的所有核心进行格式化。 Although it could be that the overhead from passing the data to and from the different processes masks the improvents from parallelizing the formatting. 尽管可能是将数据传入和传出不同进程的开销掩盖了并行化格式化的改进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM