could sum be faster on lists(列表上的总和可能更快)
问题描述
这是对 问题
首先,您会注意到您不能对字符串列表执行 sum
来连接它们,python 告诉您改用 str.join
,并且这是个好建议,因为无论您如何在字符串上使用 +
,性能都很差.
So first, you'll notice that you cannot perform a sum
on a list of strings to concatenate them, python tells you to use str.join
instead, and that's good advice because no matter how you use +
on strings, the performance is bad.
不能使用 sum
"的限制不适用于 list
,但 itertools.chain.from_iterable
是首选方式执行此类列表展平.
The "cannot use sum
" restriction doesn't apply to list
, and though, itertools.chain.from_iterable
is the preferred way to perform such list flattening.
但是 sum(x,[])
当 x
是一个列表的列表时肯定是错误的.
But sum(x,[])
when x
is a list of lists is definitively bad.
但它应该保持这种状态吗?
But should it stay that way?
我比较了 3 种方法
import time
import itertools
a = [list(range(1,1000)) for _ in range(1000)]
start=time.time()
sum(a,[])
print(time.time()-start)
start=time.time()
list(itertools.chain.from_iterable(a))
print(time.time()-start)
start=time.time()
z=[]
for s in a:
z += s
print(time.time()-start)
结果:
sum()
在列表列表中:10.46647310256958.好的,我们知道了.itertools.chain
:0.07705187797546387- 使用就地加法的自定义累积总和:0.057044029235839844(如您所见,可能比
itertools.chain
更快)
sum()
on the list of lists: 10.46647310256958. Okay, we knew.itertools.chain
: 0.07705187797546387- custom accumulated sum using in-place addition: 0.057044029235839844 (can be faster than
itertools.chain
as you see)
所以 sum
远远落后,因为它执行 result = result + b
而不是 result += b
So sum
is way behind because it performs result = result + b
instead of result += b
所以现在我的问题是:
为什么 sum
不能在可用的情况下使用这种累积方法?
Why can't sum
use this accumulative approach when available?
(这对于已经存在的应用程序来说是透明的,并且可以使用内置的 sum
来有效地展平列表)
(That would be transparent for already existing applications and would make possible the use of the sum
built-in to flatten lists efficiently)
推荐答案
我们可以尝试让 sum() 更智能,但 Alex Martelli 和 Guido van Rossum 想让它专注于算术求和.
We could try to make sum() smarter, but Alex Martelli and Guido van Rossum wanted to keep it focused on arithmetic summations.
FWIW,您应该使用这个简单的代码获得合理的性能:
FWIW, you should get reasonable performance with this simple code:
result = []
for seq in mylists:
result += seq
对于您的其他问题,为什么在可用时不能 sum 使用这种累积方法?",请参阅 Python/bltinmodule.c 中有关 builtin_sum() 的评论:
For your other question, "why can't sum use this accumulative approach when available?", see this comment for builtin_sum() in Python/bltinmodule.c:
/* It's tempting to use PyNumber_InPlaceAdd instead of
PyNumber_Add here, to avoid quadratic running time
when doing 'sum(list_of_lists, [])'. However, this
would produce a change in behaviour: a snippet like
empty = []
sum([[x] for x in range(10)], empty)
would change the value of empty. */
这篇关于列表上的总和可能更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:列表上的总和可能更快


基础教程推荐
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 筛选NumPy数组 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01