问题描述
我编写了一个小脚本来在 4 个线程之间分配工作负载并测试结果是否保持有序(相对于输入的顺序):
I have written a little script to distribute workload between 4 threads and to test whether the results stay ordered (in respect to the order of the input):
from multiprocessing import Pool
import numpy as np
import time
import random
rows = 16
columns = 1000000
vals = np.arange(rows * columns, dtype=np.int32).reshape(rows, columns)
def worker(arr):
time.sleep(random.random()) # let the process sleep a random
for idx in np.ndindex(arr.shape): # amount of time to ensure that
arr[idx] += 1 # the processes finish at different
# time steps
return arr
# create the threadpool
with Pool(4) as p:
# schedule one map/worker for each row in the original data
q = p.map(worker, [row for row in vals])
for idx, row in enumerate(q):
print("[{:0>2}]: {: >8} - {: >8}".format(idx, row[0], row[-1]))
对我来说,这总是会导致:
For me this always results in:
[00]: 1 - 1000000
[01]: 1000001 - 2000000
[02]: 2000001 - 3000000
[03]: 3000001 - 4000000
[04]: 4000001 - 5000000
[05]: 5000001 - 6000000
[06]: 6000001 - 7000000
[07]: 7000001 - 8000000
[08]: 8000001 - 9000000
[09]: 9000001 - 10000000
[10]: 10000001 - 11000000
[11]: 11000001 - 12000000
[12]: 12000001 - 13000000
[13]: 13000001 - 14000000
[14]: 14000001 - 15000000
[15]: 15000001 - 16000000
问题:那么,Pool在q<中存储每个map函数的结果时,是否真的保持原始输入的顺序?/代码>?
Question: So, does Pool really keep the original input's order when storing the results of each map function in q?
旁注:我问这个,因为我需要一种简单的方法来并行处理多个工人的工作.在某些情况下,排序无关紧要.但是,在某些情况下(如 q 中的结果)必须以原始顺序返回,因为我使用了一个依赖于有序数据的附加 reduce 函数.
Sidenote: I am asking this, because I need an easy way to parallelize work over several workers. In some cases the ordering is irrelevant. However, there are some cases where the results (like in q) have to be returned in the original order, because I'm using an additional reduce function that relies on ordered data.
性能:在我的机器上,这个操作比在单个进程上的正常执行快了大约 4 倍(正如预期的那样,因为我有 4 个内核).此外,所有 4 个内核在运行时均处于 100% 的使用率.
Performance: On my machine this operation is about 4 times faster (as expected, since I have 4 cores) than normal execution on a single process. Additionally, all 4 cores are at 100% usage during the runtime.
推荐答案
Pool.map 结果是有序的.如果您需要订购,很好;如果你不这样做,池.imap_unordered 可能是一个有用的优化.
Pool.map results are ordered. If you need order, great; if you don't, Pool.imap_unordered may be a useful optimization.
请注意,虽然您从 Pool.map 接收结果的顺序是固定的,但它们的计算顺序是任意的.
Note that while the order in which you receive the results from Pool.map is fixed, the order in which they are computed is arbitrary.
这篇关于Python 3:Pool 是否保持传递给 map 的原始数据顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)