How do I avoid this pickling error, and what is the best way to parallelize this code in Python?(如何避免这种酸洗错误,在 Python 中并行化此代码的最佳方法是什么?)
问题描述
我有以下代码.
def main():
(minI, maxI, iStep, minJ, maxJ, jStep, a, b, numProcessors) = sys.argv
for i in range(minI, maxI, iStep):
for j in range(minJ, maxJ, jStep):
p = multiprocessing.Process(target=functionA, args=(minI, minJ))
p.start()
def functionB((a, b)):
subprocess.call('program1 %s %s %s %s %s %s' %(c, a, b, 'file1',
'file2', 'file3'), shell=True)
for d in ['a', 'b', 'c']:
subprocess.call('program2 %s %s %s %s %s' %(d, 'file4', 'file5',
'file6', 'file7'), shell=True)
abProduct = list(itertools.product(range(0, 10), range(0, 10)))
pool = multiprocessing.Pool(processes=numProcessors)
pool.map(functionB, abProduct)
它会产生以下错误.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 255, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function fa
iled
functionA 的内容不重要,不会产生错误.当我尝试映射函数 B 时,似乎发生了错误.如何消除此错误,在 Python 2.6 中并行化此代码的最佳方法是什么?
The contents of functionA are unimportant, and do not produce an error. The error seems to occur when I try to map functionB. How do I remove this error, and what is the best way to parallelize this code in Python 2.6?
推荐答案
您最有可能看到此行为的原因是您定义池、对象和函数的顺序.multiprocessing
与使用线程并不完全相同.每个进程都会生成并加载环境的副本.如果您在进程可能无法使用的范围内创建函数,或者在池之前创建对象,那么池将失败.
The reason you are most likely seeing this behavior is because of the order in which you define your pool, objects, and functions. multiprocessing
is not quite the same as using threads. Each process will spawn and load a copy of the environment. If you create functions in scopes that may not be available to the processes, or create objects before the pool, then the pool will fail.
首先,尝试在大循环之前创建一个池:
First, try creating one pool before your big loop:
(minI, maxI, iStep, minJ, maxJ, jStep, a, b, numProcessors) = sys.argv
pool = multiprocessing.Pool(processes=numProcessors)
for i in range(minI, maxI, iStep):
...
然后,将您的目标可调用对象移到动态循环之外:
Then, move your target callable outside the dynamic loop:
def functionB(a, b):
...
def main():
...
考虑这个例子...
坏了
import multiprocessing
def broken():
vals = [1,2,3]
def test(x):
return x
pool = multiprocessing.Pool()
output = pool.map(test, vals)
print output
broken()
# PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
工作
import multiprocessing
def test(x):
return x
def working():
vals = [1,2,3]
pool = multiprocessing.Pool()
output = pool.map(test, vals)
print output
working()
# [1, 2, 3]
这篇关于如何避免这种酸洗错误,在 Python 中并行化此代码的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何避免这种酸洗错误,在 Python 中并行化此代码的最佳方法是什么?


基础教程推荐
- 何时使用 os.name、sys.platform 或 platform.system? 2022-01-01
- 在 Python 中,如果我在一个“with"中返回.块,文件还会关闭吗? 2022-01-01
- Python kivy 入口点 inflateRest2 无法定位 libpng16-16.dll 2022-01-01
- Dask.array.套用_沿_轴:由于额外的元素([1]),使用dask.array的每一行作为另一个函数的输入失败 2022-01-01
- 用于分类数据的跳跃记号标签 2022-01-01
- 筛选NumPy数组 2022-01-01
- 线程时出现 msgbox 错误,GUI 块 2022-01-01
- 如何让 python 脚本监听来自另一个脚本的输入 2022-01-01
- 使用PyInstaller后在Windows中打开可执行文件时出错 2022-01-01
- 如何在海运重新绘制中自定义标题和y标签 2022-01-01