问题描述
我有一个当前正在运行的模拟,但 ETA 大约需要 40 小时 - 我正在尝试通过多处理来加速它.
I have a simulation that is currently running, but the ETA is about 40 hours -- I'm trying to speed it up with multi-processing.
它本质上迭代了一个变量 (L) 的 3 个值,以及第二个变量 (a) 的 99 个值.使用这些值,它实际上运行了一个复杂的模拟并返回 9 个不同的标准偏差.因此(尽管我还没有这样编码)它本质上是一个函数,它接受两个值作为输入 (L,a) 并返回 9 个值.
It essentially iterates over 3 values of one variable (L), and over 99 values of of a second variable (a). Using these values, it essentially runs a complex simulation and returns 9 different standard deviations. Thus (even though I haven't coded it that way yet) it is essentially a function that takes two values as inputs (L,a) and returns 9 values.
这是我拥有的代码的精髓:
Here is the essence of the code I have:
STD_1 = []
STD_2 = []
# etc.
for L in range(0,6,2):
for a in range(1,100):
### simulation code ###
STD_1.append(value_1)
STD_2.append(value_2)
# etc.
以下是我可以修改的内容:
Here is what I can modify it to:
master_list = []
def simulate(a,L):
### simulation code ###
return (a,L,STD_1, STD_2 etc.)
for L in range(0,6,2):
for a in range(1,100):
master_list.append(simulate(a,L))
由于每个模拟都是独立的,因此它似乎是实现某种多线程/处理的理想场所.
Since each of the simulations are independent, it seems like an ideal place to implement some sort of multi-threading/processing.
我将如何编写这个代码?
How exactly would I go about coding this?
另外,是否所有内容都会按顺序返回到主列表,或者如果多个进程正在工作,它可能会出现故障?
Also, will everything be returned to the master list in order, or could it possibly be out of order if multiple processes are working?
编辑 2:这是我的代码——但它运行不正确.它询问我是否想在我运行程序后立即终止它.
EDIT 2: This is my code -- but it doesn't run correctly. It asks if I want to kill the program right after I run it.
import multiprocessing
data = []
for L in range(0,6,2):
for a in range(1,100):
data.append((L,a))
print (data)
def simulation(arg):
# unpack the tuple
a = arg[1]
L = arg[0]
STD_1 = a**2
STD_2 = a**3
STD_3 = a**4
# simulation code #
return((STD_1,STD_2,STD_3))
print("1")
p = multiprocessing.Pool()
print ("2")
results = p.map(simulation, data)
编辑 3:还有什么是多处理的限制.我听说它不能在 OS X 上运行.这是正确的吗?
EDIT 3: Also what are the limitations of multiprocessing. I've heard that it doesn't work on OS X. Is this correct?
推荐答案
- 将每次迭代的数据包装成一个元组.
- 列出这些元组的
data - 编写函数
f处理一个元组并返回一个结果 - 创建
p = multiprocessing.Pool()对象. - 调用
results = p.map(f, data) - Wrap the data for each iteration up into a tuple.
- Make a list
dataof those tuples - Write a function
fto process one tuple and return one result - Create
p = multiprocessing.Pool()object. - Call
results = p.map(f, data)
这将运行尽可能多的 f 实例,因为您的机器在不同进程中拥有内核.
This will run as many instances of f as your machine has cores in separate processes.
Edit1:示例:
from multiprocessing import Pool
data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]
def f(t):
name, a, b, c = t
return (name, a + b + c)
p = Pool()
results = p.map(f, data)
print results
多处理应该可以在 OSX 等类 UNIX 平台上正常工作.只有缺少 os.fork 的平台(主要是 MS Windows)需要特别注意.但即使在那里它仍然有效.请参阅多处理文档.
Multiprocessing should work fine on UNIX-like platforms such as OSX. Only platforms that lack os.fork (mainly MS Windows) need special attention. But even there it still works. See the multiprocessing documentation.
这篇关于具有单个函数的 Python 多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)