带有工作进程的 python 池

2023-03-14Python开发问题

本文介绍了带有工作进程的 python 池的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我正在尝试使用进程对象在 python 中使用工作池.每个工人(一个进程)进行一些初始化(花费大量时间)，传递一系列作业(理想情况下使用 map())，并返回一些东西.除此之外，不需要任何沟通.但是，我似乎无法弄清楚如何使用 map() 来使用我的工人的 compute() 函数.

I am trying to use a worker Pool in python using Process objects. Each worker (a Process) does some initialization (takes a non-trivial amount of time), gets passed a series of jobs (ideally using map()), and returns something. No communication is necessary beyond that. However, I can't seem to figure out how to use map() to use my worker's compute() function.

from multiprocessing import Pool, Process

class Worker(Process):
    def __init__(self):
        print 'Worker started'
        # do some initialization here
        super(Worker, self).__init__()

    def compute(self, data):
        print 'Computing things!'
        return data * data

if __name__ == '__main__':
    # This works fine
    worker = Worker()
    print worker.compute(3)

    # workers get initialized fine
    pool = Pool(processes = 4,
                initializer = Worker)
    data = range(10)
    # How to use my worker pool?
    result = pool.map(compute, data)

是作业队列代替，还是我可以使用 map()?

Is a job queue the way to go instead, or can I use map()?

推荐答案

我建议你为此使用队列.

I would suggest that you use a Queue for this.

class Worker(Process):
    def __init__(self, queue):
        super(Worker, self).__init__()
        self.queue = queue

    def run(self):
        print('Worker started')
        # do some initialization here

        print('Computing things!')
        for data in iter(self.queue.get, None):
            # Use data

现在您可以开始一堆这些，所有这些都从一个队列中获取工作

Now you can start a pile of these, all getting work from a single queue

request_queue = Queue()
for i in range(4):
    Worker(request_queue).start()
for data in the_real_source:
    request_queue.put(data)
# Sentinel objects to allow clean shutdown: 1 per worker.
for i in range(4):
    request_queue.put(None)

这样的事情应该可以让您将昂贵的启动成本分摊给多个工人.

That kind of thing should allow you to amortize the expensive startup cost across multiple workers.

这篇关于带有工作进程的 python 池的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

The End

相关推荐

在xarray中按单个维度的多个坐标分组

Pandas中的GROUP BY AND SUM不丢失列

pandas 有从特定日期开始的按月分组的方式吗？

GROUP BY+新列+基于条件的前一行抓取值

PANDA中的Groupby算法和插值算法

PANAS-基于列对行进行分组，并将NaN替换为非空值

热门文章

热门精品源码

最新VIP资源