[英]Multiprocessing of image processing (EDITED)
我有以下程序:
daytime_images = os.listdir("D:/TR/Daytime/")
number_of_day_images = len(daytime_images)
day_value = 27
def find_RGB_day(clouds, red, green, blue):
img = Image.open(clouds)
img = img.convert('RGB')
pixels_single_photo = []
for x in range(img.size[0]):
for y in range(img.size[1]):
h, s, v, = img.getpixel((x, y))
if h <= red and s <= green and v <= blue:
pixels_single_photo.append((x,y))
return pixels_single_photo
number = 0
for _ in range(number_of_day_images):
world_image = ("D:/TR/Daytime/" + daytime_images[number])
pixels_found = find_RGB_day(world_image, day_value, day_value, day_value)
coordinates.append(pixels_found)
number = number+1
编辑我想使用多处理器执行该功能,所以我尝试:
for number in range(number_of_day_images):
p = multiprocessing.Process(
target=find_RGB_day,
args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number],27, 27, 27))
p.start()
p.join()
number = number+1
coordinates.append(p)
执行的时候出现了一个AttributeError,不知道怎么解决:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'find_RGB_day' on <module '__main__' (built-in)>
我认为这个错误可能与我将图像引入程序的方式有关,我从一个文件夹中获取所有名称,然后选择每个元素的元素编号 = 编号 + 1
我与wagnifico并行工作并尝试了 Pool(很好的猜测)。 我没有图像文件,所以我使用namedtuple构造了类似的函数来尽可能接近。
from multiprocessing import Pool, TimeoutError
import os, time
import random
from collections import namedtuple
def find_RGB_day(red, green, blue):
img = Image((256, 256)) # loading your image goes here eg. 256 x 256
pixels_single_photo = []
for x in range(img.size[0]):
for y in range(img.size[1]):
h, s, v, = random.randrange(255), random.randrange(255), random.randrange(255)
if h <= red and s <= green and v <= blue:
pixels_single_photo.append((x,y))
time.sleep(5) # del this line, it imitates loading a big file
return pixels_single_photo
if __name__ == '__main__':
number_of_day_images = 5
day_value = 27
Image = namedtuple('Image',['size'])
with Pool(processes=4) as pool: # You can play with processes number
multiple_results = [pool.apply_async(find_RGB_day, args=(27, 27, 27)) for i in range(number_of_day_images)]
try:
[print(res.get()) for res in multiple_results]
except TimeoutError:
print("We lacked patience and got a multiprocessing.TimeoutError")
我使用了关于“使用工人池”的 Python 文档,您可以在此处找到。
因此,池不是创建这个进程循环,而是负责所有计算。
for _ in range(number_of_day_images):
p = multiprocessing.Process(target=find_RGB_day, args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number], 27, 27, 27))
p.start()
p.join()
你可以有这样的事情:
multiple_results = [pool.apply_async(find_RGB_day, args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number], 27, 27, 27)]
try:
[print(res.get()) for res in multiple_results]
在我的代码示例中,4 个文件在大约 5 秒内加载并构建了一个 multiple_results 数组(因为我们有 4 个工作人员),最后一个在 5 秒后触发。
[编辑] 我已下载图像并使用此代码获取所需像素的所有坐标。 (27, 27, 27) 对我来说太低了,所以我使用了不同的比例 (31, 90, 170)。
享受。
from multiprocessing import Pool, TimeoutError
import time, os
import random
from PIL import Image
def find_RGB_day(clouds, red, green, blue):
img = Image.open(clouds)
img = img.convert()
pixels_single_photo = []
for x in range(img.size[0]):
for y in range(img.size[1]):
# print(img.getpixel((x, y)))
h, s, v, = img.getpixel((x, y))
if h <= red and s <= green and v <= blue:
pixels_single_photo.append((x,y))
return pixels_single_photo
def create_pool():
coordinates = []
with Pool(processes=4) as pool:
files_to_precess = [pool.apply_async(find_RGB_day,
args=("D/TR/Daytime/" + daytime_images[number], 31, 90, 170))
for number in range(number_of_day_images)]
try:
coordinates = [res.get() for res in files_to_precess] # processes get your data in here
except TimeoutError:
print("We lacked patience and got a multiprocessing.TimeoutError")
return coordinates
if __name__ == '__main__':
daytime_images = os.listdir("D/TR/Daytime/")
number_of_day_images = len(daytime_images)
print(number_of_day_images)
day_value = 27
coordinates = create_pool()
[print(res) for res in coordinates]
您应该按照错误消息的说明进行操作并添加一个主模块:
def fun(inputs):
# your function
return outputs
if __name__ == '__main__':
# your main code
p = multiprocessing.Process(target=fun, args=(inputs,))
p.start()
p.join()
有关此问题的答案的更多详细信息。
此外,您的代码中存在错误。 调用过程时必须拆分函数和参数: target=fun, args=(inputs,)
,就像我在上面的示例中所做的那样。 当您发送带有参数的函数时 - target=fun(inputs)
- 您实际上并没有调用任何进程,因为您只是将函数fun
的输出作为目标发送,而不是函数本身。 它会引发错误,因为您的函数的输出不可调用(本身不是函数)。
为了适应您的呼叫,使用多个参数,您可以使用:
for number in range(number_of_days_images):
p = multiprocessing.Process(
target=find_RGB_day,
args=("D:/TR/IR_Photos/Daytime/" + daytime_images[number],
27, 27, 27)
)
# rest of your code ...
此外,我建议您使用pool.Pool.map ,它将在所需数量的工作人员之间拆分参数列表并阻止结果。 如何实现它有多个参数的函数一个很好的描述是速效这里。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.