class attributes and memory shared between Processes in process pool?
class attributes and memory shared between Processes in process pool?
I have a class A
that when initiated changes a mutable class attribute nums
.
A
nums
when initiating the class via a Process pool with maxtasksperchild
= 1
, I notice that nums
has the values of several different processes. which is an undesirable behavior for me.
maxtasksperchild
= 1
nums
my questions are:
maxtasksperchild
EDIT: I am guessing that that the pool pickles the previous processes it started (and not the original one) and thus saving the values of nums
, is that correct? and if so, how can i force it to use the original process?
nums
here is an example code:
from multiprocessing import Pool
class A:
nums =
def __init__(self, num=None):
self.__class__.nums.append(num) # I use 'self.__class__' for the sake of explicitly
print(self.__class__.nums)
assert len(self.__class__.nums) < 2 # checking that they don't share memory
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
EDIT because of answer by k.wahome: using instance attributes doesn't answer my question I need to use class attributes because in my original code (not shown here) i have several instances per process. my question is specifically about the workings of a multiprocessing pool.
btw, doing the following does work
from multiprocessing import Process
if __name__ == '__main__':
prs =
for i in range(99):
pr = Process(target=A, args=[i])
pr.start()
prs.append(pr)
[pr.join() for pr in prs]
# the assert was not raised
2 Answers
2
Your observation has another reason. The values in nums
are not from other processes but from the same process when it starts hosting multiple instances of A. This happens because you didn't set chunksize
to 1 in your pool.map
-call.
Setting maxtasksperchild=1
is not enough in your case because one task still consumes a whole chunk of the iterable.
nums
chunksize
pool.map
maxtasksperchild=1
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks. The (approximate) size of these chunks can be specified by setting chunksize to a positive integer. docs about map
The sharing is most likely coming in via the mapped class A
with a class attribute nums
.
A
nums
Class attributes are class bound thus belong to the class itself, are created when the class is loaded and they will be shared by all the instances. All objects will have the same memory reference to a class attribute.
Unlike class attributes, instance attributes are instance bound and not shared by various instances. Every instance has its own copy of the instance attribute.
See the class vs instance attribute effect:
1. Using nums
as a class attribute class_num.py
nums
from multiprocessing import Pool
class A:
nums =
def __init__(self, num=None):
# I use 'self.__class__' for the sake of explicitly
self.__class__.nums.append(num)
print("nums:", self.__class__.nums)
# checking that they don't share memory
assert len(self.__class__.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
print(pool)
pool.map(A, range(99)) # the assert is being raised
Running this script
>>> python class_num.py
nums: [0]
nums: [0, 1]
nums: [4]
nums: [4, 5]
nums: [8]
nums: [8, 9]
nums: [12]
nums: [12, 13]
nums: [16]
nums: [16, 17]
nums: [20]
nums: [20, 21]
nums: [24]
nums: [24, 25]
nums: [28]
nums: [28, 29]
nums: [32]
nums: [32, 33]
nums: [36]
nums: [36, 37]
nums: [40]
nums: [40, 41]
nums: [44]
nums: [44, 45]
nums: [48]
nums: [48, 49]
nums: [52]
nums: [52, 53]
nums: [56]
nums: [56, 57]
nums: [60]
nums: [60, 61]
nums: [64]
nums: [64, 65]
nums: [68]
nums: [68, 69]
nums: [72]
nums: [72, 73]
nums: [76]
nums: [76, 77]
nums: [80]
nums: [80, 81]
nums: [84]
nums: [84, 85]
nums: [88]
nums: [88, 89]
nums: [92]
nums: [92, 93]
nums: [96]
nums: [96, 97]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "class_num.py", line 12, in __init__
assert len(self.__class__.nums) < 2
AssertionError
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "class_num.py", line 18, in <module>
pool.map(A, range(99)) # the assert is being raised
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 608, in get
raise self._value
AssertionError
2. Using nums
as an instance attribute instance_num.py
nums
from multiprocessing import Pool
class A:
def __init__(self, num=None):
self.nums =
if num is not None:
self.nums.append(num)
print("nums:", self.nums)
# checking that they don't share memory
assert len(self.nums) < 2
if __name__ == '__main__':
with Pool(maxtasksperchild=1) as pool:
pool.map(A, range(99)) # the assert is being raised
Running this script
>>> python instance_num.py
nums: [0]
nums: [1]
nums: [2]
nums: [3]
nums: [4]
nums: [5]
nums: [6]
nums: [7]
nums: [8]
nums: [9]
nums: [10]
nums: [11]
nums: [12]
nums: [13]
nums: [14]
nums: [15]
nums: [16]
nums: [17]
nums: [18]
nums: [19]
nums: [20]
nums: [21]
nums: [22]
nums: [23]
nums: [24]
nums: [25]
nums: [26]
nums: [27]
nums: [28]
nums: [29]
nums: [30]
nums: [31]
nums: [32]
nums: [33]
nums: [34]
nums: [35]
nums: [36]
nums: [37]
nums: [38]
nums: [39]
nums: [40]
nums: [41]
nums: [42]
nums: [43]
nums: [44]
nums: [45]
nums: [46]
nums: [47]
nums: [48]
nums: [49]
nums: [50]
nums: [51]
nums: [52]
nums: [53]
nums: [54]
nums: [55]
nums: [56]
nums: [57]
nums: [58]
nums: [59]
nums: [60]
nums: [61]
nums: [62]
nums: [63]
nums: [64]
nums: [65]
nums: [66]
nums: [67]
nums: [68]
nums: [69]
nums: [70]
nums: [71]
nums: [72]
nums: [73]
nums: [74]
nums: [75]
nums: [76]
nums: [77]
nums: [78]
nums: [79]
nums: [80]
nums: [81]
nums: [82]
nums: [83]
nums: [84]
nums: [85]
nums: [86]
nums: [87]
nums: [88]
nums: [89]
nums: [90]
nums: [91]
nums: [92]
nums: [93]
nums: [94]
nums: [95]
nums: [96]
nums: [97]
nums: [98]
my question is specifically about the workings of a multiprocessing pool.
– moshevi
Aug 26 at 11:10
Oh I see. You get a pool of worker processes from
Pool()
. map
allows you to achieve execution and data parallelism by essentially applying a function to each element in an iterable and returning the results. So you supply to it the function and the arguments and each child process spawned in the pool will be able to successfully import.– k.wahome
Aug 26 at 11:29
Pool()
map
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
i understand the difference between class attributes and instance attributes, however this doesn't answer my question. I need to use class attributes because in my original code (not shown here) i have several instances per process.
– moshevi
Aug 26 at 11:09