my code
class myclass:
def __init__(self):
self.x = {}
self.y = []
self.semaphore = threading.Semaphore()
def __semaphore(func):
def wrapper(**args. *kw):
args[0].__sync_semaphore.acquire()
ret = func(*args, **kw)
args[0].__sync_semaphore.release()
return ret
return wrapper
@__semaphore
def __setattr__(self, name, value):
super().__setattr__(name, value)
@__semaphore
def save_to_disk(self):
""" access to my_class.x and my_class.y """
my_class = myclass()
my_class.x['a'] = 123
With the code above, I'm trying to use semaphore to protect my x
and y
when ever save_to_disk
is called. But when I call my_class.x['a'] = 123
, my_class.__setattr__
is not called. So that my x
is not protected.
I have 2 question:
my_class.x['a'] = 123
which python function is called?x
and y
in my_class
only, not global list
and dict
; my x
and y
might also have a list
or a dict
inside of it.Update: I want to update some concept for the random code above. I want to create a kernel-like AI. The AI must to do 2 work at the same time. One is collecting all information that I give it. Two is that it has to save the information to disk when the threshold is reach (i do not want it to kill my RAM)
What I'm tried to do
class
that inherit dict
and list
, to override {}
and []
, but it need me to update all {}
and []
. That is not efficiency.dict().__setitem__
, list.append
,etc. But I do not know what will comeTLDR: It is not useful to do this just on myclass
methods since not only myclass
is involved. my_class.x['a'] = 123
is equivalent to this:
def set_x_a(obj: myclass, value):
x = obj.__getattr__('x') # fetch `x` via `myclass` method
x.__setitem__('a', value) # set `'a'` via `type(x)` method
set_x_a(my_class, 123)
Note how the call to my_class.__getattr__
has already completed when x.__setitem__
is called. Any synchronisation internal to my_class
methods is thus of the wrong scope.
You can protect class fields from concurrent access by only giving access to them in synchronised blocks.
Python's basic means of synchronising blocks is the with
statement, which for example can be used with threading
locks . To simplify creating a custom block, contextlib.contextmanager
work with a single generator (instead of two methods). Finally, aproperty
allows to add behaviour to attributes , such as synchronisation.
import sys
import threading
from contextlib import contextmanager
class Synchronized:
def __init__(self):
self._x = {} # actual data, stored internally
self._mutex = threading.RLock()
@property
@contextmanager
def x(self): # public behaviour of data
with self._mutex: # only give access when synchronised
yield self._x
def save(self, file=sys.stdout):
with self._mutex: # only internally access when synchronised
file.write(str(self._x))
The important change is that the dict
attribute is no longer directly exposed. It is only available with holding a lock .
synced = Synchronized()
with synced.x as x:
x['a'] = 123
x['b'] = 42
synced.save()
You can extend this pattern to additional attributes, and improve the protection of attributes. For example, you can yield
of a copy or collections.ChainMap
of self._x
, and explicitly update the internal state with this at the end of the block -- thus invalidating the effect of external references afterwards.
when I call my_class.x['a'] = 123 which python function is called?
call the def __getattribute__(self, item):
first
I want to create a kernel-like AI. The AI must to do 2 work at the same time. One is collecting all information that I give it. Two is that it has to save the information to disk when the threshold is reach (i do not want it to kill my RAM)
The problem is because two threads want to share the same variable, right?
If so, maybe you can try to get only one thread working at a time, then don't worry about the resources be changing.
for example:
import threading
import numpy as np
from time import time, sleep
def get_data(share_list, share_dict):
num_of_data = 0
while num_of_data < 6:
t_s = time()
if is_writing_flag.is_set():
sleep(REFRESH_TIME)
continue
while 1:
data = np.random.normal(1, 1, (10,))
threshold = all(data > 1.6)
if threshold:
share_list.append(data)
share_dict['time'] = time() - t_s
num_of_data += 1
is_writing_flag.set()
break
close_keeper_flag.clear()
def data_keeper(share_list, share_dict):
while close_keeper_flag.is_set():
while is_writing_flag.is_set():
# save as csv, json, yaml...
print(share_list.pop())
print(share_dict['time'])
is_writing_flag.clear()
sleep(REFRESH_TIME)
def main():
share_list = []
share_dict = {}
td_collect_data = threading.Thread(target=get_data, name='collect some data', args=[share_list, share_dict])
td_data_keeper = threading.Thread(target=data_keeper, name='save data.', args=[share_list, share_dict])
for th in (td_collect_data, td_data_keeper):
th.start()
if __name__ == '__main__':
REFRESH_TIME = 0.2
is_writing_flag = threading.Event()
is_writing_flag.clear()
close_keeper_flag = threading.Event()
close_keeper_flag.set()
main()
But I will prefer using the asyncio
to handle this, for example
import asyncio
import numpy as np
from time import time
async def take_data(num_of_data):
count = 0
t_s = time()
while 1:
if count == num_of_data:
break
data = await collect_data()
cost_time = time() - t_s
yield list(data), dict(time=cost_time)
t_s = time()
count += 1
async def collect_data():
while 1:
data = np.random.normal(1, 1, (10,))
threshold = all(data > 1.6)
if threshold:
break
return data
async def ai_process():
async for res_list, res_dict in take_data(5):
print(res_dict['time'])
# save_to_desktop()
...
def main():
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait([ai_process()]))
loop.close()
if __name__ == '__main__':
main()
If this is still not useful to you at all, I will delete the answer. If you have any questions, please let me know, thank you.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.