Restarting or replacing a stopped thread in Python3?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Restarting or replacing a stopped thread in Python3?
In the following, a thread is started which soon stops because of a raised exception. How can the main section detect when the thread has failed and create a new replacement, while keeping pause() in effect?
Code:
#!/usr/bin/env python3
from threading import Thread
from time import sleep
from signal import pause
def pool():
while True:
print("Threading")
sleep(4)
raise OSError("Uff da")
sample = Thread(target=pool, daemon=True, name="working")
sample.start()
pause()
Last edited by Turbocapitalist; 05-07-2023 at 11:12 PM.
Reason: moved print()
Thanks. I'm not sure it would be feasible to replace pause() there. The above is a very simplified version of a more complicated script which is having the trouble. The pause() is a stand-in for mqtt.loop_forever() from paho.mqtt.client though maybe there is another approach using mqtt.loop_start() instead.
def loop_forever(self, timeout=1.0, max_packets=1, retry_first_connection=False):
"""This function calls the network loop functions for you in an
infinite blocking loop. It is useful for the case where you only want
to run the MQTT client loop in your program.
You cannot use loop_forever in a multithreaded environment.
Thanks. I'll try with something similar to the following. The old script tended to fail one every week or two, so it'll be a while before it'll be seen if there is an improvement.
Code:
#!/usr/bin/env python3
from threading import Thread
from time import sleep
import signal
def handler(signum, frame):
signame = signal.Signals(signum).name
print(" OK",signame)
exit(0)
def pool():
while True:
print("Threading")
sleep(4)
raise OSError("Uff da")
signal.signal(signal.SIGINT, handler)
# add mqtt loop_start() here
while True:
process_thread = Thread(target=pool,
daemon=True,
name="working")
process_thread.start()
process_thread.join()
you have to dump all relevant parts into a logfile, including the full exception trace/message
I tracked that down already: There are some static system files which are, apparently, temporarily unreadable once every few weeks. So the real problem may be with the OS but since that is Raspberry Pi OS, fat chance of getting it fixed. Better to wait for the next edition instead and try some work-arounds until then.
The two work-arounds would be the above modification and to move the reading up in the script so that it happens once per session and not once per nested loop.
hm. I don't really understand that (how can a static system file disappear), but such problems [probably] can be solved with a simple retry.
Yes, a third work-around would be to wrap the relevant lines in a try: conditional, but I'll use the above two for now.
I don't know that the the system file actually disappears, just that every once in a very rare while trying to read it results in an OSError exception. I tried running strace on the whole script but since it kept running even after the thread raised an exception, I could not capture what had gone on precisely. Though if it happen again, I'll add some removable storage and capture the strace output to it.
How can the main section detect when the thread has failed and create a new replacement, while keeping pause() in effect?
If I understand what you are wanting.
Code:
#!/usr/bin/python
#Lock thread example with Queue
#Lock thread and wait until all items are processed. No race condition. No other
#thread can write at the same time.
from threading import Thread, Lock, current_thread
from queue import Queue
from time import sleep
def worker(q, lock):
#Inf loop, wait until available
while True:
value = q.get()
with lock:
print(f'In {current_thread().name} got Value {value}')
q.task_done()
sleep(1)
if __name__ == "__main__":
q = Queue()
lock = Lock()
num_threads = 10
for i in range(num_threads):
thread = Thread(target=worker, args=(q, lock))
thread.daemon=True
thread.start()
for i in range(1, 21):
q.put(i)
q.join()
#!/usr/bin/python
#Processes like threads don't live in the same memory space. They need shared memory objects.
#These processes may try to access the same shared variable at the same time
from multiprocessing import Process, Value, Array
import os
from time import sleep
def add_100(number):
for i in range(100):
sleep(.01)
number.value += 1
if __name__ == "__main__":
shared_number = Value('i', 0)
print('Number at beginning is', shared_number.value)
p1 = Process(target=add_100, args=(shared_number,))
p2 = Process(target=add_100, args=(shared_number,))
p1.start()
p2.start()
p1.join()
p2.join()
print('Number at end is', shared_number.value)
#Lock them
#These won't walk on each other.
from multiprocessing import Process, Value, Array, Lock
import os
from time import sleep
def add_100(number, lock):
for i in range(100):
sleep(.01)
lock.acquire()
number.value += 1
lock.release()
if __name__ == "__main__":
lock = Lock()
shared_number = Value('i', 0)
print('Number at beginning is', shared_number.value)
p1 = Process(target=add_100, args=(shared_number, lock))
p2 = Process(target=add_100, args=(shared_number, lock))
p1.start()
p2.start()
p1.join()
p2.join()
print('Number at end is', shared_number.value)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.