How to improve Python script performances when launched from systemd?
Linux - Embedded & Single-board computerThis forum is for the discussion of Linux on both embedded devices and single-board computers (such as the Raspberry Pi, BeagleBoard and PandaBoard). Discussions involving Arduino, plug computers and other micro-controller like devices are also welcome.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How to improve Python script performances when launched from systemd?
I have a Raspberry Pi (there is a Debian-based distro) which needs to keep running a service based on a Python script.
What I have done so far has been to create the .service file added to the /lib/systemd/system/ folder, now it is run automatically at the system boot and it is able to be restarted if any crash occurs, furthermore, a little logging system has been added based on syslog.
The content of the .service file looks like this so far:
Code:
[Unit]
Description=My_Service
After=network.target network-online.target
After=local-fs.target
[Service]
Type=simple
Restart=always
ExecStartPre=/bin/mkdir -p /home/user/log
ExecStart=/usr/local/bin/python3 -u /home/user/my_service.py
SyslogIdentifier=My_Service
StandardOutput=syslog
StandardError=syslog
[Install]
WantedBy=multi-user.target
Now I've noticed that the script is slighlty less performant than when it is run by terminal.
Because it is the only one script that the system should keep running, I was trying to set it with the highest priority but I am not sure how to do that. So far I've added the following lines in the [Service] section but I'm not sure if it is ok or if it could be the best practice.
The question is: How can I set the maximum priority and maximum usage of the system resources for such service in order to maximise its performances?
I'm also trying to disable other system services which are not useful for my embedded system, such as the bluetooth.service, could this kind of work be a good practice?
that highly depends on the functionality of that python script. What is the purpose of it?
Right, sorry I didn't specify the purpose of the script.
I have a while true loop which records audio data from the microphone (passing through the FFmpeg libraries), each loop iteration when the script is launched manually takes between 18 to 21 ms, while when the script is launched from the systemd service, the iterations take up to 34 ms sometimes which is really bad form my real time purposes.
In general, I have to create a sort of WebRTC server which broadcasts the microphone audio packets, so, there are two main sub-threads:
- The audio listening service, captures the mic audio and dispatches copies of the captured audio frames to several Python queues;
- The TCP server which handles all the RTC peer connections (generates also sub-threads, each for a single RTC peer connection, each consume its own audio queue).
.service file added to the /lib/systemd/system/ folder
Not related to the question, but you should put that into /etc/systemd/system. Everything under /lib or /usr/lib is fair game next time you do an upgrade, i.e. it could be overwritten.
Quote:
Now I've noticed that the script is slighlty less performant than when it is run by terminal.
How did you determine that? Is this program continuously running in the background? How slight is the performance loss? What makes you think it is related to the scheduling priority and not something else?
Since you have the program, you should add measurements to it that tell you where the performance is lost. Of course, a renice as suggested by pan64 is a very quick test, but the nice value only has an effect if there are other processes that want to run on the CPU.
Not related to the question, but you should put that into /etc/systemd/system. Everything under /lib or /usr/lib is fair game next time you do an upgrade, i.e. it could be overwritten.
How did you determine that? Is this program continuously running in the background? How slight is the performance loss? What makes you think it is related to the scheduling priority and not something else?
Since you have the program, you should add measurements to it that tell you where the performance is lost. Of course, a renice as suggested by pan64 is a very quick test, but the nice value only has an effect if there are other processes that want to run on the CPU.
I assume that my program is sligthly less performant because it is a real-time software which captures audio frames from the mic and broadcasts those frames through the WebRTC library. When such script is launched manually, there is no problem, the audio is pretty real-time, while when I launch it at start-up as a systemd service, the audio starts to accumulate delay. Furthermore I have a print of the elapsed time that the listening thread spends while capturing the audio frames from the mic, when I launch it manually, the time between two "captures" is always between 17 to 21 ms, while when started with systemd, such time sometimes reaches 34-40 ms.
What I am thinking is that it could also depent to the syslog printings; could this gather further delays?
again, you can play with renice (from +19 to -20) to see if that has any effect on this delay. That costs almost nothing.
This syslog facility must not influence the speed at all (by design), at least I would not expect that.
I do not really understand how this python script works, but python itself is not really multithreaded so I'm a bit confused.....
Last edited by pan64; 07-22-2020 at 03:20 AM.
Reason: typo
again, you can play with renice (from +19 to -20) to see if that has any effect on this delay. That costs almost nothing.
This syslog facility must not influence the speed at all (by design), at least I would not expect that.
I do not really understand how do you this python script works, but python itself is not really multithreaded so I'm a bit confused.....
I'll try to renice ASAP.
For now, I put portions of code that will help you to understand what is going on with the script. It is better for you to understand (maybe I am just bad-designing the software itself ). For now. just assume this:
- The Python script launches two threads with this command:
Code:
#Thread 1 (listening from the mic)
start_new_thread(listening_service, ("listening_service",))
#Thread 2 (TCP server for multi-client connections)
tcp_handler = asyncio.start_server(client_connection_handler, '0.0.0.0', 5555, loop=loop, reuse_address=True, reuse_port=True)
tcp_server = loop.run_until_complete(tcp_handler)
Furthermore, several threads are also created runtime, one for each TCP client connection (which holds the RTC peer connections).
In the thread 1 (listening) I have this situation:
Code:
import queue
player = MediaPlayer("hw:0", format="alsa")
while True:
f_audio = await player.audio.recv()
frame = frame.from_ndarray(array=f_audio.to_ndarray(), format="s16", layout="stereo")
frame.sample_rate = sample_rate
frame.time_base = fractions.Fraction(1, sample_rate)
for q in audio_queues:
# putting the received audio on the queues for each RTCPeer connection
try:
q.put(frame,False)
except queue.Full:
q.queue.clear() #clean if the queue is full
I have an array of Python Queues (https://docs.python.org/3/library/queue.html. Whenever a new TCP connection is retrieved from the thread 2, I add a new queue in such array. If you know the WebRTC standard, there is a signalling server (my TCP thread) which creates RTC peer connections, each RTC peer connection has its own MediaStreamTrack (https://aiortc.readthedocs.io/en/latest/api.html) which consumes audio frames from its related queue.
I create a new thread for each RTC peer connection from the "client_connection_handler" method called from the "asyncio.start_server".
Hopefully I have been clear enough to make you understand how my software is working.
This syslog facility must not influence the speed at all (by design), at least I would not expect that.
I can imagine that this logging might lead to delays. Perhaps writes are blocking. I didn't know that syslog was an option, by the way. The man page doesn't mention it.
Play around with it, then. Replace syslog with null, for example, or write to a tty.
I can imagine that this logging might lead to delays. Perhaps writes are blocking. I didn't know that syslog was an option, by the way. The man page doesn't mention it.
Play around with it, then. Replace syslog with null, for example, or write to a tty.
Perfect, I'll try all your suggestions ASAP. Thank you!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.