[SOLVED] Linux GPL Serialized Multi-user Batch Queue?
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am looking if anyone knows of a software solution (hopefully GPL) for a simple batch queueing system for linux.
We have a multi-user server where users submit command line script jobs, that take various amounts of time to run (seconds to days). The licensing of a commercial piece of software on the server only allows a fixed number of simultaneous jobs by all users (currently 3). Is there a simple queueing system similar to the old mainframe "batch" queue that will centrally queue user jobs, serialize the submitted jobs and then maintain a fixed number simultaneous jobs using a FIFO order.
The requirements (pretty easy):
GPL License, accepts jobs from multiple users, simple queue viewing and control by users (like a print queue), user permission and working directory retention (like 'at') and execution of up to a fixed number of parallel scripts/jobs.
Has anyone seen or crafted a piece of software like this. Kind of a cross between a printer queue and 'at'.
Yes, 'batch' is installed, but it doesn't do what I need, unless there is an undocumented option. Batch looks at load levels before launching a job.
I have plenty of processor cores and memory, but only have a limited number or simultaneous licenses available for a piece of software. Failure to have a license free when launching a job causes the job to fail. This is the reason for controlling the maximum number of simultaneously running queue jobs. The rest of the software on the server has no such license restriction and wouldn't be run from this queue.
Since, there are multiple users with varying job lengths, 'at' or 'cron' are not useful. I looked into 'torque' and 'nqs', but these are hugely overkill. This is running on a single, large, multi-core server, not a distributed cluster. I need something analogous to a printer queue feeding a small printer pool, but instead of printers, they would be command line shells.
Any help with prebuilt software or links to people altering queues for a similar purpose would be great.
Last edited by Bryan88; 11-03-2011 at 10:58 PM.
Reason: word choices
[hang on, sorry, the answer that was here passed a smoketest but not a real test]
Okay, I thought I could make bash do it but that'll take a better man than me. So what I have instead is a set of brutally simple C programs: one to turn in a ticket for someone to use, one to get a ticket as soon as one's available, and one to collect and distribute tickets. Use them with a simple script that gets a ticket, runs a command and returns the ticket (also included below).
I have tested this, corner cases and all, it's usable as-is if primitive suits your style.
Code:
// llr.c, connect to the ticket-return socket so someone else can use our ticket
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/un.h>
struct sockaddr_un retr = { AF_UNIX, "ticket_return_socket" };
int main(int c, char **v)
{
int returnfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
if ( returnfd < 0 ) exit(1);
if ( connect(returnfd, (void*)&retr, sizeof retr) < 0 ) exit(4);
if ( c <= 1 ) v[1] = "here!";
write(returnfd,v[1],strlen(v[1]));
close(returnfd);
return 0;
}
Code:
// llg.c, get a free ticket as soon as the server has one to hand out
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/un.h>
struct sockaddr_un grant = { AF_UNIX, "ticket_grant_socket" };
int main(int c, char **v)
{
int grantfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
if ( grantfd < 0 ) exit(1);
if ( connect(grantfd, (void*)&grant, sizeof grant) < 0 ) exit(4);
char buf[64];
int rc = read(grantfd,buf,sizeof buf);
close(grantfd);
return rc<0;
}
Code:
// lla.c, collect tickets from clients connecting to the return socket
// and hand them out to clients connecting to the grant socket.
#include <stdlib.h>
#include <stdio.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/epoll.h>
struct sockaddr_un grant = { AF_UNIX, "ticket_grant_socket" };
struct sockaddr_un retr = { AF_UNIX, "ticket_return_socket" };
int awaiting[1024] = { 0 }; // open grant request socket list
int awaita = 0, awaitz = 0; // first used, first following unused, if = then no used.
int main(int n, char **a)
{
int tickets = 0;
int grantfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
int returnfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
if ( grantfd < 0 || returnfd < 0 )
exit(1);
if ( bind(grantfd, (void*)&grant, sizeof grant) < 0 ) exit(2);
if ( bind(returnfd, (void*)&retr, sizeof retr) < 0 ) exit(3);
chmod(grant.sun_path,0777);
chmod(retr.sun_path,0777);
if ( listen(grantfd, 50) < 0 ) exit(4);
if ( listen(returnfd, 50) < 0 ) exit(5);
int epollfd = epoll_create1(0); if ( epollfd < 0 ) exit(6);
struct epoll_event ev;
ev.events = EPOLLIN;
ev.data.fd = grantfd;
if ( epoll_ctl(epollfd, EPOLL_CTL_ADD, grantfd, &ev ) == -1 )
exit(7);
ev.events = EPOLLIN;
ev.data.fd = returnfd;
if ( epoll_ctl(epollfd, EPOLL_CTL_ADD, returnfd, &ev ) == -1 )
exit(8);
while (1) {
if ( epoll_wait(epollfd, &ev, 1, -1) == -1 )
exit(9);
if ( ev.data.fd == grantfd ) {
awaiting[awaitz++] = accept(grantfd,0,0);
} else
if ( ev.data.fd == returnfd ) {
ev.data.fd = accept(returnfd,0,0);
epoll_ctl(epollfd,EPOLL_CTL_ADD,ev.data.fd,&ev);
} else { // must be a line from someone on a return socket
char retcmd[32];
retcmd[read(ev.data.fd,retcmd,sizeof retcmd)]=0;
close(ev.data.fd);
if (!memcmp(retcmd,"kill",4)) break;
++tickets;
}
while ( awaitz > awaita && tickets > 0 ) {
write(awaiting[awaita],"ok",2);
close(awaiting[awaita]);
++awaita, --tickets;
}
if ( awaita == awaitz )
awaita = awaitz = 0;
}
getout:
close(grantfd);
close(returnfd);
unlink(grant.sun_path);
unlink(retr.sun_path);
return 0;
}
Code:
#!/bin/sh
# with_ticket script: get a ticket, run a command, return the ticket.
if llg; then
trap 'llr' 0 1 2 3 15
"$@"
fi
This is probably pushing taking simplicity to a fault, but it sure is simple, and it works. "ticket_grant_socket" and "ticket_return_socket" strings are actual pathnames, so you could e.g. make a /var/tmp/foo_licenses directory readable only by your chosen few, put "/var/tmp/foo_licenses/" in front of each of those strings, and you'd be done with that part of it.
Last edited by jthill; 11-04-2011 at 08:32 AM.
Reason: supplied an actual solution
I left some "won't happen"s in there and they just will not stop bugging me. So:
Code:
diff --git a/lla.c b/lla.c
index 428fe46..96c4345 100644
--- a/lla.c
+++ b/lla.c
@@ -10,5 +10,6 @@ struct sockaddr_un grant = { AF_UNIX, "ticket_grant_socket" };
struct sockaddr_un retr = { AF_UNIX, "ticket_return_socket" };
-int awaiting[1024] = { 0 }; // open grant request socket list
+#define CAP 1024
+int awaiting[2*CAP] = { 0 }; // open grant request socket list
int awaita = 0, awaitz = 0; // first used, first following unused, if = then no used.
@@ -54,5 +55,5 @@ int main(int n, char **a)
} else { // must be a line from someone on a return socket
char retcmd[32];
- retcmd[read(ev.data.fd,retcmd,sizeof retcmd)]=0;
+ retcmd[read(ev.data.fd,retcmd,sizeof retcmd-1)]=0;
close(ev.data.fd);
if (!memcmp(retcmd,"kill",4)) break;
@@ -67,4 +68,12 @@ int main(int n, char **a)
if ( awaita == awaitz )
awaita = awaitz = 0;
+
+ if ( awaitz == sizeof awaiting / sizeof *awaiting ) {
+ memmove( awaiting, awaiting+awaita, (awaitz-awaita) * sizeof *awaiting );
+ awaitz -= awaita;
+ awaita = 0;
+ if ( awaitz > CAP)
+ exit(99);
+ }
}
diff --git a/llg.c b/llg.c
index ccb40c0..56fe025 100644
--- a/llg.c
+++ b/llg.c
@@ -13,4 +13,4 @@ int main(int c, char **v)
int rc = read(grantfd,buf,sizeof buf);
close(grantfd);
- return rc<0;
+ return rc<=0;
}
diff --git a/with_license b/with_license
index 7804e69..9916ef5 100755
--- a/with_license
+++ b/with_license
@@ -4,3 +4,5 @@ if llg; then
trap 'llr' 0 1 2 3 15
"$@"
+else
+ echo "Ticket server shutdown"
fi
While I was at it I had it handle premature shutdowns more gracefully, but the main thing is now it'll handle the case where you've eternally got jobs waiting for a ticket. It still thinks having more than a thousand jobs waiting at a time is ridiculous, having it not handle an infinite waitlist is intentional.
The program you are looking for is Lluis Batlle i Rossell's Task Spooler for Linux (ts). It's GPL's. Grab it here: http://vicerveza.homeunix.net/~viric/soft/ts/. It has one or more job runners that process entries from a batch queue. Easy to compile.
Man page:
Code:
usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...]
Env vars:
TS_SOCKET the path to the unix socket used by the ts command.
TS_MAILTO where to mail the result (on -m). Local user by default.
TS_MAXFINISHED maximum finished jobs in the queue.
TS_ONFINISH binary called on job end (passes jobid, error, outfile, command).
TS_ENV command called on enqueue. Its output determines the job information.
TS_SAVELIST filename which will store the list, if the server dies.
TS_SLOTS amount of jobs which can run at once, read on server start.
Actions:
-K kill the task spooler server
-C clear the list of finished jobs
-l show the job list (default action)
-S [num] set the number of max simultanious jobs of the server.
-t [id] tail -f the output of the job. Last run if not specified.
-c [id] cat the output of the job. Last run if not specified.
-p [id] show the pid of the job. Last run if not specified.
-o [id] show the output file. Of last job run, if not specified.
-i [id] show job information. Of last job run, if not specified.
-s [id] show the job state. Of the last added, if not specified.
-r [id] remove a job. The last added, if not specified.
-w [id] wait for a job. The last added, if not specified.
-u [id] put that job first. The last added, if not specified.
-U <id-id> swap two jobs in the queue.
-h show this help
-V show the program version
Options adding jobs:
-n don't store the output of the command.
-g gzip the stored output (if not -n).
-f don't fork into background.
-m send the output by e-mail (uses sendmail).
-d the job will be run only if the job before ends well
-L <lab> name this task with a label, to be distinguished on listing.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.