Headless file conversion using LibreOffice as a service
The content outlined here also works for OpenOffice. This post was inspired by a question I was answering on the forums.
System Setup
I installed LibreOffice packages using a PPA in Kubuntu. Here is information about my environment.
You must have libreoffice and unoconv installed to proceed. soffice is in my $PATH. Depending one which office version you're using you may need soffice.bin instead.
Converting your first document
Let's start with the goal of converting a single document. Let's say we want to convert the following html document into a PDF (test.html).
For this step you'll need to make use of either the program screen or simply open two terminals. Open the first terminal and start the LibreOffice service.
Now convert test.html into test.pdf. With unoconv you can connect to the LibreOffice API through the network socket and use the existing service to do the heavy lifting.
If all goes well you should now see test.pdf in the same directory as test.html. You can use name globbing to specify multiple files like *.html.
To see a list of formats unoconv supports with the -f option run the command.
Daemonize the LibreOffice service
You could start the LibreOffice daemon (as the current user) with the following command.
It will now run until a shutdown command for the service has been issued. To issue a shutdown command do the following.
That's it! You could also alternatively start the service as another user like so (in this case the user is "sam")...
Pretty cool LibreOffice tricks I learned today. If you want to know more then look up the LibreOffice/OpenOffice API. That is technically what is being used. Also, you could google terms like "StarOffice.Service" or "StarOffice.ComponentContext".
SAM
System Setup
I installed LibreOffice packages using a PPA in Kubuntu. Here is information about my environment.
Code:
Tue Feb 12 23:00:54 EST 2013 Ubuntu 12.04.2 LTS Linux 3.2.0-37-generic x86_64 GNU/Linux GNU bash, version 4.2.24(1)-release (x86_64-pc-linux-gnu) LibreOffice 3.5 (Version: 1:3.5.4-0ubuntu1.1) unoconv 0.4 Python 2.7.3 gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Converting your first document
Let's start with the goal of converting a single document. Let's say we want to convert the following html document into a PDF (test.html).
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>This is a test</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head> <body> <h1>This is my title</h1> <p>This is some text in the page</p> <p><a href="http://www.gleske.net/">Visit Gleske Homepage</a></p> <ul> <li><a href="http://www.tldp.org/">Linux Documentation Project</a></li> <li>This is some text in a bullet.</li> <li><a href="http://www.gimp.org/">GIMP, An image manipulation program!</a></li> </ul> </body> </html>
Code:
soffice --nologo --headless --nofirststartwizard --accept='socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp'
Code:
unoconv --connection 'socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp;StarOffice.ComponentContext' -f pdf test.html
To see a list of formats unoconv supports with the -f option run the command.
Code:
unoconv --show #grep for certain formats unoconv --show 2>&1 1>/dev/null | grep pdf
You could start the LibreOffice daemon (as the current user) with the following command.
Code:
soffice --nologo --headless --nofirststartwizard --accept='socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp;StarOffice.Service'
Code:
soffice --nologo --headless --nofirststartwizard --unaccept='socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp;StarOffice.Service'
Code:
sudo su - sam -c "soffice --nologo --headless --nofirststartwizard --accept='socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp;StarOffice.Service'"
SAM
Total Comments 0