Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How can I show and edit the metadata of txt format ebooks in linux? I can do what I need with calibre, but I want to automate the process with command line utilities instead
write a simple script with following
wget - to down load the html of url
strstr()-search a string (metadada) in that file
you will also try with curl
will help you
You lost me at wget, for many of these files there is no url to download that I'm aware of.
Calibre will create the minimum necessary metadata for them (all I actually have to do is supply calibre with the author name), without any connection to the net.
I believe this added metadata is inserted in the txt files headers by calibre, because:
if I pass the unmodified txt files to my ereader, it organizes them in its database by filename and modification date, which is not very useful for me;
but if I add the author name to these txt files using calibre's Metadata Edit process and then pass just the modified txt files to my ereader, its database now makes files available by filename, author or modification date.
This information is not contained in the body of the txt file concerned. The only place it can be is in the file header, so all I'm actually seeking is a command line tool that will show and edit the content of file headers.
from the command line I did not tried this one ...
but it's quit easy with simple code, it's easy to extract any thing from files.........
do you have links for that files,then I can see those files.In past,I had some experience with these metatags.that time I was written code based on wget and some string functions .....
we can extract any thing from that file by using simple scripts ....
you need not require wget here...
because you already have those files with you ...
inside of the code
use fopen() /open()
read the file by character by character..
search for what ever you want ...
like
author name,
last revised,
and what ever you want ..
so that you can able to show them and also you can able to edit
Last edited by gangadhar402; 04-08-2013 at 07:06 AM.
from the command line I did not tried this one ...
but it's quit easy with simple code, it's easy to extract any thing from files.........
do you have links for that files,then I can see those files.In past,I had some experience with these metatags.that time I was written code based on wget and some string functions .....
we can extract any thing from that file by using simple scripts ....
you need not require wget here...
because you already have those files with you ...
inside of the code
use fopen() /open()
read the file by character by character..
search for what ever you want ...
like
author name,
last revised,
and what ever you want ..
so that you can able to show them and also you can able to edit
We seem to be talking at cross purposes. What scripting language are you referring to? Neither open nor fopen exist in bash, the only scripting language I am familiar with. Nothing I know of, not even hex editors like bpe, actually show the file header. touch can modify certain content in the header but does not show its layout.
I believe this added metadata is inserted in the txt files headers by calibre, because:
if I pass the unmodified txt files to my ereader, it organizes them in its database by filename and modification date, which is not very useful for me;
but if I add the author name to these txt files using calibre's Metadata Edit process and then pass just the modified txt files to my ereader, its database now makes files available by filename, author or modification date.
I'm under the impression that your ereader is using the filename of the text files to do the sorting, since Calibre will rename any file you add to its database based on the author and title provided.
Quote:
Originally Posted by porphyry5
This information is not contained in the body of the txt file concerned. The only place it can be is in the file header, so all I'm actually seeking is a command line tool that will show and edit the content of file headers.
AFAIK, text files do not have any metadata nor header, they're just plain text.
I'm under the impression that your ereader is using the filename of the text files to do the sorting, since Calibre will rename any file you add to its database based on the author and title provided.
Correct, but some text files that I obtained from Project Gutenberg are not accepted by Calibre at all, it will not add them to its library, giving the error message "Failed to read metadata from the following ..." Other text files it makes no objection to.
Quote:
Originally Posted by Diantre
AFAIK, text files do not have any metadata nor header, they're just plain text.
Every file has a header of some sort, with at least access, modification and creation dates, size data etc. But I think you are correct about text files having no metadata. Examining what is actually on my ereader with ls, rather than what the ereader claims is there, reveals the presence of 2 files, metadata.calibre and driveinfo.calibre. If I remove them, the reader reverts to file name only organization of its data base.
But as it turns out, as all I wanted was to have the files organized by file name within author, the simplest way of achieving that effect was to rename the files, making the author name the first part of every file name. That way I get the desired effect with just the data base organized by file name. So I'm marking this thread as solved even though Ive not got the original question answered. I thank you all for your help.
I know this thread is kinda old, but I just spent a while trying to solve the same problem, for a slightly different reason.
I found, as porphyry5 did, that Calibre doesn't store the metadata in the text file itself. The only thing that Calibre reads from the text file is the filename, but if you have a second file in the same directory with the same basename, but with extension "opf" then it will read all the metadata from it. That is, if your text file is "test.txt" and a file named "test.opf" is in the same directory, then when Calibre reads the text file (for example if you grab it with your mouse and drop it onto Calibre's window) then Calibre will get the metadata from the opf file.
The opf file is a standard (rather cumbersome) XML ebook metadata file. You can examine one by clicking on the text "Click to open" next to the label "Path:" in the right-hand column (panel?) when a file selected in Calibre's file list. When you add a text file it will automatically generate an opf file. If the ebook added is an epub file then it will copy the opf file from the one inside the epub ebook (which is really just a zip file containing html+images+metadata files).
As an aside, I'll explain why I wanted to store metadata for a text file:
Having suffered, over the years, some computer crashes that have trashed disks and having helped other people who've had their disks trashed, I've become very interested in recovering files from broken filing systems. One of the biggest problems is that much important data (filename, date, owner, etc) are held in the directory pointers, and often not in the files themselves. So I've begun a system of saving as much metadata in the actual files as possible to aid in recovery if ever needed.
With some files that's easy. Most picture formats let you store metadata inside them, for example jpeg EXIF data. Some sound file formats let you store metadata -- notably mp3 lets you store tremendous amounts of metadata in their ID3v2 tags, even pictures! Well, I'm a writer and wanted to store useful metadata in my text files. I'd begun storing the filename in the first line, the full directory path on the second line, and the date and time the file was created on the third line.
This morning I was writing a quick bash script to fix the author and title metadata in some epub ebooks, so I don't have to open them each time in Calibre to time-consumingly do it by hand (it's very easy to script things for Calibre using its "ebook-meta" command), and suddenly it occurred to me that Calibre might have a standard way to store metadata in text files. Sadly, it doesn't. As I mentioned above, it stores the data in a separate opf file, which is fine if your directory structure is intact, but if you're recovering a filing system after a bad crash then it is totally useless. [sigh]
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.