LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Database engine fot book digitalization (http://www.linuxquestions.org/questions/programming-9/database-engine-fot-book-digitalization-598805/)

zWaR 11-11-2007 11:23 AM

Database engine fot book digitalization
 
Hello!

I am working on digitalization of a massive amount of paper publications (books, magazines etc.). They must be cross-referenced, a user friendly search must be present and an index is needed. Is there any database engine which is customized for such needs? I would also need some import tools, if they exist, to import data from text format (e.g. *.doc or *.txt) into the database. Do such tools exist at all?

In the end the final application must be as cross-system compatible as possible, this means windows platforms (95/98/2000/xp/vista) should be supported, but the application should also work on symbian, windows mobile and palm os.

Thank you in advance!

tronayne 11-12-2007 07:24 AM

An application, no; a method, maybe.

A couple of software manufacturers and more than a few government entities have made their entire document store available as PDF documents. This has two advantages: PDF files are totally platform independent and are searchable. The way I've seen such is a browser interface with table-of-contents links to "books" (entire manuals, for example) and individual documents with a search box (Adobe Reader can search multiple documents for a given pattern). This is easiest when you have original document files to work with.

The other way I've seen is digitized individual pages; go take a look at http://www.thebookstandard.com/books..._id=1002035592 for an overview of what you're getting into.

Hope this helps.

graemef 11-12-2007 04:18 PM

How about Greenstone, this may fit your requirements. They have an example section, the first two, in Afghanistan are ones I helped to set up although they are not true digitised collections but there are other examples that fit the bill.

They have a tool for importing documents and since the user can view the collection with their a browser it is "fairly" cross platform.


All times are GMT -5. The time now is 11:28 AM.