LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 04-27-2021, 03:03 PM   #1
IOACHIMVS
LQ Newbie
 
Registered: Jul 2020
Posts: 6

Rep: Reputation: Disabled
Best file format for scanned documents/pictures?


Hi,

Quite some time ago the idea of digitize my family's most relevant documents and photographs started going through my head. Hands on deck I scanned some of them and I got to the point where I started questioning my choice for the file format.
I've been saving those scans as TIFF... and sure, it's probably not a bad idea at all, but I really don't know if I ever finish scanning loads and loads of documents it will be a good idea them to size 5-20MiB each. And I'm not sure if enabling JPEG compression on top of TIFF it's the wisest idea neither... at the end of the day it would be just JPEG on top of other file format.

Doing a quick research I came to the DjVu file format. It looks like good format to work on: it's lighter than TIFF and since I would always visualize the files from a GNU/Linux device compatibility shouldn't be a problem.
However I have no great experience in file formats. I'm sure there a dozen out there that could be good candidates.

Which you consider is the best file format for preserving old documents? Please feel free to suggest any other.
We're talking about both manuscripts and pictures. So keeping them legible and reminiscent must be a priority (I do not discard using different file formats for each aim if you suggest it), keeping them lightweight would be a plus.
 
Old 04-27-2021, 03:57 PM   #2
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,453

Rep: Reputation: 447Reputation: 447Reputation: 447Reputation: 447Reputation: 447
Hi

For archives, I recommend a lossless format, and the highest resolution the scanner can handle. That means big files. Disk capacity is huge nowadays, so you could fill a million pictures on a big hard disk. And I bet you don't have that many.

Of course, that means they're "heavy". But you typically don't use them as they are. If you're making a webpage, down-sample and convert to JPEG. It's easy to make smaller files with compression and down-sampling. But if you have a low quality, you're simply stuck.

I've done that mistake before in the past - scanned to JPEG and down-sampled to what I needed then, made MP3 of music, and AVI of movies. Now the originals are lost/destroyed, and I regret it.

You don't always know what you will use them for later. For example, if you want to make a poster of a picture, you need high resolution and the best quality possible. And who knows what the future will bring. But I guess a 100 years from now, they will still want the highest quality possible, and a few megabytes/gigabytes wont matter.
 
5 members found this post helpful.
Old 04-27-2021, 06:10 PM   #3
IOACHIMVS
LQ Newbie
 
Registered: Jul 2020
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Guttorm View Post
Hi

For archives, I recommend a lossless format, and the highest resolution the scanner can handle. That means big files. Disk capacity is huge nowadays, so you could fill a million pictures on a big hard disk. And I bet you don't have that many.

Of course, that means they're "heavy". But you typically don't use them as they are. If you're making a webpage, down-sample and convert to JPEG. It's easy to make smaller files with compression and down-sampling. But if you have a low quality, you're simply stuck.

I've done that mistake before in the past - scanned to JPEG and down-sampled to what I needed then, made MP3 of music, and AVI of movies. Now the originals are lost/destroyed, and I regret it.

You don't always know what you will use them for later. For example, if you want to make a poster of a picture, you need high resolution and the best quality possible. And who knows what the future will bring. But I guess a 100 years from now, they will still want the highest quality possible, and a few megabytes/gigabytes wont matter.
Thanks, Guttorm, I'm glad you shared your experience. I should have considered that scenario before.
Well, after all storage price have significantly decreased since the lasts decades. It would be a shame to have only a curtailed version of such important information available in the future just for saving some bucks.
I'll make a fair investment to make sure next generations can enjoy those documents in their greatest form.

I guess for me the case is closed. But I'm letting the thread open just a little bit more in case someone wants to add anything else.
 
Old 04-27-2021, 06:22 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
For me, scanning old slides was for memory jogging, not advancing the academic knowledge level of the species. Many of the slides were also faulty - scratches, mould, ...
So for me, jpeg was fine. The bigger issue for me was the amount of time needed to harvest those worth keeping. Before digital, I used to take a heap of photos just to make sure I got one worth looking at. No deleting in the camera with film. An awful lot never made it to the scanner.
My document scanning needs are almost non-existant, so no thoughts there.
 
1 members found this post helpful.
Old 04-28-2021, 11:09 AM   #5
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Debian
Posts: 6,142

Rep: Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314Reputation: 2314
DjVu may be an endangered species. Its support is largely confined to Linux and the Internet Archive stopped accepting it four years ago. It was more popular 20 years ago when it had an advantage of smaller file size than pdf and pdf was still patented.
 
1 members found this post helpful.
Old 04-28-2021, 11:56 AM   #6
Michael Uplawski
Senior Member
 
Registered: Dec 2015
Posts: 1,622
Blog Entries: 40

Rep: Reputation: Disabled
Even lossless file formats can be compressed into archives. As regards space, you gain as much as – or more than with JPEG compression alone. We could turn this now into a discussion of compressors, but that is not necessary, nowadays.

For me, the original file format depends on the character of the data and the potential uses. Usually I scan Tiffs, too, especially text-documents and forms, as they are quickly transformed into multi-page PDF-files (via multi-layer tiffs) on the command-line.

Images are a different topic. I still experiment a lot with the JPEG2000 and would replace JPG & TIFF immediately with JPEG2000 if I were sure to find decent viewers in the future. There are some which come with the conversion tools. But nobody talks about it and I am ... “chicken-hearted” (??? really. That is an adjective?)

Last edited by Michael Uplawski; 04-28-2021 at 11:58 AM. Reason: language. An ongoing process...
 
1 members found this post helpful.
  


Reply

Tags
archiving, djvu, file format, scanning, tiff



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Deepin File Manager - Change location of or delete default Documents, Pictures, etc folders CastyMcBoozer Linux Deepin 3 09-12-2017 12:37 AM
Documents, Pictures, Downloads, etc. folders aren't automatically created in Squeeze. Octoberator Debian 2 09-22-2012 11:38 AM
removing white pixels from bi-level PDF files (xpdf fails to change the color of scanned documents) libCog Linux - Software 0 12-05-2011 01:31 PM
LXer: Tesseract-ocr: convert scanned images into editable documents on Linux LXer Syndicated Linux News 0 04-24-2011 09:30 PM
Software to handle scanned documents crayiii Linux - Software 1 12-20-2003 02:19 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 02:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration