LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 10-30-2013, 12:12 PM   #1
neymac
Member
 
Registered: May 2009
Distribution: Slackware64-14.1
Posts: 138

Rep: Reputation: 19
Question Diverses encoding of text files in mirrors Slackware-current


I notice that the files inside Slackware mirrors directories have different encoding for the text files there, is there any reason for this? Or it doesn't matters?

I did this check:
Code:
bash-4.2$ ls *.txt | file --mime-encoding *.txt
ChangeLog.txt: iso-8859-1
bash-4.2$ ls *.TXT | file --mime-encoding *.TXT
BOOTING.TXT:           us-ascii
CHANGES_AND_HINTS.TXT: us-ascii
COPYRIGHT.TXT:         us-ascii
CRYPTO_NOTICE.TXT:     us-ascii
FILELIST.TXT:          us-ascii
PACKAGES.TXT:          unknown-8bit
README_CRYPT.TXT:      us-ascii
README_LVM.TXT:        us-ascii
README_RAID.TXT:       iso-8859-1
README.TXT:            us-ascii
README_UEFI.TXT:       us-ascii
SPEAK_INSTALL.TXT:     us-ascii
SPEAKUP_DOCS.TXT:      us-ascii
UPGRADE.TXT:           us-ascii
 
Old 10-30-2013, 01:16 PM   #2
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,374

Rep: Reputation: Disabled
Well, I don't know how it's actually possible to determine the encoding of these files as the characters in them are all part of US-ASCII, which is of course 8-bit and a subset of iso-8859-1 (and also a subset of UTF-8, by the way).

Anyhow, that doesn't matter at all, IMO.
 
Old 10-30-2013, 03:21 PM   #3
neymac
Member
 
Registered: May 2009
Distribution: Slackware64-14.1
Posts: 138

Original Poster
Rep: Reputation: 19
@Didier Spaier: the command line "file" shows the file type, I did another check using the "file" command at Slackware's mirror folder, filtering text files:

Code:
bash-4.2$ file * | grep text
ANNOUNCE.14_1:          ASCII text
BOOTING.TXT:            ASCII text, with CRLF line terminators
ChangeLog.txt:          ISO-8859 text
CHANGES_AND_HINTS.TXT:  ASCII text, with CRLF line terminators
CHECKSUMS.md5:          ASCII text
COPYING:                Pascal source, ASCII text
COPYING3:               Pascal source, ASCII text
COPYRIGHT.TXT:          ASCII text, with CRLF line terminators
CRYPTO_NOTICE.TXT:      ASCII text, with CRLF line terminators
FILELIST.TXT:           ASCII text
GPG-KEY:                ASCII text
PACKAGES.TXT:           Non-ISO extended-ASCII text
README_CRYPT.TXT:       Pascal source, ASCII text, with CRLF line terminators
README.initrd:          ASCII text
README_LVM.TXT:         ASCII text, with CRLF line terminators
README_RAID.TXT:        ISO-8859 text, with CRLF line terminators
README.TXT:             ASCII text, with CRLF line terminators
README_UEFI.TXT:        ASCII text
RELEASE_NOTES:          ISO-8859 text
Slackware-HOWTO:        Pascal source, ASCII text, with CRLF line terminators
SPEAK_INSTALL.TXT:      assembler source, ASCII text, with CRLF line terminators
SPEAKUP_DOCS.TXT:       C source, ASCII text, with CRLF line terminators
UPGRADE.TXT:            ASCII text, with CRLF line terminators

Last edited by neymac; 10-30-2013 at 03:38 PM.
 
Old 10-30-2013, 03:48 PM   #4
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,374

Rep: Reputation: Disabled
Quote:
Originally Posted by neymac View Post
@Didier Spaier: the command line "file" shows the file type, I did another check using The "file" command at Slackware's mirror folder, filtering text files:
That simply shows that algorithm used by "file" to determine the file type is not perfect, so you should not trust it blindly.

More specifically there is in general no way to determine a file's encoding with certainty so the algorithm should include a bit of heuristic and assumptions, that can succeed or fail.

For instance, we have man pages with various encodings, and all you can do sometimes is change value of GROFF_ENCODING before running "man" till the page be properly displayed

Last edited by Didier Spaier; 10-30-2013 at 04:21 PM.
 
  


Reply

Tags
encoding


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to search in files text that is one-byte encoding? (enc. that's not unicode) qdinar Linux - General 4 07-18-2011 03:44 AM
LXer: Detecting and changing the encoding of text files on Linux LXer Syndicated Linux News 0 06-22-2011 01:40 AM
Slackware -Current and using Multiple Mirrors physeetcosmo Slackware 4 04-12-2011 09:27 AM
Mirrors for www.slackware.org.br files? dhave Slackware 2 12-01-2004 11:10 AM
Slackware-Current, Text to Speech, UT2004 dbauder Slackware 1 05-19-2004 07:11 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 06:26 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration