LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-29-2016, 04:35 PM   #1
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Rep: Reputation: Disabled
Font File Examination Utility


Hi:

I have a large number of fonts from which I need to identify only those which contain specific Unicode planes.

For example, I would like to be able to run some command (similar to a grep perhaps?) which would return a list of font file names in a directory (along with its subdirectories if need be, so it needs to be recursive) that contained (for instance) Greek and Thai scripts or something like that.

If that isn't feasible, I'd like to have the command search for any such files that contain one or more specific characters/glyphs (in this example perhaps λ or โ, Greek lambda or Thai sera-oh).

I'm aware of the Linux font configuration utility (see man fonts-conf) but, whether because of deficiencies in this utility or (I suspect) misconfigured font files themselves, this doesn't always provide information that is correct. Hence, my wish to just grep the ttf (or whatever) font file itself.

Does anyone have any suggestion?

Thanks in advance.
 
Old 09-30-2016, 01:36 AM   #2
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 17,250
Blog Entries: 10

Rep: Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160
hmm.
my first thought upon seeing the thread title was "fontforge".
typing "font" into a terminal, pressing TAB, gives a few more options, one of them is called fontlint.
i have to hurry now, will look into it tonight.
 
Old 09-30-2016, 09:15 AM   #3
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Original Poster
Rep: Reputation: Disabled
Great idea: typing font and pressing tab; I hadn't even thought of that, BUT:

I'm familiar with fontlint (George Williams is that author of FontForge which, sadly, no longer seems to be updated). Like its namesake "lint" it reviews a font file (using routines from FontForge) for structural errors and omissions. In my experience many (if not most) fonts are not fully compliant with the various specs for their format. Fontlint is therefore useful for designers, but doesn't address what I'm looking for.

But thanks very much for the idea ...
 
Old 09-30-2016, 09:50 AM   #4
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,374

Rep: Reputation: 403Reputation: 403Reputation: 403Reputation: 403Reputation: 403
I'm not really sure what you are after. There are 2 commands "ttfdump" and "otfdump" that will dump tons of information about a font. The output is very different, but they both work on both ttf and otf fonts. This command should install both of them:

apt-get install libotf-bin texlive-binaries

Maybe look at the output and see if you can grep the output?
 
Old 09-30-2016, 11:26 AM   #5
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Original Poster
Rep: Reputation: Disabled
Much thanks... ttfdump is one more command I never heard of and would likely never have stumbled upon - and it was already installed on my system!

So I can use ttfdump to see whether a font contains either a Greek character (e.g. x3a3) or a Thai character (xe01) with the following:

ttfdump -t cmap FreeSerif.ttf | grep '0x03a3\|0x0e01'

which returns the following two lines:

Char 0x03a3 -> Index 860
Char 0x0e01 -> Index 2495

It seems like all (?) that's needed now is to create a recursive loop to examine all of my *ttf files (I haven't tried otfdump yet, but assume it will be similar) - adding the file name to each successful "find".

It would be nice to figure out which fonts support "both", but I've never figured out how to do an AND in grep, so generally just pipe the output from one grep into a second grep; that works but gets rather tedious if ANDing more than two things.

At any rate, your suggestion looks like it has the potential to do what I need. It does seem like more work than should be required in the twenty-first century though, so if anyone knows of a better alternative, I'd love to hear it ...

So, again, thanks much.

P.S. I've always wanted to visit Trondheim because of its great musical culture and heritage, but never had the chance; I'm a little jealous.
 
Old 10-01-2016, 08:05 AM   #6
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 17,250
Blog Entries: 10

Rep: Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160Reputation: 5160
i just tried otfdump like this:
Code:
 otfdump Cantarell-Bold.otf|grep -i enc
      (platformID 1) (encodingID 0) (languageID 0) (nameID 0)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 1)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 2)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 3)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 4)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 5)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 6)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 9)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 11)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 12)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 13)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 14)
      (platformID 1) (encodingID 0) (languageID 0) (nameID 20)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 0)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 1)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 2)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 3)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 4)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 5)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 6)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 9)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 11)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 12)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 13)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 14)
      (platformID 3) (encodingID 1) (languageID 1033) (nameID 20)
    (EncodingRecord (0) (platformID 0) (encodingID 3)
    (EncodingRecord (1) (platformID 1) (encodingID 0)
    (EncodingRecord (2) (platformID 3) (encodingID 1)
so to me it looks like these numerical IDs hold the key to what you're after - does the font support thai or greek encoding?
you'd just have to find out what these numbers mean.
 
Old 10-01-2016, 11:44 AM   #7
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Debian
Posts: 5,774

Rep: Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133
One quick approach is to use Writer. The insert-special-character option will only show you the characters available in the selected font.
 
Old 10-01-2016, 11:59 AM   #8
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Original Poster
Rep: Reputation: Disabled
For Ondoho: Quite possibly! I've been playing with Guttorm's suggestions and, as with much of Linux, it seems as if there's an embarrassment of approaches to explore. I'll try to report back if I settle on something that's quick and useful.

For David: Using Writer would be anything but "quick" since I'd have to choose each available font, examine the special character dialog, etc. Above and beyond that, Writer doesn't handle such things well at all (although I suspect that's partly because of flaws in Linux's reading of font characteristics or errors in the formatting of the fonts themselves). It's quite well known that, in Linux, Writer will often substitute the font used for some characters even if the current font contains those characters - great program, but tricky to use when intermixing more than one script at a time, particularly since you sometimes need to fight with its "Complex Text Layout."

My actual objective is to identify the fonts that can be used when putting together a document that uses several different combinations of scripts.

And Thanks to you both for the responses - it's becoming evident that this can be done - I just need to work out what's the best approach for my purpose.
 
Old 10-02-2016, 10:52 AM   #9
DavidMcCann
LQ Veteran
 
Registered: Jul 2006
Location: London
Distribution: PCLinuxOS, Debian
Posts: 5,774

Rep: Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133Reputation: 2133
Quote:
Originally Posted by CVAlkan View Post
It's quite well known that, in Linux, Writer will often substitute the font used for some characters even if the current font contains those characters
Only when outputing text, not in the choice box.

I don't know how many fonts you use I only use 16 but I compiled a text giving sample alphabets in each available variant (bold, small-caps, etc) and other details. So Cardo has no variants, is a Garald with a small x-height, the 12 point is actually 12 on 15, it includes Old Italic and Hebrew, I've added the Gothic and Old Cyrillic, etc. Fun to compile and quite useful.
 
Old 10-02-2016, 11:57 AM   #10
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Original Poster
Rep: Reputation: Disabled
Correct, but then outputting text is really the point, isn't it?

I use quite a few fonts depending on the particular project; among the needs are several written scripts (mostly English/Latin/Roman/Lower Ascii, Thai and Greek, but several projects have included more) and a number of specialty scripts such as music symbols. For any ONE project, though, the number of primary fonts usually isn't more than two or three.

It's interesting that you mention the "12 on 16" sort of designator: this becomes interesting when mixing multiple scripts, since languages such as Thai have actual characters (i.e. vowels and tone marks as well as simple diacritics) that are placed above and below other characters. Getting acceptable line spacing with mixed language sentences (e.g. "the Greek phrase 'xxx' means 'yyy' in Thai") while keeping the size of the base characters reasonably similar can sometimes be "interesting."

Mixing languages is also interesting when the distinctions among serif/sans-serif/italic is utterly irrelevant to one or more of them.

These and other considerations are what is driving this effort: I decided I was going to compile some sample docs such as you described (one for each group of languages that needed to be used together), but soon discovered that searching for the fonts that had a particular group of scripts was quickly becoming extremely tedious.

Based on the earlier responses to this post I'm beginning to see that such a utility can be constructed, and I'm now messing around with exactly how I want to use it - leading, as usual, to completely ignoring the actual project that drove me to this in the first place so that I can design the "perfect" solution - always fun when the problem itself is constantly changing.

But, that's how we Linux users get our kicks I suppose.

Again, thanks much.
 
Old 11-02-2016, 02:22 PM   #11
CVAlkan
Member
 
Registered: Nov 2012
Location: Northwest suburbs of Chicago
Distribution: Ubuntu & Mint LTS, Manjaro Rolling; Android
Posts: 218

Original Poster
Rep: Reputation: Disabled
In an attempt to "solve" my original question ("How do I generate a list of fonts that contain support for specific scripts/languages?), I've written a prototype bash shell script to do that; there is also an attached pdf that explains why it exists, how to use it, how to modify it and so forth.

I should mention that this need is primarily driven by experiencing unwanted font substitutions in LibreOffice Writer and other applications. The primary cause of these seems to be the use of fonts that either don't have or don't correctly report their coverage and other capabilities. This is particularly annoying when intermingling multiple scripts/languages within a single sentence, paragraph, or document.

The shell script will help to select fonts that are appropriate for a given combination of scripts/languages, as well as identify which fonts should perhaps be retired to the great font foundry in the sky and replaced with better ones.

Comments are welcome.

Frank
Attached Files
File Type: pdf Evaluating-fonts-for-multilingual-use.pdf (238.3 KB, 13 views)
File Type: pdf Evaluating-fonts-for-multilingual-use-SCRIPT.pdf (141.6 KB, 16 views)
 
  


Reply

Tags
fontconfig, unicode


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Utility/Tool to convert .f4v file to file PC can view rhbegin Linux - Software 5 02-09-2012 08:51 AM
RHCE examination enrasi Linux - Certification 7 11-03-2010 05:26 PM
lpic-1 examination materials salimshahzad Linux - Newbie 1 12-07-2009 07:56 AM
some questions in a written examination! keika General 20 07-22-2009 01:06 PM
Find File broken, need search utility, where does WineX install, KDE file roller? Ohmn Mandriva 6 07-05-2004 10:34 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration