Setting up CID fonts for Ghostscript – technical notes
While working on the CJK font integration into Ghostscript, we had to navigate a lot of unclear waters, at least unclear to me. While there are some pages describing the final setup, the reasons for that was unclear to me on many occasions. Here I want to collect my notes on why things have to be done like this.
Font formats
I will not go into details of font formats, there are thousands of pages out there doing this. I only want to list the font formats with which we are concerned, namely Open Type fonts. Due to their history, these fonts come in two different variations. The more standard TTF form, containing TrueType outlines using quadratic Bézier curves. And the OTF/CFF form, containing CFF outlines using cubic Bézier curves. Basically, OpenType fonts are TrueType fonts with some additional tables, and in case of OTF fonts one special table contains the CFF outlines.
CID key fonts were developed to deal with the huge number of glyphs in CJK fonts. For each registry (mostly Adobe), each ordering (for example Japan1 or CNS), and each supplement (a number), there is a defined list of glyph to CID key (some integer). The detailed (and big) lists can be seen here for Adobe-Japan1-6, Adobe-CNS1-6, Adobe-GB1-5, Adobe-Korea1-2.
Furthermore, what kind of font is saved into a OTF/CFF font can also vary. There are CID Fonts saved into the OTF/CFF, and non-CID fonts.
In the following, when referring to OTF/CFF we will assume it contains actual CID fonts, and not any other font data.
CID fonts and encodings
For actually accessing glyphs in a CID fonts, one usually applies an encoding to them. The idea is that there is for each encoding a file that maps CID numbers to the right positions (glyph numbers) in the encoding. Encodings can be Unicode, or some local standard, or the CID values itself.
Ghostscript and (CID) fonts
Ghostscript’s Usage documentation lists describes in 8.2 and 8.3 the usage of CID fonts. Fonts and encodings in Post(Ghost)Script are specified by
CIDResource-Encoding
and 8.2 states that if the CID Resource is available, the font resource will be auto-constructed.
As it turned out, this is not the case for OTF/CFF fonts copied/linked into the CIDFont directory of the Ghostscript Resource directory. According to one of the developers the reason is the following:
We try to parse out the registry and ordering without actually loading the entire CIDFont (which can be very time consuming), but the code that does that, doesn’t understand OTTO fonts, and probably doesn’t understand CFF fonts either.
(here OTTO fonts are OTF/CFF fonts I guess)
TTF fonts
For OTF/TTF fonts, so those looking like normal ttf fonts and containing quadratic splines, the encoding thing is again done differently. The reason is that in principle Ghostscript is a PostScript interpreter, thus TTF fonts are not really within its realm. Well, that has changed since long, but still.
So for TTF fonts, one has to specify an Explicit CIDFont Substitution (again, see Use.html, Section 8.4 by providing some code that loads the TTF font and encodes it properly.
Here we were advised not to put the TTF fonts into the Resource/Font directory, because they are not font resources in the PostScript sense.
How to setup the fonts
So, how to piece all that together, if you want to make a CID font available to Ghostscript. The procedure varies according to font type.
OTF/CFF fonts
Again, assuming the font file contains a real CID fonts, there are two steps involved: Copying or linking the file into the Ghostscript’s Resource/CIDFont directory under the correct name, and creating pre-made snippets for the necessary font/encoding pairs.
- Resource/CIDFont: One needs to find out the PostScript name. Normally the file name without extension is the PostScript name, but this is not always the case. To find out the PostScript name, use otfinfo from the LCDF TypeTools package, normally available in most distributions:
$ otfinfo --postscript-name A-OTF-RyuminPro-Medium.otf RyuminPro-Medium
(here an example where the name differs!). Having found the correct PostScript name, copy or link the otf file to the CIDFont directory, where the link name is the PostScript name, without extension. In our case that would be
Resource/CIDFont/RyuminPro-Medium -> /path/to/A-OTF-RyuminPro-Medium.otf
- Font snippets: Here we have to create small files in Resource/Font for each combination of CIDFont and needed Encoding. Note that CIDFont here means the PostScript name. So one of such files for CIDFONT-ENCODING looks like:
%%!PS-Adobe-3.0 Resource-Font %%%%DocumentNeededResources: ENCODING (CMap) %%%%IncludeResource: ENCODING (CMap) %%%%BeginResource: Font (CIDFONT-ENCODING) (CIDFONT-ENCODING) (ENCODING) /CMap findresource [(CIDFONT) /CIDFont findresource] composefont pop %%%%EndResource %%%%EOF
Here all CIDFONT and ENCODING need to be replaced by proper values: For CIDFONT there should be a file Resource/CIDFont/CIDFONT, and for ENCODING a file Resource/CMap/ENCODING.
With these two things in place, the font should be available to Ghostscript by using
/CIDFONT-ENCODING findfont
TTF fonts
There are two steps involved: Creating entries in the file Resource/Init/cidfmap, and creating pre-made snippets for the necessary font/encoding pairs.
- cidfmap entries: For each ttf font, one needs to know ordering/supplement. This is not trivial, and I don’t know how to find that out automatically. If you know this, say the values are ORDER for ordering, and SUPP for supplement, add a lines
/NAME << /FileType /TrueType /Path (PATH-TO-TTF-FILE) /SubfontID 0 /CSI [(ORDER) SUPP] >> ;
Note that for TrueType collections (.ttc) you need to select the proper Subfont ID.
- Font snippets: As above, with NAME as the first part. (This might not be strictly necessary, though)
With this in place, the TTF fonts should be available, too, in the same way as the OTF/CFF fonts.
Addition: using relative path for TTF
In case you want to create setups in a way that they are self-contained, i.e., not referring to fonts somewhere in the system, you might copy or link the TTF fonts into the directory Resource/CIDFSubst, and then use the following code to load it:
/NAME << /FileType /TrueType /Path pssystemparams /GenericResourceDir get (CIDFSubst/FILENAME.ttf) concatstrings /SubfontID 0 /CSI [(ORDER) SUPP] >> ;
Here the FILENAME.ttf is the link/copy of your font.
Conclusion
Having done all the above, all the fonts should be available to Ghostscript. But that doesn’t mean they are actually working. Unfortunately, especially with TTF fonts, I often see problems with loading the font but not finding the resource. Typical error message I receive from Ghostscript for some TTF fonts is:
Error: /invalidfont in .pickcmap_with_xlatmap
Other than that, I can use most of the fonts.
Now, if that sounds all too complicated for you, take a look at my CJK font integration script, that tries to automatize that for a large list of CJK fonts.
Enjoy. And please leave comments concerning errors, improvements, etc. Thanks.