Embed TrueType fonts in PDF

TrueType and the newer variant OpenType fonts are ubiquitous on current computers. It is therefore desirable to use these fonts in PDFs. Most of these fonts have a liberal enough licence to allow embedding.

This text is for developers who implement PDF. Embedding imvolves creating two objects: /Font and /FontDescriptor. While /Font is document dependent defining the encoding and widths, the /FontDescription is only dependent on the information from the font itself. This page aims to summerize the information about how to gather the correct information from a TrueType font, which is distributed among several specifications.

Disclaimer: I develop on Windows and, therefore, I don't have experience nor knowledge about Mac or Linux environments.

see PDF spec (2.0: 9.6.2.1, 1.7: 9.6.1) “Type 1 fonts - General” and PDF spec (9.6.3) “TrueType fonts”

<< 
  /Type           /Font
  /Subtype        /TrueType
  /BaseFont       <Name>
  /FirstChar      <Integer>
  /LastChar       <Integer>
  /Widths         <array of Integer>
  /Encoding       <an encoding>
  /ToUnicode      <a CMap>
  /FontDescriptor <a FontDescriptor>
>>

The /BaseFont name is the PostScript name in the “name” table of the font:

name -> ID 6

The PostScript name is special, since it never contains blanks.

The spec states:

In the absence of a PostScript language name in the "name" table, a PostScript language name should be derived from the name by which the font is known in the host operating system.

No further information is given… I am looking into all installed TrueType/OpenType fonts on my Windows 10 computer. Of the 526 fonts, 313 have the right identifier (first 4 bytes): (/000/001/000/000). All 313 fonts have a name table with a PostScript name, so that the fallback does not need to be used. Because of this lack of test cases, I leave a #halt in the code for this case - maybe some day such font will come by.

The only hint on how to find the right OS name on Windows is an undocumented gdi32 function: https://stackoverflow.com/questions/42135461/c-powershell-undocumeted-winapi-function-getfontresourceinfow.

/Encoding is /WinAnsi or a dict (see PDF spec …) Don't know yet

/Widths lists all the widths of the glyphs of the current encoding starting with code /FirstChar up to /LastChar.

For accessebility add /ToUnicode with a CMap mapping from the code to unicode.

The required attributes for embedding are:

<<
  /Type /FontDescriptor
  /FontName    <Name>     "same as /Font/BaseName"
  /Flags       <Integer>
  /FontBBox    <Array with 4 numbers>
  /ItalicAngle <Number>
  /Ascent      <Number>
  /Descent     <Number>
  /CapHeight   <Number>   "not for symbolic fonts"
  /StemV       <Number>
  /FontFile2   <Stream>                           "for TrueType"
  /FontFile3   <Stream with /Subtype /OpenType>   "for OpenType"
>>

Bit 1: FixedPitch

Bit 2: Serif

Bit 3: Symbolic

True, if there are any characters in the font which are not in the Standard Latin character set (PDF Spec D.2).

Bit 4: Script

Bit 6: Nonsymbolic

Must be the inverse of Bit 3 (Symbolic)

Bit 7: Italic

Bit 17: AllCap

Bit 18: SmallCap

Bit 19: ForceBold

  • truetypeembedding.txt
  • Last modified: 2023/07/21 18:35
  • by christian