Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cmap [2020/02/23 13:37] christian [Monster from the wild] |
cmap [2020/02/23 14:33] (current) christian [CMap] |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== CMap ====== | ====== CMap ====== | ||
- | CMaps(([[https:// | + | CMaps(([[https:// |
CMaps provide a very general mechanism which can describe any mappings, including unicode which was developed later. Input codes of variable length (1, 2, 3 or more bytes) can be mapped to characters. | CMaps provide a very general mechanism which can describe any mappings, including unicode which was developed later. Input codes of variable length (1, 2, 3 or more bytes) can be mapped to characters. | ||
Line 127: | Line 127: | ||
CMaps are not well defined. Therefore, there are some interesting variations of them in the wild. Here is a small selection of some issues. | CMaps are not well defined. Therefore, there are some interesting variations of them in the wild. Here is a small selection of some issues. | ||
- | ==== Mappings outside the codespace | + | ==== Codespace problems |
+ | |||
+ | === Wrong code length | ||
<code postscript> | <code postscript> | ||
Line 146: | Line 148: | ||
This can be seen often. These illegal mappings are collected into the ''# | This can be seen often. These illegal mappings are collected into the ''# | ||
+ | |||
+ | === Mappings outside the codespace === | ||
+ | |||
+ | <code postscript> | ||
+ | %... | ||
+ | 1 begincodespacerange | ||
+ | < | ||
+ | endcodespacerange | ||
+ | 11 beginbfchar | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | %... | ||
+ | </ | ||
+ | |||
+ | Here, only the first mapping matches the code space. All others fall outside of it, because the second byte has to be between <00> and <04>. | ||
==== Wrong PostScript ==== | ==== Wrong PostScript ==== | ||
Line 196: | Line 216: | ||
</ | </ | ||
- | It looks as if two codes (<24> and <50>) are mapped to a string of 2-byte characters. | + | Two codes (<24> and <50>) are mapped to a string of 2-byte characters. |
+ | |||
+ | Seen in a PDF with the '' |