Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cmap [2020/02/23 10:29] christian [Decoding] |
cmap [2020/02/23 11:15] christian [Wrong PostScript] |
||
---|---|---|---|
Line 127: | Line 127: | ||
Sometimes CMaps define mappings which are not covered by the codespace ranges. This can be seen very often in the wild. These illegal mappings are collected into the ''# | Sometimes CMaps define mappings which are not covered by the codespace ranges. This can be seen very often in the wild. These illegal mappings are collected into the ''# | ||
- | ===== Examples | + | ===== Monster |
+ | |||
+ | ==== Mappings outside the codespace | ||
single byte mappings in a double byte codespace | single byte mappings in a double byte codespace | ||
- | using /find instead of / | + | ==== Wrong PostScript ==== |
+ | |||
+ | using /find instead of / | ||
+ | |||
+ | See [[postscript# | ||
+ | ==== Prevent copying ==== | ||
+ | |||
+ | <code postscript> | ||
+ | %... | ||
+ | 1 begincodespacerange | ||
+ | < | ||
+ | endcodespacerange | ||
+ | 100 beginbfchar | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | %... | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | %... | ||
+ | </ | ||
+ | |||
+ | Here, all codes map to the same character (Substitute character, Ctrl-Z) to prevent extracting the text. Interesting is also the ordering by the second byte, which forced me to redesign the object structure to avoid exponential processing time. | ||
- | preventing copying | + | Seen in [[https:// |
+ | ==== Char to string mapping ==== | ||
+ | <code postscript> | ||
+ | %... | ||
+ | /CMapType 2 def | ||
+ | 1 begincodespacerange | ||
+ | < | ||
+ | endcodespacerange | ||
+ | 1 beginbfchar | ||
+ | < | ||
+ | endbfchar | ||
+ | 1 beginbfchar | ||
+ | < | ||
+ | endbfchar | ||
+ | 50 beginbfrange | ||
+ | < | ||
+ | %... | ||
+ | </ | ||
+ | It looks as if two codes (<24> and <50>) are mapped to a string of 2-byte characters. I have not found anything about this in the documenation. Seen in a PDF with the '' |