Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
cmap [2020/02/23 10:53] christian [Char to string mapping] |
cmap [2020/02/23 11:17] christian [Monster from the wild] |
||
---|---|---|---|
Line 124: | Line 124: | ||
* the mappings are ordered. This is not strictly prescribed, but recommended by the specifications. | * the mappings are ordered. This is not strictly prescribed, but recommended by the specifications. | ||
- | ==== Handling malformed CMaps ==== | ||
- | |||
- | Sometimes CMaps define mappings which are not covered by the codespace ranges. This can be seen very often in the wild. These illegal mappings are collected into the ''# | ||
===== Monster from the wild ===== | ===== Monster from the wild ===== | ||
+ | |||
==== Mappings outside the codespace ==== | ==== Mappings outside the codespace ==== | ||
Line 137: | Line 135: | ||
using /find instead of / | using /find instead of / | ||
+ | See [[postscript# | ||
==== Prevent copying ==== | ==== Prevent copying ==== | ||
+ | <code postscript> | ||
+ | %... | ||
+ | 1 begincodespacerange | ||
+ | < | ||
+ | endcodespacerange | ||
+ | 100 beginbfchar | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | %... | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | %... | ||
+ | </ | ||
+ | |||
+ | Here, all codes map to the same character (Substitute character, Ctrl-Z) to prevent extracting the text. Interesting is also the ordering by the second byte, which forced me to redesign the object structure to avoid exponential processing time. | ||
+ | |||
+ | Seen in [[https:// | ||
==== Char to string mapping ==== | ==== Char to string mapping ==== | ||