Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cmap [2020/02/23 10:29]
christian [Decoding]
cmap [2020/02/23 11:17]
christian [Mappings outside the codespace]
Line 124: Line 124:
   * the mappings are ordered. This is not strictly prescribed, but recommended by the specifications.   * the mappings are ordered. This is not strictly prescribed, but recommended by the specifications.
  
-==== Handling malformed CMaps ====+===== Monster from the wild =====
  
-Sometimes CMaps define mappings which are not covered by the codespace ranges. This can be seen very often in the wild. These illegal mappings are collected into the ''#unmapped'' variable of a Mappings object. + 
-===== Examples from the wild =====+==== Mappings outside the codespace ====
  
 single byte mappings in a double byte codespace single byte mappings in a double byte codespace
  
-using /find instead of /findresource+Sometimes CMaps define mappings which are not covered by the codespace ranges. This can be seen very often in the wild. These illegal mappings are collected into the ''#unmapped'' variable of a Mappings object. 
 + 
 +==== Wrong PostScript ==== 
 + 
 +using /find instead of /findresource  
 + 
 +See [[postscript#exception_handling_example]] 
 +==== Prevent copying ==== 
 + 
 +<code postscript> 
 +%... 
 +1 begincodespacerange 
 +<0000> <FFFF> 
 +endcodespacerange 
 +100 beginbfchar 
 +<0000> <001A> 
 +<0100> <001A> 
 +<0200> <001A> 
 +<0300> <001A> 
 +<0400> <001A> 
 +%... 
 +<4900> <001A> 
 +<4A00> <001A> 
 +<0001> <001A> 
 +<0101> <001A> 
 +<0201> <001A> 
 +<0301> <001A> 
 +<0401> <001A> 
 +%... 
 +</code> 
 + 
 +Here, all codes map to the same character (Substitute character, Ctrl-Z) to prevent extracting the text. Interesting is also the ordering by the second byte, which forced me to redesign the object structure to avoid exponential processing time.
  
-preventing copying+Seen in [[https://github.com/adobe-type-tools/Adobe-CNS1/raw/master/Adobe-CNS1-7.pdf|The Adobe-CNS1-7 Character Collection]]. 
 +==== Char to string mapping ====
  
 +<code postscript>
 +%...
 +/CMapType 2 def
 +1 begincodespacerange
 +<00><FF>
 +endcodespacerange
 +1 beginbfchar
 +<24><0009 000d 0020 00a0>
 +endbfchar
 +1 beginbfchar
 +<50><002d 00ad 2010>
 +endbfchar
 +50 beginbfrange
 +<21><21><0050>
 +%...
 +</code>
  
 +It looks as if two codes (<24> and <50>) are mapped to a string of 2-byte characters. I have not found anything about this in the documenation. Seen in a PDF with the ''Producer'' "Mac OS X 10.7.1 Quartz PDFContext".
  • cmap.txt
  • Last modified: 2020/02/23 14:33
  • by christian