There are many object types in the PDF specification and only some are defined in the library yet. This page explains in detail how to add a new object type to the library and why you should do so.
Why bother to define a type? The usual objects are dictionaries which can be processed and viewed as they are.
The attribute /BS of an /FreeText has an untyped dictionary.
To make it nice in the PDFExplorer!
Adding an object type allows for a customized presentation with a printString, an icon, attribute documentation and order etc. (see below for all the details).
/W of a /BorderStyle object shows its documentation.
Usually, you inspect a PDF with the PDFExplorer and find some object which is not documented. To define an object type it is important to have an example open in the PDFExplorer so that you can see the changes. In our example this is the object in the attribute /BS in a /FreeText annotation object.
In order to find out what the object is about, the relevant piece of documentation should be found in the PDF Specification. In our case this is a border style dictionary described in chapter “12.5.4 Border Styles” on page 386.
The new class can be defined with the
The package is
[PDF Interactive Features], because it is related to /Annot which is defined there.
This should be
Graphics.PDF, since this is the only namespace for the runtime code of the library.
As name for this example, I use
BorderStyle. Ideally the name should be the same as used in the PDF specification. If the name does not match the name in the specification, be it because the name is already defined or for estetic reasons, the class method
subtype, depending on the type inference mechanism) needs to be implemented.
Most often, this will be
Dictionary or a
TypedDictionary if the object has the common attribute
/Type. It can also be something exotic as a
Name or someting else (see later).
The first line should give the reference to the PDF specification, followed by the first paragraph of the description in the specification. I usually edit this text to add line breaks after sentences and remove any cross references to other parts of the specification:
PDF border style dictionary as defined in PDF 32000_2008.pdf, section 12.5.4, pp. 386. An annotation may optionally be surrounded by a border when displayed or printed. If present, the border shall be drawn completely inside the annotation rectangle. In PDF 1.1, the characteristics of the border shall be specified by the Border entry in the annotation dictionary. Beginning with PDF 1.2, the border characteristics for some types of annotations may instead be specified in a border style dictionary designated by the annotation’s BS entry. Such dictionaries may also be used to specify the width and dash pattern for the lines drawn by line, square, circle, and ink annotations. If neither the Border nor the BS entry is present, the border shall be drawn as a solid line with a width of 1 point.
Two more bits of information should be added as methods on the class side.
documentationPlace defines the section in the PDF specification. This is a more recent addition intented to be able to jump directly to the corresponding place in the specification PDF from the code browser or the PDFExplorer. This has not been done yet and most objects don't have this method, but for new objects, I add it. Eventually, I will add this to all objects.
documentationPlace ^#(12 5 4)
If the object type was not part of the original PDF specification 1.0, the version should be added.
version notes the minor part of the PDF version in which this feature first occurred, allowing for computing the minimal version for a PDF.
The version is usually mentioned in the specification of the object. After I add this method, I remove the corresponding text from the class comment.
Since a new type is defined, the object types have to be reset with
This clears the cache for all the object types (Smalltalk classes - 137 at the time of writing). On next access, the cache is filled with all known types, including the newly defined ones, so that the new type can be found.
This has to be done only when a new class is defined.
The new type can now be used. Therefore, the type of the attribute which contains the object should be set to the new type. In the example, in the method
BS of class
type: pragma should be changed from
BS <type: #Dictionary> <version: 6> <attribute: 9 documentation: 'A border style dictionary specifying the line width and dash pattern that shall be used in drawing the annotation’s border. The annotation dictionary’s AP entry, if present, takes precedence over the BS entry'> ^self objectAt: #BS ifAbsent: [Dictionary empty]
BS <type: #BorderStyle> <version: 6> <attribute: 9 documentation: 'A border style dictionary specifying the line width and dash pattern that shall be used in drawing the annotation’s border. The annotation dictionary’s AP entry, if present, takes precedence over the BS entry'> ^self objectAt: #BS ifAbsent: [BorderStyle empty]
The result looks like this in the PDFExplorer (after hitting F5 for refresh):
the style is recognized as BorderStyle and it shows the right version (PDF-1.2), but the required field
Type is red (error) and the
W field is pink (not known).
Attributes are added as methods in protocol
accessing entries named like the key in the definition, even with a capital letter, although this is not common Smalltalk style.
The first two attributes (of 4) look like this in the PDF specification:
The corresponding methods look like this:
Type <type: #Name> <attribute: 1 documentation: 'The type of PDF object that this dictionary describes.'> ^self objectAt: #Type ifAbsent: [#Border asPDF]
W <type: #Number> <attribute: 2 documentation: 'The border width in points. If this value is 0, no border shall drawn.'> ^self objectAt: #W ifAbsent: [1 asPDF]
An attribute method consists of a number of describing pragmas and the code for access.
Mandatory is the
<type: aSymbol> pragma: it takes the name symbol of the Smalltalk class implementing the PDF type. This is derived from the “Type” column of the definition table. For more information about typing and the possible type pragmas, see typing.
The documentation is specified in the
<attribute: anInteger documentation: aString> pragma. The first parameter defines the order of the attribute, so that they can be displayed in the same order as they are defined by the PDF specification. The first attribute shall be
1 and the next ones are numbered consecutively.
The documentation is taken directly from the specification and edited, so that all information is removed which is expressed directly in the method. In our example, the “(Optional)” is removed, because this is implied. If the attribute is required, the
<required> pragma is used to express this fact.
The description of the default value is also removed, because this is evident from the access code.
Also references to other parts of the specification are removed (which is not the case in the example).
Often, new attributes were added with later PDF versions. The version of an attribute, if it is higher than the version of the type, can be noted with the
<version: anInteger> pragma, where the argument is the minor version of the PDF (i.e. 2 for version PDF-1.2).
The access code can be either
^self objectAt: #Type ifAbsent: [#Border asPDF]
for optional attributes with a default value, or
^self objectAt: #Type
for a required attribute. This will raise an error if the attribute is not present in the object.
The method will return the object of the value of the attribute. The object is either stored directly in the attribute or a reference to it. In any case, the object is returned. To access the value (object or reference), the following methods can be used:
^self at: #Type ifAbsent: [#Border asPDF]
^self at: #Type
Now, the PDF type is sufficiently defined to be usefully displayed in the PDFExplorer. But more can be done by defining some of the following methods.
listText returns a short Text used in the treeview of the PDFExplorer. The method
titleText is used for the display of the selected object on the right side.
toolListIcon can be defined on the class side to get an icon for the class in the Smalltalk browser. A PDF type class has the method
listIcon on the instance side which, by default, is the
toolListIcon of the class. Therefore it is possible to select an icon depending on the object's state.
Some attributes clutter the treeview on the left side of the PDFExplorer. For example, every
TypedDictionary has the attribute
/Type which is usually used as the name of the object itself.
By defining the method
displayKeysToOmit, such attributes can be excluded from the children of the object in the treeview. For the class
TypedDictionary the method looks like this:
displayKeysToOmit ^super displayKeysToOmit , #(#Type)