This is an old revision of the document!
Monsters from the wild
Some PDF writers produce PDFs which are not correct according to the specification. The term monster
refers to lakatosian monsters as coined by Imre Lakatos to refer to counterexamples of a theory.
Software trying to read real PDF files, cannot just throw an error when something is wrong. Instead, it should deal with wrong structures and try to use as much information as possible from the file.
Generally, situations like this will raise a proceedable specific error. Therefore, the error could be treated by the reading software, but could also be ignored if it is not important.
This page describes some of the problems encountered in real PDFs from the wild and discusses ways to deal with such situations.
Missing object
An attribute of an object has a reference pointing to a free reference in the cross references, i.e. the object is not in the file.
When writing out such an object, a string (The original object is missing. It should have been a <Type>)
is written instead. This preserves the correct reference which may be used in several places. Subsequently, this will result in a type mismatch when reading that PDF (unless the original object was a string as well, which is unlikely).
Seen in Producer
Microsoft® Excel® for Microsoft 365
Bluebeam PDF Library 18
Incorrect stream length
The dictionary part of a stream has a Length
which is different from the number of bytes between the stream
and endstream
tokens.
Seen in Producer
Bluebeam PDF Library 18