Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

PDF – Basics – Cheat Sheet

PDF Terminology Key PDF features by version Glossary

ISO 32000-2:2020 PDF 1.0


PDF 1.1
1993
1996
Pure text. Only RGB. Resolution independent graphics.
Device-independent color. Article threads. External links.
Action PDF feature enabling automatic behaviours triggered by a user
interaction or event (e.g., change to a different page when a bookmark
is clicked, follow a URL hyperlink, etc.).
Security. Multimedia. Actions.
AT Assistive Technology. Associated with PDF/UA and Tagged PDF.
PDF 1.2 1996 Prepress features: CMYK, spot color, halftoning, overprinting,
OPI (Open Prepress Interface). Flate compression. New types of BBox Bounding Box. A common key name.
annotations. Interactive forms.
Conformance Represented by letter designators after a PDF subset acronym, (e.g.,
PDF 1.3 2000 DeviceN color. 2-byte CID fonts. Smooth shadings. More types of level PDF/A-1b, PDF/X-5pg, PDF/VT-2s). Each Conformance Level has its own
PDF 2.0 ISO Latest core PDF specification that applies to all PDF files. Fully annotations. Large media. Page labels. Digital signatures. specialized set of rules and requirements.
32000 vendor-neutral specification of every non-obsoleted PDF JavaScript. Alternate images. Masked images.
COS Carousel Object Syntax. The syntax used by PDF and FDF files.
feature since PDF 1.0. PDF 1.4 2001 Transparency and blend modes. Improved security. More "Carousel" was the codename for Acrobat 1.0.
PDF/A ISO Archival. PDF for the long-term preservation of static page prepress features. JBIG2 images.
Cross-reference (PDF 1.5 and later only) Cross-reference information stored in a stream
19005 appearance. PDF 1.5 2003 JPEG 2000. Layers (optional content). Tagged PDF. Object stream instead of a standard cross-reference xref table. Trailer dictionary
PDF/X ISO eXchange. PDF for graphic arts and professional printing streams and cross reference streams for better compression. entries are in cross reference stream dictionary.
15930 workflows including packaging and labelling. PDF 1.6 2004 OpenType. Ultra-large media. Watermarking. Visibility Destination An object defining a view of a document, comprising a page, the
PDF/UA ISO Universal Accessibility. PDF supporting accessibility and expressions. AES encryption. Interactive 3D with U3D. location of the document window on that page, and zoom factor.
14289 assistive technology such as screen readers for those with Measurement properties.
Direct object PDF object that occurs inline where it is defined and does not have its
vision impairments. PDF 1.7 2006 Portable collections (packages). 3D enhancements. Redaction own object identifier (object number and generation number pair).
PDF/R ISO Raster. Simplified PDF that uses banded images and that is annotations. Standardized as ISO 32000-1:2008.
FDF Forms Data Field file format to store interactive form data
23504 easy to create by low-resource embedded devices such as PDF 2.0 2017 ISO 32000-2. UTF-8 strings. 256-bit AES-CBC encryption. Unicode
consumer flatbed scanners. passwords. Black point compensation. Rich media annotations. Hybrid- (PDF 1.5 and later only). PDF containing objects referenced by
PAdES. PRC for 3D. Geospatial features. Document parts. reference PDF conventional cross-reference tables in addition to objects in object
PDF/VT ISO Variable/Transactional for high-speed variable data (VDP) and
Associated Files. Metadata streams. Deprecation of older streams referenced by cross-reference streams.
16612 transactional printing in the graphic arts and professional
printing industries. Builds on PDF/X. encryption and other legacy features, including XFA. Linearized PDF Commonly referred to as Fast Web View.
PDF/VCR ISO Variable Content Replacement. Templatized PDF/X supporting PDF 2.0 Extensions 256-bit AES-GCM encryption. Hash algorithm, elliptical curve, obj Object abbreviation. A reserved PDF keyword.
16613 late-stage variable content merging, such as adding batch and digital signature updates. Integrity protection via MACs.
numbers on pharmaceutical packaging. STEP for 3D. Clarification on PDF 1.7 and PDF 2.0 namespaces. Object stream (PDF 1.5 and later only). A stream in which indirect objects may be
stored, as an alternative to being stored in PDF body sections.
PDF/E ISO Engineering. PDF 1.6 based subset to support engineering-
OCG Optional Content Group. A selectable “layer” of page content.
24517 centric 3D PDF workflows such as aerospace and automotive Common terms for PDF features
engineering. Superseded by PDF/A-4. Owner Password with full (owner) access, including ability to change passwords
Bookmarks Outlines which use Actions or Destinations. Password and access permissions of the PDF document.
PDF/raster A single PDF can conform
to multiple subsets and
Comments Markup annotations. PAdES PDF Advanced Electronic Signatures. ETSI standard EN 319 142.

PDF/X-5pg Compression Filters on streams. Object streams. Cross reference streams.


conformance levels.
Page Label Optional descriptive text for referring to pages that can be shown on-
Fast Web View Linearization. screen (e.g., i, ii, iii, …, Chapter 1, Chapter 2, etc). This contrasts with the
zero-based integer page index used internally in PDF files.
PDF/A-4f Files Embedded Files and File Attachment annotations.
startxref Reserved PDF keyword that occurs just before the %%EOF end-of-file
Forms Widget annotations and Fields. Also referred to as AcroForm. comment marker along with the byte offset to the cross-reference data
PDF subset Subset Conformance Level(s)
Uppercase letters = ISO (optional, lowercase letters) Hyperlinks Link annotations, URI actions. Actions and Destinations. for the PDF file (expressed as an integer in ASCII).
Lowercase = Industry trailer
JavaScript (JS) ECMAScript for PDF (ISO 21757). ECMAScript Actions. The trailer dictionary is required in every PDF and defines special objects
Version of subset (e.g., largest object number, the Document Catalog root). Also keyword.
(optional) Layers Optional Content (OC), Visibility Expressions. Marked Content.
User Password Password with restricted access permissions (as set by an author).
Multimedia 3D, Movie, Screen, and RichMedia annotations with Actions.
Widget A subtype of PDF annotation used with interactive forms that represent
Page size The page MediaBox. the GUI “widgets” through which data entry is done.
MIME types
Portfolios Collections and Navigators with Embedded Files. XFA XML Forms Architecture. Proprietary XML-based specifications
application/pdf Official MIME type for all PDF files. See RFC 8118. supporting dynamic forms. Deprecated in PDF 2.0.
Properties Document Information dictionary and XMP Metadata streams.
application/fdf Official MIME type for FDF (Forms Data Field) files. XFDF XML-based version of FDF defined by ISO 19444-1.
Contains images of content such as produced by a scanner or
See ISO 32000 for FDF file specification.
Scanned PDF camera. Often has OCR-ed invisible text (Text render mode 3) XMP eXtensible Metadata Platform. XML-based metadata standard (ISO
application/xfdf Official MIME type for XFDF (XML Forms Data Field) files. placed on top of the image allowing text selection by users. 16684) used by many file formats. Required by PDF subsets and PDF 2.0.
See ISO 19444-1 for XFDF file format specification.
Security Encryption, Crypt filters, and Digital Signatures. xref Reserved PDF keyword that indicates the start of a standard cross-
reference table. Often shorthand for “cross reference table”.
Tags Tagged PDF, including Marked Content and Logical Structure.
Report errata at https://github.com/pdf-association/pdf-issues/

© 2023, PDF Association, https://www.pdfa.org (v1.00)


PDF – Basics – Cheat Sheet
Lexical Rules PDF Basics Document Structure
End-of-Line (EOL) Sequences Reserved Keywords (case sensitive) Content
stream
0x0D Carriage Return (CR) only endobj, endstream, f, false, n, null, obj, R, startxref,
0x0A Line Feed (LF) only stream, trailer, true, xref Page
Thumbnail
image
0x0D 0x0A Carriage Return (CR) followed by Line Feed (LF) only
Objects •
• Annotations
true Page •
White space Boolean • Case sensitive keywords. tree
false • • •
0x00 Null byte
Integer 0 • Signed decimal integer. Page • • •
0x09 Horizontal Tab (HT) 123 • No hexadecimal or octal integers. • • •

0x0C Form Feed (FF) +45


Outline
0x20 SPACE -67890 entry

EOL Real 1.23 • Signed decimal floating-point numbers. Outline •



Any End-of-Line sequence (see above) hierarchy •
Number -45.6 • No exponential or scientific formats. Outline
% … PDF comments (starting from % to EOL) are treated as single white space +7.8 entry
• Integers can be used for real numbers.
-.9
Token Delimiter symbols 0. Structure
Document catalog element
( Literal string start token String (literal string) • Encrypted PDFs encrypt string objects. •

(balanced () ok) • Unicode strings with byte order markers.
Structure tree •
) Literal string end token (unbalanced \() • Backslash escape sequences for literal
Structure
element
<, << Hex string start token / dictionary start (<<) token (line \
strings:
>, >> Hex string end token / dictionary end (>>) token break) Metadata
(line \nbreak) Sequence Meaning reference Embedded
[ Array start token (octal \234 code) \n LF (0x0A) files
Names ••
] Array end token \r CR (0x0D) dictionary •

/ PDF name \t Horizontal Tab (0x09)


JavaScripts

% <hex-string> \b Backspace (0x08)


Comment to end-of-line (outside of a string or inside a content stream) Interactive
<48656c6c6F> form
{ Only in Type 4 PostScript calculator functions \f Formfeed (0x0C)
<41424> % 0 added
} Only in Type 4 PostScript calculator functions \) Left parenthesis Collections
\) Right parenthesis
Document
\\ Backslash parts

File Structure \ddd Octal code. 1-3 digits


string types
Information
and controls
(when not using cross-reference streams (PDF 1.5))
%PDF-2.0 text string ASCII string byte string
Header %âãÏÓ
Linearization data PDFDocEncoding UTF-16BE UTF-8
Painting Graphics
(optional) (BoM: 0xFE 0xFF) (BoM: 0xEF 0xBB 0xBF)
Path object Text object
Allowed operators: Allowed operators:
... Name /CaseSensitive12 • Start with / SOLIDUS (0x2F). • Path construction • General graphics state
• Colour
4 0 obj /HashSign#23 • Can use # followed by 2 hex digits. • Text state
• Text-showing
...
Body endobj Array [/AName true null • One dimensional ordered collection with
• Text-positioning
• Marked-content

5 0 obj
-1.23 10 0 R … ] zero or more elements. • Compatibility

... [] % empty array • Array elements can be any type of object.


endobj
Dictionary • Associative unordered table containing W,W* Path-painting m,re BT ET
... ● Subsection = object number, number
<< operators
of objects in next xref subsection. key/value pairs known as an entry.
● 10-digit byte offset to object /KeyName value
xref ● 5-digit generation number … • Keys must be unique direct name objects.
0 7
0000000000 65535 f
● f = free object
>> • If value is null then same as if key does Clipping path Content stream level Shading object
● n = in-use object
Cross-reference 0000000016 00000 n
● object 0 is start of free list
not exist. object Allowed operators:
• General graphics state
sh Allowed operators:
• None
0000000096 00000 n Allowed operators: Path-painting

Table 0000000157 00000 n ● every xref entry is exactly 20 bytes Stream 10 0 obj • Contains zero or more stream data bytes. • None operators • Special graphics state
• Colour
0000000289 00000 n including EOL << /Length int • Always need a stream dictionary. • Text state
(immediate)
... ● no comments in cross-reference

• Marked content
sections • Stream data can be compressed and/or • Compatibility
trailer >>
encrypted using Filters.
<<
/Root 1 0 R ● trailer is special dictionary stream
/Size 7 ● always immediately after end of …stream data… • Always an indirect object. EI BI Do (immediate)
Trailer ... cross-reference table. endstream • Cannot be in object streams.
>> ● /Root = Document Catalog
startxref ● /Size = number of objects + 1 endobj • Encrypted PDFs encrypt stream data.
3987
● startxref = byte offset to xref
%%EOF Null • Case sensitive reserved keyword.
keyword null Inline image object External object
object Allowed operators: Allowed operators:
• ID • None
Multiple Incremental Updates append additional Body, Cross-reference Table and Trailer Indirect • Object number then generation number.
sections to a PDF file, allowing edits and changes without rewriting the full PDF. Link to 10 0 R
Reference • Method to refer to another object.
previous PDF state is via Prev entry in the trailer dictionary to previous xref.

Date:
2023.08.23 © 2023, PDF Association, https://www.pdfa.org (v1.20)
13:08:59
-04'00'

You might also like