Of note for accessibility is PDF/UA (Universal Accessibility) which became an ISO Standard in July 2012, and was updated in 2014 (ISO 14289-1:2014 (See PDF/UA (ISO 14289-1:2014) .) The scope of PDF/UA is not meant to be a techniques (how-to) specification, but rather a set of guidelines for creating more accessible PDF. The specification describes the required and prohibited components and the conditions governing their inclusion in or exclusion from a PDF file in order for the file to be available to the widest possible audience, including those with disabilities. The mechanisms for including the components in the PDF stream are left to the discretion of the individual developer, PDF generator, or PDF viewing agent. PDF/UA also specifies the rules governing the behavior for a conforming reader.
The Portable Document Format (PDF) is a file format for representing documents in a manner independent of the application software, hardware, and operating system used to create them, as well as of the output device on which they are to be displayed or printed. PDF files specify the appearance of pages in a document in a reliable, device-independent manner. The PDF specification was introduced by Adobe Systems in 1993 as a publicly available standard. In July 2008, PDF 1.7 became an ISO standard (ISO 32000-1) [ISO32000] .
Conversion to other common file formats (such as HTML, XML, and RTF) with document structure and basic styling information preserved.
Automatic reflow of text and associated graphics to fit a page of a different size than was assumed for the original layout.
Simple extraction of text and graphics for pasting into other applications.
Tagged PDF (PDF 1.4) is a stylized use of PDF that builds on PDF’s logical structure framework. It defines a set of standard structure types and attributes that allow page content (text, graphics, and images) to be extracted and reused for other purposes. It is intended for use by tools that perform the following types of operations:
The logical structure of a document is described by a hierarchy of objects called the structure hierarchy or structure tree. At the root of the hierarchy is a dictionary object called the structure tree root, located by means of the StructTreeRoot entry in the document catalog. See Section 14.7.2, (“Structure Hierarchy”) in PDF 1.7 (ISO 32000-1) : Table 322 shows the entries in the structure tree root dictionary. The K entry specifies the immediate children of the structure tree root, which are structure elements.
A PDF document’s logical structure is stored separately from its visible content, with pointers from each to the other. This separation allows the ordering and nesting of logical elements to be entirely independent of the order and location of graphics objects on the document’s pages.
PDF logical structure shares basic features with standard document markup languages such as HTML, SGML, and XML. A document’s logical structure is expressed as a hierarchy of structure elements, each represented by a dictionary object. Like their counterparts in other markup languages, PDF structure elements can have content and attributes. In PDF, rendered document content takes over the role occupied by text in HTML, SGML, and XML.
PDF’s logical structure features (introduced in PDF 1.3) provide a mechanism for incorporating structural information about a document’s content into a PDF file. Such information might include, for example, the organization of the document into chapters, headings, paragraphs and sections or the identification of special elements such as figures, tables, and footnotes. The logical structure features are extensible, allowing applications that produce PDF files to choose what structural information to include and how to represent it, while enabling PDF consumers to navigate a file without knowing the producer’s structural conventions.
PDF includes several features in support of accessibility of documents to users with disabilities. The core of this support lies in the ability to determine the logical order of content in a PDF document, independently of the content’s appearance or layout, through logical structure and Tagged PDF. Applications can extract the content of a document for presentation to users with disabilities by traversing the structure hierarchy and presenting the contents of each node. For this reason, producers of PDF files must ensure that all information in a document is reachable by means of the structure hierarchy.
PDF File Production and Accessibility
PDF files may be produced either directly by application programs or indirectly by conversion from other file formats or imaging models. In addition, tools exist for inspecting, checking, and repairing PDF files for accessibility. The following sections provide representative lists of applications and tools typically used for these functions.
These notes do not, and cannot, provide an exhaustive list, nor do they endorse particular applications and tools. Rather they provide a snapshot of tools in fairly wide use at the time the WCAG Working Group undertook to review and publish techniques for producing PDF documents. As with any software, application support for PDF accessibility will vary with different versions, with the formatting requirements of specific PDF documents, and with actual usage of the application. That is, the tools can be used properly to produce appropriate tags, etc..
In general, newer tools will provide greater support than earlier ones. The tools’ vendors are the source of authoritative information about their support for PDF accessibility.
Generating PDF Files
Many applications can generate PDF files directly, and some can import
them as well. This direct approach is preferable, since it gives the
application access to the full capabilities of PDF, including the imaging
model and the interactive and document interchange features. Alternatively,
applications that do not generate PDF directly can produce PDF output
indirectly. There are two principal indirect methods:
The application describes its printable output by making calls
to an application programming interface (API) such as GDI in Microsoft®
Windows® or QuickDraw in the Apple Mac OS. A software component called
a printer driver intercepts these calls and interprets them to generate
output in PDF form.
The application produces printable output directly in some other
file format, such as PostScript, PCL, HPGL, or DVI, which is converted
to PDF by a separate translation program.
Although these indirect strategies are often the easiest way to obtain
PDF output from an existing application, the resulting PDF files may
not make the best use of the high-level PDF imaging model relied upon to expose the semantics of the document. This is
because the information embodied in the application’s API calls or
in the intermediate output file often describes the desired results
at too low a level. Any higher-level information maintained by the
original application has been lost and is not available to the printer
driver or translator.
For example, since the printer driver or translator targets correct visual output, information about the semantics of the document may not be sent at all or may be ignored when creating the PDF output. As a result, headings may not be tagged as such, or link text may not be associated with its link object. Check with the vendor of any PDF authoring tool in order to understand how to use the tool in a way that produces the best tagged output.
PDF Authoring Tools that Provide Accessibility Support
Adobe Acrobat’s PDFMaker – PDFMaker is part of Adobe Acrobat
which adds macros to many business applications such as Microsoft
Office, AutoCAD and Lotus Notes that support the conversion of content
from the original format to tagged PDF.
Adobe FrameMaker – Desktop publishing application from Adobe Systems
that directly exports tagged PDF and provides support for alternative
Adobe InDesign – Page layout and desktop publishing application
from Adobe Systems that directly exports tagged PDF and provides
support for alternative text descriptions.
Adobe LiveCycle Designer – Windows-based forms design application
from Adobe Systems that directly exports tagged PDF interactive forms
and provides support for alternative text descriptions; can be invoked
standalone or from within Acrobat Pro.
LibreOffice – Open-source word processing software from The Document Foundation that can export tagged PDF.
Lotus Symphony Documents – Word-processing software from IBM that can export tagged PDF.
Microsoft® Word – Word processing application from Microsoft Corporation
that can export tagged PDF using the save as XPS or PDF utility.
OpenOffice.org Writer – Open source word-processing software from
Sun Microsystems Inc. that can export tagged PDF using the Export
as PDF utility.
CommonLook Office for Microsoft Office from Netcentric Technologies is an add-in to Microsoft® Word and PowerPoint that makes it possible to create tagged PDF documents. CommonLook Office provides tools to allow content authors to run accessibility tests in the Microsoft Word and PowerPoint environments and to remediate accessibility issues prior to conversion to PDF.
Xenos Axess™ for Accessible Statements – PDF software integrates with an organization’s existing enterprise content management (ECM) infrastructure to capture high-volume print streams and automatically transform them into tagged PDFs.
WordPerfect® Office – Word-processing software from Corel that can be used to create, mark up, and share tagged PDF documents.
Microsoft Office 10 – a suite of desktop office applications that creates tagged PDF.
Note: Care should be taken when choosing PDF creation tools from the many available, as some may not support creation of tagged PDF files.
Accessibility Checking and Repair
Acrobat Pro. Adobe Acrobat Pro is an application that creates and edits PDF files.
It has a number of tools for evaluating and repairing the accessibility
of PDF files, including access to the structure root through the
tags panel, the ability to directly manipulate the reading order
through the order panel, a built-in accessibility checker, and the
Touch Up Reading Order tool which provides a graphical mechanism for
assessing and repairing the accessibility of a PDF document.
PDF. Commonlook PDF. Commonlook PDF is a plug-in for Adobe Acrobat Pro from Netcentric Technologies. CommonLook PDF helps identify, report and correct the most common accessibility problems, including the proper tagging of images, tables, forms and other non-textual objects.
API Inspection Tools
aDesigner – a disability simulator from the Eclipse Foundation that helps designers ensure that content is accessible and usable by visually impaired users.
inspect32 – part of the Microsoft Windows Software Development Kit (SDK) that allows developers and testers to examine the accessible properties of UI components.
PDDOMView – part
of Acrobat_Accessibility_9.1.zip which contains files that can be
used by Windows clients of the accessibility interfaces described
in the Accessibility API Reference document.
UISpy – part of the Microsoft Windows Software Development Kit (SDK) that allows developers and testers to view and interact with the user interface (UI) elements of an application.