In the first post of this series, we covered why integrating accessibility at the very first steps of a project can be beneficial to the process of PDF remediation. We also covered why it is always better to plan accessibility in advance, by building it into the source document rather than bolting it onto the PDF document.
The most efficient way to build in accessibility and to ensure properly tagged content in a PDF document is making sure the proper semantics are brought into the document prior to its conversion to PDF. This means these efforts need to be implemented in the authoring tool, rather than in Acrobat Pro. It is always possible to wait and do everything in Acrobat Pro of course, but at the cost of more significant efforts in terms of time and complexity.
Unfortunately, due to poor authoring techniques and an even poorer conversion process, most PDF documents end up being untagged, meaning that they lack any kind of mark up that makes content accessible to assistive technologies. Just as it is the case in HTML, untagged PDF documents are very annoying and difficult to use for people who cannot see and need to use screen readers in order to discover their content.
Some of the most common, and perhaps easier to understand accessibility problems untagged PDF documents impose on users are related to missing or poor:
- Document titles
- Alternate text for images
- Content structure
- Language indicators
- Navigation patterns
The following is meant as a reminder of why these elements are crucial to help users better experience PDF content.
1. Missing or poor document titles – The title is the first information that is conveyed to screen reader users when opening a PDF document. This information is important for them to validate which document they’re currently viewing, especially when there is more than one document opened at the same time. Confusion can often arise when that information is not properly disclosed or when the title of the document does not clearly reflect its content.
2. Missing or poor alternate text for images – Despite what the adage says, a picture isn’t always worth a thousand words. Graphical elements like pictures, icons, illustrations, charts, and the like convey information that only exists in a graphical format. When informative images are not described in text, this content is lost to screen reader users. On the other hand, decorative images need to be ignored by screen readers, in order to reduce noise. Unless the relevant information is efficiently transferred to screen readers, users will run into major problems trying to make sense of the content.
3. Missing or poor content structure – The general structure of a document, including such components as paragraphs, lists, and headings, is normally represented visually, thus helping users understand how a document is organized. Screen reader users, being unable to see the visual formatting of a document, cannot rely on this information to understand its organization. For them to be able to do so, users need that visual organization conveyed semantically, using PDF tags. When such information is not missing or poorly implemented, screen readers will not be able to reliably convey the structure to the users. As a result, it can easily become very difficult, if not impossible, for a user to consume content.
4. Missing or poor language indicators – Screen readers cannot guess which language the content of a PDF is in. For screen reading software to be able to read the content of a document in the proper language, this information needs to be part of the document. When the main language of a document is undefined, screen readers will default to reading the content using their base settings, which may or may not be the appropriate synthesis to read the content. When this happens and the default setting is not appropriate (for instance, a French voice synthesis attempting to read a document written in English), the content will be for the most part unintelligible to screen reader users, no matter how much accessibility has been built into it.
5. Missing or poor navigation patterns – Users will normally rely on their sight to quickly navigate in a document, isolate the section they’re interested in and jump directly to it. Without the ability to see the pages and using a screen reader, it is not possible to jump directly to a precise position on the page, without having listened to all the preceding content. In order to make it easier for users to navigate through a document, navigation patterns need to be integrated in the document, in the form of bookmarks and internal links. Lacking such patterns makes quick navigation in a document much more difficult for non-sighted users.
These are just a few of the common accessibility barriers people with disabilities often run into when trying to consume untagged PDF content. The main reason for this is lack of knowledge and awareness from the author’s part, but also unreliable tools and techniques. Hopefully, through our next posts, we will be able to cover some of these issues and help improve PDF navigation.
In the upcoming posts, we will go over best practices to help prevent these barriers in Microsoft Word and OpenOffice. We will then conclude by going over the basics of an efficient conversion process to PDF, so all the accessibility features included in the authoring source are automatically and reliably transferred to PDF.
The blog posts in this series include the following and will be published over the next few weeks:
- Requirements for an Accessible PDF: Part 1
- Requirements for an Accessible PDF: Part 2 – Common Accessibility Barriers (this post)
- Requirements for an Accessible PDF: Part 3 – Authoring Best Practices
- Requirements for an Accessible PDF: Part 4 – Proper Conversion Process
We hope you enjoyed this second part of our series on PDF Accessibility Requirements. Do you feel these barriers are the most important ones? Do you feel other barriers should have been included? Don’t hesitate to leave a comment so we can keep this conversation going.