PDF Redaction Fails and how to Avoid Them

PDF Redaction Fails and how to Avoid Them

Published July 10, 2022

You’ve probably heard about it – in the news, at work, from your lawyer…

Redaction is the process of removing sensitive or classified information from a document prior to its publication to protect yourself or your company from letting this information get into the wrong hands.

Examples of sensitive information include:

  • Social Security numbers
  • Names
  • Addresses
  • Phone numbers
  • Legal information
  • Financial records

It sounds simple enough, but did you know that some tools claim that they can remove sensitive data from a document when in reality, all that is done is the placement of black bars over the text? That’s not a complete redaction, as the underlying text and data are still visible and searchable. See the example below:

redaction1.png

This is a PDF redaction fail.

If you are working with PDF documents, it is not enough to use an editor to draw a black line or black box over a few sentences in a PDF document and then save the file. The content is still there, underneath, just waiting to be discovered. This means anyone with access to the document can copy the text you “redacted,” paste it into another document and read it there instead (not ideal!).

Read 5 Ways to Secure a PDF to learn other ways to protect your PDFs.

There are also major risks associated with not correctly redacting information from your documents. Strictly enforced rules apply to documents that contain either classified or sensitive personal information and identifiers. Strict privacy laws apply in healthcare (Health Insurance Portability and Accountability Act), government (Freedom of Information Act and U.S. Federal Privacy Act), and also other areas such as the legal and financial sectors.

You risk exposure to potential litigation and fines if sensitive information is not fully removed before being released.

In other words, incomplete redaction = non-compliance = major legal fees. It can be a costly mistake and another example of a PDF redaction fail.

Proper redaction is key to compliance

When a PDF document is redacted properly, the sensitive information that you highlight is completely removed from the page. A black box appears in the place where the erased content used to be. Metadata from the document can also be permanently removed. This related process, called sanitization, can be used to remove objects added to the PDF document but are not a part of the document itself. For example, you could use sanitization to remove the name of the author and the date that the document was last updated. When the process is finished, it is impossible to recover the redacted characters and metadata.

Warning: beware of corrupted documents

Some tools can also potentially distort the text layout in the PDF document, leading to the removal or modification of content that was not meant to be redacted. Text that was not supposed to be redacted can be mistakenly removed as a result of using the wrong redaction tool. Make sure the tool you use to redact your PDFs is reliable and ensures complete redaction.

Check out what the PDF Association has to say about redaction.

Download a free trial of Adobe PDF Library to see our PDF redaction technology in action.