Cracking the Code: Converting PDF to Office Files

Cracking the Code: Converting PDF to Office Files

Published June 29, 2023

One of the latest functionalities we added to the Adobe PDF Library is the ConverttoOffice function. This allows users to convert their PDF files to Microsoft Office files (Word, Excel, and PowerPoint). 

Download a free trial of Adobe PDF Library 

Common Use Cases for Converting PDF to Office

Converting PDF files to Office formats can be useful in various scenarios. Some common use cases for converting PDF to Office files include:

  • Editing and Modifying Content: PDF files are generally not designed for easy editing, while Office formats like Word, Excel, and PowerPoint provide robust editing capabilities. Converting a PDF to an Office format allows you to make changes to the content, such as editing text, modifying tables, or updating presentations.

 

  • Extracting Data: PDFs often contain structured data, such as tables or forms. Converting PDFs to Excel spreadsheets allows you to extract tabular data and perform calculations, analysis or import it into other systems for further processing.

 

  • Preserving Formatting: While PDFs are known for their ability to retain formatting across different devices and platforms, converting them to Office formats can be beneficial when you need to preserve the layout, styling, and formatting of the document, especially for complex documents like reports, brochures, or manuals.

 

  • Repurposing Content: Converting PDFs to Office formats enables you to repurpose content for different purposes. For example, you can extract text and images from a PDF and use them in a new Word document or PowerPoint presentation, saving time and effort in recreating the content from scratch.

 

Watch "Converting PDFs to Office Files with Adobe PDF Library" to learn more about PDF to Office conversion.

 

The ConverttoOffice Code Sample

 
The ConverttoOffice code sample shows how the Adobe PDF Library converts PDF files to Office. The following sample is written in C++, but we have additional samples available in .NET,.NET Framework,andJava.

 

First, it initializes the library:

#include <sstream>

#include "InitializeLibrary.h"
#include "APDFLDoc.h"
#include "DLExtrasCalls.h"

 

Then, defines input and output parameters:

#define DIR_LOC "../../../../Resources/Sample_Input/"
#define DEF_INPUT_WORD "Word.pdf"
#define DEF_INPUT_EXCEL "Excel.pdf"
#define DEF_INPUT_POWERPOINT "PowerPoint.pdf"
#define DEF_OUTPUT_WORD "ConvertToWord-out.docx"
#define DEF_OUTPUT_EXCEL "ConvertToExcel-out.xlsx"
#define DEF_OUTPUT_POWERPOINT "ConvertToPowerPoint-out.pptx"

 

Then, based on the file type you need to convert the document to, it converts the PDF to the specified file type:

    ASPathName inputPathName = APDFLDoc::makePath(inputFileName);
   ASPathName outputPathName = APDFLDoc::makePath(outputFileName);

   ASFileSys fileSys = ASGetDefaultFileSys();

   ASBool result = false;

   if (officeType == Word)
   {
       result = ConvertPDFToWord(inputPathName, outputPathName, fileSys);
   }
   else if (officeType == Excel)
   {
       result = ConvertPDFToExcel(inputPathName, outputPathName, fileSys);
   }
   else if (officeType == PowerPoint)
   {
       result = ConvertPDFToPowerPoint(inputPathName, outputPathName, fileSys);
   }

   std::cout << "Conversion of file " << inputFileName << " has ";

   if (result) {
       std::cout << "been successfully Converted to " << outputFileName << std::endl;

  

View the entire PDFtoOffice C++ sample on Github

Start your free trial today 

Experience the reliability of Adobe PDF technology and integrate powerful PDF functionalities into your applications.