Comparing de.redsix.pdf and Java: Font Embedding Issues with Aspose

This article addresses a known issue encountered when converting DOCX files to PDF using Aspose Words and subsequently attempting to read them with Aspose PDF. The problem frequently occurs and may be related to font embedding during the conversion process.

Aspose Words to PDF Conversion and Font Embedding

The provided code snippet demonstrates the DOCX to PDF conversion process using Aspose Words:

// Update document fields
document.updateFields();
document.updateListLabels();
document.updateTableLayout();
document.updatePageLayout();
document.updateWordCount();

FontSettings.setDefaultFontName("Droid Sans Fallback");

com.aspose.words.PdfSaveOptions opts = new com.aspose.words.PdfSaveOptions();
opts.setWarningCallback(new WordsWarningCallback("",""));
//opts.setEmbedFullFonts(true); 
document.save(tempFile.getAbsolutePath(), opts);

Notably, the line //opts.setEmbedFullFonts(true); is commented out. This suggests that full font embedding is not explicitly enabled in the current configuration. This lack of explicit font embedding might contribute to the subsequent issues when reading the PDF with Aspose PDF.

Aspose PDF Text Extraction and the NullPointerException

The following code attempts to extract text from the generated PDF using Aspose PDF:

// Set text extraction options
com.aspose.pdf.TextExtractionOptions textExtOptions = new com.aspose.pdf.TextExtractionOptions(com.aspose.pdf.TextFormattingMode.Pure);
com.aspose.pdf.devices.TextDevice txtDevice = new com.aspose.pdf.devices.TextDevice(textExtOptions);
txtDevice.setEncoding(Charset.forName("UTF-8"));

// Convert a particular page and save the image to stream
txtDevice.process(document.getPages().get_Item(convertPage), bos);

This process results in a java.lang.NullPointerException within the Aspose PDF library, specifically during text extraction. The stack trace points to an issue within the internal workings of Aspose PDF when handling potentially malformed or incomplete font information.

Potential Solutions and Considerations

The observed NullPointerException suggests a problem with how font information is handled between Aspose Words and Aspose PDF. Possible solutions include:

  • Enabling Full Font Embedding: Uncommenting the line opts.setEmbedFullFonts(true); in the Aspose Words conversion code might resolve the issue by ensuring that all necessary font data is included in the PDF.

  • Verifying Font Availability: Ensure that the “Droid Sans Fallback” font, set as the default, is correctly installed and accessible on the system performing the conversion.

  • Updating Aspose Libraries: Using the latest versions of both Aspose Words and Aspose PDF may include bug fixes related to font handling and could resolve the problem. The provided version information indicates potential for updates (e.g., PDF version 9.5.0.0).

  • Font Substitution: Investigate whether the issue is specific to “Droid Sans Fallback” and try using alternative fonts known to be compatible with Aspose PDF.

Conclusion

The java.lang.NullPointerException encountered when reading Aspose Words-generated PDFs with Aspose PDF is likely linked to font embedding. Enabling full font embedding, verifying font availability, updating Aspose libraries, and exploring font substitution are viable troubleshooting steps. Addressing these font-related aspects should improve compatibility and facilitate successful text extraction from the converted PDF documents. Further investigation and testing with specific font configurations are recommended to pinpoint the root cause and implement the most effective solution.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *