Apache PDFBoxとPDF/A-3

Apache PDFBoxを使用してPDF/A-3文書を処理することはできますか？（特に、フィールドの値を変更するため？）Apache PDFBoxとPDF/A-3

PDFBox 1.8 Cookbookはpdfaid.setPart(1);

とPDF/A-1の文書を作成することが可能であることを述べている私は、PDF/A-3ドキュメントのpdfaid.setPart(3)を適用することはできますか？
そうでない場合：PDF/A-3文書を読み込み、フィールド値を変更して、必要でないもので安全にできるかどうか> PDF/A-3への変換/変換<しかし文書まだPDF/A-3ですか？

2016-08-16 hagem

あなたの質問は、すでにPDFBoxユーザーメーリングリストで正解（ときれいに）になっています。 –

偉大な、ありがとう！私は以下の答えを引用しました。 – hagem

PDFBoxはそれをサポートしていますが、注意してください。そのため、PDFBoxは、あなた自身が何の「PDF/A-3として保存」が存在しない、すなわち適合性を確認する必要があり、低レベルのライブラリであるという事実に。 http://www.mustangproject.orgを見てみると、PDFBoxを使ってZUGFeRD（電子請求書）をサポートし、PDF/A-3も必要です。

出典

2016-08-16 12:13:31 hagem

有効なPDF/Aを作成する方法：この例では、PDFをImageに変換してから、有効なPDF/Ax-yを作成します。画像。 PDFBOX2.0x

public static void main(String[] args) throws IOException, TransformerException 
{ 

    String resultFile = "result/PDFA-x.PDF"; 
    FileInputStream in = new FileInputStream("src/PDFOrigin.PDF"); 

    PDDocument doc = new PDDocument(); 
    try 
    { 
     PDPage page = new PDPage(); 
     doc.addPage(page); 
     doc.setVersion(1.7f); 

     /*    
     // A PDF/A file needs to have the font embedded if the font is used for text rendering 
     // in rendering modes other than text rendering mode 3. 
     // 
     // This requirement includes the PDF standard fonts, so don't use their static PDFType1Font classes such as 
     // PDFType1Font.HELVETICA. 
     // 
     // As there are many different font licenses it is up to the developer to check if the license terms for the 
     // font loaded allows embedding in the PDF. 

     String fontfile = "/org/apache/pdfbox/resources/ttf/ArialMT.ttf"; 
     PDFont font = PDType0Font.load(doc, new File(fontfile));   
     if (!font.isEmbedded()) 
     { 
      throw new IllegalStateException("PDF/A compliance requires that all fonts used for" 
        + " text rendering in rendering modes other than rendering mode 3 are embedded."); 
     } 
     */ 

     PDPageContentStream contents = new PDPageContentStream(doc, page); 
     try 
     { 
      PDDocument docSource = PDDocument.load(in); 
      PDFRenderer pdfRenderer = new PDFRenderer(docSource);    
      int numPage = 0; 

      BufferedImage imagePage = pdfRenderer.renderImageWithDPI(numPage, 200); 
      PDImageXObject pdfXOImage = LosslessFactory.createFromImage(doc, imagePage); 

      contents.drawImage(pdfXOImage, 0,0, page.getMediaBox().getWidth(), page.getMediaBox().getHeight()); 
      contents.close(); 

     }catch (Exception e) { 
      // TODO: handle exception 
     } 

     // add XMP metadata 
     XMPMetadata xmp = XMPMetadata.createXMPMetadata(); 
     PDDocumentCatalog catalogue = doc.getDocumentCatalog(); 
     Calendar cal = Calendar.getInstance();   

     try 
     { 
      DublinCoreSchema dc = xmp.createAndAddDublinCoreSchema(); 
      // dc.setTitle(file); 
      dc.addCreator("My APPLICATION Creator"); 
      dc.addDate(cal); 

      PDFAIdentificationSchema id = xmp.createAndAddPFAIdentificationSchema(); 
      id.setPart(3); //value => 2|3 
      id.setConformance("A"); // value => A|B|U 

      XmpSerializer serializer = new XmpSerializer(); 
      ByteArrayOutputStream baos = new ByteArrayOutputStream(); 
      serializer.serialize(xmp, baos, true); 

      PDMetadata metadata = new PDMetadata(doc); 
      metadata.importXMPMetadata(baos.toByteArray());     
      catalogue.setMetadata(metadata); 
     } 
     catch(BadFieldValueException e) 
     { 
      throw new IllegalArgumentException(e); 
     } 

     // sRGB output intent 
     InputStream colorProfile = CreatePDFA.class.getResourceAsStream(
       "../../../pdmodel/sRGB.icc"); 
     PDOutputIntent intent = new PDOutputIntent(doc, colorProfile); 
     intent.setInfo("sRGB IEC61966-2.1"); 
     intent.setOutputCondition("sRGB IEC61966-2.1"); 
     intent.setOutputConditionIdentifier("sRGB IEC61966-2.1"); 
     intent.setRegistryName("http://www.color.org"); 

     catalogue.addOutputIntent(intent); 
     catalogue.setLanguage("en-US"); 

     PDViewerPreferences pdViewer =new PDViewerPreferences(page.getCOSObject()); 
     pdViewer.setDisplayDocTitle(true);; 
     catalogue.setViewerPreferences(pdViewer); 

     PDMarkInfo mark = new PDMarkInfo(); // new PDMarkInfo(page.getCOSObject()); 
     PDStructureTreeRoot treeRoot = new PDStructureTreeRoot(); 
     catalogue.setMarkInfo(mark); 
     catalogue.setStructureTreeRoot(treeRoot);   
     catalogue.getMarkInfo().setMarked(true); 

     PDDocumentInformation info = doc.getDocumentInformation();    
     info.setCreationDate(cal); 
     info.setModificationDate(cal);    
     info.setAuthor("My APPLICATION Author"); 
     info.setProducer("My APPLICATION Producer");; 
     info.setCreator("My APPLICATION Creator"); 
     info.setTitle("PDF title"); 
     info.setSubject("PDF to PDF/A{2,3}-{A,U,B}");   

     doc.save(resultFile); 
    }catch (Exception e) { 
     throw new IllegalArgumentException(e); 
    } 
}

出典

2017-10-18 09:00:46

答えはおそらく大丈夫です（確かめるためにバリデーターを使ってコードを実行する必要があります）。 jpegファイルの解凍は非効率的です。代わりに 'JPEGFactory.createFromStream（）'を使用してください。これはjpgファイルをそのまま使用します。すべてのコピー＆ペーストの人々がその部分を使用するのを避けるために、コードを変更するとよいでしょう。また、JpegをデコードしてBufferedImageを取得したい場合は、ImageIO.read（）という1行だけが必要です。あなたの多くの行は古いか非常に新しいです:-) –

ここで目的はJPEGを解凍することではありません。それ以外の場合は、PDFBOXでPDFRenderer.renderImageWithDPI（...）を使用して、PDFページから直接BufferedImageを作成することができます。一方、結果はpdf-onlineによって検証されています。 –

私は目的がPDFを作成することを知っています。私の発言は、PDFのイメージに関するものです。 JpegファイルでLosslessFactoryを使用すると、jpegを解凍してFlate圧縮で再圧縮するため、処理速度が遅くなります。通常、ストリーム入力でJPEGFactoryを使用する場合よりも大きなPDFを生成します。 –

答えて

関連する問題