Merging PDFs and converting PDF to PNG image in .NET Core 2.0
我正在寻找可以支持将pdf合并为一个并且还将合并的pdf转换为一个.PNG图像文件的第三方.dll。
我知道Ghostscript或pdfsharp支持.NET框架,但不支持.NET Core 2.0框架。
如果有人可以建议任何第三方dll,它可以合并所有PDF,也可以在.NET Core 2.0中将合并的pdf转换为PNG图像。
对达到此要求有任何帮助或建议吗?
您可以使用iTextSharp.LGPLv2.Core合并pdf文件,效果很好。请检查本教程。它也支持.NETStandard。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | using System; using System.Collections.Generic; using System.IO; using iTextSharp.text; using iTextSharp.text.pdf; namespace HelveticSolutions.PdfLibrary { public static class PdfMerger { /// <summary> /// Merge pdf files. /// </summary> /// <param name="sourceFiles">PDF files being merged.</param> /// <returns></returns> public static byte[] MergeFiles(List<byte[]> sourceFiles) { Document document = new Document(); using (MemoryStream ms = new MemoryStream()) { PdfCopy copy = new PdfCopy(document, ms); document.Open(); int documentPageCounter = 0; // Iterate through all pdf documents for (int fileCounter = 0; fileCounter < sourceFiles.Count; fileCounter++) { // Create pdf reader PdfReader reader = new PdfReader(sourceFiles[fileCounter]); int numberOfPages = reader.NumberOfPages; // Iterate through all pages for (int currentPageIndex = 1; currentPageIndex <= numberOfPages; currentPageIndex++) { documentPageCounter++; PdfImportedPage importedPage = copy.GetImportedPage(reader, currentPageIndex); PdfCopy.PageStamp pageStamp = copy.CreatePageStamp(importedPage); // Write header ColumnText.ShowTextAligned(pageStamp.GetOverContent(), Element.ALIGN_CENTER, new Phrase("PDF Merger by Helvetic Solutions"), importedPage.Width / 2, importedPage.Height - 30, importedPage.Width < importedPage.Height ? 0 : 1); // Write footer ColumnText.ShowTextAligned(pageStamp.GetOverContent(), Element.ALIGN_CENTER, new Phrase(String.Format("Page {0}", documentPageCounter)), importedPage.Width / 2, 30, importedPage.Width < importedPage.Height ? 0 : 1); pageStamp.AlterContents(); copy.AddPage(importedPage); } copy.FreeReader(reader); reader.Close(); } document.Close(); return ms.GetBuffer(); } } } } |
我最近一直在为此苦苦挣扎,找不到适合我需要的库,所以我围绕
我只是在回答有关渲染PDF并将其转换为.NET Core 3.1中的图像的部分,这花了几天的时间才弄清楚。我最终使用phuldr的Docnet.Core来获取图像字节,并使用Magick.NET-Q16-AnyCpu将其保存到图像文件中。
有一些额外的工作可以将图像字节重新排列为RGBA顺序,并将透明像素转换为特定颜色(在我的情况下为白色)。如果有帮助,这是我的代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | public MemoryStream PdfToImage(byte[] pdfBytes /* the PDF file bytes */) { MemoryStream memoryStream = new MemoryStream(); MagickImage imgBackdrop; MagickColor backdropColor = MagickColors.White; // replace transparent pixels with this color int pdfPageNum = 0; // first page is 0 using (IDocLib pdfLibrary = DocLib.Instance) { using (var docReader = pdfLibrary.GetDocReader(pdfBytes, new PageDimensions(1.0d))) { using (var pageReader = docReader.GetPageReader(pdfPageNum)) { var rawBytes = pageReader.GetImage(); // Returns image bytes as B-G-R-A ordered list. rawBytes = RearrangeBytesToRGBA(rawBytes); var width = pageReader.GetPageWidth(); var height = pageReader.GetPageHeight(); // specify that we are reading a byte array of colors in R-G-B-A order. PixelReadSettings pixelReadSettings = new PixelReadSettings(width, height, StorageType.Char, PixelMapping.RGBA); using (MagickImage imgPdfOverlay = new MagickImage(rawBytes, pixelReadSettings)) { // turn transparent pixels into backdrop color using composite: http://www.imagemagick.org/Usage/compose/#compose imgBackdrop = new MagickImage(backdropColor, width, height); imgBackdrop.Composite(imgPdfOverlay, CompositeOperator.Over); } } } } imgBackdrop.Write(memoryStream, MagickFormat.Png); imgBackdrop.Dispose(); memoryStream.Position = 0; return memoryStream; } private byte[] RearrangeBytesToRGBA(byte[] BGRABytes) { var max = BGRABytes.Length; var RGBABytes = new byte[max]; var idx = 0; byte r; byte g; byte b; byte a; while (idx < max) { // get colors in original order: B G R A b = BGRABytes[idx]; g = BGRABytes[idx + 1]; r = BGRABytes[idx + 2]; a = BGRABytes[idx + 3]; // re-arrange to be in new order: R G B A RGBABytes[idx] = r; RGBABytes[idx + 1] = g; RGBABytes[idx + 2] = b; RGBABytes[idx + 3] = a; idx += 4; } return RGBABytes; } |
查看Docotic.Pdf库。该库支持.NET Core,没有任何依赖关系和不安全的代码。
Docotic的PDF到图像渲染器不依赖于GDI +(System.Drawing)。这对于在ASP.NET上下文或Linux中可靠地运行代码非常重要。
合并PDF文档:
1 2 3 4 5 6 7 8 9 10 11 | public void MergeDocuments(string firstPath, string secondPath) { using (var pdf = new PdfDocument(firstPath)) { pdf.Append(secondPath); // or append stream or byte array pdf.ReplaceDuplicateObjects(); // useful when merged files contain common objects like fonts and images pdf.Save("merged.pdf"); } } |
将PDF页面转换为PNG图像:
1 2 3 4 5 6 7 8 9 10 | using (var pdf = new PdfDocument(@"merged.pdf")) { PdfDrawOptions options = PdfDrawOptions.Create(); options.Compression = ImageCompressionOptions.CreatePng(); options.BackgroundColor = new PdfRgbColor(255, 255, 255); options.HorizontalResolution = 600; options.VerticalResolution = 600; pdf.Pages[0].Save("result.png", options); } |
更多将PDF转换为图像的示例
您提到将合并的PDF文档转换为单个PNG图像。 PNG不支持多帧图像(更多详细信息)。因此,您只能执行以下操作:
这是这种情况的示例(将2页合并为一页并另存为PNG):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | using (var other = new PdfDocument(@"merged.pdf")) { using (var pdf = new PdfDocument()) { PdfXObject firstXObject = pdf.CreateXObject(other.Pages[0]); PdfXObject secondXObject = pdf.CreateXObject(other.Pages[1]); PdfPage page = pdf.Pages[0]; double halfOfPage = page.Width / 2; page.Canvas.DrawXObject(firstXObject, 0, 0, halfOfPage, 400, 0); page.Canvas.DrawXObject(secondXObject, halfOfPage, 0, halfOfPage, 400, 0); PdfDrawOptions options = PdfDrawOptions.Create(); options.BackgroundColor = new PdfRgbColor(255, 255, 255); page.Save("result.png", options); } } |
DynamicPDF Rasterizer(NuGet软件包ID:ceTe.DynamicDPF.Rasterizer.NET)会将PDF转换为PNG并在.NET Core上运行。您还可以使用DynamicPDF合并器(NuGet软件包ID:ceTe.DynamicPDF.CoreSuite.NET)合并PDF。这是一个例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 | //Merging existing PDFs using DynamicPDF Merger for .NET product. MergeDocument mergeDocument = new MergeDocument(); mergeDocument.Append(@"D:\\temporary\\DocumentB.pdf"); mergeDocument.Append(@"D:\\temporary\\DocumentC.pdf"); mergeDocument.Append(@"D:\\temporary\\DocumentD.pdf"); //Draw the merged output into byte array or save it to disk (by specifying the file path). byte[] byteData = mergeDocument.Draw(); //Convert the merged PDF into PMG image format using DynamicPDF Rasterizer for .NET product. InputPdf pdfData = new InputPdf(byteData); PdfRasterizer rastObj = new PdfRasterizer(pdfData); rastObj.Draw(@"C:\\temp\\MyImage.png", ImageFormat.Png, ImageSize.Dpi150); |
可以在以下位置找到有关Rasterizer输出格式的更多信息:
http://docs.dynamicpdf.com/NET_Help_Library_19_08/DynamicPDFRasterizerProgrammingWithOutputImageFormat.html
在此处可以找到有关将DynamicPDF Merger和Rasterizer部署到.NET Core 2.0的更多信息:
http://docs.dynamicpdf.com/NET_Help_Library_19_08/DynamicPDFRasterizerProgrammingWithReferencingTheAssembly.html
http://docs.dynamicpdf.com/NET_Help_Library_19_08/Merger%20Referencing%20the%20Assembly%20and%20Deployment.html
我认为PDFSharp(用于dotnetcore)库是所有pdf操作的最佳选择。
这是PDF合并的示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | List<string> documents = new List<string>() {"sample1.pdf","sample2.pdf","sample3.pdf" }; string basePath = env.ContentDocumentPath(); string savePath = System.IO.Path.Combine(basePath,"merged.pdf"); if (System.IO.File.Exists(savePath)) System.IO.File.Delete(savePath); PdfDocument outputDocument = new PdfDocument(); foreach (string documentName in documents) { PdfDocument inputDocument = PdfReader.Open(System.IO.Path.Combine(basePath, documentName), PdfDocumentOpenMode.Import); int count = inputDocument.PageCount; for (int idx = 0; idx < count; idx++) { PdfPage page = inputDocument.Pages[idx]; outputDocument.AddPage(page); } } outputDocument.Save(savePath); |