empira / pdfsharp-1.5 Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 587.0 2.63 MB

A .NET library for processing PDF

License: MIT License

C# 100.00%

pdfsharp-1.5's People

Contributors

Stargazers

Watchers

Forkers

modulexcite gusi suchiman glondono daniellor lian899 adriaanpfeiffer brunosaboia xklqx zparrish arlm ashelley eaglesmith vagfas denispitcher haf flensrocker jnyrup canuckotter skywalker653 jc-calderon xthview hg-stolle leon-wly ravualhemio zaryk govert icenine457 killthatbird kds davzucky halid-durakovic dpieski y4k emapasa patriotexcalibur kannagi77 mikaeleliasson thedumbtechguy priestd09 mnsrulz azanov jmptrader rj2skipper hayate891 jeff-lewis streetconnect thesplashx lcbrd timecoder ivaylo5ev pwian swamyvaditya neinhart bmsengin akroninc andreashen yiqideren allenbrook narutolhy destnity kouweizhong huanglanan feichtnerdatagroup plc5700 moerbius emiyou si-shj astorch svc-user vdex42 lakerfield wangyu0426 jetzfly batwishpers basonipresent babers eladhaviv likeshan168 fiftin vgnelica jenli-chen caizikun tdck wthinkit shenyj drive-revenue alexander-87 lancekeas balumaji reyaleman gillesgauthier xzoth kqlei alexx999 epoulsen boycey10802002 chennaabdelfedtah stegeiss alberp

pdfsharp-1.5's Issues

[Feature request] Support for changing FontResolver

If you once set GlobalFontSettings.FontResolver you can't change it.

I have a Project that gets inputs from different sources which can bring there own fonts.
Because the different sources can use the same name for different fonts, the easiest way would be create a font resolver for any input and than switching the GlobalFontSettings.FontResolver. The library would then automatically invalidate all caches.

I currently see only two work arounds:

Create an lookup for any font to an unique generated font name, so no font from any source will have the same name internally.
Create a new AppDomain and tear it down.

Unfortunately I can't use 2 and using one will eventually run out of memory, because the internal cache will only grow and never shrinks until the process is terminated.

PDF Core Build

Hello,

Your project is very useful and written clear. Thanks a lot!
Do you have any plans to complete PDF Core Build to make the libraries work on Linux using Mono?

PDFSharp losing annotations

Hi there,
this bug is still active!

https://stackoverflow.com/questions/38733190/pdfsharp-losing-annotations

Best regards,
Maurice

Bookmarks are encrypted when PDF is secured via PdfSecuritySettings

There appears to be an issue with bookmarks when a PDF document is encrypted via PdfSecuritySettings.

Take the example of adding a bookmark to a newly built PDF [via the AddBookmark() method] using MigraDoc. After that PDF is rendered [via the PdfDocumentRenderer.RenderDocument() method] if a password is set in the OwnerPassword property via the PdfSecuritySettings object in order to secure the PDF (and the DocumentSecurityLevel property is set to PdfDocumentSecurityLevel.Encrypted128Bit), then all bookmark text appears encrypted in the final, saved PDF when viewed via Adobe Acrobat. If a MigraDoc-constructed PDF is not secured, then bookmark text appears in clear text when viewed via Adobe Acrobat.

As a control, if a PDF is manually secured (by adding an owner password, for example) while using Adobe Acrobat, then bookmark text in that PDF remains readable (i.e., in clear text when viewed via Adobe Acrobat).

Possible to isolate OpenType in a separate project ?

I need the function
public OpenTypeFontface CreateFontSubSet(Dictionary<int, object> glyphs, bool cidFont)
found in
OpenType\Fonts.OpenType\OpenTypeFontface.cs
in another project, together with XFontMetrics.

I've isolated the OpenType code for my purposes here:
https://github.com/ststeiger/PdfSharpNetStandard/tree/master/OpenType

However, it is intertwined with the PdfSharp code (XFont, XFontSource, XGlyphTypeface, XPrivateFontCollection, FontFactory, some enums, etc).

Would it be possible to refactor, and isolate OpenType in a way that it doesn't depend on PdfSharp ?
Extra bonus points if GDI components could be avoided as well (GDI doesn't work on Azure Web App).

AcroForm: cannot set PdfCheckBoxField when they are used as multiple choice

I'm trying to fill a PDF with AcroForm, all goes well but I've problem to set value on PdfCheckBoxField with same name that are used like multiple options.
See "Controllo impianto istallazione interna II" field in the attached PDF: there are 3 field with value 1/2/3.

I saw the PDFsharp source code, but it seems that it handle only specifics situation with only 2 options.

Any idea to workaround the problem?
Thank you.
Davide.

2 DM Libretto Allegato II_CTI 1.0 04-03.pdf

Error 'Object already in table' on PdrReader.Open(FileName, pdfDocumentOpenMode.Import)

I get this error when trying to open the included pdf using this command. I am trying to open this file to combine with other files.

PdrReader.Open(FileName, pdfDocumentOpenMode.Import)

Object already in table

PDFsharp-IssueSubmission.zip

Thanks.

Resources

The official project web site:
http://pdfsharp.net/

The official peer-to-peer support forum:
http://forum.pdfsharp.net/

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

We strongly recommend using the IssueSubmissionTemplate to make sure we can replicate the issue.
http://www.pdfsharp.net/wiki/IssueSubmissions.ashx

PdfDocument::Flatten() improperly named; doesn't flatten form fields

Reporting an Issue Here

https://github.com/empira/PDFsharp/blob/5aa7afeb13270aaca36ad21edcf3cc62d6c5446c/src/PdfSharp/Pdf/PdfDocument.cs#L854-L863

Form flattening has a specific meaning in PDFs where the form fields will be removed as a result of the process. In many circumstances, flattening is just used to disable editing but it has other side effects such as changing the document structure and reducing file size that specific use cases are dependent on. Form flattening is non-trivial to implement, and while this workaround accomplishes the desired effect in some use cases it introduces confusing bugs when the user is performing actions dependent on the form being truly "flattened". In my opinion this function is useful but should be given a different name such as MakeReadOnly, ReadOnly, DisableEditing, etc. that doesn't conflict with the common meaning of flatten.

Expected Behavior

The form fields will be removed and replaced with their contents as regular markup objects

Actual Behavior

The form fields are set to read only mode.

This is a great project that has been extremely useful for me, and I'm definitely nitpicking here but I think it's worth considering so that other consumers of the library don't end up down the same confusing debugging road I did. Thanks for the great work!

InvalidOperationException when attempting to save a valid PdfDocument

I discovered a bug in the PdfDocument.Save functions.

Expected Behavior

Saving without exception.

Actual Behaviour

An InvalidOperationException("Cannot save a PDF document with no pages.") is triggered on a freshly opened PdfDocument, even if it contains pages.

This behaviour is only triggered in the compiled programme or during uninterrupted code runs in the debugger (I am using VS 2017). When using step by step debugging (F10), the exception is not triggered.

Steps to Reproduce the Behavior

Calling the following Test() function with a valid PDF byte array triggers the InvalidOperationException when reaching the pdfDoc.Save(pdfFilePath) instruction. All the other checks before are passed successfully.

  public static void Test(this byte[] pdf, string pdfFilePath)
  {
      if (pdf == null) { throw new ArgumentNullException(nameof(pdf)); }

      PdfDocument pdfDoc;
      try
      {
        pdfDoc = PdfReader.Open(pdf.ToPdfMemoryStream(), PdfDocumentOpenMode.Import);
      }
      catch (FormatException)
      {
        MessageBox.Show("Error: Invalid or empty PDF.");
        return;
      }

      pdfDoc.Save(pdfFilePath);
  }

   public static MemoryStream ToPdfMemoryStream(this byte[] pdf)
    {
      if (pdf == null) { throw new ArgumentNullException(nameof(pdf)); }

      PdfDocument outputDocument = new PdfDocument();
      using (MemoryStream inputStream = new MemoryStream(pdf))
      {
        try
        {
          PdfDocument inputDocument = PdfReader.Open(inputStream, PdfDocumentOpenMode.Import);
          foreach (PdfPage page in inputDocument.Pages)
          {
            outputDocument.AddPage(page);
          }
        }
        catch (InvalidOperationException)
        {
          throw new FormatException("Kein gültiges PDF-Dokument.");
        }
      }
      if (outputDocument.PageCount == 0) { throw new FormatException("PDF is empty"); }

      MemoryStream outputStream = new MemoryStream();
      outputDocument.Save(outputStream);
      return outputStream;
    }

Workaround

Insert the following line in front of pdfDoc.Save(pdfFilePath);:
if (pdfDoc.PageCount == 0) { throw new FormatException("PDF is empty"); }

Possible Fix

The exception is thrown when in the void DoSave(PdfWriter writer) function of the PdfDocument class the following condition is met:
if (_pages == null || _pages.Count == 0)
The bug is caused by the _pages variable not being initialised. This also explains the workaround: Calling PdfDocument.PageCount triggers the initialisation of _pages.

A possible fix of the bug is to replace _pages with Pages in the incriminated line, which triggers the initialisation. Here is a compact version of the fix that avoids to attempt the initialisation twice:
if ((Pages?.Count ?? 0) == 0)

Zero-width Joiner Not Handled Properly

Greetings,

In testing the PdfSharp package, we found that certain characters we not rendered properly. For example, the character ප්‍ර is being rendered as ප්‍ ර. It appears to be linked to characters that use a zero-width joiner in their combination.

I've attached an example to demostrate the issue. Thank you for any support you can provide.

Regards.

PDFsharp-IssueSubmission.zip

NullReferenceException in PdfReader.Open

Hi guys,
Currently I am suffering a null reference exception when calling the static function
PdfReader.Open(Stream stream, PdfDocumentOpenMode mode).

in my case the mode is PdfDocumentOpenMode.Import

Did you guys encounter this error before? And did I miss something?

Thanks.

Editing any acroform causes adobe reader to have "extended features disabled" error/warning.

Reporting an Issue Here

Whenever I edit an acro form (for example, change text stream properties, use checkboxes, or adjust listboxes), a warning pops up if I open it in adobe acrobat. I opened it in foxit I don't seem to have these issues.

Expected Behavior

I can open the adjusted file in adobe acrobat and see the changes made.

Actual Behavior

The warning shown above pops up in adobe and the form defaults to its original configuration. Changes seem to be fine in foxit.

Steps to Reproduce the Behavior

Adjust an acroform's elements properties, for example, a radiobutton's elements["V"] property.

Surrogate characters not working

Reporting an Issue Here

Surrogate characters (characters that does not fit in 2 bytes) will not drawn correctly.

Expected Behavior

Drawing string with surrogate characters (e.g. 🅐) should draw the correct glyph.

Actual Behavior

Two non recognizable characters are printed. The surrogate pair is interpreted as two separated characters.

Steps to Reproduce the Behavior

You can reproduce this with the minimal sample repository

Relative coordinates are drawn in invalid way

Reporting an Issue Here

Expected Behavior

Relative coordinates are drawn normally

Actual Behavior

Relative coordinates are drawn with increasing of actual coordinate values.

Steps to Reproduce the Behavior

using System;
using System.Diagnostics;
using System.IO;
using PdfSharp.Drawing;
using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;

namespace DrawRectangle
{
    class Program
    {
        static void Main()
        {
            var srcPdf = @"C:\Users\Sergey Kuchuk\Desktop\TEST.pdf";
            var tmpPdf = @"C:\Users\Sergey Kuchuk\Desktop\TEST-REZ.pdf";

            if (File.Exists(tmpPdf))
                File.Delete(tmpPdf);
            File.Copy(srcPdf, tmpPdf);


            var inputPdfDocument = PdfReader.Open(tmpPdf, PdfDocumentOpenMode.Import);
            var pdfDocument = new PdfDocument();
            foreach (var page in inputPdfDocument.Pages)
                pdfDocument.AddPage(page);
            inputPdfDocument.Close();

            double x = 0.18900810339655708;
            double y = 0.0638677225513861;
            double w = 0.21725069355926102;
            double h = 0.028283526804211694;

            using (var pageGraphics = XGraphics.FromPdfPage(pdfDocument.Pages[0]))
            {
                Console.WriteLine("Width (points) {0}; Height: {1}", pdfDocument.Pages[0].Width.Point, pdfDocument.Pages[0].Height.Point);
                DrawRectangle(pageGraphics, XColors.Green, 1,
                    x * pdfDocument.Pages[0].Width, y * pdfDocument.Pages[0].Height,
                    w * pdfDocument.Pages[0].Width, h * pdfDocument.Pages[0].Height);
            }

            pdfDocument.Save(tmpPdf);
            Process.Start(tmpPdf);
        }

        public static void DrawRectangle(XGraphics pageGraphics, XColor color, double penWidth, double x, double y, double width, double height)
        {
            var pen = new XPen(color, penWidth)
            {
                LineCap = XLineCap.Round,
                LineJoin = XLineJoin.Bevel
            };
            Console.WriteLine(pageGraphics.PageUnit);
            pageGraphics.DrawRectangle(pen, new XRect(x, y, width, height));
        }
    }
}

HelloWorld.pdf (sample) cannot be opened in Import mode

When testing the ConcatenateDocuments sample, I get a null reference exception on opening HelloWorld.pdf.

(I switched from .NET 2.0 to .NET 4 before running the sample)

Cant build source code

I downloaded the PDF Sharp source code from Sourceforge. When i open the solution 'BuildAll-PdfSharp.sln' in VisualStudio and try to build, i am getting following build errors

Error 14 Metadata file 'D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp-gdi\bin\Debug\PdfSharp-gdi.dll' could not be found D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp.Charting-gdi\CSC PdfSharp.Charting-gdi
Error 15 Metadata file 'D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp-wpf\bin\Debug\PdfSharp-wpf.dll' could not be found D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp.Charting-wpf\CSC PdfSharp.Charting-wpf
Error 13 Metadata file 'D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\bin\Debug\PdfSharp.dll' could not be found D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp.Charting\CSC PdfSharp.Charting
Error 1 The name 'nameof' does not exist in the current context D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfPageInheritableObjects.cs 67 88 PDFsharp
Error 2 The name 'nameof' does not exist in the current context D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Drawing\XUnit.cs 71 78 PDFsharp
Error 3 The name 'nameof' does not exist in the current context D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfContents.cs 131 49 PDFsharp
Error 4 The name 'nameof' does not exist in the current context D:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Content.Objects\CObjects.cs 775 53 PDFsharp
Error 5 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfPageInheritableObjects.cs 67 88 PdfSharp-gdi
Error 6 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Drawing\XUnit.cs 71 78 PdfSharp-gdi
Error 7 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfContents.cs 131 49 PdfSharp-gdi
Error 8 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Content.Objects\CObjects.cs 775 53 PdfSharp-gdi
Error 9 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfPageInheritableObjects.cs 67 88 PdfSharp-wpf
Error 10 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Drawing\XUnit.cs 71 78 PdfSharp-wpf
Error 11 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Advanced\PdfContents.cs 131 49 PdfSharp-wpf
Error 12 The name 'nameof' does not exist in the current context d:\eProof\testProjects\PDFSharp_Github\PDFsharp\src\PdfSharp\Pdf.Content.Objects\CObjects.cs 775 53 PdfSharp-wpf

Can anyone help me out with building this?

Issues with Crop Boxes, XPdfForm, and XGraphics.DrawImage

Resources

Issue Submission Code
PNM PDFSharp Issue Submission 2018-04-18.zip

Reporting an Issue Here

We have a PDF with a crop box that doesn't match the media box and we want to create a new PDF without any cropping but whose appearance matches the original when opened in Acrobat or any other reader.

Expected Behavior

Expected behavior is that

XPdfForm.PointWidth and XPdfForm.PointHeight should take the crop box into account
Calling XGraphics.DrawImage when the source image is an XPdfForm should take the crop box into account
Even if those functions don't take the crop box into account, I ought to be able to use the crop box as the srcRect for XGraphics.DrawImage

Actual Behavior

The source page always reports dimensions based on the media box and never the crop box
When I attempt to use the crop box as the srcRect for XGraphics.DrawImage, the image is squished.

Steps to Reproduce the Behavior

See issue submission template for code.

Use XPdfForm.FromStream to open the source PDF and then set the page index.
Create a new destination PDF and PDF page using the width and height of the source page's crop box
Use the XPdfForm object as the source image in a call to XGraphics.DrawImage
Use the source crop box dimensions as the srcRect

Note

I tried this with both the current stable version as well as the new version (PDFSharp-GDI), both from nuget. In the attached code, the WPF project is using the new PDFSharp-GDI. The other projects are using the current stable version. So this is still an issue in the current release candidate.

Thanks!

Fixes for NetStandard-version

I've been porting the code to NetStandard.
https://github.com/ststeiger/PdfSharpNetStandard

Could you move the code in PdfSharp.Forms and PdfSharp.Windows into a separate shared-project ?
https://github.com/ststeiger/PdfSharpNetStandard/tree/master/PdfSharp_Removed

Also, same thing with Rendering.Forms and Rendering.Windows in MigraDoc.Rendering
https://github.com/ststeiger/PdfSharpNetStandard/tree/master/MigraDoc_Rendering_Removed

Then it would be very simple to have a NetStandard-Version.

Also, if you used partial classes in cunjunction with a shared project for gdi and wpf, then you could get rid of all the #ifs that make the project unreadable, and also wouldn't need to symlink files.

Images and encryption break the resulting document

The issue is pretty simple to reproduce: add an image and set the encryption of the document. The result is a broken document (Adobe Reader alerts that), and not image displayed. Other objects seem rendering fine, though.

What I found is about a double-RC4 encryption (thus no encryption) on stream objects. Removing one of them, the resulting document seems okay.

The first encryption is performed here: https://github.com/empira/PDFsharp/blob/b84018e1ef6c646a4062c7bb4f53561c4027d48f/src/PdfSharp/Pdf.Security/PdfStandardSecurityHandler.cs#L170

The second one here, during the document saving (Save method to file): https://github.com/empira/PDFsharp/blob/b84018e1ef6c646a4062c7bb4f53561c4027d48f/src/PdfSharp/Pdf.IO/PdfWriter.cs#L428

At this point, the question is: which one is the better to remove?

Support for XFont to check if character is supported.

Not all fonts have the same glyphs. It would be nice if one could check with an method on an XFont object if a specific character has a glyph in the font. So the calling library's can change to a fallback font.

The best method I've fond was CharCodeToGlypheIndex. If this returns 0 no glyph was found. But this method is only internal accessible. So calling library's can't use this.

NullReferenceException XGraphics.DrawImage() when XImage Pixel format is Format1bppIndexed

I am trying to put QRCode on a PDF file. Here's my code:

        PdfDocument document = new PdfDocument();
        PdfPage page = document.AddPage();
        page.Orientation = PdfSharp.PageOrientation.Portrait;
        page.Width = XUnit.FromInch(8.5);
        page.Height = XUnit.FromInch(11);
        XGraphics gfx = XGraphics.FromPdfPage(page);
        XImage xImage = XImage.FromGdiPlusImage(image);
        gfx.DrawImage(xImage, 10, 10, 290, 290);
        document.Save("test.pdf");

I'm getting a null reference exception, here's the stacktrace:
at PdfSharp.Pdf.Advanced.PdfImage.ReadIndexedMemoryBitmap(Int32 bits) at PdfSharp.Pdf.Advanced.PdfImage..ctor(PdfDocument document, XImage image) at PdfSharp.Pdf.Advanced.PdfImageTable.GetImage(XImage image) at PdfSharp.Pdf.PdfPage.GetImageName(XImage image) at PdfSharp.Drawing.Pdf.XGraphicsPdfRenderer.Realize(XImage image) at PdfSharp.Drawing.Pdf.XGraphicsPdfRenderer.DrawImage(XImage image, Double x, Double y, Double width, Double height) at PdfSharp.Drawing.XGraphics.DrawImage(XImage image, Double x, Double y, Double width, Double height) at UniWallet.Services.ApiApplication.Extensions.ImageExtensions.ToPDFFileByteArray(Image image) in C:\CODES\UniWallet\UniWallet_Dev\UniWallet.Services\Api\UniWallet.Services.ApiApplication\Extensions\ImageExtensions.cs:line 48 at UniWallet.Services.ApiApplication.Test.Extensions.ImageExtensionsTest.Create() in C:\CODES\UniWallet\UniWallet_Dev\UniWallet.Services\Api\UniWallet.Services.ApiApplication.Test\Extensions\ImageExtensionsTest.cs:line 36

I tried to follow the stack trace, and here's where I ended up:
PDFImage.cs
case PixelFormat.Format1bppIndexed: ReadIndexedMemoryBitmap(1/*, ref hasMask*/); break;

I tried images with other pixel format, meaning pictures with some color, and the code worked. Looks like there's something happening when an image with only two colors (black and white) is being used.

THANKS.

Error on opening pdf document with 1.50.4740-beta5

Document - pdf.pdf

PdfSharp.Pdf.IO.PdfReaderException was unhandled
  HResult=-2146233088
  Message=Unexpected character '0x0017' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file.
  Source=PdfSharp-gdi
  StackTrace:
       в PdfSharp.Internal.ParserDiagnostics.HandleUnexpectedCharacter(Char ch)
       в PdfSharp.Pdf.IO.Lexer.ScanNextToken()
       в PdfSharp.Pdf.IO.Parser.ReadInteger(Boolean canBeIndirect)
       в PdfSharp.Pdf.IO.Parser.ReadObjectNumber(Int32 position)
       в PdfSharp.Pdf.IO.Parser.ReadXRefStream(PdfCrossReferenceTable xrefTable)
       в PdfSharp.Pdf.IO.Parser.ReadXRefTableAndTrailer(PdfCrossReferenceTable xrefTable)
       в PdfSharp.Pdf.IO.Parser.ReadTrailer()
       в PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider passwordProvider)
       в PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, PdfDocumentOpenMode openmode)

NullReferenceException when saving with owner password

Steps to reproduce:

Create a new console application;
Add a NuGet reference: Install-Package PDFsharp-gdi -Version 1.50.4619-beta4c;
Copy the code from the "Hello World" sample application;
Set an owner password, a user password, or both before the call to Save;

Expected result:

The document should be saved and protected with a password.

Actual result:

System.NullReferenceException: Object reference not set to an instance of an object.
    at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.PrepareRC4Key(Byte[] key, Int32 offset, Int32 length)
    at PdfSharp.Pdf.Internal.PdfEncoders.FormatStringLiteral(Byte[] bytes, Boolean unicode, Boolean prefix, Boolean hex, PdfStandardSecurityHandler securityHandler)
    at PdfSharp.Pdf.IO.PdfWriter.WriteDocString(String text)
    at PdfSharp.Pdf.PdfDate.WriteObject(PdfWriter writer)
    at PdfSharp.Pdf.PdfDictionary.WriteDictionaryElement(PdfWriter writer, PdfName key)
    at PdfSharp.Pdf.PdfDictionary.WriteObject(PdfWriter writer)
    at PdfSharp.Pdf.PdfDocument.DoSave(PdfWriter writer)
    at PdfSharp.Pdf.PdfDocument.Save(Stream stream, Boolean closeStream)
    at PdfSharp.Pdf.PdfDocument.Save(String path)

Code:

static void Main()
{
    // Create a new PDF document
    PdfDocument document = new PdfDocument();
    document.Info.Title = "Created with PDFsharp";

    // Create an empty page
    PdfPage page = document.AddPage();

    // Get an XGraphics object for drawing
    XGraphics gfx = XGraphics.FromPdfPage(page);

    // Create a font
    XFont font = new XFont("Times New Roman", 20, XFontStyle.BoldItalic);

    // Draw the text
    gfx.DrawString("Hello, World!", font, XBrushes.Black,
        new XRect(0, 0, page.Width, page.Height),
        XStringFormats.Center);

    // Set the password(s):
    document.SecuritySettings.OwnerPassword = "password";
    document.SecuritySettings.UserPassword = "password";

    // Save the document...
    const string filename = "HelloWorld_tempfile.pdf";
    document.Save(filename);

    // ...and start a viewer.
    Process.Start(filename);
}

Notes:

It doesn't matter whether you set the owner password, the user password, or both.
It doesn't matter what you set the password to, so long as it's not an empty string.
It doesn't matter whether you're creating a new document, or modifying an existing document.

System details:

Microsoft Visual Studio Professional 2017 version 15.4.2
Microsoft .NET Framework version 4.7.02556
Windows 10 v1709 build 16299.19

Sign pdf file

Hello,

Can i use this library to sign an existing pdf file with a certificate?

Thanks,
Frederico

Can't open "secured" PDFs

I have an issue with PDF documents that have an owner password and/or some odd content compression, like this one:

https://www.swissfunddata.ch/sfdpub/docs/kid-8059_08_05-20180208-en.pdf

PDFsharp fails to open them (and also crashes when looking up the exception message). I'm not entirely sure what goes wrong and whether it would all be fine if the owner password was available.

Btw, my use case is that I need to combine multiple documents like this one into one big document which can be printed more easily (i.e. the user doesn't have to open and print them one by one).

Support for TextFormat.StrikeThrough

(MigraDoc) Would it be possible to add TextFormat.StrikeThrough? Thanks!

Wrong order of arguments in Exceptions and other checks

Please have a look at this commit for places where the order of arguments for ArgumentException are switched, places where a variable used before a null check and use of the wrong variable in Equals method.
jnyrup@7d219d2

Create PDFSharp, MigraDoc .NET Standard 2.0 version with ImageSharp

First, I would like to thank you for the 2 great libraries PDFSharp and MigraDoc.

I have a suggestion.

Use Case Description

I would like to use these libraries in ASP.NET Core, cross platform so that any ASP.NET Core application deployed anywhere can create PDFs documents on the fly with a good performance.

Possible Solution

PDFSharp and MigraDoc built with .NET Standard 2.0
Remove System.Drawing and use ImageSharp (Goes RC next month)
remove the WPF, GDI bits
Only support PDF, no RTF at the start
remove the preview function in the first version
remove all the code #if switches
Start afresh and drop support for older deployment types, existing PDFSharp supports this anyway.

Implementation Path Suggestion

I suggest creating a new repository which will implement this and port the following:

MigraDoc.DocumentObjectModel
MigraDoc.Rendering
PdfSharp
PdfSharp.Charting

@ststeiger @YetaWF have also created .NET Standard 2.0 partial ports, maybe they could help.

What do you think? would you be interested, see this as a good thing? Maybe we could all do it together.

@ststeiger @YetaWF @JimBobSquarePants

Greetings Damien

Feature: Export to string

Hi,
I would like to see a feature to extract all text to string like PdfBox:

PDFTextStripper pdfTextStripper = new PDFTextStripper();
string contentOfAllPages = pdfTextStripper.getText(PDDocument.load(pdfFileName));

I have already found some code to generate content out of pages but the result has too many linebreaks. Espacially words are sometimes split across multiple lines.

How to decode Hex Strings?

I was taking a look at your software and I must say it's very good.
So I decided to give it a try...

I have one of these PDFs that contains a vector image inside.
I managed to extract the specific stream for the vector content:

This is what I got:

q
1 0 0 1 340.9799957 298.8000031 cm
1 g
0 0 m
20.04 0 l
20.04 -11.46 l
0 -11.46 l
0 0 l
h
f*
Q
BT
/C2_0 10.121 Tf
-0.175 Tc 342.06 289.56 Td
<0004000500060004>Tj
ET

(...)

I'm able to render it.
But I'm having some difficulties rendering the text.

For example:

BT
/C2_0 10.121 Tf
-0.175 Tc 342.06 289.56 Td
<0004000500060004>Tj
ET

The hex string does not seem to be a valid string.
My guess is that it's and index to the font's code page, in this case the font referred by /C2_0

0004000500060004 => 0004 0005 0006 0004
Depending on the representation I'm assuming 2 bytes per code.
I don't know where to check that information, (I know simple font sizes only take one byte)

The question is how can I have access to the font and respective code page information to extract the text.

Or better yet if there's a simpler way to get all of this without me having to parse the vector data myself.
Getting the Objects directly... For example PdfLine, PdfText, PdfCircle, etc...

Thanks.

I have a replacement for GetFontData on Linux

I have a replacement for GetFontData, which works on all platforms.

https://gist.github.com/ststeiger/273341aebd29009f2b272b822b69563f

This uses the C# FreeType wrapper from
https://github.com/Robmaister/SharpFont

SharpFont doesn't work on Windows x64 out of the box, this correction is requried:
https://gist.github.com/ststeiger/9e2eb98e29a3c987aca739045af1d2ce

I think you have been working with OpenType before.
I'm pretty sure it would be possible to remove the dependency on FreeType.

[feature] support for Visitor pattern

when iterating through a pdf file a common way is to write some kind of a recursive method to do this.

A visitor class will make this more simple, something link this:

public class PdfCObjectVisitor
{
    // the CObject class should contain a virtual Accept method
    public void Accept(CObject @object) => VisitObject(@object);
    
    public virtual void VisitName(CName name)
    {
    }
    public virtual void VisitString(CString @string)
    {
    }
    public virtual void VisitOperator(COperator @operator)
    {
        VisitSequence(@operator.Operands);
    }
    public virtual void VisitComment(CComment comment)
    {
    }
    public virtual void VisitArray(CArray array)
    {
        foreach (var @object in array)
        {
            VisitObject(@object);
        }
    }
    public virtual void VisitInterger(CInteger integer)
    {
    }
    public virtual void VisitReal(CReal real)
    {
    }
    public virtual void VisitNumber(CNumber number)
    {
    }
    public virtual void VisitSequence(CSequence sequence)
    {
        foreach (var @object in sequence)
        {
            VisitObject(@object);
        }
    }
    public virtual void VisitObject(CObject @object)
    {
        switch (@object)
        {
            case CName name:
                VisitName(name);
                break;
            case CString @string:
                VisitString(@string);
                break;
            case COperator @operator:
                VisitOperator(@operator);
                break;
            case CComment comment:
                VisitComment(comment);
                break;
            case CArray array:
                VisitArray(array);
                break;
            case CInteger integer:
                VisitInterger(integer);
                break;
            case CReal real:
                VisitReal(real);
                break;
            case CNumber number:
                VisitNumber(number);
                break;
            case CSequence sequence:
                VisitSequence(sequence);
                break;
        }
    }
}

then to write a class that extract all of the text from a pdf is really simple:

public class TextExtractorPdfVisitor : PdfCObjectVisitor
{
    public StringBuilder Builder { get; } = new StringBuilder();
    public override void VisitOperator(COperator @operator)
    {
        if (@operator.OpCode.OpCodeName != OpCodeName.TJ
            && @operator.OpCode.OpCodeName != OpCodeName.Tj)
        {
            return;
        }
        base.VisitOperator(@operator);
    }
    public override void VisitString(CString @string)
    {
        Builder.Append(@string.Value);
    }
}

ContentReader.ReadContent(page) unicode issues

Some characters are not correctly extracted (eg: unicode hyphen variant U+2013 == EN DASH == e2 80 93).
CString seems to be empty.

"Behind" should read "In Front"?

For the summary on XGraphicsPdfPageOptions, shouldn't Append's summary tag read "The new content is appended in front of the old content and any subsequent drawing in done above the existing graphic."?

It threw me for a loop when I read The new content is inserted behind the old content and any subsequent drawing in done above the existing graphic. because that would be the exact opposite of an append.

https://github.com/empira/PDFsharp/blob/5aa7afeb13270aaca36ad21edcf3cc62d6c5446c/src/PdfSharp/Drawing/enums/XGraphicsPdfPageOptions.cs#L38

Support for "ImageSharp"

For cross-platform applications (i.e. .Net Core 2.0) I'd suggest to add the support for ImageSharp as graphics interface.

I've been able to make it working, but that was because someone else made it work before me!

The current mine is somewhat meant as temporary until a better implementation will be available. Let me know if you're interested in my attempt.

How to use it

NullReferenceException when opening a PDF/A Document

When opening a PDF/A-Document with the following code I get a NullReferenceException:

PdfReader.Open(pdfFileName, PdfDocumentOpenMode.ReadOnly));

The exception is:
System.NullReferenceException : Object reference not set to an instance of an object.
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.PrepareRC4Key(Byte[] key, Int32 offset, Int32 length)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.PrepareKey()
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptString(PdfString value)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptDictionary(PdfDictionary dict)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptArray(PdfArray array)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptDictionary(PdfDictionary dict)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptObject(PdfObject value)
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.EncryptDocument()
at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider passwordProvider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider provider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, PdfDocumentOpenMode openmode)
at Garaio.REM.Business.Printing.PdfAppenderFixture.PageCountOf(String pdfFileName) in C:\projects\garaio\REM\04_Development\Garaio.REM.REWE.Tester\Garaio.REM.Business\Printing\PdfAppenderFixture.cs:line 29

Incorrect value of PDF boolean object

I am writing a PDF file using PDFSharp. For some reason the value of a boolean object is written as 'False' instead of 'false' (notice the upper case 'F')

As a result while i am reading the file again i am getting following error in PDFSharp
"Unexpected token 'False' in PDF stream. The file may be corrupted. If you think this is a bug in PDFsharp, please send us your PDF file."

PDFSharp version: Assembly PdfSharp.dll, v1.50.4740.0

Is it possible to use PDFSharp to create diagrams that PSTricks can generate?

I love creating diagrams with PSTricks (that is a library of PostScript instructions wrapped for TeX users ).
I also love C#.

I am a newbie in PDFSharp. My question is

Is it possible to use PDFSharp for drawing diagrams that PSTricks can produce?

Please navigate to this link to know what PSTricks is.

Cannot add custom font

When I try to add a custom font using PDFSharp.Drawing.XPrivateFontCollection.AddFont(), I get a warning that says the method is deprecated, and I should use Add(). There is no Add method. When I run it I get an exception.

I downloaded the code, and I can see in the XPrivateFontCollection.AddFont() method where the first line throws an exception. I also see the Add() method, but it is commented out.

I see you have just put RC1 out. It would be great if custom font support would work in the next release.

SharpZipBaseException on calling XGraphics.FromPdfPage

For a lot of PDF pages I get a SharpZipBaseException when calling XGraphics.FromPdfPage.
The exception is thrown in InflaterInputStream.Fill() with message "Unexpected EOF".

I can 'fix' this problem using a hack in InflaterInputStream.Read:

public override int Read(byte[] buffer, int offset, int count)

   ....
if (inf.IsNeedingInput)
{
    try
    {
        Fill();
    }
    catch(SharpZipBaseException)
    { // WB! early EOF: apparantly not a big deal for some PDF pages: break out of the loop.
        break; 
    }
}
     ...

Square brackets "[]" in custom property name corrupts pdf file on save

If a PDF has a custom property whose name contains square brackets the file is corrupted when it is saved by PDFsharp. On the next open, PDFsharp gives an UnexpectedToken error and is unable to open the file. One is able to open the file in Acrobat Reader, but not able to view the file's properties.

I have included an example solution.
CustomPropertyIssue.zip

Highlight Annotation - Help needed

I am creating a new class PdfHighlightAnnotation for rendering a transparent LightYellow rectangle in order to highlight any text in an existing PDF.

I don't have any idea about what values to set in following methods. E.g. What should be the value of Keys.Subtype for highlighting text? Subsequently, what value I can set for Open and Name string constants of the Keys class?

This code is from PdfTextAnnotation.cs

        void Initialize()
        {
            Elements.SetName(Keys.Subtype, "/Text");
            // By default make a yellow comment.
            Icon = PdfTextAnnotationIcon.Comment;
            //Color = XColors.Yellow;
        }

        internal new class Keys : PdfAnnotation.Keys
        {
            [KeyInfo(KeyType.Boolean | KeyType.Optional)]
            public const string Open = "/Open";

            [KeyInfo(KeyType.Name | KeyType.Optional)]
            public const string Name = "/Name";

            public static DictionaryMeta Meta
            {
                get { return _meta ?? (_meta = CreateMeta(typeof(Keys))); }
            }
            static DictionaryMeta _meta;
        }

Can anyone help?

Signing functionality

Hi,

Thank you for sharing this wonderful library.

My enterprise added a signing functionality to it and wants to contribute to your project. Before creating a pull request I need some information:

Would you accept a pull request with this functionality?
When the next release will be scheduled in case you'll accept the pull request?

Best,
Paul

Support for .NetCore

I'm starting to use your library to parse pdf and extract data from it. This is working perfectly for windows, however I'm not able to even compile with .Net core because of all the GUI library ( like silverlight or Winform) dependency that the base library has.
Do you thing it would be possible to have a base pdfSharp.Core library that only contain parsing and creation of of pdf ?
I'm would be happy to help if you need

Use the "CodePagesEncodingProvider" where codepages are not supported

If you try to map the sources for .Net Core (.Net Standard 2.0), you'll face an exception when an external OpenType font is loaded.

The exception is a "NotSupportedException", and it's thrown by the Encoding.GetEncoding(1252) call. The reason is well explained here.

namespace PdfSharp.Pdf.Internal
{
    /// <summary>
    /// Groups a set of static encoding helper functions.
    /// </summary>
    internal static class PdfEncoders
    {
        ...

        /// <summary>
        /// Gets the Windows 1252 (ANSI) encoding.
        /// </summary>
        public static Encoding WinAnsiEncoding
        {
            get
            {
                if (_winAnsiEncoding == null)
                {
#if !SILVERLIGHT && !NETFX_CORE && !UWP
                    // Use .net encoder if available.
                    _winAnsiEncoding = CodePagesEncodingProvider.Instance.GetEncoding(1252);
                    //_winAnsiEncoding = Encoding.GetEncoding(1252);
#else
                    // Use own implementation in Silverlight and WinRT
                    _winAnsiEncoding = new AnsiEncoding();
#endif
                }
                return _winAnsiEncoding;
            }
        }

        ...
    }
}

By following the suggestion here, the problem seems solved.

However, I believe it should be made active via a proper conditional switch.

Paging in Footer like Page x of y.

I didnt find any documentation to implement paging in footer.

Reader.Open

I get an "Object already in table issue", I've been able to manipulate a lot of Pdfs but all of a sudden this issue arose for a specific pdf.

It looks like, it thinks it's not done?

i'm assuming after the first loop, the variable "prev" is supposed to be zero? so it breaks out of the loop?
is there something specific i am supposed to look for in the pdf itself?

License information

After some research and several clicks I found that:

On site https://archive.codeplex.com/?p=pdfsharp there is information that project was migrated to github
also on this page I found link to project page http://www.pdfsharp.net/ where I fonud that it is licensed under MIT

Can license information as well as project homepage can be added to github README.md
It will be more than helpful to have that information close to code

Benefits:
Github search filter will work better if we searching by license
Reduce number of clicks to find license information
Will be much clearer under what license is this project for github users

Parser.cs(ReadXRefStream()): Is Debug.Assert(generation == 0) neccessary?

Parser.cs/ReadXRefStream(): Is the line 1189 Debug.Assert(generation == 0) really necessary?

In my PDF file there is an XRef stream object with generation 1.
168 1 obj
<</DecodeParms <</Columns 5/Predictor 12>>/Filter /FlateDecode/ID [(K\267M\221U\321\254>\315\377,Q\372\207KL) (K\267M\221U\321\254>\315\377,Q\372\207KL)]/Info 1 0 R/Length 329/Root 2 0 R/Size 176/Type /XRef/W [1 3 1]>>
stream

empira / pdfsharp-1.5 Goto Github PK

pdfsharp-1.5's People

Contributors

Stargazers

Watchers

Forkers

pdfsharp-1.5's Issues

Resources

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

Reporting an Issue Here

Expected Behavior

Actual Behavior

Expected Behavior

Actual Behaviour

Steps to Reproduce the Behavior

Workaround

Possible Fix

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

Resources

Reporting an Issue Here

Expected Behavior

Actual Behavior

Steps to Reproduce the Behavior

Note

Steps to reproduce:

Expected result:

Actual result:

Code:

Notes:

System details:

Use Case Description

Possible Solution

Implementation Path Suggestion

Recommend Projects

Recommend Topics

Recommend Org

Jobs