unidoc / unipdf-examples Goto Github PK

Examples for creating and processing PDF files with UniPDF https://github.com/unidoc/unipdf

Go 81.04% Shell 0.04% Smarty 18.92%

unipdf-examples's Introduction

Examples

This example repository demonstrates many use cases for UniDoc's UniPDF library. Example code should make it easy for users to get started with UniPDF. Feel free to add to this by submitting a pull request.

While the majority of examples are fully in pure Go, there are a few examples that demonstrate additional functionality that requires CGO and external dependencies. Those examples are clarified by filename suffix "_cgo.go".

License codes

UniPDF requires license codes to operate, there are two options:

Metered License API keys: Free ones can be obtained at https://cloud.unidoc.io
Offline Perpetual codes: Can be purchased at https://unidoc.io/pricing

Most of the examples demonstrate loading the Metered License API keys through an environment variable UNIDOC_LICENSE_API_KEY.

Examples for Offline Perpetual License Key loading can be found in the license subdirectory.

Build all examples

Building with go modules:

Simply run the build script which builds all the binaries to subfolder bin/

$ ./build_examples.sh

Building with GOPATH:

Building with GOPATH requires a slightly different approach due to the /v3 semantic import portion of the unipdf import paths. There are two options:

Both options start with:

go get github.com/unidoc/unipdf/... to download the packages

Then one can decide between the two options:

Remove the /v3/ in the unipdf import paths, e.g. use github.com/unidoc/unipdf/core instead of github.com/unidoc/unipdf/v3/core
Alternatively create a symbolic link from the v3 subdirectory of unipdf to the unipdf repository, i.e.

ln -s $GOPATH/src/github.com/unidoc/unipdf $GOPATH/src/github.com/unidoc/unipdf/v3

or move/copy the unipdf folder to unipdf/v3 if symbolic links are not an option.

Once this has been done, then can build using the build script as well:

$ ./build_examples.sh

or build individual example codes as desired.

unipdf-examples's People

Contributors

Stargazers

Watchers

Forkers

peterwilliams97 tkrajina magicking sdwkn shushkins joychenjh joserfjuniorllms sgrodriguezml lixin9311 vojtechvitek maxjkfc dennwc bglar alexander-deniskin bookofdom d4z3x dontbesad wudi zanooc gecko655 adrg josh-hill-yoti nickshen3 ryanbickerstaff yankooo simonxing liwina ericdotwang cnbailh konstantingrig yingfu9218 vkuznet theassyrian yiqideren tsdrm ssoftdev h4yfans danishyasin33 daniel-007 oscarpfernandez abakabir kucjac changsongyang rangow lovepepsi omnizya snwfdhmp eajardini mrmimmalik panjjo xianlimei kaunge-fork 3ace dreddick-home harish-valuelabs pepelazz zhangshiguang gtrevg njf-chn zaky sglapiak liarendly galihrivanto mamoroom hardcorelife yanyushr senys sampila johnbitcn silviosaczucktc weichangdong fizzdi fernandezraj102 huanqiurenwu moritamori anthonycook yenlik29 arknable kelogs grigortovmasian abouroumine kellemnegasi strogo dikyridhlo zacharysyoung ppiccolo velkitor anovik calorus devansh42 bungio20k caifengcai charles-hello dreamph daniel-orlov fivezjd jiyeyuran pvillamil amarps

unipdf-examples's Issues

Cloud Providers

Hi,

I'm planning on using the license version in the future. I'm interested in the digital signatures functionality.

However, I don't see examples for different cloud providers. Is it possible to add an example integrating with AWS KMS and Google Cloud KMS?

Tried the external example but it seems it doesn't do the trick.

Replaced text looks incorrect

An issue was received via email.

About text replacement I checked it with our document. The searched text is replaced but some symbols displayed incorrect. For example I try to replace word TYPE with A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z and symbols J, K, V and Y does not displayed. Instead of this symbols I received empty space. The same case with lower case. Also all special symbols displayed incorrect !@#$%^&*(){}_-+=[]

This applies to the advanced text replacement example.
An example PDF template was provided.

Add README.md to each example subdirectory

The README.md in each subdirectory should give a broad overview of the category and description of each example item.

[Question] Underline

Is there any example on how to underline text in a PDF?

Test filecfor https://github.com/unidoc/unipdf-examples/pull/69

green_jobs.pdf

example to flatten form-filled pdf to remove fillable fields?

HI there -

nice library! is there an example that shows how to flatten a form filled pdf?

I think I figured out how to fill forms (using acroForm.Fields and the field.V=...), although when the pdf is saved, it's saved with fillable fields (with my values). It would be best to somehow save form filled pdf into a file without any form fields present, with just values (flatten the final pdf in other words).

Thanks!

[Question] Documentation for Office tools

Is there any documentation for the office suite?

Create example of filling form with custom font

Would be good to have an example showcasing
unidoc/unipdf#204
with the feature which was added by
unidoc/unipdf#368

forms/pdf_form_fill_json.go debug logs sometimes pollute JSON output

One of the example steps in forms/pdf_form_fill_json.go is to run:

go run pdf_form_fill_json.go input.pdf > formdata.json

Because the example enables debug logging to io.Stdout, any debug messages will wind up in the output JSON file, and will break it.

For example, one of my PDFs produces this output:

$ go run pdf_form_fill_json.go in.pdf > data.json
$ head -6 data.json
[DEBUG]  parser.go:754 Pdf version 1.3
[
    {
        "name": "MyField",
        "value": "foo123"
    },

Perhaps logging should be disabled? Or, alternatively, you could alter the loggers in "github.com/unidoc/unipdf/v3/common" to output to io.Stderr (instead of io.Stdout), and a caller could pipe only the stdout stream to the JSON file.

Page rotation flattening example

Create an example for rotation flattening.
If page's rotation is not 0, then the viewer will rotate the page, yet the coordinate system is unchanged.
In some cases, it is desirable to work with the PDF in the same way as it appears in the viewer, i.e. that point (0,0) is the same upper left corner as shown in the viewer.

The rotation flattening example should analyze each page, and if the Rotate number is not 0, then add code to the contentstream (via creator) to rotate the page contents, and then set the page's Rotate to 0. There should be no visual difference when opening the PDF file in the viewer.

example of text search and replace

Cant see how to do this with the current text examples.

My use case is i have a PDF that is a CAD output with text in it.
Its in English and i want to replace the text for German text.
Obviously the English to Germany text is using google translate API.

I can see how to loop the ContentStream from the examples:


fmt.Printf("--------------------\n")
	fmt.Printf("PDF to text extraction:\n")
	fmt.Printf("--------------------\n")
	for i := 0; i < numPages; i++ {
		pageNum := i + 1

		page, err := pdfReader.GetPage(pageNum)
		if err != nil {
			return err
		}

		contentStreams, err := page.GetContentStreams()
		if err != nil {
			return err
		}

		// If the value is an array, the effect shall be as if all of the streams in the array were concatenated,
		// in order, to form a single stream.
		pageContentStr := ""
		for _, cstream := range contentStreams {
			pageContentStr += cstream
		}

		fmt.Printf("Page %d - content streams %d:\n", pageNum, len(contentStreams))
		cstreamParser := pdfcontent.NewContentStreamParser(pageContentStr)
		txt, err := cstreamParser.ExtractText()

		if err != nil {
			return err
		}
		fmt.Printf("\"%s\"\n", txt)
	}

But the insert text is actually not able to replace existing text:
https://github.com/unidoc/unidoc-examples/blob/master/pdf/text/pdf_insert_text.go

Please help...

Problem filling field choice

Hi,

I am trying to fill a pdf with a select/choice by using pdf_form_fill_json.go , however it is not filling.

I haven't got any error, just a log:
[DEBUG] logging.go:125 Unexpected string for button/choice field. Converting to name: 'RJ'

I am using the pdf and json in attachment
teste.pdf
teste.json.txt

JSON format for fjson.LoadFromJSONFile

What format does the JSON need to be in to fill a pdf form?

[
{
"name": "KEY",
"value": "VALUE"
}
...
]

PDF Replace text example that works independently of encoding.

The example
https://github.com/unidoc/unipdf-examples/blob/master/text/pdf_search_replace.go

relies on the charcodes to non-encoded and operates directly without accounting for encoding of search or replacement text.

The problem with this is that text is frequently encoded and in that case the example fails.

A couple of proposal options are put forward below. The most immediate is probably the first option. The second option could be feasible in the future when we have a more complete set of graphics extractors.

Proposal option 1:

Take encoding into account when decoding text and encoding replacement.
Need to access the font that generated the text and its encoding properties.

Proposal option 2:

Implement using the extractor and TextMarks.

Load the text via TextMarks
See
https://github.com/unidoc/unipdf-examples/blob/f523c3497ec967013956e0c62bca09192b604231/text/pdf_to_csv.go
https://github.com/unidoc/unipdf-examples/blob/master/text/pdf_text_locations.go
May need to do grouping and such in case glyphs are individually displayed (proximity analysis). Then match against the search text and replace with the target. Take care that the target is encoded and displayed with same font as original.

We could make it optional to shift texts if there is any extra space to make it more natural but by default that is not supported. Need to get some use cases/examples before adding that. Text could shift in a complicated manner (across lines and even pages), so its not a trivial problem.

Add the text to page with Paragraph (creator).
Add non-textual content to Page straight from the original page.
Might make sense to start by filtering the original page contents and remove text and apply that straight to the page, then draw the text on top of it.
Can potentially lead to some unintended z-index issues (i.e. text above/under some graphic that it should not be)

At the moment this is tricky because the extractor only supports TextMarks and not extraction of other graphics. Ideally would extract all Graphics (including TextMarks), filter them and then regenerate the output.

unidoc/unipdf#83
unidoc/unipdf#9

Creates PDF file that Adobe Reader says is bad.

When you take PR #52 and run the following command on the input PDF file README.pdf

go run pdf_text_locations.go README.pdf github

the output PDF file marked_up_README.pdf is not readable by Adobe Reader.

Update to unipdf 3.6.0

Update modules and make sure everything compiles as expected.

Location in digital signature

I try run example pdf_sign_appearance.go with Location parameter:
signature.SetLocation("Brookline, MA")
In result document I see Reason field, but don't see Location field

Update image extraction examples to use extractor package

Demonstrate

Easy interface for image extraction
Image extraction with coordinates

See
https://github.com/unidoc/unipdf/blob/bc0edcb8dd96020029acf9c0b536316128f7d5fd/extractor/image_test.go#L253

Digital signature

Hi, How to use the SHA2 digest algorithm to sign？

Add optimization examples (v3) - PDF compression

Add a basic example of PDF optimization (compression).

pdf/compress/pdf_optimize.go . Syntax should be: ./pdf_optimize input.pdf output.pdf
Some typical settings can be specified in the code.

Can be based on our CLI:
https://github.com/unidoc/unicli/blob/master/cmd/optimize.go
https://github.com/unidoc/unicli/blob/master/pdf/optimize.go

4up pages example

Create an example demonstrating how to load N pages, scale them and place on one page. Similar to creating handouts, i.e. 4 pages scaled down and put on 1 page with a box around each.

This would demonstrate the capability to easily work with and manipulate page contents with the creator package.

TestAppenderSignMultiple not work well yet

i have try the example of the test method in model appender_test.go file,and only the fist file is well,can you check it ?

Large output sizes from split PDF

I am currently testing this with 5 pages PDF. I am doing a very simple task in golang:

First, I am splitting each page into its own pdf. Second, I take each split page PDF and merge them all into another PDF.

When I compare file size of the original PDF (200 KB) with the output of merged PDF (500 KB), merged PDF is a bigger size.

Is there a way to reduce the size of the merged PDF?

Fill form by json, but cannot display English and digital charcode

We created a Chinese charcode PDF, and it contains forms. we used pdf_form_fill_json.go to fill the PDF forms, but the filled form PDF cannot display English and digital charcode.

How shall we resolve it? emergency

debug info
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='1'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='1'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='1'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='a'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='s'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='d'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='f'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='a'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='s'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='d'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='f'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune='['
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune=']'
[DEBUG] encoder.go:72 Failed to map rune to charcode. rune=' '

fill.json
[
{
"name": "合同",
"value": "合同11111111asdfasdf[] () （）【】。."
}
]

test.pdf

[Question] How do blocks work

Can we use Block in PDF, in order to place them on a spcific position? Any example would be nice.

Error version not found

unipdf-examples/image/pdf_watermark_image.go

Line 41 in 7caf256

 func addWatermarkImage(inputPath string, outputPath string, watermarkPath string) error { 

Error :version not found
unable to resolve this issue

Example for StyledParagraph

Add examples of creating formatted text paragraphs using the StyledParagraph component.

how to read pdf page size ?

Missing basic examples

Such as:

Writing text
Barcodes
Basic drawing
- Rects
- Circles
Transformations
- Rotate

Grayscale conversion: FailingPDF.pdf - Range check error - Pattern colorspace issue

Problem with
[0 0 0 /P1] scn fails

Output from grayscale_convert_bench
$ go run pdf_grayscale_convert_bench.go -d -g /tmp/mybla15 /Users/ghall/pdfdb_small/FailingPDF.pdf
compDir=compare.pdfs/dir.000
0 of 1 FailingPDF.pdf (236279->[DEBUG] parser.go:677 Pdf version 1.3
[DEBUG] pdf_grayscale_convert_bench.go:322 ^^^^page 1
[DEBUG] pdf_grayscale_convert_bench.go:757 Name=(*core.PdfObjectName)(0xc42030a6a0)=Im1
[DEBUG] pdf_grayscale_convert_bench.go:767 xtype=1 pdf.XObjectTypeImage=1
[DEBUG] pdf_grayscale_convert_bench.go:757 Name=(*core.PdfObjectName)(0xc42032a2c0)=Im2
[DEBUG] pdf_grayscale_convert_bench.go:767 xtype=1 pdf.XObjectTypeImage=1
[DEBUG] colorspace.go:2020 ERROR: Unable to convert color via underlying cs: Range check
[DEBUG] processor.go:414 ERROR: Fail to get color from params: [0 0 0 P1] (CS is Pattern)
[DEBUG] processor.go:241 Processor handling error (scn): Range check
[DEBUG] processor.go:242 Operand: "scn"
[ERROR] pdf_grayscale_convert_bench.go:883 processor.Process returned: err=Range check
[ERROR] pdf_grayscale_convert_bench.go:363 transformContentStreamToGrayscale failed. err=Range check
[ERROR] pdf_grayscale_convert_bench.go:172 transformPdfFile failed. err=Range check
, bad
1 files 1 bad 0 pass 1 fail

The problem is in the code
if patternCS.UnderlyingCS != nil {
// Swap out for a gray colorspace.
patternCS.UnderlyingCS = pdf.NewPdfColorspaceDeviceGray()
}

To handle properly, need to use the actual underlying colorspace...

Add table reporting examples

Create examples showcasing the powerful table functionality

pdf/report/pdf_tables.go Showing basic tables with header wrapping across column, an image inside the table . (based off TestTableWithImage, TestTableHeaderTest)

Could maybe split into chapters and show different examples in each chapter.

pdf/report/pdf_subtables.go showcasing more complex tables using subtables (based off TestTableSubtables)

https://github.com/unidoc/unidoc/blob/0d2e2fa2cda8451aedaf0c3cd519feb91ddf173d/pdf/creator/table_test.go#L410

Support for annotations in grayscale conversions

Annotations can have its appearance defined either via appearance streams, or through the PDF viewer's interpretation and displaying of the contents.

Clearly, for most reliability, appearance streams are preferred.

The most robust way would be to convert the appearance streams to grayscale, as well as any colors that are defined within the annotation, in which case, requires handling for each type of annotation.

How to get all fonts in a Pdf file???

position information of Image and Text

Hey,

I see the examples can help up quickly get all the text and images from a PDF, but how can I get the position (BBox) information for each image and character？
Also the font information may be needed to analyse the text :)

Thanks a lot!

Digital signature examples for v3

Create examples for signatures. Already started with a draft in #35. The example cases should be:

Basic example with private/public key in PKCS12 (.p12/.pfx file). e.g. ./pdf_sign_pkcs12 file.p12 input.pdf input_signed.pdf
Example of signing with a blank signature first and then replacing blank signature with actual signature contents.
Example of signing via PKCS11 / HSM such as already in #35
Example of signature appearances
Example for signature validation, e.g. ./pdf_sign_validate file.pdf prints out signature info and validation

Advanced search/replace issue

First, this example was a more elegant solution to the problem as I described in unidoc/unipdf#267 . Thank you!

For the documents I'm interested in, here are some attributes from them since I can't share the originals nor can I create my own documents:

Created with Adobe PDF Library 9.0
Acrobat PDFMaker 9.0 for Word
PDF 1.5
Optimized

The advanced search/replace almost works, which is great.

The problem I have is that my letters are segmented to an individual letter per text object, which means this example puts all of the text in the first object, and sets all the other objects to empty. Unfortunately, my pdf viewer moves all the text over to the left, so the new text overlaps with the text next to the old text by quite a bit. Not good.

The solution is to walk each text segment, and in each segment replace the existing characters with the corresponding characters from the replacement text. There are some visible problems when the search text is larger than the replace text, but it's pretty good otherwise.

I've created a modified version of the example, but it was pretty straightforward to do and I don't want to sign the CLA at this time. The fix involves rewriting the inner loop of the 'replace' function, and just modifying the way that you change the chunks. Psuedocode is as follows:


// loop 1
  // loop 2
    // loop 3
      .. existing 'continue' code
      // chunkOffset += 1

      // keep track of consumed characters

      // chunkoffset loop
        // ensure the first chunk retains some of the original content

        // middle chunks: replace all of this chunks content, increment chunkOffset

        // last chunk: append any remaining content

pdf_grayscale_convert_bench.go: Grayscale conversion not working on specific shading dictionary

The grayscale conversion failed on the file: GLType_Stats-Large.pdf.

There was no unidoc error, but color analysis revealed that the output was colored.

go run pdf_grayscale_convert_bench.go -g /tmp/bla1 -d ~/pdfdb_small/GLType_Stats-Large.pdf
compDir=compare.pdfs/dir.000
0 of 1 GLType_Stats-Large.pdf ( 12116->[DEBUG] parser.go:677 Pdf version 1.3
[DEBUG] pdf_grayscale_convert_bench.go:290 ^^^^page 1
[DEBUG] pdf_grayscale_convert_bench.go:723 Name=(*core.PdfObjectName)(0xc42023ac10)=Fm1
[DEBUG] pdf_grayscale_convert_bench.go:733 xtype=2 pdf.XObjectTypeImage=1
[DEBUG] pdf_grayscale_convert_bench.go:809 XObject Form: Fm1
[DEBUG] pdf_grayscale_convert_bench.go:905 Converting shading to gray - cs: Separation
[DEBUG] pdf_grayscale_convert_bench.go:908 Already 1 component - no action
[DEBUG] pdf_grayscale_convert_bench.go:723 Name=(*core.PdfObjectName)(0xc420291ec0)=Fm2
[DEBUG] pdf_grayscale_convert_bench.go:733 xtype=2 pdf.XObjectTypeImage=1
[DEBUG] pdf_grayscale_convert_bench.go:809 XObject Form: Fm2
21503 177%) 1 pages 0.005 sec => /tmp/bla1/GLType_Stats-Large.pdf[ERROR] pdf_grayscale_convert_bench.go:200 isPdfColor: 1 Color pages
, fail
1 files 0 bad 0 pass 1 fail
total duration (everything): 0 seconds
0 bad
0 pass
1 fail
0 /Users/ghall/pdfdb_small/GLType_Stats-Large.pdf - color fail: 1 color pages / 1 total

Digital signing lags when opening pdf with Adobe Reader

I just self signed a document using your library, and it is having problems with Adobe Reader. It loads extremely slow on linux and on windows and OS x it gets stuck or even crashes. Do you know if this is an problem with the unipdf lib or adobe reader?

Grayscale conversion: Handle 1 component colorspaces generally (not assume is gray)

Should not just ignore 1 component colorspaces. Indexed colorspaces are often 1 component and needs to be handled more generally.

Multiple failures in pdf_grayscale_convert_bench.go are failing due to this.
if ximg.ColorSpace.GetNumComponents() == 1 {
return nil
}

Example file that was failing due to this is kdchart-1, which was using an Indexed colorspace for image data.

404 on some examples

Invoice examples are missing-

Add a basic example that illustrates composite symbolic fonts (Chinese, Japanese, Korean)

Show how to use NewCompositePdfFontFromTTFFile to load the font and create basic contents.

[QUESTION] Replace text with same width as original

Question received via email.

One more question about this example: https://github.com/unidoc/unipdf-examples/blob/development/text/pdf_search_replace_advanced.go

It is possible to generate replacement string which will have the same width like original?

For example, I have a MODEL.PDFFONT object and FONTSIZE and I need to generate a string with spaces (u0020)

Example data was provided.

digital signature filled with image

i want to fill with image when i digital signature.this time i only see


		field, err := annotator.NewSignatureField(
			signature,
			[]*annotator.SignatureLine{
				annotator.NewSignatureLine("Name", "John Doe11"),
				annotator.NewSignatureLine("Date", "2019.15.03"),
				annotator.NewSignatureLine("Reason", "External signature test"),
			},
			opts,
		)
		field.T = core.MakeString("External signature")

Example of digital sign?

can have any digital sign example?

Wrong Operator symbol in Search and Replace example

This operator should be ", a double quote symbol, not two single quotes.

unipdf-examples/text/pdf_search_replace.go

Line 142 in f0a0927

case `''`:

example of digital sign on an pdf that already sign with another cert

I want to sign again on an already completed digital signature pdf. The two signed certificates are from the same parent certificate. can have any example?

Add invoicing example

Create a simple invoicing example. Can base on:
https://unidoc.io/news/simple-invoices
https://github.com/unidoc/unipdf/blob/master/creator/invoice_test.go

Adding "metadata" or something similar to a PDF

Hi guys,

Awesome library. I was wondering if there's a function to allow adding something like metadata to a PDF file. For example, I'd like to tag/label a certain file, but not as part of the file name and in no place that is visible to the user.

Does this exist in PDF? Or is it something that must be implemented in the filesystem level?

Thanks

Add examples for getting and creating page outlines

Would be nice if the getter can return a JSON of the outlines, that can then be edited and reapplied to a new file or modify existing.
Related #73

Emoji's Break Text input

While using an emoji the unidoc newParagraph function, the text is not displayed correctly see screen shots

when entering this this character sequence

p = creator.NewParagraph(text)

Ç % { < a@E£@!&%@&^!@($**))_))!@)(£&^%^!
without the emoji it prints all the characters

Ç % { < a@E£@!&%@&^!@($**))_))!@)(£&^%^! /😀
with emoji at the end verticalises the text and does not print emoji