sicos1977 / chromiumhtmltopdf Goto Github PK

Convert HTML to PDF with a Chromium based browser

C# 100.00%

convert html pdf chrome edge google microsoft

chromiumhtmltopdf's Introduction

ChromiumHtmlToPdf

What is ChromiumHtmlToPdf?

ChromiumHtmlToPdf is a 100% managed C# .NETStandard 2.0 library and .NET Core 3.1 console application (that also works on Linux and macOS) that can be used to convert HTML to PDF format with the use of Google Chromemium (Google Chrome and Microsoft Edge browser)

From version 4.0 and up the library is now fully async but you can still use it without this if you want.

Why did I make this?

I needed a replacement for wkHtmlToPdf, a great tool but the project is archived on GitHub and no new features are added anymore, also it's not 100% compatible with HTML5.

License Information

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NON INFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

Installing via NuGet

The easiest way to install ChromiumHtmlToPdf is via NuGet (Yes I know the nuget package has another name, this is because there is already a package with the new name I used).

In Visual Studio's Package Manager Console, simply enter the following command:

Install-Package ChromeHtmlToPdf

Converting a file or url from code

var pageSettings = new PageSettings()
using (var converter = new Converter())
{
    converter.ConvertToPdf(new Uri("http://www.google.nl"), @"c:\google.pdf", pageSettings);
}

// Show the PDF
System.Diagnostics.Process.Start(@"c:\google.pdf");

or if you want to do it the async way

var pageSettings = new PageSettings()
using var converter = new Converter();
await converter.ConvertToPdfAsync(new Uri("http://www.google.nl"), @"c:\google.pdf", pageSettings);

// Show the PDF
System.Diagnostics.Process.Start(@"c:\google.pdf");

Converting from Internet Information Services (IIS)

Download Google Chrome or Microsoft Edge portable and extract it
Let your website run under the ApplicationPool identity
Copy the files to the same location as where your project exists on the webserver
Reference the ChromeHtmlToPdfLib.dll from your webproject
Call the converter.ConverToPdf method from code

Thats it.

If you get strange errors when starting Google Chrome or Microsoft Edge than this is due to the account that is used to run your site. I had a simular problem and solved it by hosting ChromiumHtmlToPdf in a Windows service and making calls to it with a WCF service.

Converting from the command line

ChromiumHtmlToPdfConsole --input https://www.google.com --output c:\google.pdf

Console app exit codes

0 = successful, 1 = an error occurred

Installing on Linux or macOS

Installing .NET

See this url about how to install .NET on Linux

https://docs.microsoft.com/en-us/dotnet/core/install/linux

And this url about how to install .NET on macOS

https://docs.microsoft.com/en-us/dotnet/core/install/macos

Installing Chrome

See this url about how to install Chrome on Linux

https://support.google.com/chrome/a/answer/9025903?hl=en

And this url about how to install Chrome on macOS

https://support.google.com/chrome/a/answer/7550274?hl=en

Example installing Chrome on Linux Ubuntu

wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -

sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'

sudo apt-get update

sudo apt-get install google-chrome-stable

google-chrome --version

google-chrome --no-sandbox --user-data-dir

Pre compiled binaries

You can find pre compiled binaries for Windows, Linux and macOS over here

Latest version (.net 6)

https://github.com/Sicos1977/ChromiumHtmlToPdf/releases/download/4.2.1/ChromiumHtmlToPdf_4_2_1.zip

.NET 6.0 for the console app

The console app needs .NET 6 to run, you can download this framework from here

https://dotnet.microsoft.com/en-us/download/dotnet/6.0

Older versions (.net core 3.1 - end of life)

.NET Core 3.1 for the console app (end of life)

The console app needs .NET Core 3.1 to run, you can download this framework from here

https://dotnet.microsoft.com/en-us/download/dotnet/3.1

Installing it from scoop the package manager

See this for more information about scoop --> https://scoop.sh/#/

Just run the command from any PowerShell window (thanks to https://github.com/arnos-stuff)

scoop install https://gist.githubusercontent.com/arnos-stuff/4f9b2d92d812b25d0ee8335c543cba78/raw/cfa861ab3078a20c69157ab45daf33f26005fd63/chrome-html-to-pdf.json

Logging

From version 2.5.0 ChromiumHtmlToPdfLib uses the Microsoft ILogger interface (https://docs.microsoft.com/en-us/dotnet/api/microsoft.extensions.logging.ilogger?view=dotnet-plat-ext-5.0). You can use any logging library that uses this interface.

ChromiumHtmlToPdfLib has some build in loggers that can be found in the ChromiumHtmlToPdfLib.Logger namespace.

For example

var logger = !string.IsNullOrWhiteSpace(<some logfile>)
                ? new ChromiumHtmlToPdfLib.Loggers.Stream(File.OpenWrite(<some logfile>))
                : new ChromiumHtmlToPdfLib.Loggers.Console();

Setting a common Google Chrome or Microsoft Edge cache directory

You can not share a cache directory between a Google Chrome or Microsoft Edge instances because the first instance that is using the cache directory will lock it for its own use. The most efficient way to make optimal use of a cache directory is to create one for each instance that you are running.

I'm using Google Chrome from a WCF service and used the class below to make optimal use of cache directories. The class will create an instance id that I use to create a cache directory for each running Chrome instance. When the instance shuts down the instance id is put back in a stack so that the next executing instance can use this directory again.

public static class InstanceId
{
    #region Fields
    private static readonly ConcurrentStack<string> ConcurrentStack;
    #endregion

    #region Constructor
    static InstanceId()
    {
        ConcurrentStack = new ConcurrentStack<string>();

        for(var i = 100000; i > 0; i--)
            ConcurrentStack.Push(i.ToString().PadLeft(6, '0'));
    }
    #endregion

    #region Pop
    /// <summary>
    /// Returns an instance id and pops it from the <see cref="ConcurrentStack"/>
    /// </summary>
    /// <returns></returns>
    public static string Pop()
    {
        if (ConcurrentStack.TryPop(out var instanceId))
            return instanceId;

        throw new Exception("Instance id stack is empty");
    }
    #endregion

    #region Push
    /// <summary>
    /// Pushes the <paramref name="instanceId"/> back on top of the <see cref="ConcurrentStack"/>
    /// </summary>
    /// <param name="instanceId"></param>
    public static void Push(string instanceId)
    {
        ConcurrentStack.Push(instanceId);
    }
    #endregion
}

Using it in a Docker container

# Suppress an apt-key warning about standard out not being a terminal. Use in this script is safe.
ENV APT_KEY_DONT_WARN_ON_DANGEROUS_USAGE=DontWarn

# export DEBIAN_FRONTEND="noninteractive"
ENV DEBIAN_FRONTEND noninteractive

# Install deps + add Chrome Stable + purge all the things
RUN apt-get update && apt-get install -y \
	apt-transport-https \
	ca-certificates \
	curl \
	gnupg \
	--no-install-recommends \
	&& curl -sSL https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
	&& echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list \
	&& apt-get update && apt-get install -y \
	google-chrome-stable \
	--no-install-recommends \
	&& apt-get purge --auto-remove -y curl gnupg \
	&& rm -rf /var/lib/apt/lists/*

# Chrome Driver
RUN apt-get update && \
    apt-get install -y unzip && \
    wget https://chromedriver.storage.googleapis.com/2.31/chromedriver_linux64.zip && \
    unzip chromedriver_linux64.zip && \
    mv chromedriver /usr/bin && rm -f chromedriver_linux64.zip

See this issue for more information --> #39

Using this library on Linux or from a docker container

To make the library work the flag --no-sandbox will be set by default (on Windows this flag is not set). The library automaticly detect on which system you are running the code and sets the flag when needed. If for whatever reason you get a converting error then check if this flag is set and if not then add it manually.

converter.AddChromiumArgument("--no-sandbox")

When Chrome crashes for unknown reasons in a docker container

On most desktop Linux distributions, the default /dev/shm partition is large enough. However, on many cloud providers using Docker containers (such as the Google App Engine Flexible Environment) or Heroku, the default /dev/shm size is appreciably smaller (64MB and 5MB, respectively). On these platforms it's impossible to change the size of /dev/shm, which makes using Chrome difficult or impossible. This is particularly an issue for those who want to take advantage of its new headless mode.

If it is not possible to change the partition size than add the flag --disable-dev-shm-usage to tell Chrome not to use this parition

converter.AddChromiumArgument("--disable-dev-shm-usage")

Core Team

Sicos1977 (Kees van Spelde)

Reporting Bugs

Have a bug or a feature request? Please open a new issue.

Before opening a new issue, please search for existing issues to avoid submitting duplicates.

chromiumhtmltopdf's People

Contributors

Stargazers

Watchers

chromiumhtmltopdf's Issues

Binaries for Windows

Is it possible to download the binaries for Windows?
I'm unable to compile the source-code.
Thanks in advance.

.NET Core

Any plans to make this .NET Core compatible?

Now that we can write cross-platform command line tools, ASP.NET Core web apps and actually UI apps that run everywhere (at least with the help of Avalonia UI), it would be great if ChromeHtmlToPdf would be usable on all platforms. What do you think?

Convert HTML file to PDF

I was trying to convert to PDF by filePath. But it's not work.

Hanging upon generation

Hello!
I'm trying to run the CMD through code (because for some reason, using the code in the example hangs up on me without any error).

Basically running it via
Process p = new Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.RedirectStandardOutput = true;
p.StartInfo.RedirectStandardError = true;
p.StartInfo.RedirectStandardInput = true;
p.EnableRaisingEvents = true;
p.StartInfo.FileName = @"C:\Windows\system32\cmd.exe";
p.StartInfo.Arguments = $@"/c C:\Users\<path>\ChromeHtmlToPdf --input https://www.google.com/ --output {pdfOutput}";
p.Start();
p.BeginErrorReadLine();
p.WaitForExit();

where pdfOutput is basically a path pointing to where "testPdf.pdf" would be, and it just hangs up when I try to run it.
What's weird is that it works perfectly if I just run it straight from the CMD, I'm not really getting any error on the Process itself.

How to convert html string to Pdf

How to convert html as string to pdf instead of from ConvertUri?
Thanks

When the url returns a 500 error code, I do not want to generate a PDF file

I don't want a PDF file that might display as an error message.
Thank you, It's a great project

Output file extension

I am wondering why in list mode the output file gets the same name as the input file. Shouldn't the extension be changed to ".pdf"? This created some major confusions in my tests because suddenly, without noticing it, the output files became input again...

The relevant code is in Program.cs line 85:

var outputFile = inputUri.IsFile
                            ? Path.GetFileName(inputUri.AbsolutePath)
                            : FileManager.RemoveInvalidFileNameChars(inputUri.ToString());

Make the Converter take an ILogger interface instead of a plain stream, at least as an alternative

The plain stream makes it needlessly difficult to have the library integrate with logging systems like Serilog or the Microsoft.Extensions.Logging pipelines.

WebSocket is creating an Exception / Exception flow out of normal code in test case with Blazor Server Side

Hi there,

A very exciting project, and very useful considering wkhtmltopdf does not support HTML 5 and renders, in my opinion, a poorer quality document than Chrome does with print to PDF. I'm also frustrated that I cannot seem to get FontAwesome Icons to display consistently in the rendered output.

I tested your example solution in isolation and it worked fine (running on IIS Express on my dev host), but when I've tried to get it to work with a URL on a Blazor server app (the page to convert is https://www.fleet-serv.app/Reports/RCI/13 - this is a razor page, not one that uses SignalR), it throws an exception in the hub circuit (I've try--catched the block where the conversion is happening and no Exception is caught) When I've managed to get an exception, it's from the WebSocket connection OnError method, which states that a Message was interrupted, but I cannot find where in the code there is any interruption to this, and this only happens when I'm breaking the execution at every message send and receive.

I've tried various options:

Changing the default parameters to run Chrome in Headful mode - does not work at all
Adding a responseType of Base64Encoded string (or whatever it was from the chrome headless dev-tools API) did not change the behaviour at all
It's after the Page.printToPDF Message is sent. It appears the Task is not set to a Task.CompletedTask as the Exception happens outside of the code flow I can see.
I've tried running the debugger in Chrome and in Edge, with Script Debugging enabled and disabled and get the same result.

I will look into this in more detail when time allows, I'll possibly need to bring in the source for WebSocketSharp into the debugger to get a better feel for where the exception is being thrown.

I've tried running in the debugger both as Kestrel and as IIS Express. I've also changed the target of the https request in the Web API sample, which works fine.

So it looks like it's a peculiarity of working with Blazor and SignalR at this point,.

I will need to refactor the entire project in future to work with Blazor WebAssembly, to enable offline use and use in areas where signal connection is sub-optimal, but that's a way away yet, and I had hoped that this very promising project would be my silver bullet.

Thanks for all your hard work on this.

Thanks
Sean

Converter.ConvertToImage Stuck at 'Getting page frame tree'

Hello, thanks for this project. I really need it.
I need to convert a html code (which it's inside a stringbuilder) with this structure:

<html>
<head>
<style>
    table style...
</style
</head>
    <body>
        <table>...</table>
    </body>
</html>

The html is well built because I can render it on https://codebeautify.org/htmlviewer (but not indented since It's created runtime) This is my code:

                var logger = !string.IsNullOrWhiteSpace("logg.txt")
                  ? new ChromeHtmlToPdfLib.Loggers.Stream(File.OpenWrite(@"c:\users\jbalducci\desktop\asd.txt"))
                  : new ChromeHtmlToPdfLib.Loggers.Console();
                MemoryStream stream = new();
                using (ChromeHtmlToPdfLib.Converter converter = new())
                {
                    converter.ConvertToImage(sbHtml.ToString(), stream, new 
                    ChromeHtmlToPdfLib.Settings.PageSettings(ChromeHtmlToPdfLib.Enums.PaperFormat.FitPageToContent), logger: logger);
                }
                var imageBytes = stream.ToArray();
                stream.Dispose();

The method ConvertToImage doesn't do its job, the program is stuck, in the log file I have these lines:

2021-12-16T23:53:10.672 - Starting Chrome from location 'C:\Program Files (x86)\Google\Chrome\Application\chrome.exe' with working directory 'C:\Program Files (x86)\Google\Chrome\Application'
2021-12-16T23:53:10.693 - "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --headless --disable-gpu --hide-scrollbars --mute-audio --disable-background-networking --disable-background-timer-throttling --disable-default-apps --disable-extensions --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-translate --metrics-recording-only --no-first-run --disable-crash-reporter --remote-debugging-port="0" --window-size="1366,768"
2021-12-16T23:53:10.718 - Chrome process started
2021-12-16T23:53:11.525 - Received Chrome error data: 'DevTools listening on ws://127.0.0.1:49528/devtools/browser/9a9c5b58-755f-4ca1-b4a9-86e4b82a16f1'
2021-12-16T23:53:11.525 - Connecting to dev protocol on uri 'ws://127.0.0.1:49528/devtools/browser/9a9c5b58-755f-4ca1-b4a9-86e4b82a16f1'
2021-12-16T23:53:11.559 - Creating new websocket connection to url 'ws://127.0.0.1:49528/devtools/browser/9a9c5b58-755f-4ca1-b4a9-86e4b82a16f1'
2021-12-16T23:53:11.569 - Opening websocket connection with a timeout of 30 seconds
2021-12-16T23:53:11.714 - Websocket opened
2021-12-16T23:53:11.926 - Creating new websocket connection to url 'ws://127.0.0.1:49528/devtools/page/1509782EA4467F24112ABA2A90976106'
2021-12-16T23:53:11.926 - Opening websocket connection with a timeout of 30 seconds
2021-12-16T23:53:11.928 - Websocket opened
2021-12-16T23:53:11.928 - Connected to dev protocol
2021-12-16T23:53:11.929 - Chrome started
2021-12-16T23:53:11.929 - Getting page frame tree

I opened visual studio ad administrator.
Can you please help me fixing this? Thanks!

Add option to merge multiple html files to one pdf

IIS: Chrome exited unexpectedly, One or more arguments are invalid

I am trying to get this running in the IIS but am stuck with this error:
Chrome exited unexpectedly, One or more arguments are invalid (Exception from HRESULT: 0x80000003)

I have tried to place the ChromePortable in the following locations:

Root of WebApp
WebAppBin\Bin

I have tried to copy the entire folder "GoogleChromePortable" to those two locations, and to copy the entire content of the "GoogleChromePortable" folder to the two locations. Same error in alle scenarios.

Would it be possible for you to either add a path for the ChromePortable as a parameter or check for a specific path before trying to spin of ChromePortable. If the files is not found at the location, you could throw an error that could point the developer in the error direction.

Extra details:
at ChromeHtmlToPdfLib.Converter.StartChromeHeadless() in C:\tfs\ChromeHtmlToPdf\ChromeHtmlToPdfLib\Converter.cs:line 380
at ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, Stream outputStream, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable1 conversionTimeout, Nullable1 mediaLoadTimeout, Stream logStream) in C:\tfs\ChromeHtmlToPdf\ChromeHtmlToPdfLib\Converter.cs:line 983
at HtmlToPdfApi.Controllers.HtmlToPdfController.PostConvertHtmlToPdf(ConversionRequest req) in c:\Projekter\documendo\Tools\HtmlToPdfApi\Controllers\HtmlToPdfController.cs:line 109

Support HTML conversion image

Valid file extensions

Hi!

I'm using the version 1.2 of Ur awesome library to convert vector graphics .svg files to .pdf and it works great. But it seems like the new version is not accepting .svg files anymore. By adding the .svg in the code solves the problem:

switch (ext.ToLowerInvariant())
{
case ".htm":
case ".html":
case ".svg":
// This is ok
break;
default:
...
}

It seems the
PreWrapExtensions.Add(".svg")
is not working for this file extension at least.

Can the conversion be faster?

I found that every time I convert a pdf, Chrome will be launched. So when I have a lot of conversions, the efficiency is relatively low. Is there a way to reuse Chrome that has been started?

Multiple input files

Hi,
conversion with a single file works great.
Now I would like to convert multiple html files into a single pdf file. My first question regards the format of --input in case --input-is-list ist set. The documentation for the --input parameter should tell us that it needs to be a file that lists the input files because that wasn't clear to me and I first tried to pass multiple files separated by semicolon/space etc.

So do I understand it correctly that for multiple input files each file would result in its own pdf file? I am just wondering because as far as I remember, wkhtmltopdf does support combining multiple files into a single pdf. My use case is creating a single pdf report from multiple html files that I generate dynamically.

Unable to resolve WebSocketSharp dependency

Install-Package : Unable to resolve dependency 'WebSocketSharp'. Source(s) used: 'nuget.org', 'Microsoft Visual Studio Offline Packages'.
Anyway solution for this is highly appreciated.

Unable to open a page protected by authentication

Hi,

I am trying to print pdf a page which is currently protected by authentication. Can you please assist with this query

Regards,
TC

Add delay option

Sometimes we need to export charts and other information with javascript and use a delay to await some loadings in page.

Have no response when using in a winform button-click event

I try to put the converter.ConvertToPdf method in a winform button-click event,but it just keeps running and having no response.

here's my code:

private void button_Click(object sender, EventArgs e)
{
ChromeHtmlToPdfLib.Settings.PageSettings pageSettings = new ChromeHtmlToPdfLib.Settings.PageSettings();

        using (var converter = new Converter())
        {
            converter.ConvertToPdf(new ConvertUri("https://starbounder.org/Starbound_Wiki"), @"d:\starbound.pdf", pageSettings);
        }
    }

I have to use the third param in converter.ConvertToPdf(only 2 used in README.md),otherwise it won't work.
The method works successfully when using in a console APP.But another story in winform.

Fatal error

Hi Kees,

tried installing version 211 today; get A fatal error occurred. Running Win2016, Command line: The required library hostfxr.dll could not be found.
If this is a self-contained application, that library should exist in [C:\Chrome
HtmlToPdf].

Any tips? Thx!

best regards Kjetil

Margin is cutting off content

When I use the margins in the PageSettings like this:

converter.ConvertToPdf(new ConvertUri(_pathManager.PrintFile), output.OutputFilePath,
                            new PageSettings
                            {
                                PrintBackground = true,
                                MarginBottom        = document.Margin.Bottom,
                                MarginLeft          = document.Margin.Left,
                                MarginRight         = document.Margin.Right,
                                MarginTop           = document.Margin.Top
                            });

The content of the page is being cut off on the right side and if I increase the margins on the other sides they are also cutting off content. Like this:

Everything else is working like it should as far as I can tell.

Chrome Portable - "Chrome exited unexpectedly"

I downloaded the chrome portable from chrome portable site (both version 32bit and 64bits). I placed the extracted folder on solution folder and give the path to the converter and always get the message "Chrome exited unexpectedly".

If, from my console application try to start the process (chromeportable.exe) it's open the chrome. Additionally, if I don't send anything on Converter constructor, it works fine and use the chrome installed on my machine.

Can you help understanding what's is being missed when using chrome portable?

I'm using the last version (2.0.6).

The directory 'd:\ff' does not exists

C:\Projects\ChromeHtmlToPdf_v1.2>ChromeHtmlToPdf.exe version
ChromeHtmlToPdf 1.2.0.0
Copyright c 2017 - Kees van Spelde

running it

C:\Projects\ChromeHtmlToPdf_v1.2>ChromeHtmlToPdf.exe --input result.htm --output result.pdf
2018-11-27T00:19:49 - Stopping Chrome
2018-11-27T00:19:49 - The directory 'd:\ff' does not exists
2018-11-27T00:19:49 - Stopping Chrome

there seems to be some hardcoded reference to d:\ff in the release that is available from GitHub. I will download the source code and give that a try.

How to reuse Chrome

I use a service for PDF conversion, which simply creates a static converter variable. Then, when the request is received, converttopdf() is called for conversion. However, I found the following anomalies occasionally:

An unhandled exception has occurred while executing the request.
System.NullReferenceException: Object reference not set to an instance of an object.
   at ChromeHtmlToPdfLib.Browser.PrintToPdf(PageSettings pageSettings, CountdownTimer countdownTimer)
   at ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, Stream outputStream, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable`1 conversionTimeout, Nullable`1 mediaLoadTimeout, Stream logStream)
   at ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, String outputFile, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable`1 conversionTimeout, Nullable`1 mediaLoadTimeout, Stream logStream)

I can't find the reason.
Is it because of multithreading, or does chrome hibernate itself after it starts?

Please help me. Thank you

Make crossplatform Chrome finder

Exception in an event could crash the host application

Hi
Thanks for the great library, but I've stumbled upon an issue that I think you might be interested in looking into.

It seems like any exception raised in the _chromeProcess events (StartChromeHeadless() method - ErrorDataReceived and probably Exited and OutputDataReceived aswell) will cause the host application to terminate unexpectedly.

Since these are handled synchronously with a ManualResetEvent in the method, I think it would make sense to catch them in the event and rethrow them in the StartChromeHeadless()-method directly. That would allow us to wrap converter.ConvertToPdf in a try-catch and handle the error in the application.

Steps to reproduce:

Set a breakpoint at the two Environment.Exit lines in Program.cs.
Throw any exception in _chromeProcess.ErrorDataReceived
The application will terminate without hitting any of the breakpoints.

This causes an issue when running in a Windows service since it'll shut down the service (and probably restart the application pool for an IIS hosted application).

Best regards
Johan

Access denied

I think there is a bug in line 124 of Program.cs

using (var output = File.OpenWrite(options.Output))

When using --input-is-list then --output is supposed to be a directory and this cannot be opened for writing. This gets me an "Access denied" exception.

Make it run in IIS

Hi,

How can I make it run under IIS?

Thanks.

Feature request: Add options for headerTemplate, footerTemplate, etc...

It would be nice if this could match the options offered by the latest stable version of chrome dev tools available in Chrome v65+ (now in stable channel)

Options that would be nice for pdf conversion are especially the header and footer template ones.

All properties are described here: https://chromedevtools.github.io/devtools-protocol/1-3/Page/#method-printToPDF

Stuck in - Chrome Process Started running from the Command line

I'm running:

C:\TEMP\ChromeHTMLToPDF>ChromeHtmlToPdfConsole.exe --input "https://www.bing.com" --output google.pdf

C:\TEMP\ChromeHTMLToPDF>ChromeHtmlToPdfConsole.exe --input "https://www.bing.com" --output google.pdf
2021-08-17T11:32:15.527 - Starting Chrome from location 'C:\TEMP\ChromeHTMLToPDF\chrome.exe' with working directory 'C:\TEMP\ChromeHTMLToPDF'
2021-08-17T11:32:15.531 - "C:\TEMP\ChromeHTMLToPDF\chrome.exe" --headless --disable-gpu --hide-scrollbars --mute-audio --disable-background-networking --disable-background-timer-throttling --disable-default-apps --disable-extensions --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-translate --metrics-recording-only --no-first-run --disable-crash-reporter --remote-debugging-port="0" --window-size="1366,768"
2021-08-17T11:32:15.540 - Chrome process started

Cursor blinks and blinks, and it never gets past the point of "- Chrome process started"

On the other hand, Chrome.exe alone opens up fine when run from the same CMD window. No firewall issues, can browse just fine.

Windows 10 Build 1809 / Porable Chrome Version 92.0.4515.159 (Official Build) (32-bit) / Running as Administrator

Thank you!

wait-for-window-status option

I found interesting feature 'wait-for-window-status' but I can't get it work. On my web page I need to wait for all content loaded via ajax so I wanted to use 'wait-for-window-status' to wait. I set in javascript:
window.status = 'report_ready';
and in command line:
--wait-for-window-status report_ready
but it doesn't recognise it and wait until time out. I use Chrome. Can you help me.

Nuget: install-package : Unable to resolve dependency 'WebSocketSharp'. Source(s) used: 'Nuget'

I tried adding the package on a fresh MVC solution with .net 4.6.2 on VS 2017 and i got the following error:
install-package : Unable to resolve dependency 'WebSocketSharp'. Source(s) used: 'Nuget', 'Microsoft Visual Studio Offline Packages'.

I got the excapt same error using VS 2019.

A workaround is to first install WebSockedSharp through Package Console:
Install-Package WebSocketSharp -Version 1.0.3-rc11

Then run the following through the console or through the nuget manager:
install-package chromehtmltopdf

Btw. the documentation states dependencies against .NETFramework 4.6.1, but you are actually using .NETFramework 4.6.2 ;)

Not able to find the pre complied binaries for MacOS and Linux

README.md mentioned

Pre compiled binaries
You can find pre compiled binaries for Windows, Unix and macOS over here

https://github.com/Sicos1977/ChromeHtmlToPdf/releases/download/2.0.11/ChromeHtmlToPdf_211.zip https://github.com/Sicos1977/ChromeHtmlToPdf/releases/download/2.1.6/ChromeHtmlToPdf_216.zip

but I was not able to find them from these two links. Any ideas? Thanks.

Running from docker throws exception

Hi,

I have a .net core project that is using ChromeHtmlToPdf. It works fine locally on windows. But it throws "no process is associated with this object" exception when running from docker container (which is linux based image).

The code is below:

var localPath = System.AppDomain.CurrentDomain.BaseDirectory + "testpage.html";
converter.ConvertToPdf(new ConvertUri(localPath), output, setting);

Does the project support docker?

Input-is-list fails

This tool works great on single local html files. I have an issue trying to pass multiple files in. It's possible I don't have the arguments structured right. I can provide stripped down sample html files if it would help. Thanks and keep up the good work!

C:\Users\212549737\Downloads\ChromeHtmlToPdf_216\ChromeHtmlToPdf\ChromeHtmlToPdfConsole --logfile C:\Users\212549737\Desktop\testing\log.txt --input-is-list --input C:\Users\212549737\Desktop\testing\DOCUMENTATION_DOCUMENTATION_1_1.html C:\Users\212549737\Desktop\testing\DOCUMENTATION_DOCUMENTATION_1_2.html C:\Users\212549737\Desktop\testing\DOCUMENTATION_DOCUMENTATION_1_3.html --output C:\Users\212549737\Desktop\testing\Documentation.pdf

log.txt

Run Chrome with --no-sandbox (Linux only?)

There is another thing for (at least) Linux. I have a web app that allows the user to create PDF reports that are created via ChromeHtmlToPdf from dynamically created HTML files. Chrome doesn't like to be started headless from the root account without the --no-sandbox argument.

I've seen that you had it in the default arguments, probably for testing, but now it is commented out. Any chance to bring it back via an argument to ChromeHtmlToPdf?

I recompiled the code to include it for a test and with --no-sandbox my web app is able to create the reports. Without it, no chance :-( Background is that the web app runs as a daemon with root privileges. I'm not sure if it makes any difference when I use a dedicated account but it would need higher privileges nevertheless. With my regular user account I can of course start ChromeHtmlToPdf manually but that account can not be used for the daemon.

Xamarin Support

Hi,
Congratulation for wonderful project.
Is any how can you able to support Xamarin where Google Chrome might be available or not?

Stripped down version of Chrome portable?

I'm trying to find a better solution than wkHtmlToPdf on Windows.
I gave ChromeHtmlToPdf a try and it output pdf nicely, but I need a solution when Chrome is not installed on the computer.
Hence I was wondering where you suggest to grab a portable Chrome and if such a setup can be stripped down in order to remove anything that doesn't have to do with html rendering?

Thanks in advance

Samples to use in MVC C# ASP.NET

Hi Kees,

Do you have a sample to render a MVC page to PDF?
I have troubles using wkhtmltopdf and a page that is using jquery scripting ...

Thanks.
Joop

Using --user on Linux

Hey Kees,

great to see that ChromeHtmlToPdf now works on Linux.
There is one thing that does not currently work on Linux: running as a different user.

The problem is that your current code also looks for a domain to fill ProcessStartupInfo.
But accessing the Domain property of ProcessStartupInfo throws an exception. The
same goes for the Password and LoadUserProfile. You can see that here. Both getter and setter throw!
https://github.com/eerhardt/corefx/blob/master/src/System.Diagnostics.Process/src/System/Diagnostics/ProcessStartInfo.Unix.cs

Could you pack Domain, Password and LoadUserProfile into an environment check to only evaluate it on Windows?

Then I could still use the UserName field on Linux.

Nicolas

Allow overriding wrapping of content

Please see attached patch for allowing unwrapped content. We use this for XML. Regards.

AllowNoWrapping.zip

Async methods

Hey there, PrintToPdf for me constantly ends in a deadlock for me. The way you are calling SendAsync is prone to deadlocks (see this stackoverflow answer).

Changing it to this fixes the deadlock issue (and making the method async, as well ConvertToPdf async)

var result = countdownTimer == null ? await _pageConnection.SendAsync(message) : await _pageConnection.SendAsync(message).Timeout(countdownTimer.MillisecondsLeft);

Thanks a ton for this btw! I was getting so frustrated with wkHtmlToPdf with wonky formatting and this solves my issue completely!

Side note, I know you've been working on the library and may not have gotten to it yet - but don't forget to update the 'Converting a file or url from code' section to take into account ConvertUri/Stream and PageSettings!

edit:
Also, before I forget, HttpUtility I think is only for ASP apps so it wasn't compiling for me - I used 'WebUtility' instead and it worked

Support For Local Files (Not Hosted)

Hi,

Going to give it a try but wondered if this could work for a simple "index.html"file present on the local hard disk. So instead of using a hosted web page, either under a www. domain or simply on a local web server, does this support the local html files? It was not stated in the readme so maybe it would be better to mention it there too,

Great work ! Thanks

Not working with IIS

I saw the other issue with getting it to work with IIS, I'm having the same issue. it could not start chrome, retried 5 times.

I grabbed chrome portable from https://www.chrome-portable.com/ and put it in the same DIR but it gives the same error.

I have tried to call the exe from command line

            System.Diagnostics.Process process = new System.Diagnostics.Process();
            System.Diagnostics.ProcessStartInfo startInfo = new 
            System.Diagnostics.ProcessStartInfo();
            startInfo.RedirectStandardOutput = true;
            startInfo.RedirectStandardError = true;
            startInfo.UseShellExecute = false;
            startInfo.CreateNoWindow = true;
            startInfo.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden;
            startInfo.FileName = "cmd.exe";
            startInfo.Arguments = "/C ";
            startInfo.Arguments += System.Web.HttpContext.Current.Server.MapPath("/ChromeHtmlToPdf/ChromeHtmlToPdf.exe");
            startInfo.Arguments += " --print-background --input \"";
            startInfo.Arguments += Url;
            startInfo.Arguments += "\" --output ";
            startInfo.Arguments += System.Web.HttpContext.Current.Server.MapPath("/Reports/export.pdf");

            process.StartInfo = startInfo;
            process.Start();
            
            process.WaitForExit();

It gives me this.

2019-04-25T16:29:27.624 -    at ChromeHtmlToPdfLib.Converter.StartChromeHeadless()
   at ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, Stream outputStream, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable`1 conversionTimeout, Nullable`1 mediaLoadTimeout, Stream logStream)
   at ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, String outputFile, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable`1 conversionTimeout, Nullable`1 mediaLoadTimeout, Stream logStream)
   at ChromeHtmlToPdf.Program.Convert(Options options)
   at ChromeHtmlToPdf.Program.Main(String[] args), Chrome exited unexpectedly, One or more arguments are invalid (Exception from HRESULT: 0x80000003)

also tried

            using (var converter = new Converter(System.Web.HttpContext.Current.Server.MapPath("/chrome64/GoogleChromePortable.exe")))
            {
                var sets = new PageSettings(ChromeHtmlToPdfLib.Enums.PaperFormat.A4);
                sets.PrintBackground = true;
                converter.ConvertToPdf(new Uri(Url), System.Web.HttpContext.Current.Server.MapPath("/Reports/test.pdf"), sets, false);
            }

same error.

Converted a long time,and,return nothing.

converter.ConvertToPdf(new ChromeHtmlToPdfLib.ConvertUri("http://www.baidu.com", Encoding.UTF8), saveFile, new ChromeHtmlToPdfLib.Settings.PageSettings() { },string.Empty,2000,1000,2000);

It hasn't returned the data or throw an exception after five minutes.
And,when i used a command line "ChromeHtmlToPdf --input http://www.baidu.com --output D:\1.pdf --timeout 5000 --chrome-location C:\Users\xdq\AppData\Local\Google\Chrome\Application\chrome.exe",it workd well.
Enviroments:
chrome 74.0.3729.131 ;
Windows 10 64bit;

在 System.Environment.GetStackTrace(Exception e, Boolean needFileInfo) 在 System.Environment.get_StackTrace() 在 ChromeHtmlToPdfLib.Connection.<SendAsync>d__20.MoveNext() 位置 F:\Repos\Libray\4-Demos\Printer\ChromeHtmlToPdfLib\Connection.cs:行号 133 在 System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.Start[TStateMachine](TStateMachine& stateMachine)
在 ChromeHtmlToPdfLib.Connection.SendAsync(Message message)
在 ChromeHtmlToPdfLib.Browser.d__6.MoveNext() 位置 F:\Repos\Libray\4-Demos\Printer\ChromeHtmlToPdfLib\Browser.cs:行号 311
在 System.Runtime.CompilerServices.AsyncTaskMethodBuilder1.Start[TStateMachine](TStateMachine& stateMachine) 在 ChromeHtmlToPdfLib.Browser.PrintToPdf(PageSettings pageSettings, CountdownTimer countdownTimer) 在 ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, Stream outputStream, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable1 conversionTimeout, Nullable1 mediaLoadTimeout, Stream logStream) 位置 F:\Repos\Libray\4-Demos\Printer\ChromeHtmlToPdfLib\Converter.cs:行号 951 在 ChromeHtmlToPdfLib.Converter.ConvertToPdf(ConvertUri inputUri, String outputFile, PageSettings pageSettings, String waitForWindowStatus, Int32 waitForWindowsStatusTimeout, Nullable1 conversionTimeout, Nullable1 mediaLoadTimeout, Stream logStream) 位置 F:\Repos\Libray\4-Demos\Printer\ChromeHtmlToPdfLib\Converter.cs:行号 1008 在 PrintPreview4.HtmlToPdf.HtmlToPdf_Load(Object sender, EventArgs e) 位置 F:\Repos\Libray\4-Demos\Printer\PrintPreview4\HtmlToPdf.cs:行号 28 在 System.Windows.Forms.Form.OnLoad(EventArgs e) 在 Syncfusion.WinForms.Controls.SfForm.OnLoad(EventArgs args) 在 System.Windows.Forms.Form.OnCreateControl() 在 System.Windows.Forms.Control.CreateControl(Boolean fIgnoreVisible) 在 System.Windows.Forms.Control.CreateControl() 在 System.Windows.Forms.Control.WmShowWindow(Message& m) 在 System.Windows.Forms.Control.WndProc(Message& m) 在 System.Windows.Forms.Form.WmShowWindow(Message& m) 在 System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam) 在 System.Windows.Forms.SafeNativeMethods.ShowWindow(HandleRef hWnd, Int32 nCmdShow) 在 System.Windows.Forms.SafeNativeMethods.ShowWindow(HandleRef hWnd, Int32 nCmdShow) 在 System.Windows.Forms.Control.SetVisibleCore(Boolean value) 在 System.Windows.Forms.Form.SetVisibleCore(Boolean value) 在 System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context) 在 System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context) 在 PrintPreview4.Program.Main() 位置 Program.cs:行号 19

_tempDirectory null

Hi, I'm not sure if the temp directory is expected to be explicitly set but when I don't set it, I get an exception "Object reference not set". I changed the code in Converter.cs for ConvertToPdf slightly to check for null:

    ....
finally
{
	if (_tempDirectory != null) //*** NEW ***
	{
		_tempDirectory.Refresh(); //*** 'Object reference' exception when _tempDirectory==null
		if (_tempDirectory.Exists)
		{
			WriteToLog($"Deleting temporary folder '{_tempDirectory.FullName}'");
			_tempDirectory.Delete(true);
		}
	}
}

That way my PDF conversion runs through but I am wondering how this ever worked for others? Or am doing something wrong?

It looks like the background property is not working at all.

I have Tried both entering html with css classes that have set background and entering url to page which has background set and neither option worked

Example:
https://www.w3schools.com/cssref/css_colors.asp
Test.pdf

Invalid URI: The Authority/Host could not be parsed.

MacosOx Visual studio, NET CORE 2.2

Error in converting in container

I try to convert in container and I get the exception.

I use this code:

                using (var converter = new Converter(""))
                {
                    var pageSettings = GeneratePageSetting(document.ExtraParams);
                    var inputUri = new ConvertUri(document.Url);
                    converter.ConvertToPdf(inputUri, outputPdfFilePath, pageSettings);
                }

Exception is:

System.InvalidOperationException: No process is associated with this object.
at System.Diagnostics.Process.EnsureState(System.Diagnostics.State state) at offset 65
at System.Diagnostics.Process.get_HasExited() at offset 16
at ChromeHtmlToPdfLib.Converter.get_IsChromeRunning() at offset 10
at ChromeHtmlToPdfLib.Converter.Dispose() at offset 36
at ..Convertors.ChromePdfConvertor.ConvertHtmlToPdf(.*.Models.PdfDocument document, ..Models.PdfConvertEnvironment environment,

Google chrome installed in the container

Here is the dockerfile

FROM mcr.microsoft.com/dotnet/core/runtime:2.2-stretch-slim AS base

RUN apt-get update && apt-get install -y gnupg2

RUN apt-get update && \
    apt-get install -y curl

RUN curl -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
  echo "deb http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google.list && \
  apt-get update && \
  apt-get install -y 'google-chrome-stable' && \
  rm -rf /var/lib/apt/lists/*


ENTRYPOINT ["dotnet", "Console.dll"]

Please help to solve the exception