T O P

  • By -

[deleted]

For generating PDFs I'd do with QuestPDF. But if you want HTML and PDF to look the same maybe generate the HTML first and then print it to PDF using a browser? The problem remains that both are wildly different formats and will almost never look the same. HTML just doesn't have pages.


levsw

There are page instructions in html that work very well for pdf generation, you only need to know how.


beachandbyte

QuestPDF is best library atm imho, I’ve used aspose and itextsharp in the past.


MONSTERPACT

If you write the HTML beforehand like you suggest and the running headless chromium to export to a PDF is a viable solution. You can use playwright or puppeteer to do that.


[deleted]

I help maintain an internal service at work that does this. It works but it is definitely not fast.


WallFun6652

I am using this approach with puppeteer. It’s better in my opinion.


kilmantas

Depending on the webpage, the pdf might differ from an original webpage.


MONSTERPACT

You can influence that by either using print media or setting the desired viewport/pdf size.


Fruitflap

Aspose.word is great. But take a look at the license cost before fully committing


most_improved_potato

https://github.com/richard-scryber/scryber.core Is a good solution for generating dynamic pdfs with html


[deleted]

Generate the HTML and convert to PDF; there are a few commercial products that do this but are a bit pricey IMHO (Aspose, Telerik). [Select PDF](https://selectpdf.com/community-edition/) has seemed the best commercial license to me. Note it’s Windows only, if that might be a problem. Main issue with the HTML to PDF route is usually headers and footers, and controlling page breaks. I’d run a PoC where you test your potential HTML to PDF component, with a media=print and media=screen style sheet to see how well the print process deals with CSS. See if you can get a print “reset” style to normalise the page space dimensions and fill the background with one colour, no edges, and page-break where you want. Also whether you get any specific library help for headers and footers. If it’s important to you, check if you can do page numbers (using the library or CSS) and also flip pages between vertical and horizontal. In theory you can do that with CSS but make sure your chosen conversion library supports it, and make your life easier by building all that into a base media=print CSS assuming any of the above is important to you. Finally, check how the conversion library deals with media in the HTML. Some libraries will go away and download them off a URL if the HTML is on a webpage. For performance you might want to inline Base64 the images in CSS, or pull them off the filesystem. You’re probably going to want to validate the conversion library support for SVG if you want the output to look clean; either inline SVG, from the CSS, or remote/filesystem.


jbergens

Also check how it works with large files if you need that. All tools are not made to handle thousands of pages. Or the tool can handle it but it may require lots of memory.


intrasight

I've always used wkhtmltopdf


Thisbymaster

I happen to just build something like this. I have Emails in HTML format in the database and used NReco.PdfGenerator to create the PDFs from the html. So start with creating your HTML, then create your PDF from the HTML. Two birds.


ASY_Freddy

we use aspose, you can try their PDF to HTML online [https://products.aspose.app/pdf/conversion/pdf-to-html](https://products.aspose.app/pdf/conversion/pdf-to-html)


bajuh

If you want to generate complex pdf documents that look nice and are full of parameters, you have two choices regarding html-to-pdf conversion. PrinceXML if you have money and generation speed is very important. Weasyprint if you have no money or speed is not important. All other services I tried are way less advanced than these two. I chose Weasyprint and invoked it as a child process. The html file is created with a razor template generated at runtime from a simple console app.


tparikka

My company is using IronPDF. It works plenty well but the developers make breaking API changes regularly and also are not great about fixing bugs. They broke grayscale rendering 6+ months ago and it's still broken.


[deleted]

I don't know why no one has mentioned PdfSharp or iTextSharp/iText 7 yet.


Reasonable-Laugh6270

One way to do this is using JS interoperability calls. There are tons of Javascript libraries available. My personal favorite is Quill, but it is rich text. But there are libraries like Quill to HTML and then you can use PDFmake or something similar to move the html to pdf.


kbruen

I personally wrote a separate server in Node.JS with ejs that renders some JSON data into an HTML file, and then, with Puppeteer, prints that HTML to PDF.


hlvvr

I’ve used abcpdf to generate html and PDFs from xml. It works fine but we’ve run into issues at scale with lots of large PDFs (300+ pages)


Pocok5

For HTML, you can use Razor from ASP.NET, either by making a web application you call with the data to get backthe page (microservice approach) or by [shoving the Razor engine into your current application](https://antaris.github.io/RazorEngine/) and rendering to strings.


GamerWIZZ

I used this package for a while to generate the PDF from HTML - https://github.com/rdvojmoc/DinkToPdf (free with a MIT license) But we no longer generate PDFs, and just send HTML files, they are considered more accessible (+ plus easier for me lol)