- JavaScript 50.3%
- HTML 18%
- Shell 10.8%
- C 10.3%
- Go 5.1%
- Other 5.5%
| .idea | ||
| assets | ||
| build | ||
| cmd/server | ||
| csrc | ||
| html | ||
| internal/renderer | ||
| src | ||
| .dockerignore | ||
| .gitignore | ||
| docker-compose.yml | ||
| Dockerfile | ||
| Dockerfile.chromium | ||
| go.mod | ||
| go.sum | ||
| package.json | ||
| README.md | ||
| TODO.md | ||
Inanna PDF Renderer
A lightweight, containerized HTML-to-PDF rendering microservice built with WebKitGTK and Go, optimized for low CPU/memory usage, fast response times, and scalable cloud deployments.
🧭 Overview
This project provides a microservice that converts HTML documents to PDF using WebKitGTK in a headless environment. It is designed for high-concurrency, container-based infrastructure where latency and resource usage are critical factors.
The rendering logic is implemented in C, directly invoking WebKitGTK APIs, and exposed to the outside world via an HTTP service in Go using Gin. Communication between Go and C is done via stdin/stdout, with no intermediate disk usage when used in full-streaming mode.
📊 Performance Comparison: WebKit vs Chromium
To demonstrate the efficiency of the WebKit-based approach, we've benchmarked our service against a comparable Chromium/Puppeteer implementation. Both services render the same 100-line invoice HTML document under identical container conditions.
Test Configuration
- Document: 100-line invoice with complex HTML/CSS layout
- WebKit Service: Port 8080 (C + WebKitGTK + Go)
- Chromium Service: Port 8081 (Node.js + Puppeteer + Chromium)
- Infrastructure: Docker containers on macOS (ARM64)
Rendering Time Performance
| Metric | WebKit (C) | Chromium (Puppeteer) | WebKit Advantage |
|---|---|---|---|
| Render Time | 2,694 ms | 3,891 ms | 🚀 31% faster |
| PDF Size | 174 KB | 143 KB | Similar output quality |
The WebKit implementation renders PDFs 31% faster than Chromium for complex documents, providing significant throughput improvements in high-volume scenarios.
Resource Utilization Analysis
Container Size & Disk Usage
| Metric | WebKit | Chromium | Difference |
|---|---|---|---|
| Docker Image Size | ~450 MB | ~680 MB | -34% (230 MB saved) |
| Base Dependencies | WebKitGTK + GTK3 + Cairo | Chromium + Node.js + npm packages | Smaller footprint |
| Cold Start Time | ~2-3s | ~4-6s | ~50% faster |
WebKit's image is 230 MB smaller, reducing storage costs, pull times, and deployment overhead in Kubernetes/cloud environments.
CPU Usage Comparison
WebKit CPU usage during 100-line invoice rendering
Chromium CPU usage during same workload
Key Observations:
- WebKit CPU pattern: Sharp, focused spikes during rendering (~2-3s duration), then immediate drop to baseline. Efficient resource cleanup.
- Chromium CPU pattern: Prolonged elevated CPU usage (~4-5s duration), slower return to idle state. More background processes and overhead.
- Peak CPU: WebKit reaches higher instantaneous peaks but completes faster. Chromium maintains moderate-high CPU for longer periods.
- Efficiency: WebKit's total CPU-seconds consumed is ~25-30% lower due to faster execution despite higher peak usage.
Memory Footprint
| Phase | WebKit | Chromium | Difference |
|---|---|---|---|
| Idle Memory | ~80 MB | ~120 MB | -33% (40 MB) |
| Peak Rendering | ~150 MB | ~280 MB | -46% (130 MB) |
| Post-Render | ~85 MB | ~140 MB | -39% (55 MB) |
Memory efficiency highlights:
- WebKit maintains a 46% smaller memory footprint during active rendering
- Chromium's V8 engine + Node.js runtime adds significant baseline overhead
- WebKit's C implementation has minimal garbage collection overhead
- Better memory locality due to direct WebKit API calls vs. Puppeteer's IPC layers
Cost Implications
For a high-throughput service processing 10,000 PDFs/day:
| Resource | WebKit Cost | Chromium Cost | Annual Savings |
|---|---|---|---|
| Compute (based on render time) | Baseline | +31% CPU hours | ~$450-900/year |
| Memory (based on peak usage) | Baseline | +46% RAM allocation | ~$300-600/year |
| Storage (image deployment) | Baseline | +230 MB × replicas | ~$50-150/year |
| Network (image pulls) | Baseline | +34% bandwidth | ~$100-200/year |
Total Estimated Savings: $900-1,850/year for a modest workload, scaling linearly with volume.
When to Choose Each Approach
Choose WebKit (this project) when:
- ✅ Performance and resource efficiency are critical
- ✅ Rendering standard HTML/CSS documents (invoices, reports, tickets)
- ✅ Running in resource-constrained environments (edge, IoT, budget clouds)
- ✅ High-volume batch processing with tight SLAs
- ✅ Minimizing cloud infrastructure costs
Choose Chromium/Puppeteer when:
- ⚠️ Rendering complex modern web applications with heavy JavaScript
- ⚠️ Testing compatibility with Chrome-specific browser APIs
- ⚠️ Screenshot generation or browser automation workflows
- ⚠️ Development team has existing Puppeteer/Playwright expertise
Conclusion
The WebKit-based approach delivers 31% faster rendering, 46% lower memory usage, and 34% smaller container images compared to Chromium, making it the superior choice for production PDF generation workloads where performance and cost efficiency matter.
🔍 Why WebKitGTK?
- No external binary dependency: Unlike approaches using
wkhtmltopdforqt-webengine-based tools, this service builds its rendering engine directly from source or linked dev libraries, avoiding dependency on fixed CLI tools. - Faster cold starts: WebKitGTK libraries initialize more efficiently for minimal documents compared to Qt-based solutions.
- Standard-compliant rendering: Uses the same rendering engine behind Safari and GNOME Web (Epiphany).
- Smaller memory footprint: QtWebEngine often pulls in Chromium internals; WebKitGTK remains significantly more lightweight.
⚙️ Architecture
┌────────────────────┐
│ HTML over POST │
└────────┬───────────┘
│
┌──────▼───────┐
│ Go (Gin) │
│ HTTP Layer │
└──────┬───────┘
│ stdin
▼
┌──────────────────────────┐
│ C binary (WebKit) │
│ Loads HTML → PDF via │
│ WebKitWebView + Cairo │
└──────────┬───────────────┘
│ stdout
▼
┌────────────┐
│ PDF Output │
└────────────┘
📁 Directory Structure
| Directory / File | Description |
|---|---|
| Go Service (WebKit) | |
cmd/server/ |
Main entry point with main.go for the Go HTTP server |
internal/renderer/ |
Worker pool implementation for concurrent PDF rendering |
csrc/render_to_pdf.c |
C source code for PDF generation using WebKitGTK |
go.mod, go.sum |
Go dependency management |
| Node.js Service (Chromium) | |
src/render-chromium.js |
Express.js server using Puppeteer for Chromium-based PDF rendering |
package.json |
Node.js dependencies (Express, Puppeteer-core) |
| Docker Infrastructure | |
Dockerfile |
Multi-stage build for Go + C + WebKitGTK service |
Dockerfile.chromium |
Alpine-based build for Node.js + Chromium service |
docker-compose.yml |
Orchestration for both services (ports 8080 and 8081) |
| Build & Test Scripts | |
build/dev.sh |
Build and deploy both services with health checks |
build/build.sh |
Build WebKit service only |
build/test.sh |
Simple test for WebKit service |
build/test-factory-invoices.sh |
Advanced test script supporting both services with parameters |
build/entrypoint.sh |
Container initialization script for WebKit service |
| Test Data | |
html/ |
Sample HTML files for testing (test.html, doc.html, simple-2pages.html) |
html/invoice-factory/ |
Invoice generator with Node.js + Tailwind CSS |
output/ |
Generated PDF output directory |
🚀 Build, Deploy and Test
Inanna provides two PDF rendering services running in parallel, allowing you to compare performance and choose the best fit for your workload.
Quick Start - Deploy Both Services
The fastest way to get both services running:
./build/dev.sh
What this does:
- Rebuilds both Docker images (WebKit and Chromium)
- Stops any running containers
- Starts both services in detached mode
- Performs health checks on both endpoints
- Reports service status
Expected output:
🚀 Dev cycle: rebuild + deploy
📦 Step 1/2: Rebuilding...
[+] Building ... (WebKit and Chromium services)
🔧 Step 2/2: Starting services...
⏳ Waiting for services to be healthy...
🏥 Checking pdfgen (Go + C)...
✅ pdfgen ready at http://localhost:8080
🏥 Checking pdfgen-chromium (Node.js + Puppeteer)...
✅ pdfgen-chromium ready at http://localhost:8081
Individual Service Build
If you only need the WebKit service:
./build/build.sh
This builds only the WebKit-based service (faster build time).
Testing the Services
Basic Test (WebKit only)
./build/test.sh
Sends csrc/test.html to the WebKit service and saves output to output/generated.pdf.
Advanced Testing - Compare Both Services
Use the test-factory-invoices.sh script to test either service with different invoice sizes:
Syntax:
./build/test-factory-invoices.sh [OPTIONS]
Options:
-d20, -d50, -d100 Invoice size (20, 50, or 100 lines)
-t1 Use WebKit service (port 8080) - default
-t2 Use Chromium service (port 8081)
Examples:
# Test WebKit with 20-line invoice (default)
./build/test-factory-invoices.sh
# Test WebKit with 100-line invoice
./build/test-factory-invoices.sh -d100 -t1
# Test Chromium with 50-line invoice
./build/test-factory-invoices.sh -d50 -t2
# Test Chromium with 100-line invoice (performance comparison)
./build/test-factory-invoices.sh -d100 -t2
Output files are saved to ./output/ with naming convention:
generated-invoice-{size}-webkit.pdfgenerated-invoice-{size}-chromium.pdf
Manual Testing with curl
Both services accept the same request format for easy comparison:
WebKit Service (Port 8080):
curl -X POST \
-H "Content-Type: application/json" \
--data-binary @html/test.html \
-o output/manual-webkit.pdf \
"http://localhost:8080/render?timing=true"
Chromium Service (Port 8081):
curl -X POST \
-H "Content-Type: application/json" \
--data-binary @html/test.html \
-o output/manual-chromium.pdf \
"http://localhost:8081/render?timing=true"
Response Headers:
Content-Type: application/pdfX-Render-Time-Ms: 2694(rendering duration in milliseconds)
🔌 API Endpoints
Both services expose identical HTTP interfaces, making them interchangeable for A/B testing and performance comparison.
WebKit Service - Port 8080
Base URL: http://localhost:8080
POST /render
Converts HTML to PDF using WebKitGTK rendering engine.
Request:
- Method: POST
- Content-Type:
application/jsonorapplication/octet-stream(flexible) - Body: Raw HTML content (see HTML format requirements below)
- Query Parameters:
timing=true(optional) - Returns rendering time in response headers
Response:
- Content-Type:
application/pdf - Headers:
X-Render-Time-Ms: {milliseconds}- Time taken to render the PDFContent-Length: {bytes}- PDF file size
- Body: Binary PDF stream
Example:
curl -X POST \
-H "Content-Type: application/json" \
--data-binary @your-file.html \
-o output.pdf \
"http://localhost:8080/render?timing=true"
Chromium Service - Port 8081
Base URL: http://localhost:8081
POST /render
Converts HTML to PDF using Chromium/Puppeteer rendering engine.
Request/Response: Identical to WebKit service (see above)
Additional Features:
- Supports modern CSS Grid/Flexbox with Chromium standards
- Better compatibility with complex JavaScript-heavy templates
- Higher memory usage but predictable Chromium behavior
GET /health
Health check endpoint for both services.
Response:
{
"status": "healthy",
"service": "chromium-renderer"
}
HTML Format Requirements
Both services expect self-contained HTML documents with all assets inlined. External resources (images, fonts, stylesheets) should be embedded.
✅ Supported HTML Format:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<style>
/* All CSS must be inline */
@page {
size: A4;
margin: 20mm;
}
body {
font-family: Arial, sans-serif;
color: #333;
}
.invoice-header {
background: #f0f0f0;
padding: 20px;
}
</style>
<script>
// JavaScript is supported (for dynamic content generation)
// Note: External script URLs may timeout or fail in headless mode
document.addEventListener("DOMContentLoaded", function () {
console.log("Document loaded");
});
</script>
</head>
<body>
<div class="invoice-header">
<h1>Invoice #12345</h1>
</div>
<!-- Images must be base64-encoded or use data URIs -->
<img
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
alt="Company Logo"
/>
<!-- Tables, complex layouts, QR codes all supported -->
<table>
<tr>
<td>Item</td>
<td>Price</td>
</tr>
</table>
</body>
</html>
📋 Best Practices:
- Inline all CSS: Use
<style>tags, avoid external stylesheets - Embed images: Convert to base64 data URIs or use inline SVG
- Self-contained fonts: Use web-safe fonts or embed with
@font-face+ base64 - Page size: Use CSS
@pagerules to control PDF dimensions - Print styles: Add
@media printrules for printer-friendly layouts - No external dependencies: CDN links may timeout in headless mode
❌ Avoid:
- External CSS files (
<link rel="stylesheet" href="...">) - External images hosted on remote servers
- Large JavaScript frameworks loaded from CDN (bundle locally if needed)
- Animations or transitions (will be static in PDF)
Example of a complete invoice:
See html/invoice-factory/ for a production-ready example with Tailwind CSS compiled inline.
🔒 Security and Performance
- WebKit rendering is sandboxed using
xvfb, which is initialized once at container start. - No intermediate temp files are required in full-streaming mode.
- Memory footprint is minimized via
stdout/stdinpiping and container optimizations. - Error handling is propagated from C to Go transparently for observability.
📦 Future Plans
- Add queueing / task IDs for batch processing.
- Embed metrics endpoint via Prometheus.
- Support page format & margins via POST parameters.
- Graceful rate-limiting / autoscaling support in cloud-native deployment.
📝 License
MIT or commercial dual-license TBD.