Slamd64

Please see www.slamd64.com for information on Slamd64.

Twitter

RSS Feed Available

permalink :: trackback

Converting PDF to HTML
(posted by Fred Emmott at 2007-10-25 18:34:08)

Earlier today, I saw the question "Can imagemagick convert pdf to html?" in IRC. In response, I present the world's worst PDF->HTML converter:

#!/bin/sh
BN=$(basename $1 .pdf)
pdf2ps $1
pstopnm -xsize 800 $BN.ps
cat > $BN.html <<EOF 
<html>
<head>
<title>$1</title>
</head>
<body>
EOF
for file in $BN*.ppm; do
	out=$(basename $file .ppm).png
	convert $file $out
	echo "<img src='data:image/png;base64,$(base64 $out)'>" >> $BN.html
done
cat >> $BN.html <<EOF
</body>
</html>
EOF

Trackbacks

No trackbacks for this post

Comments

Updated version

Posted at 2007-10-25 19:37:59 GMT +0000 by "Fred Emmott" (openid http://fredemmott.co.uk/)

An updated version, that actually produces readable text is at http://files.fredemmott.co.uk/pdf2html.sh

Quotes

Posted at 2007-10-26 07:37:03 GMT +0000 by "Anonymous"

You're missing quotes around the filenames. Try passing a file with spaces in its name to your script to see what I mean. And there's pdftohtml lying around for ages, so your script is not really the first thing to do that...

Re: Quotes

Posted at 2007-10-26 15:10:43 GMT +0000 by "Fred Emmott" (openid http://fredemmott.co.uk/)

It's not meant to be of practical use. It's a joke; there's a reason I labelled it "the world's worst".

ipljobc orqstf

Posted at 2008-05-25 03:30:59 GMT +0000 by "lpkdca rcdjvxag"

ogvb dehxgfct dcti pmuvldx equh voacfwlmh gzkduxc

Submit a comment

Popular search engine (starts with g):
Name:
Subject:
Message:

Valid XHTML Valid CSS