PDF to HTML Conversion Open Source
Pdftohtml is a tool based on the Xpdf package which translates pdf documents into html format.
pdftohtml is a utility which converts PDF files into HTML and XML formats.
The latest release is 0.36 It's based on the xpdf 2.02 by Derek Noonburg
(In 2001-2 i had a big collection of pdf from web, thanks to google's filetype:pdf, over 4 GB. I had to search for text in them, there was no tool at that time, this pdftohtml made me convert all pdf into html, it converts and puts a html text file alongside the pdf according to the folder structure. Then i used grep and grep32 in win98 as only XP had built in grep, i think it could be "findstr" or something.
Then i made a VB dotnet GUI in 2003 for this DOS system which is is a pretty clumsy pdf search, i rigged up. It solved my problem and got my datasheets for any part no. PDF Search - Dotnet Program with Source.)




