Thursday, August 14, 2008

PDF to HTML Conversion Open Source

PDF to HTML Conversion Open Source

Pdftohtml is a tool based on the Xpdf package which translates pdf documents into html format.

pdftohtml is a utility which converts PDF files into HTML and XML formats.

The latest release is 0.36 It's based on the xpdf 2.02 by Derek Noonburg

(In 2001-2 i had a big collection of pdf from web, thanks to google's filetype:pdf, over 4 GB. I had to search for text in them, there was no tool at that time, this pdftohtml made me convert all pdf into html, it converts and puts a html text file alongside the pdf according to the folder structure. Then i used grep and grep32 in win98 as only XP had built in grep, i think it could be "findstr" or something.

Then i made a VB dotnet GUI in 2003 for this DOS system which is is a pretty clumsy pdf search, i rigged up. It solved my problem and got my datasheets for any part no. PDF Search - Dotnet Program with Source.)

Tomahawk PDF - Native Winds

Tomahawk PDF+ is the freeware version of Tomahawk Gold. The freeware version may not have all the bells and whistles of the full version of Tomahawk Gold but it is still one of the most advanced word processing/pdf creation software packages available for the Windows platform. This comfortably designed program with a familiar interface that's easy to use, allows documents to be produced and saved in a number of formats, including rich text; exported as HTML; or converted to PDF.

Tomahawk PDF - Native Winds

Tomahawk PDF+ runs on all Windows systems from Win9x up to Win2003.
And now Tomahawk PDF+ also runs on Linux systems using Wine 0.9.19!

Native Winds

Jack started his programming experience on an old "Trash-80" in 1982 using Basic as a programming language. His first program was a small address database that a local veterans organization in Colorado Springs had requested. As computer technology advanced, Jack upgraded his equipment and programming software.

Tomahawk PDF - Native Winds LLC

Wednesday, August 13, 2008

BlockNote wysiwyg page editor

Extremely easy to use Web page editor. Designed for people who create content for the Web, BlockNote is compact, fast and is as easy to use as a word processor. Its' simple, user-friendly interface enables even the most novice computer user to import, edit and format web pages-without having to learn any HTML!

  • View and Edit your web pages in a friendly WYSIWYG (What You See Is What You Get) environment.
  • Format text, including; font, size, color, bullets and more.
  • Insert and Format professional looking tables.
  • Select size, shade, borders, custom background image, and other table and cell attributes..
  • Insert images, controlling size, alignment, and borders.
  • Insert hyperlinks, or apply a hyper link to an image for a professionally designed looking web document.
  • Utilize BlockNote's intelligent cut-and-paste feature to transfer formatted text or data.
  • Rely on BlockNote's powerful Spellcheck engine for error free content.


Right Click, Copy HTML Source Avoids messing up code. Useful for Blog Posts. Also cleans up code, even cleans some unicode errors of other editors, which may cause problems with some hosts. And on the left pane you can browse your folder of html docs and edit them. Get it Here at BlockNote.

HTML that has embedded CSS and JS on copy-paste-save in blocknote cleans the code to simple HTML, even unicode cleaning, useful for clean html posting. It is also useful to clean html in this before bbcode conversion for forum posts. The bbcode conversion can be done with TextPad macros or ReplaceEm.

The best Feature is right click and copy HTML Source , Great for Posting in Blogs. Also the file folder view in left and tabs are priceless.

Tuesday, August 12, 2008

The UNIX System

The UNIX System

The Open Group is a vendor- and technology-neutral consortium, whose vision of Boundaryless Information Flow will enable access to integrated information within and between enterprises based on open standards and global interoperability.

Wednesday, August 06, 2008

AI Applications to query Search Engines

AI or artificial intelligence applications could be used to generate the search string to the search engines, this may help simple people get what they want in the internet, without having to rack their grey matter to improve their search results. The strength of a AI application depends on a human built database of knowledge with a capability to learn thru internet or by collaborative computing.

Then i have a small AI - DOS based program made by Greg Leedberg which works pretty well, it is called daisy, i taught it many things for over a year, now it comes up with interesting things, i have never heard about.

(Idea - Oct-05, Posted in Ideas Blog, Revised again - 4 April 2008, Nature - Open Source SAN Gear )

Tuesday, August 05, 2008

Secure Authenticated RSS-XML Feeds and Feedreaders

Suppose you want to have a simple means of communication with a known individual online without the need to a routine task to check or sort mail, there is, we know a solution. which is instant text messaging. Spam in this is very rare.

Now an alert comes by email, about some secure site, you have to first find if it is real or it is a phishing-spoof. You first go in your browser to your secure-site and verify. So communication of such kind has to be a combination of Instant-messaging and email. If the secure site sends a IM + eMail. Then you know it is real. As you have only known contacts on your IM.

Now one step further, if the secure site in which you have your account has a secure encrypted rss-xml feed of your account which can only be read with an user-id and password.

Now You have a feedreader for secure sites with provision for authentication. You place your accounts feed and setup the login. Now once a day or once a week it reads the encrypted private feed and you are updated about the status of your secure account. The Client feedreader also can be given access by secure site, which will note the programs serial number and vendor number and the NIC Number.

(Idea - Wednesday, December 27, 2006, Revised - 31 March 2007, Nature - Open Source, Person - SAN Gear)

Coat Pocket Computer - dapj Open Design

I could find that many HandHeld Computers are Difficult to use and the User Experience is a bit like wearing Shoes two Sizes Smaller.

So i thought i make a Feedback about how small a computer could be without any Compromise from the Present PC OS or Program Base. So we can use the same things we have on the Home PC.

That lead to a Open Concept Design which is Quite Possible, I noted the Points that can give some relief from the Suffocation of small Sizes, still maintaining mobility for people used to wearing Big Coats.

From Nomadic Computing

Coat Pocket Computer - dapj Open Design - Thursday, October 11, 2007

This post is mirrored here as it is related to software too


I saw a Review of Sharp/Willcom W-ZERO3. Image below, the layout is good but size is a bit small. If it is 60% bigger then it could use a regular Desktop OS and Programs (muntzed). As HDD, CPU and Batteries has become smaller, efficient and Better. I think this is close to the ideal handheld.

From Nomadic Computing

Saturday, August 02, 2008

PDF Search - Dotnet Program with Source

If you have a collection of many pdf files, this tool will be useful to search the pdf files for a text string in the pdf, pdf to html conversion will be done by this tool first, i built it for learning.

The tool is made with dotnet vb7, and the source code of pdf search is open source, the components used in this tool are not open source.

From SAN Gear - Software Developer and Web Design Resource


The computer should be win98/pentium2 or better, with dotnet framework installed, this is a 20MB install file dotnetfx.exe , it must be in microsoft site or find it at pcworld, tucows, webattack or thru google. Also you need the windows installer files InstMsiA.exe and InstMsiW.exe get them from somewhere.

The program is alpha and may look odd to many, but it works and does its job with a lot of work arounds.

The Source Code is this pdfser_net.zip on this Page as an Attachment. - Web Deuce - DotNet

For this software to work the folder names and pdf file names should not have spaces eg. you cant use 'my documents' folder, make a folder like c:\pdf_books and in that put as many folders as you want nested 3 deep and put your pdf files in them rename all pdf files with spaces if you have a file good book.pdf rename as good_book.pdf.

The command line components used in this program are

pdf to html PDFTOHTML conversion program

grep32 or try

(please note this was done many years back - ananth )

If you are learning like me download the source code, it is not much as i can only write few lines, you are free to use it the way you like.

Copernic Desktop Search (CDS) came soon after this program of mine and later other desktop search. If that's true, i am a inventor ! or at least a Brainstormer.

URL mdb - Document Bookmarks Database

This is also open source, this is a browser with a bookmark manager using an mdb file which is a database file used in ms-access, the database of bookmarks can be searched and updated using the built in browser. It can also be used as a document manager.

This software can store your keywords with each bookmark and also store offline copies of web pages and record them in the database. Documents on your harddisk can be indexed and stored as local urls in the same database with keywords. A Dotnet program i made to make a website database and save a snapshot of web page. Uses sql.

From SAN Gear - Software Developer and Web Design Resource


You will need the below things in your computer for this program to work

  • win98/pentium 2 or better
  • IE5 or better
  • dot net framework dotnetfx.exe
  • ms data access components mdac_typ.exe
  • ms jet data base engine jetsetup.exe
  • windows installer InstMsiA.exe and InstMsiW.exe

you may have to get the above from the internet or thru magazine CDs.

The Source Code is urlmdb.zip on this Page as an Attachment. - Web Deuce - DotNet

Note - To start it as a exe create a file url_data.mdb with a few records and put in /bin

You are free to use the source code of the program the way you like.

Friday, August 01, 2008

Google Reader in Content Presentation

Google Reader can be used as a Dynamic Content Presentation tool in Plain html pages. This has to be used in combination with Ajax Api and your Blog's Feed Set to Full. It has an advantage that only browsers with JavaScript enabled can view content. Then Search engines may not be able to index it.

Each site has to have its own Unique API Key, A Key on a top level domain may work on folders and subdomains. This is my learning. Once you make tags in reader make the tags public and use the feed URL of the tags on different pages or sections. Now by just tagging your posts in reader you will present your pages to the visitors. A Small site can be managed without editing any page !. But i am not sure if this can be a CMS idea.

Reader also has a Share with note that can be published with ajax api. By using the Bookmarklet "Note in reader", you can just comment on nice pages on the web. Your note and a Teaser Snippet of that site will be seen on the Ajax page you made. Set Description to full.

( Idea date :11:21 Dec-15 2007, Person : Anantha Narayan, Type : Open Source)

Computer Hardware Checklist

If you are buying a new PC computer ensure the below essential specs.

1. Serial Port-RS232 port, 1No., for modem, mouse or peripheral.
2. Parallel-Centronics port, 1No. ,for printer, external drive, peripheral.
3. PS2 ports- 2 Nos., one for keyboard and one for mouse.
4. LAN Card 10-100 Mbps, with windows, linux and unix drivers in floppy.
5. Mouse with web scroll wheel and optical ball-less sensor for PS2.
6. CDROM-RW drive 48X or less with ability to burn normal CDROM and RW.
7. CDROM-48x drive with DVD reading capability if you need.
8. Speakers with woofer and a small microphone.
9. Motherboard should have integrated video and sound.
10. PS2 keyboard with buttons for cut-paste, windows control.
11. Monitor that supports 1024*768, flat, black, anti-glare.
12. Sturdy Cabinet with reset and power buttons in front and good SMPS.
13. USB ports 2 in back and 2 in front of PC or use a USB Hub.
14. Floppy drive for start up, testing and troubleshooting.
15. Modem internal or external 56 kbps with drivers for linux and windows.
16. Motherboard with 256MB ram with 4 PCI slots and drivers for linux and windows.
17. UPS with built in mains voltage regulator, surge suppressor, and high voltage cut off.

Others : printer cable, serial cable, USB hub, printer, scanner, midi port.

Revised January 2004 - this list may be partially outdated