As you may remember from my prior attempt at using Altavista Search I ran out of space, and found out it only serves pages on 127.0.0.1:6688 and is pretty much hardcoded to do so. It’s a “fine” hyrid java 1.01 application, with the bulk of it being java. I finally got around to setting up a VM, and unpacking all of the utzoo archives, and indexing them. I should have done something about the IO because this took too long (KVM).
So to cheat the system, I installed stunnel as a simple https to http proxy, which let me access my search VM anywhere. However it still embedded 127.0.0.1 in all the pages.
Enter an Apache reverse proxy to talk to stunnel to talk to AltaVista search!
First to enable a few modules:
And adding this into the config:
ProxyPass “/altavista/” “https://10.12.0.16”
ProxyPassReverse “/altavista/” “https://10.12.0.16/”
AddOutputFilterByType SUBSTITUTE text/html
Substitute “s|file:///C:\Program Files\DIGITAL\AltaVista Search\My Computer\images\|http://debian7/images/|n”
Substitute “s|<a href=http://debian7/altavista/?pg=q&what=0&fmt=d|<!—|n”
This let me redirect all of those requests into a VM called debian7 on the /altavista path. I also copied the images to the apache server, and now I get something that looks correct!
I cut the results short… But here is a search of something simple:
I also killed all the ‘working URL’s that simply open a desktop application on the index ‘server’. Naturally it was a personal service, but as a server this isn’t any good. As such you can’t click on any search results now. I need something else to figure out how to take the result blocks like “u:\b128\comp\databases\2852” and turn them into URL’s.
Also, as much as I want to re-index I would be best to cut off the headers, or most of them so the preview lines make sense. Xref, Path, even From & Newsgroups don’t interest me.
I hate to leave it as ‘good enough’ but if anyone has a solution…. I’ll be glad to make this wonderful resource available!