Converting Relative URLs to Absolute URLs

  Apr 3, 2007 #1


    Within all pages within a folder.

    So I downloaded a number of pages that have links to pages that I want to download (using a utility such as HTTrack or DownThemAll). The problem is that the URLs of the pages are all relative, so when I save the pages to an external server (I have to do that, since the pages are temporary PHP search pages that HTTrack could not mirror), the URLs point to pages within the server that I cannot access.

    So example...


    is converted to...

    http://students.washington.edu/achen89/kong/viewtopic.php?t=10151 [Broken] (where I saved the search page to).

    So is there a program that allows one to parse all relative URLs to absolute URLs within an HTML page? (so that I can then use Httrack/DownThemAll on the saved page and then mirror all links within that page) Does the program have to be coded in a particular language? (C, Perl, Python?) I tried searching for one and found it at perlmonks.com, but couldn't turn it into a working .exe file (though I have no experience in compiling Perl)
  Apr 3, 2007 #2


    On a side note, an alternative would be to "trick" the website tracker into interpreting the base directory of the URL as http://the-scholars.com
    Look for webpage mirroring programs that support this. Any good mirroring program will do this automatically though.

    I used to know of a good one that did this automatically, but I've long since forgotten about it. If I run across it somewhere, I'll let you know.
