Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Converting Relative URLs to Absolute URLs

  1. Apr 3, 2007 #1


    User Avatar
    Gold Member

    Within all pages within a folder.

    So I downloaded a number of pages that have links to pages that I want to download (using a utility such as HTTrack or DownThemAll). The problem is that the URLs of the pages are all relative, so when I save the pages to an external server (I have to do that, since the pages are temporary PHP search pages that HTTrack could not mirror), the URLs point to pages within the server that I cannot access.

    So example...


    is converted to...

    http://students.washington.edu/achen89/kong/viewtopic.php?t=10151 [Broken] (where I saved the search page to).

    So is there a program that allows one to parse all relative URLs to absolute URLs within an HTML page? (so that I can then use Httrack/DownThemAll on the saved page and then mirror all links within that page) Does the program have to be coded in a particular language? (C, Perl, Python?) I tried searching for one and found it at perlmonks.com, but couldn't turn it into a working .exe file (though I have no experience in compiling Perl)
    Last edited by a moderator: May 2, 2017
  2. jcsd
  3. Apr 3, 2007 #2


    User Avatar
    Gold Member

    On a side note, an alternative would be to "trick" the website tracker into interpreting the base directory of the URL as http://the-scholars.com
  4. Apr 5, 2007 #3
    Look for webpage mirroring programs that support this. Any good mirroring program will do this automatically though.

    I used to know of a good one that did this automatically, but I've long since forgotten about it. If I run across it somewhere, I'll let you know.
    Last edited: Apr 5, 2007
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook