Converting Relative URLs to Absolute URLs

  • Thread starter Thread starter Simfish
  • Start date Start date
  • Tags Tags
    Absolute Relative
AI Thread Summary
The discussion focuses on the challenge of downloading web pages with relative URLs using utilities like HTTrack or DownThemAll. When saving these pages to an external server, the relative URLs become inaccessible, leading to broken links. The main inquiry is about finding a program that can convert relative URLs to absolute URLs within HTML pages, enabling successful mirroring of linked content. There is mention of potential programming languages for creating such a tool, including C, Perl, and Python. Additionally, there is a suggestion to explore mirroring programs that automatically handle this conversion. The user recalls previously knowing a suitable program but cannot remember its name.
Simfish
Gold Member
Messages
811
Reaction score
2
Within all pages within a folder.

So I downloaded a number of pages that have links to pages that I want to download (using a utility such as HTTrack or DownThemAll). The problem is that the URLs of the pages are all relative, so when I save the pages to an external server (I have to do that, since the pages are temporary PHP search pages that HTTrack could not mirror), the URLs point to pages within the server that I cannot access.

So example...

http://the-scholars.com/viewtopic.php?t=10151

is converted to...

http://students.washington.edu/achen89/kong/viewtopic.php?t=10151 (where I saved the search page to).

So is there a program that allows one to parse all relative URLs to absolute URLs within an HTML page? (so that I can then use Httrack/DownThemAll on the saved page and then mirror all links within that page) Does the program have to be coded in a particular language? (C, Perl, Python?) I tried searching for one and found it at perlmonks.com, but couldn't turn it into a working .exe file (though I have no experience in compiling Perl)
 
Last edited by a moderator:
Computer science news on Phys.org
On a side note, an alternative would be to "trick" the website tracker into interpreting the base directory of the URL as http://the-scholars.com
 
Look for webpage mirroring programs that support this. Any good mirroring program will do this automatically though.

I used to know of a good one that did this automatically, but I've long since forgotten about it. If I run across it somewhere, I'll let you know.
 
Last edited:
In my discussions elsewhere, I've noticed a lot of disagreement regarding AI. A question that comes up is, "Is AI hype?" Unfortunately, when this question is asked, the one asking, as far as I can tell, may mean one of three things which can lead to lots of confusion. I'll list them out now for clarity. 1. Can AI do everything a human can do and how close are we to that? 2. Are corporations and governments using the promise of AI to gain more power for themselves? 3. Are AI and transhumans...
Thread 'ChatGPT Examples, Good and Bad'
I've been experimenting with ChatGPT. Some results are good, some very very bad. I think examples can help expose the properties of this AI. Maybe you can post some of your favorite examples and tell us what they reveal about the properties of this AI. (I had problems with copy/paste of text and formatting, so I'm posting my examples as screen shots. That is a promising start. :smile: But then I provided values V=1, R1=1, R2=2, R3=3 and asked for the value of I. At first, it said...

Similar threads

Replies
3
Views
2K
  • Sticky
Replies
0
Views
4K
Replies
14
Views
3K
Replies
18
Views
1K
Replies
14
Views
2K
Back
Top