Replace Broken links with archived sources

  • #1
I've seen some broken external links in physics forums Which have been changed to BROKEN as moderators. Is replacing the broken links with archived links(Ex. archive.org) a bad idea?
 

Answers and Replies

  • #2
18,502
8,417
No it's not a bad idea, but it's a matter of scale. We have near 700k threads. We'd require an army to go through them, check links and then replace broken ones with archived links.
 
  • #3
No it's not a bad idea, but it's a matter of scale. We have near 700k threads. We'd require an army to go through them, check links and then replace broken ones with archived links.
Is there a bot changing broken links text to "broken", If so Can't you make the bot to check the availability of the link in Archive.org through their api and route the link to Archived Link?
 
  • #4
18,502
8,417
Is there a bot changing broken links text to "broken", If so Can't you make the bot to check the availability of the broken link page and route the link to Archived Link?
That was automated, but a one time thing. It's my understanding that archive.org doesn't archive everything and is organized by snapshot date. How would a bot know what date it was archived on if it was? Sure, it's likely programmically possible, but a lot of work and we'd likely be blocked after sending archive.org hundreds of thousands of requests.

Also during that first run there were false positives found. Servers can respond with some less than standard responses and confuse our simple checker. It's not something I want to rely on doing all the time.
 
  • #5
Wrichik Basu
Insights Author
Gold Member
2020 Award
1,780
1,613
That was automated, but a one time thing. It's my understanding that archive.org doesn't archive everything and is organized by snapshot date. How would a bot know what date it was archived on if it was? Sure, it's likely programmically possible, but a lot of work and we'd likely be blocked after sending archive.org hundreds of thousands of requests.
Maybe you don't need the date. You can make a bot that will take a link from PF and then use the search option in Wayback machine, and if the search returns some results (except null), the bot will copy the URL of the latest snapshot and place it in PF.
 
  • #6
18,502
8,417
Maybe you don't need the date. You can make a bot that will take a link from PF and then use the search option in Wayback machine, and if the search returns some results (except null), the bot will copy the URL of the latest snapshot and place it in PF.

Let me know when it's ready :-p

Easiest solution is if a broken link is found, simply include a link to archive.org and they can do the rest :wink:
 
  • Haha
Likes Wrichik Basu
  • #8
18,502
8,417

Related Threads on Replace Broken links with archived sources

  • Last Post
Replies
4
Views
773
  • Last Post
2
Replies
31
Views
3K
Replies
1
Views
1K
  • Last Post
Replies
2
Views
1K
  • Last Post
Replies
5
Views
3K
  • Last Post
Replies
2
Views
2K
  • Last Post
4
Replies
94
Views
9K
  • Last Post
Replies
3
Views
2K
  • Last Post
2
Replies
33
Views
5K
  • Last Post
Replies
8
Views
834
Top