Capturing a website to view offline

DaveC426913 · Nov 6, 2006

Anyone know of a convenient way to capture a whole (flat HTML) website so it can be viewed offline? I mean, other than file by file and image by image.

turbo · Nov 6, 2006

Google on "copy website" for options. There are applications that will crawl through entire websites and download them to your HD.

Greg Bernhardt · Nov 6, 2006

Teleport Pro is the best

http://tenmax.com/teleport/pro/home.htm

NoTime · Nov 6, 2006

If all you want is a screen image...
Use Alt-PrintScreen to copy active window.
Paste image into Paint, Imaging, Photoshop, etc.

neutrino · Nov 6, 2006

Do you use Firefox, Dave?

https://addons.mozilla.org/firefox/427/

jtbell · Nov 6, 2006

If it's a large web site, the owner may not appreciate having a robot crawl all over and suck up hundreds of megabytes of content at once. It puts a huge load spike on his server, and he may have to pay his provider based on traffic above a certain threshold.

DaveC426913 · Nov 6, 2006

Never mind, I found WinHTTrack, a site-capture tool.

Wow, and just as well, this site is monstrous. I had no idea. It's a yearbook site, spanning 75 years. I'm over 100Mb/10,000 files so far.

All this, so my dad can look at it from a CD, rather than online...

The things I do...

0rthodontist · Nov 6, 2006

If you are using Linux you can use wget. Actually you can do this in Windows too if you download it.

DaveC426913 · Nov 7, 2006

Phew. 700Mb, 15,800 files, 6 hours to download.

I'll bet the site owner hates me.

Greg Bernhardt · Nov 7, 2006

jtbell said:

If it's a large web site, the owner may not appreciate having a robot crawl all over and suck up hundreds of megabytes of content at once. It puts a huge load spike on his server, and he may have to pay his provider based on traffic above a certain threshold.

agreed! private crawling is hated amongst webmasters because most of the time it's from a some dude trying to rip or copy the site and then put up a copy on another site. there is actually a hack in vb that let's you block 100s of common private crawlers

jtbell · Nov 7, 2006

In my case, my personal hobby site (which has a large gallery of pictures) is on one of my college's Web servers, and I don't want to impact normal academic use.

So I watch for robots and for my other pet peeve: people on forums who hotlink to several of my pictures in a single posting, which causes several hits on my server every time someone opens that thread.

To counter this, first I set up my server to examine the referring URL whenever someone fetched a picture, and if it was from one of the offending sites, I sent instead a GIF with the red-bar-in-circle logo over the word "Hotlinking", and the URL of my terms of usage below.

Then I saw that someone had started a thread titled "The scariest thing in the world!" which hotlinked directly to that GIF! So I took a thumbnail-sized JPEG of Alfred E. Neumann (the MAD magazine character) and substituted that. The thread became hilarious for a while, with new viewers seeing Alfred while previous viewers (including of course the original poster) still had my "scary" GIF in their browser caches. "What, me scary?"

Eventually someone caught on and said, "hey dudes, refresh your cache!" but it was fun in the meantime.

Greg Bernhardt · Nov 7, 2006

jtbell said:

To counter this, first I set up my server to examine the referring URL whenever someone fetched a picture, and if it was from one of the offending sites, I sent instead a GIF with the red-bar-in-circle logo over the word "Hotlinking", and the URL of my terms of usage below.

try to hotlink on of PF's images

Hot-Text · Nov 7, 2006

O Yes You Cam!

http://hot-text.ath.cx/img/offline.gif http://hot-text.ath.cx/img/offline1.gif
Custemize it
http://hot-text.ath.cx/img/offline2.gif http://hot-text.ath.cx/img/offline3.gif

http://hot-text.ath.cx/img/offline4.gif http://hot-text.ath.cx/img/offline5.gif

Hot-Text · Nov 7, 2006

How do i Delete

http://hot-text.ath.cx/img/offline-1.gif http://hot-text.ath.cx/img/offline-2.gif

http://hot-text.ath.cx/img/offline-3.gif http://hot-text.ath.cx/img/offline-4.gif
o:)

Hot-Text · Oct 9, 2007

DaveC426913 said:

Anyone know of a convenient way to capture a whole (flat HTML) website so it can be viewed offline? I mean, other than file by file and image by image.

mmmmmmmmm

capture a whole (flat HTML) website

File
Save As Type
Webpage Complete (*.htm, *.html)

Hot-Text · Oct 9, 2007

mmmmmmmmm
capture a whole (flat HTML) website

File
Save As Type
Webpage Complete (*.htm, *.html)

Capturing a website to view offline

Similar threads

Hot Threads

Is AI hype?

How to disable AI responses in Google Searches?

Dealing with the new security règime

More on Distributing High Quality Audio

Looking For Ideas for a Hackathon: 'AI-Driven Diagnostic Efficiency & Solution'

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem