What is the use of http headers?

  • Thread starter Thread starter adjacent
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the use of HTTP headers in the context of downloading files, particularly images, using a Python script. Participants explore the necessity and implications of including headers in HTTP requests, as well as the potential identification of bots by websites based on header usage.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant questions the necessity of including headers in a Python script for downloading an image, noting no observable difference when headers are omitted.
  • Another participant provides a link to a list of HTTP header fields and mentions that while some headers are required, others are optional and can lead to default or cached values being used by the browser.
  • It is noted that the only mandatory header in HTTP 1.1 is "Host: ", but sending additional headers for encoding and connection management may be beneficial.
  • A participant emphasizes the importance of headers for authentication purposes, suggesting that access credentials may be required depending on the website's configuration.
  • There are inquiries about whether a website can identify a user as a bot if headers are not used, with some suggesting that this is rare and typically related to copyright protection.
  • Participants suggest using tools like netcat, packet capture software, or browser debugging tools (such as Firebug or the F12 key) to inspect the headers sent by browsers.

Areas of Agreement / Disagreement

Participants express varying opinions on the necessity of HTTP headers, with some asserting their importance for certain functionalities while others believe they may not be crucial in all cases. The discussion remains unresolved regarding the extent to which headers affect bot identification.

Contextual Notes

Some assumptions about the necessity of headers may depend on specific website configurations and the nature of the requests being made. The discussion does not reach a consensus on the implications of omitting headers.

adjacent
Gold Member
Messages
1,552
Reaction score
62
For example, I am downloading an image file from example.com using a python script.
Is there any need to include the headers in the script? I don't see anything happening if I don't include the headers.
So what is the need of it?
 
Computer science news on Phys.org
As far as I'm aware of the only mandatory header in HTTP 1.1 is "Host: ". However it might be better if your script can also send appropriate headers to define the encoding, and request the TCP connection to be closed or kept open.
 
  • Like
Likes   Reactions: 1 person
[Edit: I thought you were only trying to download an image :-p ]

You need http header options especially in case the website in contact needs e.g your credentials for access (username, password, apikey etc). Plus, it depends on which method you use to send your packet so as to include your appropriate request header i.e partial vs full download stream, request string from your client to the server in transaction etc.

If for example you would want to download an image from the site example.com, you can just need to GET the image location provided that you've been granted your access right to the site.
In case you would want to query something from the site which has supplied you with i.e public api methods, they sure also document them on their website, visit and follow their examples.
 
Last edited:
  • Like
Likes   Reactions: 1 person
Thanks for the answers. Can the site Identify me as a bot if I don't use the headers?
 
Can the site Identify me as a bot if I don't use the headers?

Some sites do use headers for that, but it's very few of them and in most cases that's just for protecting copyrighted material (e.g. videos). So most likely you won't have trouble.

If you notice that it doesn't work (and you're not doing any nasty things) I'd suggest you use netcat in server mode and send a browser request to it. You'll be able to see on screen all the headers that your browser sends. Alternatively, you can use packet capture software to inspect your browser's requests to the real site.
 
  • Like
Likes   Reactions: 1 person
martiandawn said:
If you notice that it doesn't work (and you're not doing any nasty things) I'd suggest you use netcat in server mode and send a browser request to it. You'll be able to see on screen all the headers that your browser sends. Alternatively, you can use packet capture software to inspect your browser's requests to the real site.

The 'firebug' add-on for firefox can also be used to find the request and response headers. :smile:
 
The 'firebug' add-on for firefox can also be used to find the request and response headers.

Thanks for the tip!
 
All of the major browsers (Firefox, Chrome, IE) have debugging capabilities that can be accessed from the F12 key. Using the F12 debugging tools you can see the request and response headers and quite a bit more.
 
  • Like
Likes   Reactions: 2 people
  • #10
Mark44 said:
All of the major browsers (Firefox, Chrome, IE) have debugging capabilities that can be accessed from the F12 key. Using the F12 debugging tools you can see the request and response headers and quite a bit more.

I've played with web apps for years and never knew about the F12 key. Thx.
 

Similar threads

Replies
4
Views
5K
  • · Replies 20 ·
Replies
20
Views
5K
Replies
10
Views
4K
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
Replies
16
Views
4K
  • · Replies 32 ·
2
Replies
32
Views
3K