Curious about PF robots

    Staff: Mentor

    The right column of the PF home page says "Robots 221, members 62, guests 1266" under "members online now". How interesting.

    Does a web crawler count as a robot?
    How do you detect robots as opposed to guests?
    What are those robots doing, and what motivates those who send them?
    Does PF send robots to monitor other sites?

    The number of guests is also remarkable. PF members need to be aware of that; especially with controversial dangerous threads. 95% of those viewing PF are silent and not identifiable.
    What do you think we mentors are? Humans???
    Not Greg, but here are some answers:

    Web crawlers count as robots, and they identify themselves as robot (otherwise they are counted as guests). They are crawling the forums for search engines and similar tools.
    PF doesn't operate search engines (outside the forums) or anything like that, no need to crawl other websites.

    Most visitors are guests, yes. Most of them come from search engines.
    Polite robots identify themselves in the User-Agent field of their requests to web servers. For example, here are some requests for the home page of my web site, from my server log:
    Code (Text): - - [27/Dec/2016:17:44:52 -0500] "GET / HTTP/1.1" 200 4442 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +[PLAIN]http://yandex.com/bots)"[/PLAIN] [Broken] - - [27/Dec/2016:18:14:05 -0500] "GET / HTTP/1.1" 200 4442 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +[PLAIN]http://www.bing.com/bingbot.htm)"[/PLAIN] [Broken] - - [27/Dec/2016:20:27:54 -0500] "GET / HTTP/1.1" 200 4442 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +[PLAIN]http://www.google.com/bot.html)"[/PLAIN] [Broken] - - [27/Dec/2016:22:47:29 -0500] "GET / HTTP/1.1" 304 - "-" "Mozilla/5.0 (compatible; SeznamBot/3.2; +[PLAIN]http://napoveda.seznam.cz/en/seznambot-intro/)"[/PLAIN] [Broken]
    The ones shown above are for search engines. There are also companies that crawl the web and collect statistics that they sell to website owners, e.g. statistics about who links to your site, and what your site links to. There is at least one site (archive.org) that crawls the web in order to maintain a historical archive of the web, where you can look up what a website looked like in the past.
    Last edited by a moderator: May 8, 2017
    I'm one-quarter lawn gnome on my mother's side.
