Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

How to write a program to retrieve Web data

  1. Apr 25, 2013 #1
    Hello! :smile:

    So I have some basic programming skills, but I have never done anything that interacts with the web. Here at work, we have a website that we have to go to in order to check the statuses of all of the jobs we have open. The website is awful in that you cannot run a report on all of the jobs at once. I want to develop a tool that goes to the website, loops through all of the jobs, and pulls the necessary data.

    I just need a starting point for now. Is this something I can do with Python? Any thoughts are helpful.

  2. jcsd
  3. Apr 25, 2013 #2


    User Avatar
    2017 Award

    Staff: Mentor

    Basically every programming language has some tool to read HTML content of web pages. You can pick your favorite one.
    It looks easy with python.
  4. Apr 25, 2013 #3
    I've used Beautiful Soup, and Minidom (both Python libraries) to do this. What you are really doing is parsing the DOM of the webpage you want. You look for the information located in some div by traversing the XML structure, and extract it. If the webpage changes structure, you might have to recode your solution, but it's pretty easy programming.
  5. Apr 26, 2013 #4


    User Avatar

    Staff: Mentor

    I agree python is a simple way to do it. I wrote similar parser without using any additional libraries but those that installed automatically (and with some ancient 2.x python version). But I don't have access to the code ATM so I can't tell you details.
  6. Apr 26, 2013 #5
    Ok thanks guys! Parsing and python... The two P's... I'll tell my boss I'll be PP'ing all next week!
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook