How can I log into a webpage using python and access data?

  • Context: Python 
  • Thread starter Thread starter dacruick
  • Start date Start date
  • Tags Tags
    Python
Click For Summary

Discussion Overview

The discussion revolves around logging into a webpage using Python to access data, with a focus on using libraries such as smtplib for email and urllib for web requests. Participants explore various methods for connecting to servers and handling authentication.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant shares a Python script for sending emails using the smtplib library but encounters a socket error when trying to connect to the SMTP server.
  • Another participant suggests that the user may not have an SMTP server running locally and recommends using an ISP's SMTP server or Gmail's smtp.gmail.com.
  • There is a discussion about how to find the correct SMTP server, with one participant noting that it varies based on the email provider.
  • A participant expresses a desire to learn how to access data from a server using Python, indicating that the email exercise was not directly applicable to their goal.
  • Multiple potential solutions are proposed for accessing logged data, including using an FTP server, having the server send logs via email, or utilizing remote logging capabilities.
  • One participant shares a code snippet for logging into a webpage using urllib and cookielib but faces issues with the login button not functioning as expected.
  • There is a request for assistance regarding the login button's functionality, specifically related to the HTML structure of the button.

Areas of Agreement / Disagreement

Participants generally agree on the need for a proper SMTP server for email functionality, but there are multiple competing views on how to effectively log into a webpage and access data. The discussion remains unresolved regarding the specific issue with the login button.

Contextual Notes

Participants express uncertainty about the correct approach to logging into a webpage, particularly regarding the interaction with HTML elements like buttons. There are also limitations in the provided code snippets that may affect their functionality.

Who May Find This Useful

This discussion may be useful for individuals interested in web scraping, server communication, and Python programming, particularly in the context of handling authentication and data retrieval from web services.

dacruick
Messages
1,038
Reaction score
1
Hello ladies and gentlemen,

I have grabbed a code for emailing using python. This isn't my end goal, but I think its a logical place to start and get used to the smtp library. here is the code

Code:
import smtplib

def prompt(prompt):
    return raw_input(prompt).strip()

fromaddr = prompt("From: ")
toaddrs  = prompt("To: ").split()
print "Enter message, end with ^D (Unix) or ^Z (Windows):"

# Add the From: and To: headers at the start!
msg = ("From: %s\r\nTo: %s\r\n\r\n"
       % (fromaddr, ", ".join(toaddrs)))
while 1:
    try:
        line = raw_input()
    except EOFError:
        break
    if not line:
        break
    msg = msg + line

print "Message length is " + repr(len(msg))

server = smtplib.SMTP('localhost')
server.set_debuglevel(1)
server.sendmail(fromaddr, toaddrs, msg)
server.quit()

I'm getting a server connection error called a socket error. on the line "server = smtplib.SMTP('localhost')"
It is telling me that my connection is being reset by a peer. Does anyone know what that means and what I can do to get rid of that error?
 
Technology news on Phys.org
Do you have an SMTP server running on your PC? Most people don't.

If not, you'd maybe want to install one for testing purposes. Or you can find out what your normal SMTP server is and use that. Your ISP should have one. GMail also provides one.

Other than that, learning Python is on my to-do list, so I haven't looked closely at the code.
 
How would I find my "normal" SMTP server?
 
dacruick said:
How would I find my "normal" SMTP server?
That's what I'd call "opening a can of worms". It depends on your usual email account. There's so many possibilities it's impossible to say for sure without more information.

Do you use GMail? If so, then the server is probably smtp.gmail.com. Most servers are authenticated and require you to give them a username and password. There's such a call in smtplib:

SMTP.login(user, password)

You may also, possibly, need to use SSL to connect to gmail's smtp server. There's a call in smtplib for that as well.

If you use your ISP or generally go through a mail application on your PC, then you should be able to go in there and find the information on your account, which includes the smtp server. Every mail package is different, so where to find it, I can't say. It's quite likely to require a login as well. A server that doesn't require logging in is rare, which is good. Open mail servers are often abused by spammers, and admins that don't secure their mail servers are idiots. So you should expect that your mail server requires logging in.

If you use hotmail or something, I have no idea what the smtp server is. Like I said, there are so many possibilities I can't possibly enumerate them all.
 
Hi Grep,

I am using gmail, and you were right about the smtp.gmail.com. Everything works fine. This didn't end up being as useful as I wanted it to be though. I wanted to use this as an exercise to get acquainted with accessing the internet with python. But I didn't use any lower level code to do this so its not really transferable to what I want to do.

I have data being logged on a server and I want to write a python program to grab it every hour and format it. I just don't know how to tell python to go to a certain site and log in kind of thing.
 
dacruick said:
Hi Grep,

I am using gmail, and you were right about the smtp.gmail.com. Everything works fine. This didn't end up being as useful as I wanted it to be though. I wanted to use this as an exercise to get acquainted with accessing the internet with python. But I didn't use any lower level code to do this so its not really transferable to what I want to do.

I have data being logged on a server and I want to write a python program to grab it every hour and format it. I just don't know how to tell python to go to a certain site and log in kind of thing.
Hm, one solution would be an FTP server. If you put one up on the server (with proper security) and have the log copied into the ftp server directory, you could use python's FTP library to get it fairly simply. Without getting into details, that's one way.

Or you could have the server send an email containing the log to a specific account (perhaps putting up an SMTP server on the receiving end). Then you could have that server parse the log out of the mail messages.

Or perhaps you can use some kind of remote logging capability? On Unix type systems, syslogd (the system logging daemon) can do it, I think. Maybe Windows can as well.

There's always sharing the directory with the logs, and accessing it from another system.

Just tossing out ideas to consider. Might be of use. I'm not sure what resources you have available to you.
 
yes thank you for all of your ideas. This is useful. I am going to see what I can do about this tomorrow, but is it alright if I repost on this thread in case I run into some problems?
 
dacruick said:
yes thank you for all of your ideas. This is useful. I am going to see what I can do about this tomorrow, but is it alright if I repost on this thread in case I run into some problems?
Not a problem. :approve:
 
Hi Grep,

I'm going to go in a different direction with this. I have to log into a webpage using python before I can access any of the data.

Code:
import urllib, urllib2, cookielib

username = 'username'
password = 'password'

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'password' : password, 'commit' : 'Log in' })
opener.open('site', login_data)
resp = opener.open('site I want to go to')

It seems that I can fill out the username and password fields with no problem, but the login button is giving me trouble.

The HTML for the login button is this
Code:
<input class="button" name="commit" onclick="alertForExplorerBrowserVersion(7, 3);" type="submit" value="Log in">

I figured that the 'commit' : 'Log in' tuple would activate the button but apparently not. Do you think you could shed some light?

EDIT: Hobolink.com is the site in case it is of any help
 

Similar threads

Replies
3
Views
2K
Replies
55
Views
7K
Replies
1
Views
2K
Replies
1
Views
7K
  • · Replies 43 ·
2
Replies
43
Views
4K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 16 ·
Replies
16
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K