Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

[Python] Keeping the session alive

  1. Aug 28, 2014 #1

    adjacent

    User Avatar
    Gold Member

    I am experimenting with capatcha images. I have a capatcha.php in my local host which will generate an image and that image will be put into the form

    Here is my python code to get the image, extract the text in it and send them back to the form. And finally save the resulting form as html.
    The form has Two fields, Number and Code. And it will return a table of things.
    But it's not working. The html saved by python does not have the table. I don't see any problem with the code :confused:
    Code (Text):

    import os
    import requests

    p = requests.session()
    q = p.get('[PLAIN]http://localhost/Test/Capatcha.php')[/PLAIN] [Broken]
    with open('data/a.png', 'wb') as f:
        f.write(q.content)
    os.system("tesseract C:\\Users\\Me\\Desktop\\Test\\data\\a.png C:\\Users\\Me\\Desktop\\Test\\data\\a")
    with open("data\\a.txt") as cap:
        capData = cap.read()
    print("Capatcha line:"+capData)
    num = input("Please enter the number :")
    payload = {
        'Code': capData,
        'q': num
    }

    url = "[PLAIN]http://localhost/Test/index.php"[/PLAIN] [Broken]
    r = p.post(url, data=payload)


    with open("data\\log.html", "w") as file:
        log = file.write(r.text)

     
     
    Last edited by a moderator: May 6, 2017
  2. jcsd
  3. Aug 29, 2014 #2
    I have not idea what you are talking about; but, is there, by any chance, a missing extension ".txt" in the system call? That might be the reason why your a.txt file does not have what is supposed to?
     
  4. Aug 29, 2014 #3

    adjacent

    User Avatar
    Gold Member

    Why? Can you tell me what you didn't understand?
    I have a folder in my localhost which is called 'Test'. In that folder, I have two php files: Capatcha.php and Index.php. I also have a folder called 'data' in the 'test' folder.
    What the capatcha.php does is, start a session and generate a random image with numbers.

    The index.php has two input fields one is called 'Code' and other is called 'q'.
    What I am trying to do is, download the capatcha image from the capatcha.php and get the text in the capatcha image with an OCR engine(Tesseract).
    The code in os.system() will save the resulting text in a text file called data.txt. And I will save that text in a variable called 'capData'
    Then the python program will ask for a number which I will enter manually. It will be saved in a variable called 'num'
    Then the python program will connect to index.php and enter the values of 'q' which will be the value for 'num' ,and 'Code' which will be the value for 'capData'.
    Then If the 'Code' matches the one in the capatcha image, the php file will generate a table containing a name and the number.

    My problem is that it does not generate the table and I think the reason is because when the python program connects the index.php, it is not in the capatcha session. It started a new session.

    So My question is how to keep the session alive. I don't see any problem in my code above. However, it does not work.


    'Tesseract' (OCR engine) will automatically save it as a text file.



    Here is a php code which does the same thing I am trying to do.
    Code (Text):

    <?php
    include("simple_html_dom.php");
    $tmp_fname = tempnam("data", "COOKIE");

    $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
    $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
    $header[] = "Cache-Control: max-age=0";
    $header[] = "Connection: keep-alive";
    $header[] = "Keep-Alive: 300";
    $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $header[] = "Accept-Language: en-us,en;q=0.5";
    $header[] = "Pragma: ";

    $curl_handle = curl_init("[PLAIN]http://localhost/Test/Capatcha.php");[/PLAIN] [Broken]
    curl_setopt($curl_handle, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0');
    curl_setopt($curl_handle, CURLOPT_HTTPHEADER, $header);
    curl_setopt($curl_handle, CURLOPT_REFERER, '[PLAIN]http://localhost/');[/PLAIN] [Broken][/PLAIN] [Broken]
    curl_setopt($curl_handle, CURLOPT_COOKIEJAR, $tmp_fname);
    curl_setopt($curl_handle, CURLOPT_COOKIEFILE, $tmp_fname);
    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
    $output = curl_exec($curl_handle);

    file_put_contents("data/a.png", $output);
    $path = "C:/xampp/htdocs/edi/";
    try
    {
       exec($path."Tesseract-OCR/tesseract.exe ".$path."data/a.png ".$path."data/a", $msg);
       $capture = trim(file_get_contents("data/a.txt"));
       echo $capture;
       unlink("data/a.txt");
    }
    catch (Exception $e)
    {
       echo $e;
    }

    $curl_handle = curl_init("[PLAIN]http://localhost/Test/index.php");[/PLAIN] [Broken]
    curl_setopt($curl_handle, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0');
    curl_setopt($curl_handle, CURLOPT_HTTPHEADER, $header);
    curl_setopt($curl_handle, CURLOPT_REFERER, '[PLAIN]http://localhost/');[/PLAIN] [Broken][/PLAIN] [Broken]
    curl_setopt($curl_handle, CURLOPT_COOKIEFILE, $tmp_fname);
    curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($curl_handle, CURLOPT_POSTFIELDS, array("captcha"=>$capture, "number"=>$_REQUEST['q'], "submit"=>""));
    $output = curl_exec($curl_handle);

    [B]$html = str_get_html($output);[/B]

    ?>
     
     
    Last edited by a moderator: May 6, 2017
  5. Aug 29, 2014 #4

    adjacent

    User Avatar
    Gold Member

    I have solved this by putting everything in a "with requests.session() as s:"

    Thanks anyway. :D
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: [Python] Keeping the session alive
  1. Chess with Python (Replies: 2)

  2. Is Python the future? (Replies: 10)

  3. Python installation (Replies: 10)

Loading...