Why Does My Python URL Extraction Code Not Work?

In summary, there could be several reasons why your Python URL extraction code is not working. It could be due to errors in your code, such as incorrect syntax or missing modules. It could also be caused by issues with the website or the URL itself, such as a broken link or a redirect. Another possibility is that your code is not properly handling errors or exceptions, causing it to fail when encountering unexpected data. Troubleshooting and debugging your code, as well as checking the URL and website for potential issues, can help identify and resolve the problem.
  • #1
doktorwho
181
6
Thread moved from a technical forum, so homework template is missing
I am suppose to write a code that print put the url of a link given below. The url is defined to start where the first " appears and end where the last " url appears starting from the start_link. Its actually the last project from te lecture 1 in Udacity and the forst code of mine.. but its wring haha
Here it goes:
# Write Python code that assigns to the
# variable url a string that is the value
# of the first URL that appears in a link
# tag in the string page.
# Your code should print http://udacity.com
# Make sure that if page were changed to

# page = '<a href="http://udacity.com">Hello world</a>'

# that your code still assigns the same value to the variable 'url',
# and therefore still prints the same thing.

# page = contents of a web page
page =('<div id="top_bin"><div id="top_content" class="width960">'
'<div class="udacity float-left"><a href="http://udacity.com">')

start_link = page.find('<a href=')
new_page=page[start_link:]
num_ofstart=new_page.find(' " ')
new_page1=new_page[(num_ofstart+1):]
num_ofend=new_page1.find(' " ')
url=new_page[(num_ofstart):(num_ofend)]
print(url)
It prints out only "http://ud
Whats wrong?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
doktorwho said:
num_ofend=new_page1.find(' " ')

url=new_page[(num_ofstart):(num_ofend)]
In between these two lines add code to print out the values of new_page, num_ofstart, and num_ofend to make sure that you and your code are operating in sync.
 
  • #3
In addition to what @NascentOxygen said, you have a problem with these two lines of code:
Python:
num_ofstart=new_page.find(' " ')
.
.
.
num_ofend=new_page1.find(' " ')
In each case, the character you should be searching for is ". What you are actually doing is searching for <space>"<space>. In other words, in the argument to the find() function, you have extra space characters before and after the double-quote. The string you're searching in doesn't contain a substring of <space>"<space>, so both calls to find() are returning -1.

One more thing - when you post code, especially Python code, surround your code with code tags.
What I did above looks like this:
[code=python]
num_ofstart=new_page.find(' " ')
.
.
.
num_ofend=new_page1.find(' " ')[/code]
 
  • #4

1. What is the purpose of my code?

The purpose of your code should be clearly defined before you start writing it. This will help you stay focused and ensure that your code is solving the intended problem.

2. Why is my code not working?

There can be several reasons why your code is not working. Some common issues include syntax errors, incorrect variable names, or missing punctuation. Make sure to carefully review your code and use debugging tools to identify and fix any errors.

3. How can I make my code more efficient?

Improving code efficiency can often involve refactoring, which is the process of restructuring existing code to make it more organized and streamlined. This can include removing redundant code, optimizing loops, or using built-in functions instead of writing custom code.

4. What is the best way to document my code?

Documenting your code is important for others to understand and use it. The best way to document your code is to use clear and concise comments throughout, explaining the purpose of each section and any complex logic. You can also use documentation tools like JSDoc or Doxygen.

5. How can I improve the readability of my code?

Readability is important for maintaining and updating code in the future. Some ways to improve readability include using proper indentation and spacing, using meaningful variable names, and breaking down complex logic into smaller functions or methods.

Similar threads

  • Programming and Computer Science
Replies
4
Views
2K
  • Programming and Computer Science
Replies
5
Views
1K
Back
Top