Paul Colby

Perhaps it's worth asking what ML SFrame'esk lib is in python 3? Sometimes names change to protect the innocent. Other times newer options have sprung up. Data mining the web is a going concern. Seems odd no python 3 options are included.

P.S. are you using pip? Have you looked at virtual environments? Both are essential IMO for python development. (not that I've done much, but I know people who have)


Any suggestion on web scraping? I am kind of worried about associated legal issues. Should I stick to using APIs?

Paul Colby

Any suggestion on web scraping? I am kind of worried about associated legal issues.
If you're just scraping sites that are publicly visible anyway, I don't see why there would be any legal issues; everything on the page is there for the public to see. The site might throttle you if it detects that you're requesting too much data in too short a time, but that's a technical issue, not a legal issue; you just need to scrape at a slow enough rate, the same way the web crawlers for Google and other search engines do.

