Using urllib2 with SOCKS proxy

Is it possible to fetch pages with urllib2 through a SOCKS proxy on a one socks server per opener basic? I’ve seen the solution using setdefaultproxy method, but I need to have different socks in different openers.

So there is SocksiPy library, which works great, but it has to be used this way:

import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)

That is, it sets the same proxy for ALL urllib2 requests. How can I have different proxies for different openers?

How can I use a SOCKS 4/5 proxy with urllib2?

How can I use a SOCKS 4/5 proxy with urllib2 to download a web page?

using tor as a SOCKS5 proxy with python urllib2 or mechanize

My goal is to use python’s mechanize with a tor SOCKS proxy. I am not using a GUI with the following Ubuntu version: Description: Ubuntu 12.04.1 LTS Release: 12.04 Codename: precise Tor is installed a

Timeout not working using urllib2, socks5 proxy and socksipy

I’m using socksipy with urllib2 in Python 2.6. Everything works fine except the timeouts when i hit a hanging URL. None of the urllib2 function timeout arguments or global socket default timeouts are

Using urllib2 via proxy

I am trying to use urllib2 through a proxy; however, after trying just about every variation of passing my verification details using urllib2, I either get a request that hangs forever and returns not

Python urllib2 timeout when using Tor as proxy?

I am using Python’s urllib2 with Tor as a proxy to access a website. When I open the site’s main page it works fine but when I try to view the login page (not actually log-in but just view it) I get t

How can I apply authenticated proxy exceptions to an opener using urllib2?

When using urllib2 (and maybe urllib) on windows python seems to magically pick up the authenticated proxy setting applied to InternetExplorer. However, it doesn’t seem to check and process the Advanc

Does httplib2 support http proxy at all? Socks proxy works but not http

Here is my code. I cannot get any http proxy to work. Socks proxy (socks4/5) works fine though. Any ideas why? urllib2 works fine with proxies though. I am confused. Thanks.. Code : 1 import socks 2

urllib2 proxy does not work with tor

I want to write some script with python which uses tor/proxy addresses to access web, for the test I have the following script: import urllib2 from BeautifulSoup import BeautifulSoup protocol = ‘socks

Both using cookies and a proxy in Python with urllib2

I’m using urllib2 to interact with a webserver. For the specific problem I need to solve, I need to tunnel the traffic through a proxy. I managed to do that with a urllib2 ‘ProxyHandler’. I also need

Connect to network not working using urllib or urllib2 even after configuring proxy

I am not able to open a url for read() using urllib or urllib2 even after using proxyhandlers (in case of urllib2) and setting proxies in urllib. My network which uses proxies to connect to internet h

Answers

== EDIT == (old HTTP-Proxy example was here..)

My fault.. urllib2 has no builtin support for SOCKS proxying..

There are some ‘hacks‘ adding SOCKS to urllib2 (or the socket object in general) here.
But I hardly suspect that this will work with multiple proxies like you require it.

As long as you don’t wan’t to hook / subclass urllib2.ProxyHandler I would suggest to go with pycurl.

You have only one socket for all openers and implementing socks is in socket level. So, you can’t.
I suggest you to use pycurl library, it much more flexible.

Try with pycurl:

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform() 
c2.perform() 

You might be able to use threading locks if there aren’t too many connections being made at once, and you need to access from multiple threads:

import socks
import socket
import thread
lock = thread.allocate_lock()
socket.socket = socks.socksocket

def GetConn():
    lock.acquire()
    import urllib2
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
    conn = urllib2.urlopen(ARGUMENTS HERE)
    lock.release()
    return conn

You might also be able to use something like this every time you need to get a connection:

urllib2 = execfile('urllib2.py')
urllib2.socket = dummy_class() # dummy_class needs the socket module's methods

These are obviously not fantastic solutions, but I’ve put in my 2ยข anyway ๐Ÿ™‚

You could do you it by setting evironmental variable HTTP_PROXY in following format:

user:[email protected]:port

or if you use bat/cmd, add before calling script:

set HTTP_PROXY=user:[email protected]:port

I am using such cmd-file to make easy_install work under proxy.

A cumbersome but working solution for using a SOCKS proxy is to set up provixy with proxy chaining and then set the HTTP_PROXY provided by privoxy via system variable or any other way.

Yes, you can. I repeat my answer on How can I use a SOCKS 4/5 proxy with urllib2? You need to create an opener for every proxy like you do with an http proxy. The code for adding this feature to SocksiPy is available in GitHub https://gist.github.com/869791 and is as simple as:

opener = urllib2.build_opener(SocksiPyHandler(socks.PROXY_TYPE_SOCKS4, 'localhost', 9999))
print opener.open('http://www.whatismyip.com/automation/n09230945.asp').read()

For more information I’ve written an example running multiple Tor instances to behave like a rotating proxy: Distributed Scraping With Multiple Tor Circuits