I am trying to send scrapy requests through a proxy that requires an authorization. I updated the process_request
method from the default middleware (middleware.py
).
I tried several ways to achieve it but everytime I get the following error messsage : ERROR: Gave up retrying <GET https://api.ipify.org/> (failed 3 times): Could not open CONNECT tunnel with proxy proxy_ip:proxy_port [{'status': 407, 'reason': b'Proxy Authentication Required'}]
Here is what I tried :
def process_request(self, request, spider):
request.meta['proxy'] = 'http://proxy_ip:proxy_port'
proxy_user_pass = "username:password"
encoded_user_pass = base64.encodestring(proxy_user_pass.encode()).decode()
request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass
return None
I try other ways of encoding the header, such as :
From : https://www.zyte.com/blog/scrapy-proxy/
request.headers['Proxy-Authorization'] = basic_auth_header("username", "password")
From: https://github.com/aivarsk/scrapy-proxies/blob/master/scrapy_proxies/randomproxy.py
encoded_user_pass = base64.b64encode(proxy_user_pass.encode()).decode()
From : Scrapy cookies not working when sending Proxy-Authorization header
request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass.strip()
The username / password have been tested and work properly If I whitelist my current ip (no authorization required) I can send a request using only request.meta['proxy'] = 'http://proxy_ip:proxy_port'
. Though this is not a solution as I do not control the ip from which the request is sent.
Any idea what goes wrong with my request?
source https://stackoverflow.com/questions/73561410/scrapy-setting-authorization-header-for-proxy-in-middleware
Comments
Post a Comment