
Teach you to use the Requests library to hang proxies.
Recently, a lot of friends doing data collection are asking how to use Python's requests library to hang the proxy will not be blocked? This is a simple matter, but there are a few pitfalls to pay special attention to. Let's take ipipgo family proxy service to give a chestnut, guaranteed to see the end of the hand.
Basic proxy configuration (don't underestimate this step)
Many newbies planted in the proxy configuration, in fact, the core of the three lines of code. Take the http proxy as an example:
import requests
proxies = {
'http': 'http://用户名:密码@gateway.ipipgo.com:9020',
'https': 'http://用户名:密码@gateway.ipipgo.com:9020'
}
response = requests.get('destination URL', proxies=proxies)
Here's one.Tai Hang District, Hong KongThe proxy server goes through the http channel. ipipgo's proxy ports will change depending on the package, so remember to check the latest port number in the background after purchase.
Dynamic proxies are king
A single proxy is easy to be blocked, you have to use ip pool rotation. Let's use ipipgo's dynamic forwarding service as an example:
import requests
from random import choice
proxy_list = [
'gateway.ipipgo.com:9020',
'gateway.ipipgo.com:9021', 'gateway.ipipgo.com:9022', 'gateway.ipipgo.com:9022'
'gateway.ipipgo.com:9022'
]
def get_with_retry(url).
for _ in range(3).
try.
proxy = f'http://用户名:密码@{choice(proxy_list)}'
return requests.get(url, proxies={'http': proxy, 'https': proxy}, timeout=8)
except.
continue
return None
watch carefullytimeout settingIt is recommended that between 8-15 seconds. ipipgo's response speed is around 200ms, and it is easy to misjudge if it is set too short. Their dynamic ip pool automatically changes ip per request, which is suitable for scenarios that require high-frequency replacement.
A practical guide to avoiding the pit
Name a few blood lessons:
| problematic phenomenon | method settle an issue |
|---|---|
| Return 407 error | Check whether the account password with special characters, it is recommended to use urlencode transcoding |
| Frequent connection timeouts | Contact ipipgo customer service to check the node status, don't mess around with it yourself! |
| stall | Try switching proxy protocols (e.g. http to socks5) |
Recently, I found that some people use the session object of requests without closing the connection, and it crashed the proxy server. Remember to add theresponse.close()!
QA time
Q: Do I need to install drivers locally to use ipipgo's proxy?
A: No need at all! Their proxies use the standard http protocol, fill in the proxies parameter and it will work.
Q:Why does my test proxy pass but the acquisition fails?
A: Maybe the target website has fingerprint detection. Try adding a 'User-Agent' in the request header, or contact ipipgo for a high stash package.
Q: How to choose nodes for overseas agents?
A:在ipipgo后台选”智能路由”,会自动匹配最低的节点。做跨境电商采集的亲测有效。
Say something from the heart.
The agent thing, three points rely on technology and seven points rely on service. Previously used a few cheap, not moving on the ip failure. Later changed ipipgo exclusive package, with their API dynamic access to the proxy, collection efficiency directly doubled. Especially theirAnomalous IPs are automatically rejectedfunction, how much labor maintenance time is saved.
One last reminder: don't write dead proxy configurations in your code! It is recommended to store account information with environment variables. In case the code is transferred to github, you won't have time to cry. Well, what should be said should not be said have nagged, there are any problems to find ipipgo customer service, better than asking me to use ~!

