
A hands-on approach to stuffing proxy IPs into Python's pants pockets
The old driver of the crawler knows that Requests library is like the key to open the excavator, but without the proxy IP support, it will be caught by the site security (anti-crawling mechanism) in a minute. Today let's talk about how to put ipipgo proxy IP into the Python pants pocket.
import requests
The right way to open a proxy IP
proxies = {
"http": "http://user:password@gateway.ipipgo.com:9020",
"https": "http://user:password@gateway.ipipgo.com:9020"
}
response = requests.get("https://目标网站", proxies=proxies)
Watch this space.User name and passwordYou have to change the authentication information you got from ipipgo, and don't copy the port number, as each package gives a different channel. I fell into this last time, copying the port to the document and ended up blind for half an hour.
Winding up the Socks5 proxy
There are some special scenarios where you have to use socks5 protocol, and this time you have to install a small motor for Requests. First, install the dependency libraries:
pip install requests[socks]
The configuration poses are slightly different:
proxies = {
'http': 'socks5://user:password@gateway.ipipgo.com:9021',
'https': 'socks5://user:password@gateway.ipipgo.com:9021'
}
Here is a pitfall, ipipgo's socks port and http port are separate, don't get confused. I've used 9020 and 9021 in reverse before, and the program got stuck like a tractor.
Guerrilla tactics with dynamic IPs
If you're using a dynamic residential proxy, remember to add astochastic switchingThe organs of the
import random
def get_random_proxy():
proxy_list = [
"http://user:password@gateway2.ipipgo.com:9020", "http://user:password@gateway2.ipipgo.com:9020", "http://user:password@gateway2.ipipgo.com:9020", "http://user:password@gateway2.ipipgo.com:9020
"http://user:password@gateway3.ipipgo.com:9020"
]
return {'http': random.choice(proxy_list), 'https': random.choice(proxy_list)}
response = requests.get(url, proxies=get_random_proxy())
So that every request for a change of armor, anti-climbing system will not recognize you. But pay attention to ipipgo's dynamic package is based on traffic billing, don't slip and write a dead loop to use the package over.
QA First Aid Kit
Q: What can I do if the agent can't connect?
A: first check the three-piece suit: ① account password there is no error ② port number on the corresponding protocol ③ local network is allowed to outbound
Q: What about the snail's pace?
A: Try to cut to the TK line, or change the static residential IP. sometimes the regional node is not selected correctly will also be stuck, such as climbing the Asian site do not choose the European node.
Q: What if I need to run multiple crawlers at the same time?
A: On ipipgo's exclusive static package, each crawler is assigned an independent IP to avoid IP pool fights.
ipipgo package selection guide
| Package Type | Applicable Scenarios | prices |
|---|---|---|
| Dynamic residential (standard) | Daily data collection | 7.67 Yuan/GB/month |
| Dynamic Residential (Business) | High-frequency visit requirements | 9.47 Yuan/GB/month |
| Static homes | Long-term fixed operations | 35RMB/IP/month |
Newbies are advised to practice with the Dynamic Standard Edition first, and then cut the Enterprise Edition when the business volume comes up. TheirAPI ExtractionIt does come in handy, I wrote a script to auto-renew IPs and it works pretty well with crontab.
Finally nagging sentence, encountered complex scenes directly to customer service to 1v1 program, than their own blind folding to save time. Last time there is a cross-border collection needs, they give the cross-border line with direct savings of 30% traffic costs.

