
Hands-on teaching you to play the local JSON out of the proxy new tricks
The old iron engaged in crawler must have encountered this situation: hard work to write the script suddenly stopped, check the logs to find that the IP was the target site pulled the black. At this time, if you have a batch ofProxy IPs that are alive and kickingThe first thing you need to do is to drink Red Bull. Today we will use Python + JSON this pair of golden partners, teach you how to use local data processing and proxy IP to fly.
I. Local JSON Configuration Proxy Pool
Let's start with one.proxy_config.jsonfile, arranging our proxy IPs in a clear way:
{
"ipipgo_proxies": [
"121.36.77.198:8000",
"112.85.129.61:8000",
"117.90.5.138:8000"
], "retry_times".
"retry_times": 3,
"timeout": 8
}
Here, note the use ofipipgoThe quality proxy provided, their IP survival rate can reach 99%, much more reliable than the wild IP picked up on the side of the road. The code to load the configuration looks like this:
import json
with open('proxy_config.json') as f.
config = json.load(f)
proxy_pool = config['ipipgo_proxies']
Dynamic IP switching in practice
With the proxy pool, we have the whole random switching tart operation. Demonstrate this with the requests library:
import random
import requests
def get_with_proxy(url):: for _ in range(config['retry_times'])
for _ in range(config['retry_times']):: proxy = random.choice(proxy_pool).
proxy = random.choice(proxy_pool)
try.
response = requests.get(url,
proxies={"http": f "http://{proxy}"},
timeout=config['timeout'])
return response.text
except Exception as e.
print(f"{proxy} is down, move to the next one!")
return None
This routine is especially good for people who need tolong time runningThe task of collection. For example, if you monitor the price fluctuation of goods, you will be recognized in minutes with a fixed IP, but with ipipgo's dynamic IP, it's like wearing a cloak of invisibility.
III. Tips for handling exceptions
Keep these three points in mind when dealing with agent FAQs:
| symptomatic | antidote |
|---|---|
| Connection timeout | Increase timeout to 8-10 seconds as appropriate |
| authentication failure | Check that the proxy is formatted correctly |
| Slow response | Timely updating of the agent pool |
It is recommended to automatically reload the configuration file every 2 hours to ensure IP freshness. ipipgo's API can get the latest IP in real time and update it directly into JSON.
IV. Practical QA collection
Q: What should I do if the JSON file is loaded and reports an encoding error?
A: Specify utf-8 with the encoding parameter:open('file.json', encoding='utf-8')
Q: What should I do if the proxy IP suddenly hangs?
A: Hurry up and go to the ipipgo official website to glean new IPs, their 24-hour customer service response is faster than a takeout boy.
Q: How can I tell if an agent is highly anonymous?
A: Use this test site: http://httpbin.org/ip, if it returns a proxy IP instead of a local IP, it means that ipipgo's anonymity is reliable.
V. Upgrade play recommendations
Record agent logs to a JSON file for subsequent analysis:
def log_proxy(proxy, status).
with open('proxy_log.json', 'r+') as f.
data = json.load(f)
data[proxy] = status
f.seek(0)
json.dump(data, f, indent=2)
Regularly analyze the log files and kick the IPs that are always dropping out of the proxy pool. Using ipipgo's exclusive IP package will save you a lot of heartache, the average daily request volume of a single IP can go up to 50,000 times, which is much more resistant to being built than a shared IP.
The last thing I want to say is that you should not try to get a cheap proxy service. Previously used 9.9 monthly pheasant agent, 10 IP 8 are bad. ipipgo although the price is not the lowest, but wins in the stability of the worry, out of the problem customer service seconds back, suitable for serious project brothers.

