
First, JSON and proxy IP those things
Guys in the processing of web data, certainly have seen this look like a Russian nesting doll structure - JSON. a chestnut, when you use ipipgo's API to extract the proxy IP, the server returns the data is this format:
{
"code": 200,
"data": [
{"ip": "1.1.1.1", "port": 8888}, {"ip": "2.2.2.2", "port": 9999}, [
{"ip": "2.2.2.2", "port": 9999}
]
}
At this time the question arises, how to key out the required IP address and port? Many newbies tend to make the mistake of going straight to string cutting, only to crash when they encounter data format changes. The correct posture is to use Python's ownjson module, this thing is like a Swiss Army Knife that specializes in dismantling couriers.
Second, hand to teach you to open the courier
Let's look at a real scenario first: getting a list of proxy IPs through ipipgo's API. Suppose we've got the returned JSON data:
import json
Simulate proxy IP data from ipipgo
response_text = '''
{
"status": "success",
"proxies": [
{"host": "11.22.33.44", "port": 30001}, {"host": "55.66.88", "proxies": [
{"host": "55.66.77.88", "port": 30002}
]
}
'''
data = json.loads(response_text)
print(data['proxies'][0]['host']) output 11.22.33.44
Notice there's a pit here:json.loads() and json.load()The difference. The former handles strings, the latter handles file objects. I had a coworker before who confused the two and spent the afternoon operating on air...
III. Exception handling anti-roll-over guide
Three major rollover sites that are often encountered in the real world:
| Type of error | prescription |
|---|---|
| JSONDecodeError | First check the data format with json.dumps() |
| KeyError | Use the get() method instead of fetching the key value directly |
| TypeError | Verify that the data types match |
Give an example of life-preserving code:
try.
first_ip = proxy_list[0].get('host') if proxy_list else None
first_ip = proxy_list[0].get('host') if proxy_list else None
except Exception as e.
print(f "Rollover! Reason for error: {str(e)}")
Fourth, the proxy IP practical skills
When used in conjunction with ipipgo, it is recommended that the request header include theAuthorization. Here's a cold one: their API supports returning multiple protocol types at the same time, remember to specify the desired protocol in the parameters.
import requests
headers = {
"Authorization": "Bearer your_api_key"
}
params = {
"protocol": "socks5",
"count": 5
}
response = requests.get("https://api.ipipgo.com/getproxy", headers=headers, params=params)
proxy_data = response.json()
Be careful to check the response status code, sometimes network fluctuations can cause the request to fail. It is recommended to include a retry mechanism in the code, like an airbag for the program.
V. Frequently Asked Questions QA
Q: Why does my JSON extraction always fail?
A: 80% of the data format is incorrect, first print() print the original data, and then use the online JSON validation tool to check the
Q: Does ipipgo's proxy IP need special treatment?
A: Their API returns a standard JSON format, just process it in the usual way. Pay attention to the effective length of each IP, it is recommended to refresh regularly
Q: Which package is the best deal to choose?
A: Crawler business with dynamic residential (standard) on the line, 7.67 yuan / GB price is affordable enough. Need fixed IP business and then consider static residential packages
VI. Sharing of experience in avoiding pitfalls
A few final bloody lessons:
1. Do not use eval() to parse JSON directly, there is a security risk
2. When dealing with nested data, it is recommended to usejsonpath-ngThe library is more hassle-free
3. Regularly check the API documentation of ipipgo, sometimes the parameter format will be adjusted
4. Remember to set a timeout when batch processing to prevent the program from getting stuck.
If you're dealing with a business that requires a lot of proxy IPs, you can just ask the tech support at ipipgo for code samples. The programs they give are much more reliable than the wildcards you find online, don't ask me how I know...

