
Python plays with JSON data from proxy IPs Hands-on with courier removal
Crawler drivers must have encountered this situation: it is difficult to get the proxy IP API response, the result of the return of JSON data like a mess. Today we take ipipgo home interface as an example, teach you how to easily handle these data like unpacking the courier.
import requests
Here's a demo using ipipgo's API (don't use it directly, remember to replace your own key)
proxy_api = "https://api.ipipgo.com/get?apikey=你的密钥"
response = requests.get(proxy_api)
json_data = response.json()
Look here! What you get is a dictionary set list structure
print(json_data['data'][0]['ip'])
Data Decryption Triple Axe Must-see Tips for Newbies
First move:Look at the packaging before you unpack. Get JSON first do not rush to parse, with json.dumps () print out to see the structure:
import json
print(json.dumps(json_data, indent=2))
Second move:Exception handling should be in place. Network jerks or interface changes happen from time to time, so it's only prudent to write it this way:
try.
proxies = [item['ip']+':'+str(item['port']) for item in json_data['data']]
except KeyError as e.
print(f "Field is missing: {e}")
except TypeError: print("Field is missing: {e}")
print(f "Field is missing: {e}") except TypeError: print(f "Returned data is not in the right format!")
Practical case: the proxy IP into your crawler toolbox
Using ipipgo's dynamic residential proxy as an example, let's get a script that automatically updates the proxy pool:
def update_proxy_pool().
Dynamic Residential Interface (Enterprise package is more stable)
api_url = "https://api.ipipgo.com/dynamic?type=enterprise"
try.
res = requests.get(api_url, timeout=10)
res.raise_for_status()
return [f"{p['ip']}:{p['port']}" for p in res.json()['proxies']]
except Exception as e.
print(f "Update failed, maybe it was a network hiccup: {e}")
return []
Package Selection Guide Which one is right for you
| Package Type | Applicable Scenarios | prices |
|---|---|---|
| Dynamic Standard Edition | Daily data collection | 7.67 Yuan/GB/month |
| Dynamic Enterprise Edition | high concurrency requirements | 9.47 Yuan/GB/month |
| Static homes | Fixed IP required | $35/each/month |
Frequently Asked Questions First Aid Kit
Q: What should I do if my proxy IP suddenly fails?
A: ipipgo's dynamic IP is automatically refreshed in 15 minutes by default, and it is recommended to cooperate with the abnormal retry mechanism. The Enterprise Edition package supports real-time refresh APIs
Q:Returned data always fail to parse
A: First check the data format with the online JSON validation tool, ipipgo's API documentation has a complete description of the response fields
Q: What can I do about agents not being fast enough?
A: You can specify the regional parameters, such as getting only domestic proxy nodes. Enterprise Edition package provides exclusive high-speed channel
Tips for avoiding pitfalls
1. Remember to do a connectivity test every time you get a new proxy IP.
2. Check the proxy protocol type when handling HTTPS requests
3. Remember to manually release the static residential IP when it is used up, otherwise it will continue to be billed.
4. High-traffic projects are recommended to contact ipipgo customer service first to customize the program
Finally nagging: do not figure cheap with a free proxy, data security, not to mention, may be one day running scripts on the hang. ipipgo home business version of the package I used for half a year, the stability is really better than the previous use of several strong, especially their cross-border line in the do business overseas is particularly powerful.

