
When Proxy IP meets Python Dictionary, how exactly do you play it without flipping?
When you are using Python to capture proxy IP data, you must have encountered this JSON thing. Just like opening a package, know that there are good things inside, but if you do not know how to open it, it is easy to break things. Today we will use the most grounded way to talk about how to use Python dictionary to deal with proxy IP data.
import json
Assuming this is the raw data from the ipipgo interface
proxy_data = '''
{
"code": 0,
"data": [
{"ip": "112.95.235.86", "port":8080, "protocol": "http"}, {"ip": "120.79.139", "port":8080, "protocol": "http"}, [
{"ip": "120.79.169.139", "port":8888, "protocol": "https"}
]
}
'''
Step 1: Unpack the package
data_dict = json.loads(proxy_data)
Proxy IP Data Anatomy Guide
Don't be in a hurry to use the JSON data, first figure out the structure. The data returned by ipipgo usually looks like this: the outer layer is the status code and the actual data, and the inner layer hides theReal IP List. It's time to peel back the layers like an onion:
| field name | corresponds English -ity, -ism, -ization |
|---|---|
| code | Status code (0 for success) |
| data | Proxy IP Array |
Practical: Proxy IP into a usable format
A lot of newbies tend to fall into thedata conversionThis step. For example, to turn the proxy IP returned by ipipgo into a format that the requests library can use, you have to do this:
proxies_list = []
for item in data_dict['data']:
proxies_list.append({
item['protocol']: f"{item['ip']}:{item['port']}"
})
print(proxies_list)
Output results:
[{'http': '112.95.235.86:8080'}, {'https': '120.79.169.139:8888'}]
Guide to avoiding pitfalls: Don't be lazy about exception handling
Handling proxy IP data is the most dreadedData format anomalies. For example, ipipgo's interface occasionally returns maintenance information, which has to be defended:
try.
if data_dict['code'] ! = 0: if data_dict['code'] !
raise ValueError("Interface returned exception")
Follow-up logic...
except KeyError as e.
print(f "Field does not exist: {str(e)}")
except json.JSONDecodeError: print("Field does not exist: {str(e)}")
JSONDecodeError: print("Data format error")
QA Time: Frequently asked questions and answers
Q: Why does my proxy IP always timeout the connection?
A: First check the validity of the IP, recommended to use ipipgo's real-time verification interface, their IP survival rate of 95% or more!
Q: How to handle the authentication information of proxy IP?
A: Add the auth field to the dictionary, for example:
{'http': 'user:pass@112.95.235.86:8080'}
Q: What are the tips for batch processing thousands of IPs?
A: use generator instead of list, like ipipgo's interface support paging to get, remember to add delay to avoid being blocked
Ultimate advice: choose the right tools to save big
Anyone who has tossed proxy IPs knows that it's too much work to maintain your own IP pool. For exampleipipgoThis kind of professional service providers, not only provide ready-made API interface, the return data format is also standardized. Their technical documentation directly in the Python sample code, encountered problems can also find technical support, than the self blind toss much stronger.
Lastly, I would like to remind you that dealing with JSON data is like stir-frying, the fire (exception handling) and seasoning (data conversion) must be mastered. The next time you encounter a proxy IP data processing problems, you may want to drink a cup of water to calm down, step by step against this guide, you are guaranteed to take a detour.

