
Hands-on with JSON data in Python
Nine out of ten of our brothers who do data crawling have dealt with proxy IPs. Recently, I found that many old iron in the docking proxy service provider API, often by the JSON format of the data whole confused. Today we will talk in plain language, how to use Python to proxy IP JSON data to play flowers.
Practical case: parse ipipgo API response
Let's say we get this return data from the ipipgo API:
{
"status": 200,
"data": [
{ "ip": "45.88.66.12", "port": 8866, "expire_time": "2024-03-10 12:00:00"}, { "ip": "103.88.44.91", "port": 3124, "expire_time": 3024
{"ip": "103.88.44.91", "port": 3128, "expire_time": "2024-03-10 12:30:00"}
]
}
The right posture for disassembling this pile of data with Python:
import json
response = '{"status":200,...}' This pretends to be the raw data from the API
proxy_data = json.loads(response)
if proxy_data['status'] == 200.
for item in proxy_data['data'].
print(f "Available proxies: {item['ip']}:{item['port']}")
print(f "expire_time: {item['expire_time']}")
Proxy IP Configuration Automation Tips
The old driver of the crawler knows that manually changing the proxy configuration can exhaust a person. We can make a smart switcher:
import requests
from random import choice
def get_proxies()::
This calls the ipipgo API to get a list of proxies.
proxies_list = [{'ip':'x.x.x.x','port':xxx},...]
return choice(proxies_list)
target_url = "https://example.com"
current_proxy = get_proxies()
resp = requests.get(
target_url, current_proxy = get_proxies(
proxies={
"http": f "http://{current_proxy['ip']}:{current_proxy['port']}",
"https": f "http://{current_proxy['ip']}:{current_proxy['port']}"
}
)
Guide to avoiding the pit: three common rollover sites
Scenario 1: JSON parsing error
Often encountered in the API return data has a mess of special characters, this time to add an error capture will be stable:
try.
data = json.loads(raw_data)
except json.JSONDecodeError as e:: print(f "Data parsing has failed!
print(f "Data parsing failed! Error message: {e}")
Scenario 2: Sudden failure of the agent
It is recommended to check the proxy expiration date before each request, like this:
from datetime import datetime
expire_time = datetime.strptime(item['expire_time'], "%Y-%m-%d %H:%M:%S")
if datetime.now() > expire_time.
print("This proxy is cool, move to the next one!")
QA time: quick check of high-frequency questions
Q: How do I ensure that requests are not interrupted when I use the Dynamic Residential Package?
A: It is recommended to set up an automatic replacement mechanism in the code to switch to a new IP immediately when a 403 status code is received. ipipgo's dynamic residential package can change the IP 5 times per second, which is completely sufficient.
Q: What if I need a long-term fixed IP?
A: directly on the static residential package, 35 dollars a month that. Suitable for scenarios that require IP stability, such as long-term hang-up tasks.
ipipgo package selection guide
Pick according to our actual needs:
- Tight budget: chooseDynamic Residential Standard($7.67/GB)
- Enterprise-level requirements: withDynamic Residential Enterprise Edition($9.47/GB)
- Fixed IP rigidity: directStatic Home Package($35/each)
Finally, when dealing with proxy IP JSON data, remember to do a good job of exception handling. After all, the network request this thing, like opening a blind box, may encounter what moth. With ipipgo brothers if you encounter technical problems, their customer service response speed thief, personally test two o'clock in the morning to mention the work order can be seconds back.

