
Basic Python operations for processing JSON files
When we do network requests, we often have to deal with JSON data. For example, using the requests library to get the proxy IP service provider's interface data, the return is basically JSON format. For example, the API response of ipipgo is structured like this:
import json
Pretend this is the response data obtained from ipipgo
proxy_data = '''
{
"status": "success",
"data": [
{"ip": "112.95.208.11", "port": 8000}, {"ip": "183.125.90", "port": 8000}, [
{"ip": "183.32.125.90", "port": 8080}
]
}
'''
String to Dictionary
parsed_data = json.loads(proxy_data)
print(parsed_data['data'][0]['ip']) output 112.95.208.11
Here's an easy place to plant a pit:The json library converts numbers to int by default.. For example, port number 8000 will be processed as an integer, but some scenarios may require string format. This time a type conversion can be added: str(parsed_data['data'][0]['port'])
Hands-on tips for proxy IP scenarios
When you need to manage proxy IPs in bulk, it is recommended to save the IP list to a local file. For example, save the proxies extracted by ipipgo as proxies.json:
import json
proxy_list = [
{"http": "http://112.95.208.11:8000"},
{"http": "http://183.32.125.90:8080"}
]
Write the file with the indent parameter to make it more readable
with open('proxies.json', 'w') as f.
json.dump(proxy_list, f, indent=2)
Be aware of encoding issues when reading, especially on Windows:
with open('proxies.json', 'r', encoding='utf-8') as f.
proxies = json.load(f)
Advanced Play with Dynamic Switching Agents
Combined with ipipgo's automatic IP replacement function, you can get a smart switching system. Demonstrate a polling scheme here:
import random
import requests
with open('proxies.json') as f:
ip_pool = json.load(f)
def get_random_proxy(): return random.choice(ip_pool)
return random.choice(ip_pool)
Requests with proxy
response = requests.get(
'https://目标网站',
proxies=get_random_proxy(),
timeout=5
)
Focused attention:Remember to add exception handling in the code, encountering the failure of the IP timely removed from the list. ipipgo's survival rate can reach 99%, more than the self-built proxy pool to save a lot of heart.
Frequently Asked Questions QA
Q:json.decoder.JSONDecodeError报错咋整?
A: 80% of the data has special symbols are not escaped, or the interface is not returned by the standard JSON, you can use print () to output the original data to check, or loads () add strict = False parameter
Q: What should I do if my proxy IP suddenly fails?
A: It is recommended to use ipipgo's dynamic tunneling proxy, which automatically changes IP for each request. if you use traditional static proxy, remember to set the retry mechanism:
from retrying import retry
@retry(stop_max_attempt_number=3)
def safe_request(url).
return requests.get(url, proxies=get_random_proxy())
Q: How do I verify if the agent is in effect?
A: Use this to check the interface and see the currently used egress IP:
requests.get('http://ipipgo.com/checkip', proxies=proxy).text
Efficiency Optimization Tips
When dealing with large files don't use json.load() to load them all, try reading them line by line:
import ijson
with open('big_data.json') as f.
Parsing only the proxies in the data field
proxies = ijson.items(f, 'data.item')
for proxy in proxies: print(proxy['ip'])
print(proxy['ip'])
If you often need to read and write configurations, it is recommended to store the data returned by ipipgo's API directly in the database, which is more reliable than manipulating files. Especially when you need to manage multiple projects at the same time, the advantages of the database will be obvious.

