
What the heck is a JSON file?
We engaged in crawling brothers should have seen the .json suffix file, this product is a notepad with format. For example, from ipipgo background to get the proxy IP list, nine times out of ten are this format. For example, you can open it and see the structure:
{
"proxies": [
{"ip": "123.45.67.89", "port": 8080}, {"ip": "98.76.54.32", "port": 3128}, {"proxies": [
{"ip": "98.76.54.32", "port": 3128}
]
}
Watch out for this in thecurly brackets over middle brackets (math.)This is the standard way of writing JSON. Sensei has to straighten out this structure first, whether it's dealing with proxy IPs or other data.
Python how to eat this bowl of "JSON rice"?
Using Python to process JSON files is actually as simple as getting high, in three main steps:
import json
Step 1: Open the box
with open('ipipgo_proxies.json', 'r') as f:
data = json.load(f)
Step 2: Pick and sort
for proxy in data['proxies'].
print(f "Available proxies: {proxy['ip']}:{proxy['port']}")
Step 3: Update the inventory (in ipipgo format for example)
data['proxies'].append({"ip": "76.135.28.41", "port": 8888})
with open('new_proxies.json', 'w') as f.
json.dump(data, f, indent=4)
Here's the kicker! When using ipipgo's proxy service, the JSON format returned by their API is particularly well-organized, and the field names are all fixed lowercase letters, which is particularly friendly for batch processing.
How do proxy IPs and JSON hook up?
Old drivers engaged in data collection know that the proxy IP and JSON file that is the golden partner. Take a real scenario:
import requests
from json.decoder import JSONDecodeError
proxies = {
'http': 'http://ipipgo_username:ipipgo_password@gateway.ipipgo.com:9021',
'https': 'https://ipipgo_username:ipipgo_password@gateway.ipipgo.com:9021'
}
try.
response = requests.get('https://api.example.com/data', proxies=proxies)
data = response.json() automatically converted to dictionary
print(data['results'][0]['ip_address'])
except JSONDecodeError: print("data['results'][0]['ip_address']")
print("This site is not returning proper JSON!")
Here's one.Hidden Tips: Proxy support for ipipgoUsername and password authentication is written directly in the proxy addressThe design really saves time by not having to manually process the authentication every time.
A practical guide to avoiding the pit
There are a few common fallouts for newbies:
| pit stop | correct posture |
|---|---|
| json.load() reported encoding error | Add encoding='utf-8' to open(). |
| KeyError field not found | First use data.get('field name') to fetch the value safely |
| Failed request due to proxy IP failure | With ipipgo's automatic switching function |
question-and-answer session
Q: Why should I use ipipgo's proxy with JSON?
A: Because their API returns a standardized format and also supports thebulk accessrespond in singingStatus Query, which is directly converted to a dictionary and works.
Q: What should I do if I have a memory burst when handling large files?
A: Use ijson library to stream read, or directly call ipipgo's paging API, don't pull all the data at once.
Q:json.dump save the Chinese become garbled?
A: Add two parameters to keep the peace: ensure_ascii=False, encoding='utf-8′.
Finally, a piece of trivia: ipipgo's proxy list JSON hides asecret fieldIt's called "region_code", with this you can accurately select the export IP of a specific region, I won't tell the general public!

