
Python play JSON data essential sets
Engage in network data capture of the old iron must have stepped on such a pit - the target site suddenly give you an IP ban. At this time we have to pull out our killer app!Proxy IP ServiceThe first thing you need to do is to use the Python json module. Today take ipipgo home services as a chestnut, teach you how to use Python's json module to elegantly handle the data returned by the API.
import json
import requests
Here, remember to replace the ipipgo key with your own.
proxy = {
'http': 'http://用户名:密码@gateway.ipipgo.com:9020', 'https': 'http://用户名:密码@gateway.ipipgo.com:9020'
'https': 'https://用户名:密码@gateway.ipipgo.com:9020'
}
resp = requests.get('https://api.example.com/data', proxies=proxy)
data = json.loads(resp.text)
Proxy IP real battle to avoid the pit manual
Many newbies are prone toAgent CertificationThe link rolls over. ipipgo's proxy address format is fixed and must be strictly followedUsername:Password@Gateway Address:PortThe format is filled in. Here's a cross-reference table of common errors:
| symptomatic | method settle an issue |
|---|---|
| 407 Agent Authentication Error | Checking passwords for special characters requires URL encoding |
| Connection timeout | Try switching ipipgo's different server room nodes |
| Return data garbled | Add the Accept-Encoding parameter to the headers of the requests. |
JSON Data Fancy Processing Tips
Don't rush to process the data when you get it, first use thejson.dumps()Do a nice formatting:
The raw data may be squished into a pile
print(json.dumps(data, indent=2, ensure_ascii=False))
If you encounter an odd timestamp, you can convert it like this
from datetime import datetime
timestamp = data['create_time']
print(datetime.fromtimestamp(timestamp))
First Aid Kit for High Frequency Problems
Q: Why is the request speed slower with ipipgo proxy?
A: 80% is not open persistent connection, in the Session configuration keep-alive parameter can enhance the 30% speed
Q: How to deal with null in the returned JSON data?
A: The json module will be automatically converted to None, it is safer to use get() method to take the value:
data.get('price', 0)
Hidden features of ipipgo revealed
They have a family.Intelligent RoutingThe black technology that automatically selects the fastest node by adding a header header to the code:
headers = {
'X-Proxy-Mode': 'smart', 'Authorization': 'Bearer your_token'
'Authorization': 'Bearer your_token'
}
This feature is tested in the case of the need forHigh Concurrency AcquisitionIt is especially powerful when it is much more stable than the normal polling mode. Recently, the new user registration also sends 10G traffic package, wool not gripping white not gripping.
One last tip: when dealing with deeply nested JSON data, try thejsonpath-ngThis library is much more refreshing than writing a bunch of for loops. When you encounter problems that can't be solved, go directly to ipipgo's work order system to find the technical guy, they are online at 2:00 a.m., which is called the night watchman of the programmer community.

