IPIPGO ip proxy Remote reading of JSON files in Python using proxy IP program

Remote reading of JSON files in Python using proxy IP program

When the crawler meets the network 404 how to do? Done the data crawl brothers understand, the most afraid of the situation is: the code runs well, suddenly give you a face not work. At this time eighty percent is triggered by the target site's anti-climbing mechanism, directly to your IP address off the small black room. This time you need to find ...

Remote reading of JSON files in Python using proxy IP program

What to do when crawler boy meets network 404?

Done the data crawl brothers understand, the most afraid of the situation is: the code runs well, suddenly give you a face not work. This time eighty percent is triggered by the target site's anti-climbing mechanism, directly to your IP address off the small black room. At this time you need to find a substitute to help you work - that is, we have to nag today'sproxy IPThe

For example, you want to grab a remote JSON data using Python's requests library:


import requests

url = 'https://api.example.com/data.json'
response = requests.get(url)
print(response.json())

Run it a few times and you'll see that it returns a 403 error. This is the time to pull the proxy IP trick and make the server think a different person is accessing.

The right way to open a proxy IP

Here's the kicker! Using a proxy IP is not just a matter of finding a random address and filling it in, it's a matter of strategy. Here are some recommendationsipipgoHome service, their IP pool is as big as a seafood market, and they can get you a new vest with every request.

The modified code looks like this:


import requests

proxies = {
    'http': 'http://用户名:密码@gateway.ipipgo.com:9020',
    'https': 'http://用户名:密码@gateway.ipipgo.com:9020'
}

try.
    response = requests.get(url, proxies=proxies, timeout=10)
    response.raise_for_status()
    data = response.json()
except requests.exceptions.RequestException as e:: print(f "f", "f", "f", "f", "f")
    print(f "Request rollover: {str(e)}")

Note the use ofUser Name Password Authentication FormatMany newbies fill in the IP address directly without authentication information, and as a result, they can't connect to it. ipipgo's proxy address format is particularly simple, just copy it according to their documentation.

A practical guide to preventing pitfalls

Name a few easy places to plant your head:

1. IP survival time: free proxy often used twice on the hang, it is recommended to use ipipgo's dynamic short-lived proxy, each request automatically replace the
2. Time-out settingsDon't forget to add the timeout parameter, 5-10 seconds is recommended!
3. Exception handling: Web requests are not as reliable as 100%, must do a good try-except
4. JSON parsing: Sometimes the return is not standard JSON, first use response.text to see the original data

White QA First Aid Kit

Q: What should I do if my proxy IP always times out?
A: First check the format of the proxy address, especially the special symbols in the username and password should be encoded in URL. If you confirm that the format is OK, you can contact ipipgo customer service to check the node status.

Q: Do I need to manually change my IP every time?
A: Not with ipipgo's polling package, they switch automatically at the gateway level, just keep the same proxy address in the code

Q: What should I do if I encounter an SSL certificate error?
A: Add verify=False parameter in requests.get(), but this is not very safe. Suggest to check the system root certificate, or change to use ipipgo's HTTPS exclusive proxy channel!

Why ipipgo?

This is not a hard sell, it's a bloody experience. I've used 7 or 8 service providers before and finally settled on ipipgo for three reasons:

1. Response speed is top-notch, basically within 200ms
2. 200+ city lines across the country, very powerful when you need IP in a specific area.
3. Management background can see the real-time usage, not afraid of overruns
4. technical support is a real person, the last time I raised a work order at two o'clock in the middle of the night, it was actually answered in seconds.

They also recently came out with aIntelligent Routingfunction, can automatically select the fastest line. For the scene that needs to read JSON data stably, it is simply the existence of the opening. New user registration also sends 5G traffic, enough for testing.

The Ultimate Solution

A complete solution for the reachers:


from requests.adapters import HTTPAdapter

session = requests.Session()
session.mount('http://', HTTPAdapter(max_retries=3))
session.mount('https://', HTTPAdapter(max_retries=3))

def fetch_json(url):
    proxies = ipipgo.get_proxy() call ipipgo's API to get the latest proxies
    try.
        response = session.get(url, proxies=proxies, timeout=(3, 7))
        return response.json()
    except JSONDecodeError: print("JSONDecodeError", "JSONDecodeError").
        print("The returned data is not in JSON format.")
        return None

This solution adds three insurances: connection retry, automatic acquisition of new IP, and exception catching. Using ipipgo's API you can directly get the latest available proxy address, which is much less laborious than maintaining your own IP pool.

Finally, to be honest, proxy IPs are worth every penny. If the project is important, don't save on the budget. After all, the loss of downtime due to server blocking can be much more expensive than the proxy fee.

我们的产品仅支持在境外网络环境下使用(除TikTok专线外),用户使用IPIPGO从事的任何行为均不代表IPIPGO的意志和观点,IPIPGO不承担任何法律责任。

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

IPIPGO-动态住宅ip全新升级

Professional foreign proxy ip service provider-IPIPGO

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish