IPIPGO ip proxy Python crawler proxy IP settings: requests library and scrapy framework configuration

Python crawler proxy IP settings: requests library and scrapy framework configuration

Teach you to wear a cloak to the reptile brothers engaged in reptiles understand that there are no two cloaks simply can not be mixed. Today we will nag how to use requests and scrapy these two brushes, to the crawler set ipipgo family agent vest. This is not just to find a few free agents can be compared to the professional thing...

Python crawler proxy IP settings: requests library and scrapy framework configuration

Hands-On Cloak for Crawlers

Brothers engaged in crawling understand that there are no two cloaks simply can not mix. Today we will nag how to use requests and scrapy these two brushes, to the crawler set ipipgo family agent vest. This is not just to find a few free agents can be compared to professional things have to find a professional team.

Requests library proxy setup in practice

Let's start with the most commonly used requests, in fact, adding a proxy is a matter of two lines of code. But there are a lot of instructions here, such as session maintenance and automatic switching, with ipipgo's dynamic residential proxy is the most appropriate.


import requests

 Example of dynamic residential proxies (remember to replace them with your own account)
proxies = {
    'http': 'http://用户名:密码@gateway.ipipgo.com:端口',
    'https': 'http://用户名:密码@gateway.ipipgo.com:端口'
}

 Requests with timeout retries
session = requests.Session()
adapter = requests.adapters.HTTPAdapter(max_retries=3)
session.mount('http://', adapter)
session.mount('https://', adapter)

response = session.get('https://目标网站', proxies=proxies, timeout=10)

Focused attention:The proxy address of ipipgo should be according to the access point they give, the access domain name may be different for different packages. Dynamic proxies are recommended to create a new session for each request, so that the automatic switching of IP is more silky smooth.

Scrapy framework proxy configuration trick

Scrapy's proxy configuration is slightly more complex, but getting the hang of it is easy. We recommend using a middleware solution, and here's an example of a static residential proxy:


 Add to settings.py
DOWNLOADER_MIDDLEWARES = {
    'yourprojectname.middlewares.ProxyMiddleware': 543,
}

 middlewares.py middleware
class ProxyMiddleware(object).
    def process_request(self, request, spider).
        request.meta['proxy'] = "http://用户名:密码@static.ipipgo.com:端口"
         Set the IP expiration date (in seconds)
        request.meta['bind_time'] = 180  

When using ipipgo static proxy remember to turn on thelong connectionThe IP survival time of their family can be up to 24 hours, suitable for the scene that requires a fixed IP. If you do data collection, it is recommended that you buy the Enterprise Edition package, and the concurrency can reach 500+.

Guidelines on demining of common problems

Q: What should I do if the agent's total connection times out?
A: First check the whitelist settings, ipipgo need to bind the server IP. if you go to the account password authentication, pay attention to the url special symbols to escape.

Q: What if I need to specify the IP of a city in the US?
A: Just add parameters after the proxy address, such as... .gateway.ipipgo.com:port?country=US&city=Los_AngelesThe specific parameter values go to their backend to check.

Q: How to manage with both dynamic and static agents?
A: It is recommended to split two middleware processing, using scrapy priority settings. High-frequency requests are handled with dynamic agents, and critical services are handled with static, so that cost efficiency is not compromised.

Why ipipgo?

Home reptile business test data: the success rate with ordinary agents less than 60%, cut to ipipgo dynamic residential directly soared to 92%. their home.City-level positioningIt's real accurate, down to the neighborhood when doing localized collection. There is also a hidden feature - in the proxy parameters addisp=AT&TBeing able to specify the operator, this works especially well for certain anti-climbing strictures.

Finally said a pit: do not figure cheap to buy a small workshop of the proxy, before the test of a certain IP, 10 in the first 8 are in the blacklist. ipipgo although expensive, but they have aReal Residential IP Pool, it's better to use a reliable one for doing serious business.

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/47178.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

New 10W+ U.S. Dynamic IPs Year-End Sale

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish