IPIPGO ip proxy pyspider set proxy ip: PySpider crawler framework proxy IP configuration and use guide

pyspider set proxy ip: PySpider crawler framework proxy IP configuration and use guide

Teach you to use PySpider to hang proxy What is the most afraid of crawlers? The most fearful thing about crawlers is that their IPs will be blocked! Today, let's talk about how to wear a vest to the crawler in PySpider - using a proxy IP to protect the peace. Don't be intimidated by those complicated tutorials, in fact, configuring a proxy is simpler than cooking noodles. Why do you have to use a proxy IP? Let's take a ...

pyspider set proxy ip: PySpider crawler framework proxy IP configuration and use guide

Hands-on teaching you to use PySpider to hang proxies

What is the biggest fear of crawlers? Being blocked IP is definitely in the top three! Today, let's talk about how to put a vest on a crawler in PySpider - using a proxy IP to keep it safe. Don't be intimidated by those complicated tutorials, in fact, the configuration of the proxy is simpler than cooking noodles.

Why do I have to use a proxy IP?

To give a chestnut: you go to the supermarket every day to grab the special eggs, three consecutive days to wear the same red dress, the fourth day of the security guards directly to stop you outside the child. Proxy IP is a closet of clothes for the crawler, every time you go out and change to wear. With ipipgo's proxy, it's equivalent to renting a clothing store directly, with "clothes" from 200+ countries around the world to choose from.

Proxy Configuration in Three Steps


 先导入必备工具包
from pyspider.libs.base_handler import 

class MyCrawler(BaseHandler):
    crawl_config = {
        'proxy': 'http://username:password@proxy_ip:port',   这里填ipipgo提供的代理地址
        'headers': {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
        }
    }
    
    @every(minutes=2460)
    def on_start(self):
        self.crawl('http://目标网站.com', callback=self.index_page)

Highlights:When the proxy address is obtained in the ipipgo backend, remember to select the HTTP/HTTPS protocol format. Dynamic residential IP is recommended to useDynamic Residential (Standard) PackageThe price of $7.67/GB is extra friendly for newbies.

Dynamic IP automatic switching trick

I want to realize the automatic change of IP per request, and use the API interface of ipipgo to catch and use it now:


import random

def get_proxy():
     这里调用ipipgo的API接口
    proxy_list = ["ip1:port","ip2:port","ip3:port"] 
    return random.choice(proxy_list)

class Handler(BaseHandler):
    
    def make_request(self, url, callback):
        return Request(url, 
                      callback=callback,
                      proxy=get_proxy())   每次请求自动换装

Guide to avoiding the pit (QA session)

Q: What should I do if the proxy suddenly fails?
A: ipipgo client comes with heartbeat detection, found that the IP hangs will automatically cut the new IP, with the cell phone automatically connect to WiFi a reason.

Q: How do I test if the proxy is working?
A: Add a test step to the crawler:


self.crawl('http://httpbin.org/ip', callback=self.check_ip)

def check_ip(self, response):
    print(response.text)   这里显示的IP应该变成代理IP

How to choose a package without stepping on the mine

Business Type Recommended Packages Applicable Scenarios
High Frequency Data Acquisition Static homes 35/IP for a whole month, suitable for long term surveillance.
Enterprise Crawler Dynamic Residential (Business) 9.47/GB with VIP channel, grab data faster!
Individual small projects Dynamic residential (standard) 7.67 Cabbage Prices, First Choice for Practice

最后叨叨句:别在免费代理上浪费时间,之前我测试过,10个免费代理有8个是坏的。ipipgo的TK专线实测不到200ms,跟本地网络差不多快。他们客服还能给定制方案,上次有个兄弟要爬东南亚电商数据,直接给配了跨境专线。

我们的产品仅支持在境外网络环境下使用(除TikTok专线外),用户使用IPIPGO从事的任何行为均不代表IPIPGO的意志和观点,IPIPGO不承担任何法律责任。

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

IPIPGO-五一狂欢 IP资源全场特价!

Professional foreign proxy ip service provider-IPIPGO

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish