IPIPGO ip proxy PHP Crawl Example: CURL Capture Code Template

PHP Crawl Example: CURL Capture Code Template

The most important thing is that you can use PHP to do web page collection to do data collection, and you should not be afraid of IP blocking! Today, let's talk about how to use PHP's CURL with proxy IP to keep the peace. First of all, I have a buddy to do price comparison site, did not add a proxy directly hard, the results of the next day, the server IP was the target station...

PHP Crawl Example: CURL Capture Code Template

Hands-on teaching you to use PHP to engage in web page collection

The most fearful thing about data collection is that the IP will be blocked! Today, let's talk about how to use PHP CURL with proxy IP to keep the peace. First of all, a real thing, I have a buddy to do price comparison site, did not add a proxy directly hard, the results of the next day, the server IP was the target station blacklisted, and now change to use ipipgo's proxy pool never turn over the car.

Base model collection template (with proxy)


function crawlWithProxy($url) {
    $ch = curl_init();

    // Here's the kicker! Here's how to set up the proxy server
    curl_setopt($ch, CURLOPT_PROXY, 'proxy.ipipgo.com:9021');
    curl_setopt($ch, CURLOPT_PROXYUSERPWD, 'user name:password');

    curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_URL, $url).
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // skip https authentication

    $output = curl_exec($ch);
    if(curl_errno($ch)){
        throw new Exception('Crawling error: '.curl_error($ch));
    }
    curl_close($ch); }
    return $output; }
}

// Example usage
try {
    $html = crawlWithProxy('http://目标网站.com'); echo $html; }
    echo $html; } catch(Exception $e) { $html
} catch(Exception $e) {
    echo $e->getMessage(); } catch(Exception $e) { echo $e->getMessage(); }
}

watch carefullyAgent Settings sectionThe proxy addresses provided by ipipgo are used here. They are generally in the formatDomain:PortIf you want to use a proxy, you have to remember to change the account password to the one you registered with. The advantage of using his proxy is that each request automatically change IP, the target site simply can not feel your set.

Advanced Configuration Tips

Want to make acquisition more stable? These parameters have to be tuned:


// Set the timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);

// Disguise browser headers
$headers = [
    'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36', 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
    'Accept-Language: zh-CN,zh;q=0.9'
];
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

// Automatically handle redirects
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

Special note: with ipipgo'sLong-lasting static proxiesRemember to set the whitelist in the background. If you use a dynamic proxy pool, their API interface can directly get the latest proxy list, which will be discussed later.

Real-world common pitfalls QA

Q: What should I do if the proxy always times out the connection?
A: First check the proxy address and port is not right, and then try to adjust the CURLOPT_CONNECTTIMEOUT parameter. If you use ipipgo encountered this situation, their customer service response speed thief, background submit a work order 5 minutes must return.

Q: What should I pay attention to when collecting https sites?
A: Set CURLOPT_SSL_VERIFYPEER and CURLOPT_SSL_VERIFYHOST to false, which is not very safe but can solve the problem. Or go to the official website of ipipgo to download the CA certificate, and specify the path of the certificate is more secure.

Q: How to switch proxy IP automatically?
A: ipipgo's dynamic proxy service comes with this function, in the code to replace their API interface on the line. For example:


$proxy = file_get_contents('https://api.ipipgo.com/dynamic?token=你的令牌');
curl_setopt($ch, CURLOPT_PROXY, $proxy);

Tips for using ipipgo

Their agents come in three packages, chosen according to needs:

Package Type Applicable Scenarios Recommended Configurations
dynamic rotation high frequency acquisition Automatic IP change per request
static and long-lasting Fixed IP required 24-hour validity period
Customized Exclusive Enterprise Requirements Exclusive IP Pool + Customized Strategy

Remember to register as a new user2G Free Traffic PackIt's enough for testing. There is a hidden benefit: in the code with their alternate domain name proxy2.ipipgo.net, sometimes the main domain name is blocked by some sites can use this.

最后说个骚操作:把采集脚本放crontab定时跑的时候,记得在代码里加个随机sleep(mt_rand(1,5)),这样既模拟真人操作,又能避免触发目标网站的风控机制。配合ipipgo的代理,基本上可以做到无感采集,亲测有效!

我们的产品仅支持在境外网络环境下使用(除TikTok专线外),用户使用IPIPGO从事的任何行为均不代表IPIPGO的意志和观点,IPIPGO不承担任何法律责任。

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

IPIPGO-五一狂欢 IP资源全场特价!

Professional foreign proxy ip service provider-IPIPGO

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish