IPIPGO ip proxy PHP Web Crawler Proxy: Using PHP language to integrate the proxy IP real-world code

PHP Web Crawler Proxy: Using PHP language to integrate the proxy IP real-world code

手把手教你把ipipgo代理塞进PHP爬虫 搞爬虫的都知道,没代理IP就像裸奔上网。用PHP做采集的话,给curl加个外套就能接上ipipgo的代理服务。先整个最简单的配置: $ch = curl_init(); curl_setopt($ch, CURLOP…

PHP Web Crawler Proxy: Using PHP language to integrate the proxy IP real-world code

手把手教你把ipipgo代理塞进PHP爬虫

搞爬虫的都知道,没代理IP就像裸奔上网。用PHP做采集的话,给curl加个外套就能接上ipipgo的代理服务。先整个最简单的配置:


$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, '目标网址');
curl_setopt($ch, CURLOPT_PROXY, '代理IP:端口'); //比如121.40.88.66:8000
curl_setopt($ch, CURLOPT_PROXYUSERPWD, '账号:密码');
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
$result = curl_exec($ch);

Here's a pitfall to watch out for:别把代理IP写死在代码里。建议用ipipgo的API动态获取IP池,他们家的API返回格式是json,用json_decode处理特别方便。

代理IP的智能切换套路

实战中发现,很多新手容易卡在代理失效的问题上。教你们个绝招:用try…catch包住curl_exec,当捕获到超时或403错误时,立马换新代理。


function getProxy(){
    //调用ipipgo的API获取新IP
    $api = 'https://api.ipipgo.com/get?type=dynamic';
    $ipData = json_decode(file_get_contents($api),true);
    return $ipData['proxy'];
}

do {
    try {
        $proxy = getProxy();
        //...执行curl请求...
        break;
    } catch(Exception $e) {
        //记录失败日志
        continue;
    }
} while(重试次数<3);

必须掌握的防封技巧

光换IP还不够,得让爬虫看起来像真人操作。这几个参数建议加上:


curl_setopt($ch, CURLOPT_ENCODING, 'gzip'); //解压缩
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Accept-Language: zh-CN,zh;q=0.9',
    'User-Agent: Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36...'
]);
curl_setopt($ch, CURLOPT_REFERER, 'https://www.google.com/'); 
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt'); //带cookie访问

ipipgo的动态住宅代理有个隐藏功能:设置session_id可以让同一个IP保持会话。这对需要登录的网站特别有用,在代理地址后面加session=自定义字符串就行。

QA常见翻车现场

Symptoms of the problem method settle an issue
总是连接超时 检查代理类型是否选错,动态代理要用_http类型
遇到Cloudflare验证 换ipipgo的静态住宅代理+降低请求频率
Return to blank page 加上CURLOPT_FOLLOWLOCATION跟踪重定向
Account Banned 开启代理自动切换+设置随机请求间隔

There is a way to choose a package

根据实测经验给建议:

  • 普通采集用Dynamic residential (standard),每小时自动换IP
  • 需要固定IP的场景(比如API调用)选Static homes
  • 企业级大规模采集直接上Dynamic Residential (Business),支持并发数翻倍

最后提醒:用ipipgo的时候记得开他们的IP Survival Detection功能,能提前筛掉失效的代理。具体是在请求代理时加&check=1参数,返回状态码200的才能用。

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/46932.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

New 10W+ U.S. Dynamic IPs Year-End Sale

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish