
Crawler always be blocked IP? Try this life-saving trick!
Do data capture old drivers understand, the most headache is the target site suddenly blocked IP. two days ago a friend to do e-commerce price comparison with my spit, just run half an hour program IP was blacklisted. In fact, this thing really do not blame the site hard, frequent requests who can not carry. This is the time toproxy IPUp to save the day.
Proxy IP in the end how to use reliable
There are a plethora of proxy service providers on the market, but it's important to pick the right type. Here's a focused comparison chart for you:
| typology | anonymity | tempo | Applicable Scenarios |
|---|---|---|---|
| Transparent Agent | lower (one's head) | plain-spoken | General Access |
| Anonymous agent | center | moderate | routine collection |
| High Stash Agents | your (honorific) | stabilise | high-frequency crawling |
Recommended for direct useHigh Stash Proxy for ipipgoThe IP pool of their family updated every day 200,000 +, the measured success rate of the request can be 98%. especially suitable for the need for a long time to hang up the collection of the situation, do not ask me how to know, anyway, I climbed a certain East commodity data with his family IP stable batch.
Hands-on PHP Access Proxy
Taking the most commonly used cURL library as an example, adding a proxy parameter is actually extremely simple. Look at this code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'Target URL');
curl_setopt($ch, CURLOPT_PROXY, 'ipipgo's proxy server address:port'); curl_setopt($ch, CURLOPT_PROXY, 'ipipgo's proxy server address:port');
curl_setopt($ch, CURLOPT_PROXYUSERPWD, 'Username:Password');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
Note that you have to change the proxy address to the one you use in theipipgoThe backend gets exclusive access. They give ready-made code examples in the document, directly copy and paste to change the parameters on the line. It is recommended to set the timeout time is not too long, about 30 seconds is more appropriate.
A practical guide to avoiding the pit
Some sites will detect the proxy features in the request header, here to teach you a riotous operation: the Via parameter in the request header to delete clean, and then use ipipgo to provide a random UA generator. Tested to bypass the 90% anti-climbing detection.
Also remember to set up a failure retry mechanism, IP pool polling mode is recommended. For example:
$proxyList = ['IP1:PORT','IP2:PORT','IP3:PORT']; //IP pool from ipipgo
$maxRetry = 3;
for($i=0; $i<$maxRetry; $i++){
try{
// Initiate the request using $proxyList[$i].
break;
}catch(Exception $e){
// Record the error log
}
}
Frequently Asked Questions QA
Q: Does proxy IP slow down the speed?
A: It is important to choose the right service provider! Like ipipgo's BGP line agent, the delay is basically controlled within 200ms, more than 10 times faster than some free agents.
Q: What if the website still detects that I am using a proxy?
A: Try their dynamic port function, each request automatically change the port. If you can't, use their API to get the latest IP in real time, it works.
Q: How to choose a reliable agent service provider?
A: focus on three points: 1. IP survival time 2. concurrent connection limit 3. after-sales technical support. Like ipipgo such as 24-hour technical support, problems can be solved in a timely manner.
Lastly, I would like to remind newbies not to use free proxies for the sake of cheapness, as it may cause data leakage or be tracked in reverse. Leave the professional things to the professional tools.ipipgoThe newcomer package of 50 cents per day can take care of small and medium-sized collection needs, which is much more hassle-free than building your own agent pool.

