
Hands-on teaching you in PHP with proxy IP grab data
Engaged in the network crawler guys must have encountered the 403 Forbidden bad thing, this timeproxy IPIt's a life saver for you. Today we use the most straightforward way to talk about how to add proxy IP functionality to cURL in PHP.
Why not use a naked crawl?
Many sites have anti-crawler mechanisms, for example:
- Frequent visits to the same IP will be blacklisted
- Servers can recognize server room IP segments
- Certain regional IPs will be given special attention
At this point with ipipgo's proxy IP pool, it's like putting a gas mask on the crawler to effectively avoid these monitoring traps.
Practical code to go
Let's look at a base configuration:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "Destination URL");
curl_setopt($ch, CURLOPT_PROXY, "Proxy IP address:port"); curl_setopt($ch, CURLOPT_PROXY, "Proxy IP address:port");
curl_setopt($ch, CURLOPT_PROXYUSERPWD, "account:password"); curl_setopt($ch, CURLOPT_PROXYUSERPWD, "Account:Password");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
Focus on these three parameters:
| parameters | corresponds English -ity, -ism, -ization |
|---|---|
| CURLOPT_PROXY | Fill in the proxy server address |
| CURLOPT_PROXYTYPE | Type of agency (may be omitted) |
| CURLOPT_PROXYUSERPWD | Certification Information |
Automatic IP switching
If you want to have a long and stable operation, you have to learn how to change IPs automatically. ipipgo's Dynamic Proxy Service is recommended here, and their API can get the latest IPs in real time:
// Get the proxy IP pool from ipipgo
$ipPool = json_decode(file_get_contents("https://api.ipipgo.com/getips?type=php"));
foreach($ipPool as $proxy){
curl_setopt($ch, CURLOPT_PROXY, $proxy->ip.":".$proxy->port);
// Add error handling logic here
if(curl_errno($ch) == 0){
break; // break out of loop on success
}
}
A must-know guide to avoiding the pit
1. Don't be lazy with timeout settings: It is recommended that CURLOPT_TIMEOUT be set at 8-15 seconds, too short for false positives.
2. Remember to clean up your tracks.: add CURLOPT_USERAGENT to disguise the browser
3. Validating Proxy Validity: periodically check the response status code with curl_getinfo
Frequently Asked Questions QA
Q: Proxy IPs are not working when I use them?
A: In this case, it is recommended to use ipipgo's dynamic short-lived proxy, their IP survival time can be accurate to the minute level.
Q: The returned data is always incomplete?
A: Try adding the CURLOPT_ENCODING parameter, some proxies will compress the data
Q: How can I tell if a proxy is anonymous?
A: Using the detection interface provided by ipipgo, the X-FORWARDED-FOR header is returned
Finally, to be honest, maintaining a proxy IP pool on your own is both costly and exhausting. Professional service providers like ipipgo not only offerTens of millions of IP resourcesThe first is that the PHP SDK is a good package, and the docking documentation is also written in a human language. Their PHP SDK package is quite perfect, docking documentation is also written in human terms, it is recommended to use directly.

