IPIPGO ip proxy PHP Crawling: Proxy IP to resolve request limitations

PHP Crawling: Proxy IP to resolve request limitations

Teach you to use proxy IP to break through the site access restrictions Brothers engaged in web crawlers should have encountered this kind of shit: scripts running suddenly stopped, the site either popping CAPTCHA or directly blocked IP. this time we have to move out of our life-saving weapon - proxy IP. ...

PHP Crawling: Proxy IP to resolve request limitations

Teach you to use proxy IP to break through the website access restrictions

Brothers engaged in web crawlers should have encountered this kind of shit: scripts run and run a sudden hiatus, the site either popping CAPTCHA or directly blocked IP. this time we have to move out of our life-saving weapon - theproxy IPThe first thing you need to do is to use a proxy service for PHP. Today let's take PHP and show you how to use ipipgo's proxy service to deal with these website restrictions.

Why does your crawler always get caught?

Webmasters are not vegetarians, they stare at the access logs to see, found that a certain IP crazy brush request, directly give you a seal. Ordinary users visit the web page every minute just a few times, but the crawler may be dozens of times per second, the frequency of blind people can see that there is a problem.


// Example of a typical death-crawler code
for($i=0; $i<1000; $i++){
    $html = file_get_contents('target site');;
    // Parsing the data...
}

It doesn't take half an hour to do this, and your IP is guaranteed to be blacklisted. It's time to use a proxy IP toSwitching identities on a rotating basis, making the site think it is being accessed by different users.

Real-world PHP proxy configuration

Here to teach you two common methods, using ipipgo's proxy service to demonstrate (their home API docking is particularly convenient).

Method 1: CURL Setting Proxy


$proxy = 'Proxy address assigned by ipipgo:port';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "destination URL");
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// It is recommended to add a timeout setting
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$output = curl_exec($ch); curl_close($ch, CURLOPT_TIMEOUT, 10)
curl_close($ch).

Method 2: Streaming Context Setting


$context = stream_context_create([
    'http' => [
        'proxy' => 'tcp://'.$proxy,
        'request_fulluri' => true
    ]
]);
$response = file_get_contents('destination url', false, $context);

How to choose a reliable proxy IP?

The agent service providers on the market are uneven, and here we must be amenable to the followingipipgo. I'll give you a list of the advantages of their home to compare:

functionality General Agent ipipgo
connection speed Frequent lagging 5G leased line
IP library size thousands Million Dollar Pool
automatic replacement manual operation Automatic API switching
after-sales service I can't find anyone. 24 hours online

A guide to avoiding lightning in common potholes

Q: What should I do if my proxy IP is not working after I use it?
A: Remember to set the failure retry mechanism, ipipgo's API supports automatic acquisition of new IPs, it is recommended that every 20 requests to change the proxy

Q: What's wrong with using a proxy and still getting blocked?
A: check the request header has no browser characteristics, do not use the obvious like crawler User-Agent, and then do not visit the frequency is too crazy, it is recommended to control within 3 times per second!

Q: What should I do if my proxy IP responds slowly?
A: In the background of ipipgo choose "high-speed channel" node, or switch to different regions of the server to try, sometimes the physical distance between the nodes faster!

Conscientious advice for newbies

Brothers who are just starting to play with crawlers are advised to start with ipipgo'sFree Trial PackagePractice. They get 1G of traffic for new users, which is enough to test basic functions. Remember a few key points:

1. Randomly draw proxies from the IP pool before each request
2. Record the number of times each IP is used
3. Immediate IP switching in case of response anomalies
4. Periodic testing of agent availability

Finally said a heartfelt words, do not believe those free agents, nine out of ten is the pit. Professional things to professional people to do, ipipgo this kind of fee service although it costs money, but can save you a lot of time to toss, the key time does not fall off the chain is really cost-effective.

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/36854.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

New 10W+ U.S. Dynamic IPs Year-End Sale

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish