IPIPGO ip proxy PHP crawling example: PHP web crawling example tutorials

PHP crawling example: PHP web crawling example tutorials

Hand in hand to teach you to use PHP to catch the web page does not block IP old iron is not often encountered to catch the data by the site blocked IP?Today we will nag how to use proxy IP to solve this headache. Take our own ipipgo service, hand in hand to teach you how to live in PHP. Why use a proxy IP to capture data? The first thing you need to do is to use a proxy IP to capture the data.

PHP crawling example: PHP web crawling example tutorials

Teach you to use PHP to catch web pages without blocking the IP!

Old iron is not often encountered to capture data by the site blocked IP, today we will nag how to use proxy IP to solve this headache. Take our own ipipgo service, hand in hand to teach you how to live in PHP.

Why do I need a proxy IP to capture data?

To give a chestnut, you go to the supermarket to buy snacks, even go ten times to take the same membership card, the cashier must be suspicious. This is also the case with anti-creeper websites.Frequent visits from the same IPThe first thing you need to do is to use a proxy IP, which is the equivalent of changing your membership card every time you go to the supermarket. This is when you have to use a proxy IP, which is equivalent to changing your membership card every time you go to the supermarket.

// Normal request (easily blocked)
$html = file_get_contents('http://目标网站.com');

// Use proxy IP (safe mode)
$context = stream_context_create([
    'http' => [
        'proxy' => 'tcp://ipipgo-proxy.com:8080',
        'request_fulluri' => true
    ]
]);
$html = file_get_contents('http://目标网站.com', false, $context);

PHP proxy real-world three-piece suite

Here's a list of configurations for the guys to follow:

artifact corresponds English -ity, -ism, -ization Recommended Programs
IP pool Provide multiple IP addresses ipipgo Dynamic Residential Proxy
request header masquerading as Simulate Browser Access Randomized User-Agent Generation
request interval Avoid high-frequency triggers for wind control sleep(rand(1,3))

Real life example: capturing e-commerce prices

Recently there is a price comparison website friends to find me, said with PHP to capture data is always blocked. Give him a whole ipipgo solution, now running two months of stability. The key code is long like this:


// Get the latest proxy IP from ipipgo
$proxy = json_decode(file_get_contents('https://api.ipipgo.com/getproxy'));

$options = [
    CURLOPT_PROXY => $proxy->ip,
    CURLOPT_PROXYPORT => $proxy->port,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTPHEADER => [
        'User-Agent: Mozilla/5.0 (Windows NT 10.0) Turnip Head Browser'
    ]
];

$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);

Frequently Asked Questions QA

Q: What should I do if my proxy IP is not working?
A: This is why recommend ipipgo's dynamic IP service, their IP pool automatically change a batch every 5 minutes, much more stable than the roadside stalls.

Q: What if the crawl is too slow?
A: You can try concurrent requests, but you have to control the pace. ipipgo's enterprise version supports multi-threaded dedicated channels, which can increase the speed by more than 3 times.

Q: How do I break the CAPTCHA when I encounter it?
A: This is an advanced protection, we suggest to add automatic identification module in the code, or contact ipipgo's technical support to get a customized solution.

Guide to avoiding the pit

The most common pitfall for newbies isProxy IP quality is not good. Some free proxies look like they work, but in reality 8 out of 10 are broken. I've tested it before, and with ipipgo's commercial-grade proxies the success rate can go up to 98%, while the free proxies are not even good enough for 30%.

One last tip: add aException Retry MechanismIf the request fails, it automatically switches to the next IP to continue trying. If the request fails, automatically change to the next IP to continue to try. ipipgo's API returns a list of IPs with availability ratings, prioritize the ones with high ratings, you can go through a lot less.

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/34323.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

新春惊喜狂欢,代理ip秒杀价!

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish