IPIPGO ip proxy Curl add custom Header crawl case

Curl add custom Header crawl case

Hands-on teach you to use curl plus Header anti-blocking crawl data Recently there are old iron asked me, with curl crawl data old by the site blocked IP how to do? Today we will nag about this. Focus on a trick - custom Header + proxy IP combo punch, pro-test effective. First of all, a real case: an e-commerce platform price monitoring ...

Curl add custom Header crawl case

Hands-on teaching you to use curl plus Header anti-blocking crawl data

Recently, some old iron asked me, with curl crawl data old by the site blocked IP how to do? Today we will nag about this. Focus on a tough trick--Customized Header + Proxy IPCombinations that are pro-tested to work.

First of all, a real case: an e-commerce platform price monitoring script, with ordinary curl request less than half an hour to be ban. later to the request header with the browser characteristics, together with ipipgo's dynamic proxy pool, running for three days are fine. Here's how to do it.

The correct posture of curl plus Header

Let's start with a typical rollover scene:

curl https://目标网站.com

With this kind of bare-bones request, the server knows at a glance that it's a bot doing something. We have toPut a vest on curl.::

curl -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
-H "Accept-Language: zh-CN,zh;q=0.9"
-H "Referer: https://www.google.com/"
https://目标网站.com

Note the three key Headers:

Header name corresponds English -ity, -ism, -ization example value
User-Agent Fake Browser Latest version of Chrome or Firefox
Accept-Language Language Settings zh-CN first
Referer source page Simulate Search Engine Jump

The right way to open a proxy IP

It's not enough to just change the header, you have to work with a proxy IP in order tocomplete invisibility. Here we recommend using ipipgo's service, who has a special anti-blocking package. See specific usage:

curl -x http://用户名:密码@proxy.ipipgo.com:端口号
-H "User-Agent: Mozilla/5.0..."
https://目标网站.com

Watch out for these two potholes:

  1. Don't use free proxies, 99% are all public IP pools, long ago the site pulled black
  2. Residential proxies are more insidious than server room proxies, ipipgo'sDynamic Residential IPHigher success rate for packages

A practical guide to avoiding the pit

The strangest ban I've ever encountered: a site that actually detects font rendering parameters in cookies! Here's a couple of tawdry maneuvers to share:

  • Regularly replacing headers in theAccept-Encoding(be) worth
  • Randomly insert meaningless but legal fields into the request header, such asX-Requested-With: XMLHttpRequest
  • With ipipgo.session holdFunctions to maintain a reasonable access frequency for the same IP

Frequently Asked Questions QA

Q: What should I do if I still get blocked after adding Header?
A: Check if the Cache-Control field is missing, it is recommended to add theCache-Control: max-age=0Simulating Browser Behavior

Q: How to solve the problem of slow proxy IP speed?
A: ipipgo'sIntelligent Routingfunction automatically selects the fastest node, or you can add the-m 30Setting the timeout period

Q: What if I need to deal with cookies?
A: First use curl's-c cookie.txtparameter to save the cookie, subsequent requests bring-b cookie.txt

The Ultimate Life Preservation Program

Finally a universal template, remember to replace it with your ipipgo account:

curl -x http://vipuser:123456@proxy.ipipgo.com:8899
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
-H "Accept: text/html,application/xhtml+xml"
-H "Accept-Encoding: gzip, deflate, br"
--compressed
https://目标网站.com

This template has three key designs:

  1. Using ipipgo'sEnterprise Agent Channel
  2. Emulate full browser features
  3. Enable compressed transmission to save traffic

If you encounter a particularly difficult website, you can contact ipipgo technical support to customize it!Dedicated anti-climbing program, their engineers have dealt with all sorts of sick anti-climbing tactics, like what TLS fingerprinting authentication, browser fingerprinting detection can handle.

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/31436.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

新春惊喜狂欢,代理ip秒杀价!

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish