
The hidden role of headers in requests
Many partners in the use of curl to do data capture, always encountered the situation of site shielding. At this time, in addition to changing the proxy ipRequest header settingsThis is the real key to break the game. For example, some websites will check if your User-Agent is a browser, and if you use the default curl header, you will be recognized as a machine request in minutes.
curl -x http://user:pass@proxy.ipipgo.cn:8080
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0)..."
-H "Accept-Language: zh-CN"
https://目标网站.com
The important thing to note here is that the proxy address in theuser:passTo change to your own authentication information generated in the ipipgo backend. Their proxy servers support multiple authentication methods, a point that is particularly friendly to users who need to operate in bulk.
Triple protection disguised as a real person
Simply changing the User-Agent is not enough, you have to get a full set of disguises. Here we teach you the three mandatory changes:
| header (computing) | recommended value | Description of the role |
|---|---|---|
| Accept-Encoding | gzip, deflate | Mimic browser compression |
| Referer | Homologation Website Address | Creating the illusion of visiting sources |
| Connection | keep-alive | Keeping long connections reduces features |
Remember that each request should beRandom intervals of 1-3 secondsThe proxy pool with ipipgo can automatically switch the exit ip, so with dynamic header information, anti-blocking effect directly pull full.
Troublesome maneuvers in the real world
Try this combo when you come across a particularly difficult site:
curl -x http://动态认证.rotating.ipipgo.net:9021
-H "Cookie: copy real cookie from browser"
-H "X-Forwarded-For: random public IP"
--connect-timeout 10
https://反爬严格的网站
There are two key points here:
1. ipipgo'sDynamic Authentication AgentNo need to piece together your own passwords.
2. X-Forwarded-For should be filled with the public network address in the same region as the proxy ip.
Common Rollover Scene QA
Q: What should I do if I am still recognized even though I have set all the header information?
A: Start with the ipipgo offeringsDetection ToolsLook at the real request header, some sites will ask for a specific header parameter
Q: Proxy IP often connects to timeout?
A: Set the -connect-timeout parameter to 15 seconds or more, it is recommended to use ipipgo'sEnterprise LinesTheir BGP lines have a success rate of 99.2%.
Q: How do I break it if I need to deal with cookies?
A: Specify the cookie file with the -b parameter of curl, and also make sure to use the same proxy IP for each request, ipipgo'ssession hold functionJust the thing to fix it.
Why ipipgo?
Having tested a dozen or so vendors, I finally locked in on ipipgo for just three points:
1. Domestic self-built server room, unlike those who use overseas second-hand IP
2. SupportHeader message customizationProxy channel, this feature is really not available elsewhere
3. Customer service can respond to work orders in seconds, the last time I had a problem debugging a script in the middle of the night, it was solved in 5 minutes.
Finally give an ultimate configuration template, the following parameters saved as a config file, when used directly call:
Save as curl_config.txt
user-agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36..."
referer = "https://www.google.com/"
proxy = "http://auto:动态密钥@gateway.ipipgo.com:8899"
Just add the -K parameter to the call:
curl -K curl_config.txt Destination URL

