
Hands-on with cURL to disguise browser requests
Crawler friends should have encountered such a situation: obviously the code is well written, the target site suddenly blocked IP. At this time, we have to invite the two major treasures -proxy IPrespond in singingrequest header masquerading asThe first one is the cURL, which is the most popular one. Today let's take cURL as an example and talk about how to play these two tricks.
Why bother with proxy IPs?
For example, if you go to the neighborhood supermarket every day to buy eggs and wear red clothes for three days in a row, on the fourth day the boss will simply say, "Red clothes are not for sale." The fourth day, the boss will say, "No red clothes! Proxy IP is like changing different colors of clothes every day, so that the supermarket boss can not recognize the same person.
Using ipipgo's proxy service is the equivalent of having an entire closet of clothes to change at will. Their dynamic IP pool is so deep that they can assign you a new vest with every request, which is much more stable than those small workshop proxies.
Basic cURL camouflage
Let's start by looking at a bare-bones piece of code:
curl https://example.com
This kind of request is like going out on the street with no clothes on, the server recognizes it as a machine access at a glance. We need to dress it up:
curl -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
-H "Accept-Language: zh-CN,zh;q=0.9"
-H "Referer: https://www.google.com/"
https://example.com
These request headers act like ID information, disguising the crawler as an ordinary Internet user. Attention.User-AgentGo with a common browser version, don't screw with the outdated antique models.
Putting a proxy vest on cURL
It's not enough to disguise it, you have to hide the real IP. Using ipipgo's proxy service is like getting a middleman to run your errands for you:
curl -x http://username:password@proxy.ipipgo.cc:8080
-H "User-Agent: Mozilla/5.0..."
https://target-site.com
Three things to note here:
- Don't misspell the proxy address, ipipgo's user backend has a ready-made generation tool
- Replace the password with your own, don't be a fool and use the one in the example.
- Remember to test whether the proxy is connected first, you can use curl to visit ip.ipipgo.com first to see if the returned IP is correct.
Advanced camouflage techniques
Some sites are thieves and check more parameters. This is when a more comprehensive camouflage solution is needed:
| request header | example value | caveat |
|---|---|---|
| Accept-Encoding | gzip, deflate, br | To match the compression method supported by the server |
| Connection | keep-alive | Mimic a long browser connection |
| Sec-Fetch-Series | Setting up according to the scene | Metadata automatically added by newer browsers |
The code looks like this when fully armed:
curl -x http://ipipgo_proxy
-H "User-Agent: Mozilla/5.0..."
-H "Accept: text/html,application/xhtml+xml..."
-H "Accept-Encoding: gzip, deflate, br"
-H "Connection: keep-alive"
--compressed
https://target-site.com
Frequently Asked Questions QA
Q: Used a proxy or got blocked?
A: Check two points: 1. whether the request header is complete 2. proxy IP quality. Recommended to use ipipgo'sQuality Dynamic AgentsTheir IPs are short lived but high quality and suitable for high frequency requests.
Q: What should I do if my agent is slow?
A: Prioritize nodes that are geographically close. ipipgo's smart routing feature automatically matches the fastest routes, which saves you a lot of work compared to switching manually.
Q: What if I need a multi-region IP?
A: Just add the region parameter in the proxy address of ipipgo background, such as®ion=shanghaiSpecify the Shanghai node, or&city=randomRandomly switch cities.
Guide to avoiding the pit
A common mistake newbies make isoverdo the pretenseFor example, stuffing the request header with all kinds of parameters. For example, stuffing the request header with various parameters results in exposing exceptions instead. Remember the three principles:
- Parameter values should be logical (e.g., don't carry Windows system information for cell phone UA)
- The headers should be self-consistent (e.g., Accept and Content-Type should match).
- Keep parameters up to date (quarterly browser version number updates)
Lastly, I'd like to introduce you to ipipgo'sBrowser Fingerprint EmulationThe program can automatically generate the matching request header parameters, which is much more convenient than configuring them manually. Especially for long-term collection projects, it is recommended that they directly on the enterprise version of the package, with automatic replacement of IP and request header of the full set of programs.

