
Teach you how to add request headers when using cURL to hang proxies.
Crawlers should understand that some sites should know, some sites special fine, hanging proxy is not enough, we have to do something in the request header in order to muddle through. Today we will use cURL this magic tool, say how to customize the request header in the proxy request.
Basic Proxy Configuration
First of all, the entire simplest proxy settings, take our ipipgo proxy as a chestnut:
curl -x http://user:pass@proxy.ipipgo.com:8000 https://target-site.com
Pay attention here.-x parameterFollowed by the format, don't mistype the colon. If you are using a socks5 proxy, replace http with socks5, and the port number depends on the information given in the specific package.
Request header camouflage practical tips
Some websites check for User-Agent parameters, so we need to act like a normal browser. Try this configuration:
curl -x http://proxy.ipipgo.com:8000
-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
-H "Accept-Language: zh-CN,zh;q=0.9"
https://target-site.com
focus on-H parameter, which can be stacked indefinitely. I'm generally in the habit of saving common header information into a config file and calling it with the -config parameter, saving me from having to knock it out by hand every time.
Don't cram authentication information into the code
A common mistake made by newbies is to write account passwords directly to the command line, which is both insecure and difficult to maintain. It is recommended to use .netrc file management:
Create a .netrc file in the user directory
machine proxy.ipipgo.com
login Your account
password Your password
Then execute with a -netrc parameter and you're done, much cleaner code:
curl --netrc -x http://proxy.ipipgo.com:8000 ...
We'll have to schedule a retry.
It is inevitable to encounter network fluctuations with proxies, and these parameters can save your life at critical moments:
--connect-timeout 30 Connection timeout 30 seconds
---max-time 120 Overall timeout 2 minutes
--retry 3 Failure to automatically retry 3 times
A guide to practical QA pit rows
Q:The proxy setting is successful but the website still returns 403?
A: It is likely that the request header is exposed, try to add the Referer and Cookie headers, and use the -verbose parameter to see the complete request process
Q: HTTPS requests always report certificate errors?
A: Add -proxy-insecure at the end of the command, or specify the certificate path with -proxy-cacert
Q: How to batch test the agent pool?
A: Write the proxy address into the txt file, polling calls with the -K parameter, remember to cooperate with the random request header plugin
Why recommend ipipgo proxies
The agency service used in your own home, to mention a few real advantages:
| Package Type | Applicable Scenarios | Price advantage |
|---|---|---|
| Dynamic residential (standard) | Daily data collection | 7.67 Yuan/GB/month |
| Dynamic Residential (Business) | High-frequency visit requirements | 9.47 Yuan/GB/month |
| Static homes | Long-term fixed operations | 35RMB/IP/month |
A special shout-out to theirTK LineThe API extraction is also convenient, directly curl their interface can get fresh proxy, save yourself to maintain the IP pool.
Lastly, I'd like to emphasize that proxy configuration is a matter of trial and error. Don't be in a hurry to change the proxy when you encounter strange problems, first use -trace-ascii to save the request logs and analyze them, and very often the parameters are not adjusted correctly. If you have any specific questions, please feel free to tease us, we do not talk about falsehoods.

