
Hands-on with proxy IPs with cURL's Accept header
Engaged in the data crawl know that some sites are thieves, just use the proxy IP is not enough, you have to make the request header like a real person to visit the same. Today we will nag how to use ipipgo's proxy service to cURL request.Put on the Accept head vest.The
Why toss the Accept head?
Many websites have now installed "security gates" that specialize in checking requests for IDs. For example:
- With the default Accept header (/), it is straightforward to block as a bot
- The Accept values for mobile and web are different, so you'll get confused and get caught.
- Some API interfaces must specify a specific MIME type
Last year, I helped my friend to do e-commerce comparison, just because the Accept header is not set correctly, and even changed 3 proxy IP are blocked, and then realized that it is the problem of the header information.
Four Steps to Practice
First make sure you have cURL on your computer, if not go to the official website to get the latest version. Use ipipgo's proxy as a demo here, their dynamic IP pool is big enough, not easy to be banned.
Basic template (remember to replace it with your account password)
curl -x http://username:password@gateway.ipipgo.com:9021
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9"
https://目标网站.com
Mobile-specific version
curl -x http://username:password@gateway.ipipgo.com:9021
-H "Accept: application/json, text/javascript, /; q=0.01"
https://m.目标网站.com
How do you choose parameters without stepping on them?
| take | Accept value | Applicable websites |
|---|---|---|
| General web pages | text/html,application/xhtml+xml | web portal |
| API interface | application/json | data interface |
| Photo Resources | image/webp,image/apng | Gallery Sites |
Common Rollover Scene QA
Q: Is the Accept header still recognized?
A: It is likely that other header information is missing, remember to match the User-Agent, Referer and all these together.
Q: The proxy of ipipgo suddenly can't connect?
A: First check the account validity period, their packages are billed by the hour, the balance is insufficient will automatically stop. Then try to change the alternate port, 9021-9030 are supported!
Q: Do I need to change the Accept head frequently?
A: Look at the target site strategy, generally the same type of page with the same value on the line. If you are not sure, you can use your browser's developer tool to grab a real request header and copy it.
Why do you recommend ipipgo?
After using his agent for the past two years, three things have come to mind most:
- The IP pool is automatically refreshed every hour, unlike some service providers who don't change IPs for three days.
- be in favor ofpay per volumeThe small-scale crawlers are especially cost-effective.
- Customer service response is fast, the last time I encountered problems with the verification code, 2:00 a.m. actually seconds back to the work order!
Recently, new users also get a 5G traffic pack for signing up, enough to test it for most of the month.
As a final rant, matching proxies is not a panacea. Like Accept head such details are handled well, coupled with a reliable proxy service, in order to let the crawler run stable and fast. Encounter strange problems do not die, more than a few IP try, ipipgo background can see the real-time connection status, this feature is really practical.

