
Come on, let's talk about how to hook up a proxy for wget.
Recently, a lot of old iron doing data collection are asking how to use wget under the things with username and password authentication. This is a simple thing to say is also simple, say trouble is also trouble. Let's break it up today and crumble it, so you can learn it in three minutes.
Let's take a look at why we need proxy certification.
For example, if you buy a proxy package from ipipgo, the address they give you looks like this:http://username:password@gateway.ipipgo.com:8080. The trick is to include the password in the request. If you dislike it directly from the command line, wget won't recognize it and you have to use specific parameters.
Handy to match the certification parameters
Here's the point! Remember these two golden partnership parameters:
wget --proxy-user=your account number
--proxy-password=your password
http://目标网址
Note that these two parameters have to beoccur in pairs, not even one less. When using ipipgo's proxy, remember to change your account password to the authentication info they give you.
Don't panic when you encounter errors, veteran drivers teach you demining
Here's a common pitfall for newbies:
Error code 407: Proxy authentication required
This means that the proxy server is not receiving authentication information. Check three things first:
1. Is the account password reversed?
2. Are there two minus signs missing in front of the parameters?
3. Is the port number of the proxy address correct?
The Lazy Man's Way to Profiles
If you use the proxy every day, it's too much of a hassle to knock out the parameters each time. In the~/.wgetrcAdd these lines to the file:
use_proxy = on
http_proxy = http://用户名:密码@gateway.ipipgo.com:端口
https_proxy = http://用户名:密码@gateway.ipipgo.com:端口
After doing this, every time you use wget, it automatically goes to the proxy. ipipgo users should note that their proxy address should be the exclusive address shown in the console.
QA time: high-frequency questions in one place
Q: What should I do if there are special symbols in my password?
A: Wrap it in quotes, e.g. -proxy-password="Abc123″
Q: How do I switch between using multiple proxies at the same time?
A: Temporarily change the proxy by adding the -no-proxy parameter to the command line, or change the .wgetrc file.
Q: Does the test agent take effect?
A: First withwget -O- http://httpbin.org/ipSee if the returned IP is a proxy IP
Why do you recommend ipipgo's proxy service?
After using 7 or 8 proxy services, it's not for nothing that I ended up locking in on ipipgo:
1. Flexible authentication, support for user name and password / whitelist two modes
2. proxy node survival rate of 99%, automatic switching offline
3. Optimized for data collection scenarios, the number of concurrency to give sufficient
Especially if you are doing a long term crawler program, their package ofLong-lasting static proxiesA real saver, an agent can last half a month without changing.
The ultimate reminder: safe practices to remember
A final rant:
- Don't write passwords in plaintext in scripts.
- Test the waters with a small file.
- 403 error first check the target site anti-climbing strategy
- The ipipgo backend can see real-time usage, remember to check your bill regularly!
If you've done all this, you're a wget proxy master. If you have any new questions, welcome to ipipgo official website to find customer service girl nagging, their technical answers can be more detailed than my brown man.

