
Hands-on teaching you to use Puppeteer to hang proxies
The old iron engaged in crawling know that many sites are now added to the anti-climbing mechanism. This time you have to use a proxy IP to disguise the real address, otherwise it will be blocked in minutes. Today, we will take the hottest Puppeteer in the NodeJS circle as an example to teach you how toproper assignment of valuesProxy (oh yeah, the word "configure" is always a typo, just read it).
Why do I need an agent for Puppeteer?
For example, if you send a courier (Puppeteer) to make a delivery (to visit a website), and the same courier always ends up going, the stagecoach (the target website) will definitely get suspicious. That's when you need toipipgo's courier vest, change different clothes (IP address) each time to make a delivery.
const puppeteer = require('puppeteer');
async function run(){
const browser = await puppeteer.launch({
args: ['--proxy-server=http://username:password@ipipgo-proxy-server:port']
}).
// Normal operation later...
}
The right posture for certified agents
Many newbies get stuck in the agent authentication step, here's a hidden trick: use the.authenticate() methodIt's more secure than writing the password directly in the URL. This is especially recommended when using ipipgo's private proxy:
const page = await browser.newPage();
await page.authenticate({
username: 'Account number given to you by ipipgo', password: 'Exclusive password', 'Password', 'Password', 'Password'); await page.authenticate({
password: 'Exclusive password'
}).
How do you play around with dynamic IPs?
Static IP is easy to be blocked, recommend using ipipgo'sDynamic Residential IP Pool. Their API gets the latest IP in real time, here's a sample template:
const { getProxy } = require('ipipgo-sdk'); // pretend to have this SDK
const currentProxy = await getProxy({
type: 'https', country: 'us'
country: 'us'
}); }
// Fill currentProxy into the proxy configuration...
| Type of problem | prescription |
|---|---|
| The agent can't connect. | Check if the IP format is ip:port |
| Slow page load | Switching ipipgo's server room node |
| CAPTCHA appears | Enabling Dynamic Residential IP Service |
Common pitfalls QA
Q: The proxy setting is successful but it doesn't take effect?
A: first do not rush to scold, eighty percent of the authentication information is filled out wrong. With ipipgo words pay attention to theirPasswords are dynamically generated, don't copy directly from the email.
Q: How can I improve agent stability?
A: The real test puts Puppeteer'sHeadless mode off.Can see the actual request process. Recommended to use ipipgo'sLong-lasting static IPpackage, their survival rate can go to 95% or more.
Q: Why do you recommend ipipgo?
A: Let's put it this way, I've often encountered the same problem with other agents before.IP suddenly and violently dies, after switching to ipipgo, theirIntelligent Routing SystemIt will automatically switch the failed node, and you can also select data center IP or residential IP according to the type of business.
Cold tips to add
Finally, I'd like to share a tasty action: in Puppeteer, you can use both theMultiple proxy IPs. This needs to be paired with ipipgo'smultichannel serviceThe code implementation is actually simple for thieves:
const proxies = await ipipgo.getBatch(5); // take 5 IPs at once
proxies.forEach(async (proxy) => {
const context = await browser.createIncognitoBrowserContext({
proxy: `http://${proxy.ip}:${proxy.port}`
});
// Separate IP for each incognito window...
});
Well, the above is the experience summarized in the actual combat. To be honest, choosing the right proxy service provider can save half the effort. Like ipipgo, you canAutomatic IP change,Large selection of regionsThe service is really more worrying than the self-built agent pool. Especially when doing large-scale data collection, the stability of this piece of the pinch to death.

