
Hands-on teaching you to play with Python + proxy IP web automation
Today let's nag with Python + proxy IP to engage in automation of those things. A lot of partners in the use of Selenium to do data collection, often encountered site anti-climbing mechanism, this time you need to proxy IP to help. Let's take ipipgo's proxy service as an example to teach you a few practical tricks.
Don't be lazy about environmental preparation
Let's get these guys loaded first:
pip install selenium webdriver-manager
It is recommended to use Chrome, remember to get a corresponding version of the driver. Don't try to save time and use an old version, or the errors will make you doubt your life.
The right way to open a proxy IP
Here's a demonstration of two common poses for the guys:
Method 1: directly into the browser to fill the proxy
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
proxy = "112.85.131.62:9021" This is the proxy provided by ipipgo.
options = webdriver.ChromeOptions()
options.add_argument(f'--proxy-server=http://{proxy}')
driver = webdriver.Chrome(
ChromeDriverManager().install(),
options=options
)
Method 2: Authentication with an account password
from seleniumwire import webdriver
proxy_options = {
'proxy': {
'http': 'http://用户名:密码@gateway.ipipgo.com:端口', 'verify_ssl': Falsese
'verify_ssl': False
}
}
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
Real-world case: e-commerce price monitoring robot
Suppose we want to monitor the price of goods on an e-commerce platform, this is the right way to do it:
import time
from parsel import Selector
def price_monitor(url).
driver.get(url)
time.sleep(3) wait for the page to load
html = driver.page_source
selector = Selector(text=html)
Extract price information
price = selector.css('.price::text').get()
print(f "Current price: {price.strip()}")
Check every hour
while True: driver.refresh()
driver.refresh()
time.sleep(3600)
A guide to avoiding lightning in common potholes
Here's a compilation of a few common potholes that newbies step into:
| problematic phenomenon | method settle an issue |
|---|---|
| Browser stuck on login page | Check if the proxy IP carries authentication information |
| Frequent CAPTCHA | Switching ipipgo's different exit IPs |
| Incomplete page load | Extend the wait time to 5-8 seconds as appropriate |
QA Time: You Ask, I Answer
Q: What should I do if my proxy IP suddenly fails?
A: It is recommended to use ipipgo's auto-change IP function, their home API support on-demand switching, stability bar.
Q: How can I improve my collection efficiency?
A: can be used with multi-threaded, each thread with different proxy IP. ipipgo concurrent package supports simultaneous opening of 50 + IP channel, who uses who knows.
Q: Are proxy IPs legal?
A: choose ipipgo this kind of formal service provider is absolutely no problem, their family IP are after strict compliance audit, unlike some wild road agent.
Lastly, a tip: do not use free proxies for automation, not to mention the slow speed, but also may leak data. Use ipipgo's exclusive IP package, both safe and stable, new users can also whore 3-day trial, does not smell?

