
Why do I have to use a proxy IP for ad price monitoring?
Recently, a lot of friends do e-commerce friends and I touted, said with crawlers to catch competitors advertising data is always blocked. To cite a real case: Hangzhou, a clothing company Wang, with ordinary IP to catch a platform advertising data, just caught 200 on the trigger wind control, the account was directly blocked for 15 days. At this time, we have to use proxy IP to solve this pain point.
Ordinary IP is like using the same ID card to go to the bank every day to withdraw money, and will soon be targeted. Proxy IP is equivalent to switching to a different person each time to operate, spreading the request to different IP addresses.Dynamic residential IP for ipipgoBest suited to this scenario, each request comes from a real user's home network, and advertising platforms simply can't tell if it's a real person or a machine.
Build an ad monitoring system in three steps
Step 1: Data collection
Write a crawler script in Python, focusing on setting the random request header. Here's a key trick: get a new IP from ipipgo's API before each request. see this code sample:
import requests
from ipipgo_api import get_proxy Assume this is the SDK for ipipgo_.
def get_ad_data(url):
proxy = get_proxy(type='dynamic') call dynamic residential IPs
headers = {'User-Agent': random.choice(user_agents)}
response = requests.get(url,
proxies={"http": proxy, "https": proxy},
headers=headers,
timeout=10)
return response.json()
Step 2: Frequency Control
Never use a fixed time interval! It is recommended to set up a randomized waiting time + automatic IP switching mechanism. For example, every 5 times to catch the data to change IP, waiting time between 1-3 seconds random floating.
Step 3: Data cleansing
Focus on monitoring three types of data changes:
| data type | Monitoring Points |
|---|---|
| Price Information | Record all changes to two decimal places |
| advertising space | Call the police if you move up or down more than 3 places in the rankings |
| Promotional Labels | Changes in keywords such as "discount" and "seconds" |
ipipgo real-world configuration program
Based on our experience with the clients we have served, this is the recommended mix:
- For daily monitoringDynamic residential (standard)Package, $7.67/GB enough to catch 100,000 requests
- Upgrade during the promotionDynamic Residential (Business)Packages to support higher concurrency
- Special needs such as the need for a fixed IP, with $ 35 / month static residential IP
There's an easy pit to step into: many people write dead proxy IP addresses in their code. The correct way to do this is to get the latest IP via the API before each request, like this:
Example of error
BAD_PROXY = "123.123.123.123:8888"
Correct Approach
def get_fresh_proxy().
return requests.get('https://api.ipipgo.com/get_proxy').json()['ip']
Frequently Asked Questions QA
Q: Will the proxy IP affect the data collection speed?
A: With ipipgo's TK line can speed up 40%, measured latency within 200ms. Don't choose the free proxy, it will really slow down the speed!
Q: How does monitoring ad prices in different regions work?
A: Just specify the locale parameter in the code, for example, to catch ads for Walmart in the US:
proxy = get_proxy(country='US', region='California')
Q: Is it illegal to collect data?
A: There is nothing wrong with using proxy IPs per se, but be careful to follow the robots protocol. It is recommended to collect only public data, don't touch the sensitive information you need to log in to see!
Guide to avoiding the pit
Recently, I've noticed that a lot of users have fallen head over heels in these areas:
- No timeout parameter is set, which causes the program to get stuck
- More than 50 consecutive requests from the same IP
- Forget to handle SSL certificate validation (solution: add verify=False parameter to requests)
Finally, a cold knowledge: the anti-climbing system of the advertising platform updates the strategy at 3:00 a.m., and the success rate of data collection is the highest at this time. Use ipipgo's timed task function to set up automatic collection in the early morning, can save a lot of heartache.

