
What's so hard about Adidas merchandise data capture?
Have done e-commerce data crawl know, Adi official website of the anti-climbing mechanism that is called a ruthless. The front foot just climbed 200 data, after the foot of the IP was shut down in a small black house. Ordinary users may feel that a change of IP on the line, but the system has been blocked to the entire IP segment, this time it is necessary to use theDynamic Proxy IP Poolto fight a guerrilla war.
Teach you to use a proxy IP to capture data
Here's a recommendation for the guys from ipipgoDynamic Residential Agents, their IP pool is updated 200,000+ per day, which is especially suitable for dealing with anti-climbing strict websites like Adi. Write a simple script in Python and remember to change the IP for each request:
import requests
from random import choice
List of proxies from ipipgo backend
proxies = [
"http://user:pass@gateway.ipipgo.com:30001",
"http://user:pass@gateway.ipipgo.com:30002".
... Other Proxy Nodes
]
url = "https://www.adidas.com/api/products"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36..."
}
try: response = requests.get(url)
response = requests.get(url, proxies={"http")
proxies={"http": choice(proxies)}, headers=headers, {"http": choice(proxies)}, }
headers=headers,
timeout=10
)
print(response.json())
except Exception as e.
print("Crawl failed, change IP and retry")
Top 3 Tips for Avoiding Backcrawl
1. IP rotation frequency: It is recommended to change your IP every 50 catches and not wait until you are blocked!
2. request header masquerading as: remember to randomize the User-Agent, don't use the default header of the requests
3. request interval: better add a random.uniform(1,3) random wait
| Agent Type | Applicable Scenarios | Recommended Programs |
|---|---|---|
| Data Center Agents | Short-term small grabs | not recommended |
| Residential Agents | Long-term stable acquisition | ipipgo dynamic homes |
Common pitfalls QA
Q: Why is it still blocked after using a proxy?
A: Maybe the session is not disconnected, remember to clear cookies after each request, or just use stateless request
Q: What if ipipgo's proxy is not fast enough?
A: Their backstage can be selected低节点,实测用美国东部节点能压到200ms以内
A special reminder for veteran drivers
Don't try to be cheap and use free proxies, those IPs have long been blacklisted by Adi. Suggest to go directly to ipipgoExclusive IP packageIf you spend 200 bucks a month, the data capture success rate can soar from 30% to 85% or more. Used to know, professional things or professional tools to do.
One last thing: remember to update your IP pool every day! ipipgo has aIP Freshness FunctionIf you have a CAPTCHA bombardment, you can work with their API to automatically change the export IP address, which is written in their documentation. If you encounter CAPTCHA bombing, you can work with their API to automatically replace the export IP, the specific operation in their home documentation are written.

