
Why do I need a proxy IP for Zillow data crawling?
A lot of partners doing overseas real estate analysis have their eyes on Zillow as the meat and potatoes, but it's easy to hit the nail on the head by directly tuning its API. To cite a chestnut, the old king used his own server last week to request 200 times in a row, and the result was that the IP was blacked out the next day. At this time it is necessary to rely onDynamic Proxy IPto fight a guerrilla war, especially one likeipipgoThis service that automatically changes IPs is a lifesaver.
Hands-on teaching you how to use ipipgo proxy tuning APIs
Don't rush to code, go to the ipipgo website and open a trial package first. They give 5GB of traffic to new users, which is enough for testing. After signing up, find theAPI Access Pointsrespond in singingauthenticationInformation, two things you'll need to write code for later.
import requests
Proxy tunnel address provided by ipipgo
proxy = "http://用户名:密码@gateway.ipipgo.com:端口"
headers = {
'Content-Type': 'application/json',
'Authorization': 'Bearer your Zillow key'
}
params = {
'zws-id': 'Your Zillow ID', 'address': '1600 Pennsylvania Ave NWS'
'address': '1600 Pennsylvania Ave NW',
'citystatezip': 'Washington DC'
}
response = requests.get(
'https://api.zillow.com/webservice/GetSearchResults.htm'.
proxies={"http": proxy, "https": proxy},
headers=headers, params=params
params=params
)
Avoid these potholes in order to play
Three minefields were found in the real test:
| concern | prescription |
|---|---|
| Request timeout | Cut the ipipgo node to the west coast of the US |
| Returns empty data | Check that the address format is fully compliant with the Zillow specification |
| Account blocked | Control the frequency of requests to less than 10 per minute |
The QA session that white people love to ask
Q: Is it okay to use a free proxy?
A: Don't! 9 out of 10 public proxies are flagged by Zillow, ipipgo's pool of residential proxies is the way to go.
Q: Why did you choose ipipgo?
A: His family has three axes: 1) Dynamic IP every 5 minutes automatically change 2) U.S. residential IP pool of more than half a million 3) with automatic retry mechanism
Q: What should I do if the API returns an error code?
A: first look is not these cases: 1) parameters with Chinese symbols 2) IP is restricted (hurry to change ipipgo IP) 3) certificate problems (remember to install the latest root certificate)
Practical experience in the field
Recently, when helping customers do batch estimates, I found a tawdry operation: using ipipgo'ssession hold functionMaintain the IP unchanged for half an hour, so that it is not easy to trigger the wind control when grabbing deep data. But be careful not to exceed 30 minutes, to the point where you must change to a new IP, this time difference is just stuck below Zillow's monitoring threshold.
Another piece of cold knowledge - Zillow's image API has looser IP restrictions. If you are mainly grabbing image data, you can turn down the IP switching frequency of ipipgo, so as to save traffic and ensure stability.

