
Hands-on teaching you to play in Excel web page capture
Recently a lot of data analysis buddies are asking, how to use VBA to engage in web crawling can also avoid being blocked IP This thing is not difficult to say, the key to use the right tool. Today we will give you nagging how to work in Excel, focusing on how to use the proxy IP this life preserver.
Why does your VBA always get IP blocked?
A lot of newbies write half a day's worth of code, and then just when they run it, they're prompted with a"429 error."The first thing you need to do is to use a proxy IP. To put it bluntly, the website realizes that you are requesting like crazy and just pulls the IP. this is when you have to use a proxy IP toDisguise your true identityIt's like fighting a guerrilla war where you have to change positions often.
' Normal request code (dangerous version)
Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1")
objHTTP.Open "GET", "http://目标网站", False
objHTTP.Send
' Proxy version of the request code (safe mode)
Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1")
objHTTP.SetProxy 2, "Proxy IP:Port" 'Here it is recommended to use ipipgo's residential proxy
objHTTP.Open "GET", "http://目标网站", False
objHTTP.Send
How to choose a reliable proxy IP?
There are a variety of proxy services on the market, and the actual test is downipipgoThe exclusive residential proxy is most suitable for web crawling. Their IP survival time is long, the response speed can be up to 200ms or less, the most critical thing is that there is a special API interface can automatically change IP, completely without having to manually toss.
| Agent Type | tempo | stability | Applicable Scenarios |
|---|---|---|---|
| Data Center Agents | plain-spoken | easy identification | Short-term small quantities |
| Residential agent (ipipgo) | moderate | extremely high | Long-term large-scale |
Four Steps to Real-World Configuration
1. Go firstipipgo official websiteSign up for an account and get a free trial pack
2. Reference in VBA projectMicrosoft XMLstorehouse
3. Paste the following code in:
Sub Smart Capture()
Dim Proxy Pool As New Collection
Proxy.Add "ip1.ipipgo.pro:8000" 'We recommend buying a package to get more IPs.
Proxy.Add "ip2.ipipgo.pro:8000" 'We recommend to buy a package to get more IPs.
For Each Current Proxy In Proxy Pool
On Error Resume Next
Set http = CreateObject("WinHttp.WinHttpRequest.5.1")
http.SetProxy 2, Current Proxy
http.Open "GET", "Target URL", False
http.Send
If http.Status = 200 Then
'Processing data logic
Exit For
End If
End If
End Sub
4. Remember the settingsstochastic delayuseApplication.Wait Now + TimeValue("00:00:03")It's more realistic.
Common pitfalls QA
Q: The code is fine but it always prompts a timeout?
A: 80% is the proxy IP quality is not good, change ipipgo high stash package try, remember to check the fire settings!
Q: How to solve the problem of incomplete data capture?
A: plus paging processing logic, while using ipipgo's automatic rotation function, each paging for a different IP
Q: What if I need to process a CAPTCHA?
A: In this case it is recommended to go on ipipgo'sDynamic Residential AgentsThey have some IPs that come with browser fingerprinting artifacts.
Upgrade Play Tips
If you have enough budget, integrate ipipgo's API directly into VBA. Their interface return speed thief, but also can specify the country and region. For example, if you want to catch a certain country's website, you can directly lock the proxy pool of the corresponding region, and the success rate can be doubled.
Lastly, don't try to use free proxies for cheap, or data leakage, or computer poisoning. The professional thing is still left to theipipgoThis kind of reliable service provider, save time and effort is also safe. What do not understand welcome to go to their official website to find customer service, reply speed than some e-commerce platform much faster.

