IPIPGO ip proxy Go Proxy IP HTML Parser: Go Proxy IP Parser Library

Go Proxy IP HTML Parser: Go Proxy IP Parser Library

Teach you to use Go to pull the proxy IP We do data collection of the old driver understand, no proxy IP is like driving without a steering wheel. Today, I'm going to give you the hard stuff. I'm going to write my own proxy IP resolver in Go, and I'm going to focus on how to extract the proxy IP address from a web page. // For example, from the web page ...

Go Proxy IP HTML Parser: Go Proxy IP Parser Library

Hands-on with Go to pickpocket proxy IPs

We do data collection of old drivers understand that no proxy IP is like driving without a steering wheel. Today, I will give you some hard goods, use Go language to write a proxy IP resolver, focusing on how to extract the proxy IP address from the web page.


// As an example: parse IPs from a web table
func parseIPTable(html string) []string {
    re := regexp.MustCompile(`<td>(d+.d+.d+.d+)</td>.?<td>(d+)</td>`)
    matches := re.FindAllStringSubmatch(html, -1)

    var proxies []string
    for _, match := range matches {
        proxies = append(proxies, fmt.Sprintf("%s:%s", match[1], match[2]))
    }
    return proxies
}

This regular expression looks simple, but there are severalpotholeBe careful: the structure of the web page often changes, some sites will deliberately put a fake IP, the table may be mixed with advertising content. This is the time to use ipipgo ready-made proxy pool, than their own pull web pages to save a lot of trouble.

Proxy IP Authentication

It's hard to pull down the IP, eight out of ten can not be used how to do? I'll teach you a trick:

verification step take a period of (x amount of time) success rate
TCP connection alone 2 seconds. 40%
Test with target site 5 seconds. 80%
Multi-node concurrency detection 3 seconds. 95%

If it's too much trouble, just use ipipgo's.Pre-verified IP PoolThe company has already done three rounds of screening for us. Their API returns the IP is basically ready to use, saving you the trouble of verifying.

Practical case: the collection of an enterprise information website

Recently a brother asked me to help, said their company to collect enterprise data, the results of the site anti-climbing too hard. Give everyone a look at how we get it done:


func main() {
    // Get 10 proxies from ipipgo
    proxies := ipipgo.GetProxies(10, "http")

    for _, proxy := range proxies {
        client := &http.Client {
            Transport: &http.Transport{Proxy: http.ProxyURL(proxy)},
            Timeout: 8 time.Second
        Timeout: 8 time.Second }

        // Remember to handle exceptions here
        resp, _ := client.Get("target site")
        // Parsing the data...
    }
}

Using this method, the anti-climbing mechanism was successfully bypassed. The key point isDifferent proxies for each requestThe IP pool of ipipgo is big enough for us to take turns using it.

Old Driver QA Time

Q: Why can't I use the proxy IP I got?
A: There are two common situations: either the proxy fails (short survival time of their own IP), or the target site blocked the proxy segment. It is recommended to use ipipgo this kind of professional service provider, they IP update fast, there are 24 hours survival guarantee!

Q: How can I increase the collection speed?
A: three tricks: 1. concurrent requests with the concurrent pool 2. set a reasonable timeout 3. do not catch a site fierce grip, with proxy IP to disperse requests

Q: What should I pay attention to when choosing a proxy service provider?
A: focus on these points: IP pool size (recommended ipipgo million pool), protocol support (HTTP/HTTPS/Socks5), response speed (measured ipipgo average of 200ms), whether to provide a trial (they have a 3 yuan experience package)

This article was originally published or organized by ipipgo.https://www.ipipgo.com/en-us/ipdaili/37412.html

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

New 10W+ U.S. Dynamic IPs Year-End Sale

Professional foreign proxy ip service provider-IPIPGO

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish