IPIPGO ip proxy Python Image Grabber: Batch Downloader

Python Image Grabber: Batch Downloader

If you are always blocked IP for image crawling, try this trick! Brothers engaged in network crawlers understand, batch under the picture of the biggest headache is the IP is blocked. In the morning also run a good script, the afternoon will give you a 403 Forbidden, this time we have to pull out the proxy IP this life preserver. Today we will use Python ...

Python Image Grabber: Batch Downloader

If you are always blocked by IP, try this trick, it's very effective!

Brothers engaged in network crawlers understand, batch under the picture of the biggest headache is the IP is blocked. In the morning, the script is still running well, but in the afternoon, it will give you a403 ForbiddenThis is the time to pull out the proxy IP this life preserver. Today we will use Python to get a picture downloader with a shield, with ipipgo's proxy service to escort.

Why is it cool to not use a proxy IP?

There are three main things to look for in a website against crawlers:Request frequency, IP traces, user characteristicsThe following is an example of this. Ordinary crawler with a fixed IP wildly send requests, like the same person every minute to smash the door 100 times, the security does not block you block who? Using a proxy IP is like knocking on the door with a different vest every time, so the security guards won't recognize you at all.


 Example of core configuration for proxy IPs
proxies = {
    'http': 'http://用户名:密码@gateway.ipipgo.com:9020',
    'https': 'http://用户名:密码@gateway.ipipgo.com:9020'
}

hand in hand with the environment

Install these essential libraries first (remember that it's faster to install them with the Tsinghua source):


pip install requests pillow retrying -i https://pypi.tuna.tsinghua.edu.cn/simple

Focusing on the ipipgo configuration doorway: get on their backend toAPI Extraction LinksSuggested choicesLong-lasting static IPpackage, this IP survives for a long time and is particularly suitable for crawling tasks that require continuous work.

Code is written in such a way as to resist blocking

Straight to the hard stuff. Look at this tape.Triple Protectionof the code:


from retrying import retry
import requests
from urllib.parse import urlparse

def download_img(url, save_path): headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'}

     Get the proxy IP dynamically from the ipipgo interface
    proxy = requests.get("https://ipipgo.com/fetchproxy?type=json").json()

    @retry(stop_max_attempt_number=3)
    def _download().
        resp = requests.get(url, headers=headers,
                          proxies={"http": proxy['proxy']},
                          timeout=15)
        resp.raise_for_status()
        with open(save_path, 'wb') as f.
            f.write(resp.content)

    try.
        _download()
    except Exception as e.
        print(f "Download failed: {str(e)}, changing ipipgo's IP...")
        return False
    return True

Old Driver QA Time

Q: What should I do if the proxy IP suddenly doesn't work?
A: ipipgo's home IP pool has5 seconds auto switchingmechanism, just add a retry loop in the code. If you encounter a dead IP, their background can also manually refresh the node.

Q: How do I know if the proxy is in effect?
A: Add a detection logic in the code, visit http://ip.ipipgo.com/checkip before downloading to see if the returned IP is a proxy IP.

Q: What if I want to open a multi-threaded download?
A: ipipgo'sEnterprise PackageSupport simultaneous 500 IP concurrency, each thread with an independent proxy, remember to set the timeout to more than 30 seconds.

Pitfall Avoidance Guide Form

pothole method settle an issue
The IP was blocked too fast. Turn up the frequency of IP changes in the ipipgo backend
Image not loading fully Add selenium rendering and then download the
Validated by the site's man-machine Enabling IP Filtering for Server Rooms with ipipgo

Tell the truth.

Don't believe in those free proxies, not to mention the slow speed, may also contain Trojan horses. ipipgo I have used for more than half a year, the biggest benefit is thatIP address can be selectedIf you want to grab images from any region, you can choose any node. Recently they have a campaign, new users get 10G of traffic, fill in the promo code when you sign up!IMG2024You also get 5G more, enough to download tens of thousands of images.

最后唠叨一句:别把设太低!有些网站故意拖慢响应速度,设10秒以下的超时就容易误判。用ipipgo的话,建议把Timeout to 15-20 secondsThe success rate can go up by 30%.

我们的产品仅支持在境外网络环境下使用(除TikTok专线外),用户使用IPIPGO从事的任何行为均不代表IPIPGO的意志和观点,IPIPGO不承担任何法律责任。

business scenario

Discover more professional services solutions

💡 Click on the button for more details on specialized services

美国长效动态住宅ip资源上新!

Professional foreign proxy ip service provider-IPIPGO

Contact Us

Contact Us

13260757327

Online Inquiry. QQ chat

E-mail: hai.liu@xiaoxitech.com

Working hours: Monday to Friday, 9:30-18:30, holidays off
Follow WeChat
Follow us on WeChat

Follow us on WeChat

Back to top
en_USEnglish