
Why do Ruby crawlers need proxy IPs?
Brothers who have engaged in data collection know that the anti-climbing mechanism of the site is now more and more ruthless. Take a treasure, the same IP continuous access to the page 10 times, immediately give you a pop-up verification code. At this time if you use ipipgo's dynamic residential IP, each request automatically switches the export address, the server simply can not tell if you are a real person or machine.
Let's take a real scenario: we want to monitor the price fluctuation of 50 e-commerce platforms. If we don't use proxy, the IP will be blocked in less than half an hour. Using Ruby to write a crawler with ipipgo's API, each request randomly calls the IP pool of different countries, the success rate of data collection is directly full.
require 'net/http'
require 'json'
Fetch dynamic proxy from ipipgo (code example)
def fetch_proxy
api_url = "https://api.ipipgo.com/dynamic?key=你的密钥"
response = Net::HTTP.get(URI(api_url))
JSON.parse(response)['proxy']
end
Use the proxy to access the target website
proxy = fetch_proxy
uri = URI.parse("http://目标网站.com")
http = Net::HTTP.new(uri.host, uri.port, proxy['ip'], proxy['port'])
http.open_timeout = 10
http.read_timeout = 20
begin
response = http.get(uri.path)
puts response.body
rescue => e
puts "Request failed: {e.message}"
end
How to choose dynamic/static IP?
ipipgo has three major packages, and veteran drivers will teach you to choose:
Dynamic Residential (Standard): Suitable for scenarios that require frequent IP switching, such as bulk registration testing and advertising effect testing. Affordable price, more than 7 yuan 1 G flow enough to run a small project.
Static Residential IP: It is a must to do long-term number raising, and each IP can be used for a full 30 days. Play cross-border e-commerce brothers know that the store IP must be fixed to avoid wind control.
| Package Type | Applicable Scenarios | Price advantage |
|---|---|---|
| dynamic standard | Short-term data collection | 7.67 Yuan/GB |
| Static homes | Long-term account maintenance | 35 Yuan/Month/IP |
Proxy IP practical guide to avoid pitfalls
Three common mistakes newbies make:
1. Timeout set too short: Foreign servers are slow to respond, it is recommended that read_timeout be set to at least 30 seconds.
2. IP Reuse: Dynamic IPs are recommended to be used no more than 5 times each.
3. Forgot authentication: some agents need account password authentication, remember to add auth parameter in the code
Proxy setup with authentication
http = Net::HTTP.new(uri.host, uri.port, proxy['ip'], proxy['port'], 'account', 'password')
Common pitfalls in practice
Q: What should I do if all the proxy IPs suddenly fail?
A: Check whether the API extraction frequency is over the limit. ipipgo's standard package supports 3 queries per second. We recommend upgrading the enterprise package for large volume demand
Q: Is the slowing down of the crawler a problem with the agent?
A: Use this code to measure proxy latency:
start_time = Time.now
http.get('/')
puts "Response time: {Time.now - start_time} seconds"
If the delay is more than 2 seconds, it is recommended to switch to ipipgo's TK line, which is specially optimized for Asian node speeds
Why ipipgo?
Pro-test three advantages:
1. The protocol supports full: socks5 protocol to go udp traffic, suitable for scenes that need to transmit video data
2. Client Saving: their Windows client can automatically change IP, with Ruby crawler directly call the local proxy port
3. Life-saving servicesWe had a project that needed a Cambodian IP, and our customer service took care of the customization on the same day!
Recently, I discovered a hidden feature: adding the API parameter?format=textYou can directly get the ip:port format , eliminating the need to parse JSON steps . This detail design is really friendly to developers , who use who knows.

