
What's the difference between these two asynchronous request libraries?
Brothers engaged in network crawlers should have encountered this situation: obviously write a good code, the results of the site a speed limit on the blind. At this time asynchronous request library is a lifesaver, aiohttp and httpx these two goods are often compared. First of all, a vernacular difference: aiohttp is like a focus on sprinting, specializing in asynchronous students; httpx is more like an all-round player, synchronous asynchronous can play.
Let's take a realistic example, suppose you want to batch detect 100 web pages with proxy IP. If you use aiohttp, you have to build your own wheels to manage the connection pool, while httpx comes with connection pool reuse, which is more friendly to newbies. But aiohttp is really faster and lighter in a purely asynchronous environment, especially when dealing withlong connectionIt saves a lot of memory when it comes to
Which is the best proxy IP configuration?
Here's the point! We do data collection of the most headache proxy settings. Test found that the proxy configuration of aiohttp should be written like this:
import aiohttp
async with aiohttp.ClientSession() as session.
async with session.get('https://目标网站', proxy="http://user:pass@ipipgo-proxy.com:端口") as resp.
print(await resp.text())
The way httpx is written is closer to the style of requests:
import httpx
async with httpx.AsyncClient(proxies="http://user:pass@ipipgo-proxy.com:端口") as client: resp = await client.get("http://user:pass@ipipgo-proxy.com:端口")
resp = await client.get("https://目标网站")
There's a pitfall to be aware of: aiohttp'sThe proxy parameter must have a protocol header(http://或https://), and httpx will be automatically recognized. Here we recommend using ipipgo's proxy service, their family provides ready-made authentication templates, directly copy and paste can be used, saving yourself from tossing string splicing.
Real-world performance competition
Let's test it with a real scenario (test environment: 100 requests/5 concurrency):
| norm | aiohttp | httpx |
|---|---|---|
| Average response | 1.2 seconds | 1.5 seconds |
| memory footprint | 78MB | 105MB |
| Exception handling | Manual retry required | built-in retry mechanism |
See? aiohttp does have an advantage in speed, but httpx comes with aauto-retry functionIt's really fragrant. Especially when using a highly available proxy like ipipgo, with the retry mechanism the success rate can get to over 99%. But the memory consumption thing depends on the specific situation, if it is only a short time task, this gap can be ignored.
Which one should I choose?
Give a solid suggestion:
- needUltimate performanceSelect aiohttp
- coerceGetting StartedUse httpx
- Need for simultaneous processingSynchronous + AsynchronousRequested selection httpx
To give a real case: before to help friends do e-commerce price monitoring, both to climb the domestic platform and to climb the overseas site (of course, with ipipgo's global node la). In the end, we used httpx to get it done, because it couldAutomatic switching between HTTP/1.1 and HTTP2protocol, some sites must use HTTP2 to access, this aiohttp is not yet supported.
Frequently Asked Questions QA
Q: What should I do if the proxy always times out the connection?
A: First use the test interface provided by ipipgo to check whether the proxy is available, and then check the timeout parameter settings. It is recommended to set the timeout to more than 15 seconds, especially for high latency nodes.
Q: How to configure HTTPS proxy?
A: Just replace http with https in the proxy address, for example: "https://user:pass@ipipgo-ssl-proxy.com:端口". Note that some old versions of the library may not support, it is recommended to use the latest version of httpx.
Q: What if I need to change agents frequently?
A: It is recommended to use ipipgo's dynamic proxy service, theirquantity-based billing packageSupport automatic IP switching, set endpoint directly in the code on the line, do not have to maintain their own IP pool.
Personal advice on stepping on potholes
A few final rants:
- Don't use time.sleep() in asynchronous functions, use asyncio.sleep()
- When proxy authentication fails, first check theaccount balance(Don't laugh. So many people forget to renew.)
- SSL errors can be encountered by adding
verify=Falseparameter, but the production environment should remember to match the certificate.
In short, according to the project needs to choose tools, do not follow the wind. Small projects with aiohttp lightly loaded, complex business with httpx more worry. Proxy services highly recommended ipipgo, their family ofBeijing, Shanghai, Shenzhen nodesThe latency is all under 50ms, which is solid for doing domestic business.

