Based on library analysis considering GitHub stars, activity level, technical feasibility, and user impact.
File: python_proxy_headers/pycurl_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 1,146 |
| Last Active | 2026-01-30 |
| Feasibility | ✅ HIGH |
| Impact | HIGH - Direct libcurl access |
Why High Priority:
- libcurl already supports
CURLOPT_PROXYHEADERfor sending custom headers - Can capture CONNECT response headers via
CURLOPT_HEADERFUNCTION - Foundation for curl_cffi work
Implementation Plan:
# pycurl_proxy.py - Proposed API
class ProxyCurl:
"""PycURL wrapper with proxy header support."""
def __init__(self, proxy_headers=None):
self.proxy_headers = proxy_headers or {}
self._response_proxy_headers = {}
def get(self, url, proxy=None) -> ProxyResponse:
"""Make GET request with proxy header support."""
pass
@property
def received_proxy_headers(self) -> dict:
"""Headers received from proxy during CONNECT."""
return self._response_proxy_headers
def request(method, url, proxy=None, proxy_headers=None) -> ProxyResponse:
"""Convenience function for one-off requests."""
passTechnical Approach:
- Use
pycurl.PROXYHEADERoption to send custom headers - Use
HEADERFUNCTIONcallback to capture CONNECT response headers - Parse headers to separate proxy headers from origin headers
- Expose via clean API matching existing python-proxy-headers style
File: python_proxy_headers/curl_cffi_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 4,873 |
| Last Active | 2026-01-30 |
| Feasibility | |
| Impact | VERY HIGH - Popular anti-bot library |
Why High Priority:
- Very popular for bypassing bot detection
- Uses libcurl which has proxy header capabilities
- Active development means potential upstream contributions
Implementation Plan:
# curl_cffi_proxy.py - Proposed API
from curl_cffi import Session
class ProxySession(Session):
"""curl_cffi Session with proxy header support."""
def __init__(self, proxy_headers=None, **kwargs):
super().__init__(**kwargs)
self._proxy_headers = proxy_headers or {}
self._last_proxy_response_headers = {}
def request(self, method, url, **kwargs) -> ProxyResponse:
"""Make request capturing proxy headers."""
pass
@property
def proxy_response_headers(self) -> dict:
"""Headers from last proxy CONNECT response."""
return self._last_proxy_response_headers
# Convenience functions
def get(url, proxy=None, proxy_headers=None, impersonate=None, **kwargs):
passTechnical Approach:
- Investigate if curl_cffi exposes low-level curl options
- If yes: Use
CURLOPT_PROXYHEADERdirectly - If no: Create PR to curl_cffi to expose these options
- May need to work with curl_cffi maintainers
Upstream Contribution Opportunity:
- File issue requesting
proxy_headersparameter - Contribute PR if welcomed
File: python_proxy_headers/cloudscraper_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 6,060 |
| Last Active | 2025-06-10 |
| Feasibility | ✅ HIGH |
| Impact | HIGH - Popular for Cloudflare bypass |
Why High Priority:
- Built on requests - can use our existing adapter
- Popular for accessing protected sites
- Easy integration
Implementation Plan:
# cloudscraper_proxy.py - Proposed API
import cloudscraper
from .requests_adapter import HTTPProxyHeaderAdapter
class ProxyCloudScraper(cloudscraper.CloudScraper):
"""CloudScraper with proxy header support."""
def __init__(self, proxy_headers=None, **kwargs):
super().__init__(**kwargs)
adapter = HTTPProxyHeaderAdapter(proxy_headers=proxy_headers)
self.mount('https://', adapter)
self.mount('http://', adapter)
def create_scraper(proxy_headers=None, **kwargs):
"""Create a CloudScraper with proxy header support."""
return ProxyCloudScraper(proxy_headers=proxy_headers, **kwargs)Technical Approach:
- Subclass
cloudscraper.CloudScraper - Mount our
HTTPProxyHeaderAdapter - Preserve all cloudscraper functionality
- Simple integration - likely <50 lines of code
File: python_proxy_headers/autoscraper_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 7,082 |
| Last Active | 2025-06-09 |
| Feasibility | ✅ HIGH |
| Impact | MEDIUM - Niche use case |
Implementation Plan:
# autoscraper_proxy.py
from autoscraper import AutoScraper
from .requests_adapter import ProxySession
class ProxyAutoScraper(AutoScraper):
"""AutoScraper with proxy header support."""
def __init__(self, proxy_headers=None):
super().__init__()
self._proxy_session = ProxySession(proxy_headers=proxy_headers)
def build(self, url, wanted_list, proxy_headers=None, **kwargs):
"""Build scraper with proxy header support."""
# Use our ProxySession for requests
passFile: python_proxy_headers/treq_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 606 |
| Last Active | 2026-01-03 |
| Feasibility | |
| Impact | MEDIUM - Twisted ecosystem |
Implementation Plan:
# treq_proxy.py
from twisted.web.client import Agent, ProxyAgent
from twisted.internet import reactor
class ProxyHeaderAgent(ProxyAgent):
"""Twisted Agent with proxy header support."""
def __init__(self, proxy_headers=None, **kwargs):
super().__init__(**kwargs)
self._proxy_headers = proxy_headers or {}
# Override connection methods to inject headersTechnical Approach:
- Subclass
ProxyAgent - Override
_connectmethod to add custom headers - Capture CONNECT response headers
- More complex due to Twisted's async nature
File: python_proxy_headers/crawlee_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 7,968 |
| Last Active | 2026-01-30 |
| Feasibility | |
| Impact | MEDIUM - Only HTTP crawler portion |
Implementation Plan:
# crawlee_proxy.py
from crawlee.crawlers import BeautifulSoupCrawler
from .httpx_proxy import HTTPProxyTransport
class ProxyBeautifulSoupCrawler(BeautifulSoupCrawler):
"""Crawler with proxy header support for HTTP requests."""
def __init__(self, proxy_headers=None, **kwargs):
# Configure httpx client with our transport
passNote: Only applies to BeautifulSoupCrawler, not PlaywrightCrawler
File: python_proxy_headers/requestium_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 1,838 |
| Last Active | 2026-01-26 |
| Feasibility | |
| Impact | LOW - Requests portion only |
Implementation Plan:
# requestium_proxy.py
from requestium import Session
from .requests_adapter import HTTPProxyHeaderAdapter
class ProxySession(Session):
"""Requestium Session with proxy header support."""
def __init__(self, proxy_headers=None, **kwargs):
super().__init__(**kwargs)
adapter = HTTPProxyHeaderAdapter(proxy_headers=proxy_headers)
self.mount('https://', adapter)
self.mount('http://', adapter)File: python_proxy_headers/botasaurus_proxy.py
| Metric | Value |
|---|---|
| GitHub Stars | 3,808 |
| Last Active | 2026-01-10 |
| Feasibility | |
| Impact | LOW - Request decorator only |
Implementation Plan:
- Investigate botasaurus's request module internals
- May require monkey-patching or upstream PR
These libraries use browser automation where proxy handling is delegated to the browser engine. Custom proxy header support is not feasible without browser extensions or significant browser-level modifications.
| Library | Stars | Reason for Low Priority |
|---|---|---|
| crawl4ai | 59,235 | Browser-based (Playwright) |
| Scrapegraph-ai | 22,434 | Browser-based (Playwright) |
| playwright-python | 14,209 | Browser handles proxy |
| SeleniumBase | 12,139 | Browser handles proxy |
| Selenium | N/A | Browser handles proxy |
| splash | 4,198 | Qt WebKit-based |
Recommendation: Do not implement extensions for these libraries. Instead, document that proxy header support is not possible due to browser architecture limitations.
- ✅ pycurl extension
- ✅ cloudscraper extension (quick win)
- curl_cffi extension (may require upstream work)
- autoscraper extension
- treq extension
- crawlee-python extension
- requestium extension
- botasaurus extension (if feasible)
python_proxy_headers/
├── __init__.py
├── urllib3_proxy_manager.py # Existing
├── requests_adapter.py # Existing
├── httpx_proxy.py # Existing
├── aiohttp_proxy.py # Existing
├── pycurl_proxy.py # NEW - Priority 1
├── curl_cffi_proxy.py # NEW - Priority 1
├── cloudscraper_proxy.py # NEW - Priority 1
├── autoscraper_proxy.py # NEW - Priority 2
├── treq_proxy.py # NEW - Priority 2
├── crawlee_proxy.py # NEW - Priority 2
├── requestium_proxy.py # NEW - Priority 2
└── botasaurus_proxy.py # NEW - Priority 2
For each new extension, add:
- RST doc file in
docs/ - Entry in
docs/index.rst - Usage example in README.md
- Example code in proxy-examples repo
Created: January 30, 2026