-
Notifications
You must be signed in to change notification settings - Fork 427
net-tokio: add fn socks5_connect_outbound
#4305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Thanks for assigning @tnull as a reviewer! |
a7f498c to
42cbfd4
Compare
| SocketAddress::OnionV2 { .. } => unreachable!(), | ||
| }); | ||
|
|
||
| match addr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Might be a bit cleaner if we do one match addr and then push both the type and the following fields accordingly? If we do that, we could also avoid the unreachable!s below, the early-return if above, and simply return an error for OnionV2?
lightning-net-tokio/src/lib.rs
Outdated
|
|
||
| tcp_stream.write_all(&socks5_request).await.map_err(|_| ())?; | ||
|
|
||
| let mut buffer: Vec<u8> = Vec::with_capacity(SOCKS5_REPLY_MIN_LEN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably immediately re-allocate since the reply should be larger than min reply len in the common case if we read_to_end, no? Should we rather pick a reasonable bufsize for this? Might also make sense to only read the 3 first bytes we look at in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably immediately re-allocate since the reply should be larger than min reply len in the common case if we read_to_end, no?
If we take the common case to be Tor, this reply will always be 10 bytes.
Should we rather pick a reasonable bufsize for this?
I have nonetheless set this to the max possible reply given my reading of the RFC spec.
Might also make sense to only read the 3 first bytes we look at in the first place?
Want to make sure any bytes after the first 3 get discarded. Didn't find a better solution other than just reading them into the buffer and ignoring them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have nonetheless set this to the max possible reply given my reading of the RFC spec.
I'm confused, you're saying SOCKS5_REPLY_MIN_LEN is the maximum reply length? Edit: nevermind only now saw that you actually changed the code to include SOCKS5_REQUEST_REPLY_MAX_LEN.
Want to make sure any bytes after the first 3 get discarded. Didn't find a better solution other than just reading them into the buffer and ignoring them.
Hmm, if we're positive this ~always 10 bytes, why not actually discard 10 bytes (or read the fields and keep discarding until we've seen all expected fields)? FWIW, seems we could def. avoid the allocations at all if we make buffer a small fixed-size byte array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, you're saying SOCKS5_REPLY_MIN_LEN is the maximum reply length?
I changed the length of the buffer we preallocate to the max length of the reply given my reading of the RFC spec.
Hmm, if we're positive this ~always 10 bytes, why not actually discard 10 bytes (or read the fields and keep discarding until we've seen all expected fields)? FWIW, seems we could def. avoid the allocations at all if we make buffer a small fixed-size byte array?
I want to avoid being too tor-specific here. Another socks5 setup could definitely return more bytes in this response, up to 262 SOCKS5_REQUEST_REPLY_MAX_LEN.
Would you rather we use a [0u8; SOCKS5_REQUEST_REPLY_MAX_LEN] to avoid any heap allocations ? We could make it work yes, but seems to me Vec<u8> is good enough.
lightning-net-tokio/src/lib.rs
Outdated
| tcp_stream.write_all(&socks5_request).await.map_err(|_| ())?; | ||
|
|
||
| let mut buffer: Vec<u8> = Vec::with_capacity(SOCKS5_REPLY_MIN_LEN); | ||
| let n_read = tcp_stream.read_to_end(&mut buffer).await.map_err(|_| ())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we read what's available from the stream and then discard the buffer. Are we positive that this would never hold data we actually want to use and bubble up to handlers? As mentioned above, maybe we should just read 3 bytes rather than read_to_end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we positive that this would never hold data we actually want to use and bubble up to handlers?
In the common case where we talk to Tor, any bytes after this will always be set to ATYP IPV4 and then ADDR 0.0.0.0 and PORT 0. None of this information is useful to callers, so I ignore it. Happy to update this in the future should this information be needed by callers in other SOCKS5 setups.
As mentioned above, maybe we should just read 3 bytes rather than read_to_end?
See above, I want to make sure any bytes after the first 3 get properly discarded, and not read by future calls to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the common case where we talk to Tor, any bytes after this will always be set to ATYP IPV4 and then ADDR
0.0.0.0and PORT0. None of this information is useful to callers, so I ignore it. Happy to update this in the future should this information be needed by callers in other SOCKS5 setups.
Hmm, if we expect these fields exactly, why not then read the expected number of bytes only, and error out if there's anything unexpected? I'm still wondering if it would be ensured we'd never be able to read actual payload on this first read after connection establishment. FWIW, that kinda violates TCP's streaming model, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not then read the expected number of bytes only, and error out if there's anything unexpected?
As mentioned above, I want to match the socks5 spec, and not constrain the use case to the tor proxy only.
I'm still wondering if it would be ensured we'd never be able to read actual payload on this first read after connection establishment. FWIW, that kinda violates TCP's streaming model, no?
I don't follow this one, can you clarify ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow this one, can you clarify ?
TCP is a stream-based protocol. If you 'read to end' that only means you read what's currently there to your buffer. That might be the full response, only a part of it, or even the response followed by the some bytes of following packets. See for example Tor's 'optimistic data' mode: given that they allow to pipeline sending requests even when the connection attempt is still inflight could potentially mean that the first response bytes are already queued when we get to read_to_end, so we might end up discarding some valid goodput? It might actually be unlikely, but such races happen, so it would be safer to read only what we need to the buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point thank you ! See the fixup below, I now read the socks5 reply to the exact byte in case the payload follows immediately in the stream.
|
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4305 +/- ##
==========================================
- Coverage 86.59% 86.51% -0.08%
==========================================
Files 158 158
Lines 102408 102511 +103
Branches 102408 102511 +103
==========================================
+ Hits 88678 88688 +10
- Misses 11309 11406 +97
+ Partials 2421 2417 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
42cbfd4 to
e5f6359
Compare
lightning-net-tokio/src/lib.rs
Outdated
| let mut buffer: Vec<u8> = Vec::with_capacity(SOCKS5_REQUEST_REPLY_MAX_LEN); | ||
| let n_read = tcp_stream.read_to_end(&mut buffer).await.map_err(|_| ())?; | ||
| if n_read < SOCKS5_REQUEST_REPLY_MIN_LEN | ||
| || n_read > SOCKS5_REQUEST_REPLY_MAX_LEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is too strict ? Anything longer than this deviates from the spec, so I lean towards returning Err in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO fine to error if it doesn't adhere to the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also aware we have no tests for this code now, would it be ok to add tests that try connections to actual endpoints on the internet ? ie torproject.org, some famous lightning node onion V3 address...
i have tested socks5_connect in a toy binary and everything works.
Yeah, tests towards some known endpoints on the internet are notoriously flaky (especially in Github CI), however, they are better than no tests! |
lightning-net-tokio/src/lib.rs
Outdated
|
|
||
| tcp_stream.write_all(&socks5_request).await.map_err(|_| ())?; | ||
|
|
||
| let mut buffer: Vec<u8> = Vec::with_capacity(SOCKS5_REQUEST_REPLY_MAX_LEN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the other comment: now that we have SOCKS5_REQUEST_REPLY_MAX_LEN, coudl we make this a [u8; SOCKS5_REQUEST_REPLY_MAX_LEN]?
Let me see that would require installing C tor in CI right ? Do we want to do this still ? |
Hmm, it would likely not be too bad, though indeed it might have some overhead if run every time. It would be a good candidate for a nightly CI job, IMO. For now it might be fine to add some tests, if they all reuse the same setup? |
| // Constants defined in RFC 1928 | ||
| const VERSION: u8 = 5; | ||
| const NMETHODS: u8 = 1; | ||
| const NO_AUTH: u8 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While NO_AUTH is fine, we should still support setting USERNAME/PASSWORD fields as Tor uses that to let users establish new circuits. FWIW, we might even want to allow users to manually change username/password during runtime for following connections for that reason.
(see https://spec.torproject.org/socks-extensions.html#extent-of-support)
e9a864d to
8d22deb
Compare
8d22deb to
7fdaeb2
Compare
| let method_selection_request = [VERSION, NMETHODS, selected_auth]; | ||
| let mut tcp_stream = TcpStream::connect(&socks5_proxy_addr).await.map_err(|_| ())?; | ||
| tcp_stream.write_all(&method_selection_request).await.map_err(|_| ())?; | ||
|
|
||
| let mut method_selection_reply = [0u8; METHOD_SELECT_REPLY_LEN]; | ||
| let n_read = tcp_stream.read_exact(&mut method_selection_reply).await.map_err(|_| ())?; | ||
| if n_read != METHOD_SELECT_REPLY_LEN || method_selection_reply != [VERSION, selected_auth] { | ||
| return Err(()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wat? No, just send the CONNECT command and be done with it. We don't need the vast majority of this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Send something like this - https://github.com/rust-bitcoin/corepc/blob/master/bitreq/src/proxy.rs#L104 and then just check the status code response https://github.com/rust-bitcoin/corepc/blob/master/bitreq/src/proxy.rs#L113
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, HTTP CONNECT is gated behind the HTTPTunnelPort setting, itself not enabled by default in C Tor, so we go with SOCKS5.
No description provided.