Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add smoltcp TCP transport #2213

Conversation

conectado
Copy link
Contributor

@conectado conectado commented Nov 16, 2021

Implementing feature discussed in #1804

Description

This PR includes:

  • An implementation of the tcp transport protocol using smoltcp.
  • Abstract common behavior between the smoltcp and std implementation into ockam_transport_core.

This PR does NOT include:

  • A portal and inlet/outlet implementation for smoltcp.
  • A no_std implementation of the ockam_transport_smoltcp traits needed to run an example in a no_std device. (Although it does compile in no_std).

Implementation notes

  • ockam_transport_core::tcp includes the extracted behavior from ockam_transport_tcp.
    • worker are the same workers as they were in the old ockam_transport_tcp crate but using the new traits to work in no_std with smoltcp.
    • router includes the old definition of router and handle as in the old ockam_transport_tcp, save for the methods used by portals and inlets/outlets, those were extended by a newtype pattern in the ockam_transport_tcp crate.
    • traits Includes the definition of the behavior needed from tokio's socket to implement the workers(and endpoint_resolver to resolve endpoints with the router). Additionally:
      • io Includes a defintion of the async io traits, copied from futures, that is compatible with no_std(plus the necessary extensions), this is needed because neither tokio nor futures have a no_std compatible AsyncRead and AsyncWrite due to their use of std::io::Error.
  • ockam_transport_smoltcp includes the implementation of the protocol using smoltcp. The surface structure is basically the same as ockam_transport_tcp using a SmolTcpTransport instead of TcpTransport with some differences needed by additional configuration for smoltcp and the new method get_stack to obtain the stack for polling manually.
    • port_provider is needed to abstract the behavior of obtaining a port that can be pretty different in each implementation(For now I decided to make the PortProvider trait static to be able to hide it inside the smoltcp's EndpointResolver and then an implementation can use an static global if it needs to keep track of some state, this way the PortProvider doesn't exist for the normal tcp implementation).
    • net here is most of the interesting things for this crate, providing a wrapper for smoltcp related things. (A lot of the implementation was based on embassy)
      • stack provides a tcp stack.
        • StackFacade provides type safety to make sure the stack is initialized before being used. (Note: I only managed to panic to prevent double initialization any idea to give type safety to prevent this is welcomed!)
      • tcp Provides an interface similar to tcp's sockets.
      • timer provide a monotonic clock trait that is needed by the stack(More discussion about this in this PR comments!).
      • device Provides a Device wrapper around smoltcp's Device to handle explicitly the wakers for async. (This module also includes the tuntap implementation)
  • I tried not to use unsafe unless absolutely needed(only for the tuntap interface), even if it meant having mutexes wrapping other mutexes, so that those can be added in a separate PR.
  • I tried to prevent allocations to make it easier to stop depending on alloc in the future except when the implementation would be much harder or would require unsafe.(I figured it'd be okay requiring alloc since the whole project requires alloc anyways) I tried to leave a todo comment over each allocation.
  • I didn't include an integration test like in the tcp implementation because that would either require creating programatically a tuntap interface or mocking the Device. Leaving this for a future PR.

Test the PR

  • To test that ockam_transport_smoltcp compiles in no_std, even if there is no example doing this:
    cd implementaitons/rust/ockam/ockam_transport_smoltcp && cargo +nightly check --target thumbv7em-none-eabihf --no-default-features --features="no_std, alloc, pool-32"
  • To test the PR functioning, I added an example with a README on how to run it.

Next steps

  • Add at least one no_std implementation of:
  • Use embedded_nal as suggested here, this probably would let us replace some of the traits in ockam_tcp_core.
  • Extract portal and inlet/outlet behavior into ockam_tcp_core and add an implementation for smoltcp.
  • Add the integration test mentioned before.
  • Add unsafe sections to prevent wrapping mutexes within mutexes and allocations.

Checks

@thomcc
Copy link
Contributor

thomcc commented Nov 17, 2021

This looks great so far, please don't hesitate to ask if you have any questions!

@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch from 757e886 to a717755 Compare November 19, 2021 21:13
@mrinalwadhwa
Copy link
Member

@conectado thank you for continuing to work on this.
Let us know if you need help or have any questions along the way.

@conectado
Copy link
Contributor Author

@conectado thank you for continuing to work on this. Let us know if you need help or have any questions along the way.

Thanks 😄 For now things are going quite smoothly, I will start working on making it work with no-std. Will let you know if I've any difficulties.

@conectado
Copy link
Contributor Author

@mrinalwadhwa @thomcc I'm in the middle of changing the implementation to support no-std (I'm starting with a version that needs alloc) and when polling the ethernet interface for new packages I need to pass the current Instant which is the number of milliseconds that passed since startup. However, the method Instant::now provided by smoltcp only works in std, I was wondering if there's something already in the repo to deal with time in no-std enviroment.

If there's not I think there are a couple of options on how to advance with this:

For now I'll try going with the third option(at least to understand the problem better and decide later) but any kind of tip or insight is appreciated as I don't have any experience with this topic 🙏

@thomcc
Copy link
Contributor

thomcc commented Dec 13, 2021

@antoinevg might have some insight onto the 2nd option, but in general the third option sounds best for now, if possible, since I believe we'd like not to only be able to use this in embedded (even if that is the primary use case).

@conectado
Copy link
Contributor Author

@antoinevg might have some insight onto the 2nd option, but in general the third option sounds best for now, if possible, since I believe we'd like not to only be able to use this in embedded (even if that is the primary use case).

I think we could use smoltcp's Instant::now when std is on and use one of the other options when that's not the case.

@thomcc
Copy link
Contributor

thomcc commented Dec 13, 2021

Ah, that would work.

@antoinevg
Copy link
Contributor

when polling the ethernet interface for new packages I need to pass the current Instant which is the number of milliseconds that passed since startup. However, the method Instant::now provided by smoltcp only works in std

Everyone will tend to have their own way of handling system time for their particular project so I'd probably try to steer away from making that decision for them.

For myself, I usually handle this by defining an atomic integer and then setting a SysTick exception or similar to occur every 1ms to increment it.

See, for example:

https://github.com/antoinevg/nucleo-h7xx/blob/1e3aa9bae284da845c5f7a76c2ce192cf91d91ef/examples/ethernet_hal.rs#L195-L222

I'd be perfectly happy if you just passed smoltcp's timestamp requirement on to the developer.

i.e. requiring developers to pass it in via your Stack poll method as poll_iface(self: &Arc<Self>, cx: &mut Context, timestamp: Instant) or somesuch.

@conectado
Copy link
Contributor Author

when polling the ethernet interface for new packages I need to pass the current Instant which is the number of milliseconds that passed since startup. However, the method Instant::now provided by smoltcp only works in std

I'd be perfectly happy if you just passed smoltcp's timestamp requirement on to the developer.

i.e. requiring developers to pass it in via your Stack poll method as poll_iface(self: &Arc<Self>, cx: &mut Context, timestamp: Instant) or somesuch.

Since I spawn the task to poll the interface myself I could ask for the developer a struct when starting the Stack (Stack::run(&self, clock: impl Clock)) implementing a simple trait:

type MIlliseconds = i64;

trait Clock {
  type Instant: Into<Milliseconds>
  fn now(&self) -> Self::Instant;
}

Or do you think it is better if I let starting the task for polling to the developer too?

@antoinevg
Copy link
Contributor

Or do you think it is better if I let starting the task for polling to the developer too?

It may also be useful to give the developer the option to do even the polling themselves.

e.g. On embedded it's best to poll smoltcp in the [ethernet] interrupt handler and that can be set up very differently depending on how the app is structured.

In the longer run maybe we should even be looking at something like:

https://crates.io/crates/embedded-nal

All that said, there are so many open questions around async/embedded that I think almost any approach that results in running code is valuable right now because it helps us all understand the space better!

@conectado
Copy link
Contributor Author

Or do you think it is better if I let starting the task for polling to the developer too?

All that said, there are so many open questions around async/embedded that I think almost any approach that results in running code is valuable right now because it helps us all understand the space better!

Alright! Then, I will advance with my current approach, I think adding the possibility for the developer to poll the interface themselves is easy so I will add that too for now and later on revisit to use embedded-nal if possible.

@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch 2 times, most recently from 80d574a to 123f0a9 Compare December 14, 2021 08:38
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch 2 times, most recently from d41504c to a7afe28 Compare December 17, 2021 08:11
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch from d35610c to b449ea2 Compare December 29, 2021 08:58
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch 2 times, most recently from c6b1cd9 to f2d28c6 Compare January 16, 2022 05:16
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch from 1942f29 to 6e99517 Compare January 23, 2022 21:17
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch 2 times, most recently from b36a3ec to a7804f1 Compare February 12, 2022 18:42
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch 3 times, most recently from f36d0f0 to 6d17559 Compare February 20, 2022 01:56
@conectado conectado marked this pull request as ready for review February 20, 2022 03:02
@conectado conectado changed the title (WIP) Add smoltcp TCP transport Add smoltcp TCP transport Feb 20, 2022
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch from 7948d14 to 8517611 Compare February 20, 2022 23:42
@conectado conectado force-pushed the conectado/tcp-transport-using-smoltcp branch from 8517611 to 4d51a34 Compare February 20, 2022 23:50
* Changes to support using smoltcp as transport protocol in crate `ockam_transport_smoltcp`.
  * Includes async interface.
  * Ability to manually poll.
  * Necessary implementation of interfaces to run in std using tun/tap interface.

* Extract common behavior from smoltcp and std tcp into `ockam_transport_core` in `tcp` module.
  * Necessary unix socket behavior extracted into traits.
  * Worker and Route behavior extracted into common modules.
  * No support for portals and inlet yet.
@mrinalwadhwa
Copy link
Member

@conectado awesome work! Thank you ❤️
Looking forward to merging this after we get some reviews.

Cargo.lock Outdated
@@ -204,9 +244,9 @@ dependencies = [

[[package]]
name = "autocfg"
version = "1.1.0"
version = "1.0.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to update the Cargo.lock before merging since this diff bumps several dependency versions down from what is in develop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here!

Copy link
Contributor

@spacekookie spacekookie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey!

Thank you for all the effort you already put into this PR. I left you some comments with minor things I'd like to see changed in the API.

I haven't read all the code yet (it's quite a lot 😅) but this looks very good so far 🙂

Comment on lines +59 to +60
let tcp = SmolTcpTransport::<ThreadLocalPortProvider>::create(&ctx, configuration, Some(StdClock))
.await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can hide that generic parameter here? It's good to have the option to use a different port provider mechanism but most users are not going to care and be confused by what exactly this is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could either create a type alias:

type TcpTransport = SmolTcpTransport<ThreadLocalPortProvider>

So most users would use TcpTransport and those who care about the port provider use SmolTcpTransport<P>.

Or we could add a default generic parameter:

struct SmolTcpTransport<P = ThreadLocalPortProvider> { ... }

But that has the problem that users that wanted to use the default would need to do <SmolTcpTransport>::create(...).

However, I think that most users that will use this crate would do it in a no_std context, meaning they will need to provide the port provider. I think that using this in a std context would be mostly done for testing and examples.

/// TCP address type constant
pub const TCP: u8 = 1;

pub(crate) const CLUSTER_NAME: &str = "_internals.transport.tcp";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note: maybe we want to rename this cluster to smoltcp so that shutdowns between the two tcp stacks (if running in parallel) are independent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here!

peer_addr: Address,
tcp_stream: R,
cluster_name: &'static str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to store the cluster_name again here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a vestigial change when I took the cluster_name as a parameter from outside the crate and I needed to store it to use it in the worker's initialize. But since I changed it back to taking the cluster_name from outside I need to keep it.

@thomcc
Copy link
Contributor

thomcc commented Feb 22, 2022

I'll finish reviewing later today, but the length of the lines that have comments is a bit much. I think we try to keep that to around 100 characters.

If you use vscode, I use https://marketplace.visualstudio.com/items?itemName=stkb.rewrap to help with this.

Copy link
Contributor

@antoinevg antoinevg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. This is a stunning PR, really good work.

I can't wait to try this out on embedded!

[license-image]: https://img.shields.io/badge/License-Apache%202.0-green.svg
[license-link]: https://github.com/ockam-network/ockam/blob/HEAD/LICENSE

[discuss-image]: https://img.shields.io/badge/Discuss-Github%20Discussions-ff70b4.svg
[discuss-link]: https://github.com/ockam-network/ockam/discussions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please restore the formatting here to keep it consistent with the other crates.

Comment on lines +24 to +30
std = [
"ockam_core/std",
"ockam_node/std",
"tokio",
"tokio/io-util",
"tokio/net",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 100% with you that we should try to reduce code duplication as much as possible but, in this particular case, I'd argue against moving the TCP router/workers etc. into ockam_transport_core.

My reasoning is basically this:

  1. We've now created a dependency inockam_transport_core on ockam_node and tokio. [1] This could be problematic as we already have an alternate async executor implementation for no_std targets (ockam_executor) and we are also likely to have alternate ockam_node implementations in the near future.
  2. There's likely to be other weirdnesses that come up when we start using ockam_transport_smoltcp in anger on bare-metal embedded devices that would require some changes to router/worker that are exclusive to ockam_transport_smoltcp
  3. It's not a lot of code duplication and it's nice to have it living close to the code that will be using it. We now have multiple transports and I don't think we've yet had a case where we had to make big changes to one of their router/worker implementations that needed to be propagated to the others.

[1] Aside: Check out the (admittedly hackish) way we currently wrap tokio in ockam_node for no_std:

https://github.com/ockam-network/ockam/blob/b50149e4b23a5247922d6622b8ba9815a47c970e/implementations/rust/ockam/ockam_node/src/lib.rs#L23-L27

And:

https://github.com/ockam-network/ockam/blob/b50149e4b23a5247922d6622b8ba9815a47c970e/implementations/rust/ockam/ockam_executor/src/lib.rs#L34-L47

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think this is not a problem since tokio is an optional dependency that's only included with std enabled. I only use tokio here for the implementations of the AsyncRead and AsyncWrite traits on std. I could completely remove the dependency by using the newtype pattern to do the implementation in ockam_transport_tcp if that's preferable.
  2. I figured that we can start extracting the behavior as the weirdness appears, that way we maximize the shared behavior.
  3. Although not particularly a big change while working on this the heartbeat was added and by solving conflicts it was automatically enjoyed by the smoltcp crate.

All in all I thought it was a good idea as the behavior they share don't seem coincidental and maybe in the future more protocols could use this worker/router implementation.

Anyways, I understand the points and also the generic implementation along with all the trait bounds make the code much harder to parse so if you still think is better to have each crate with their own worker/router implementation I will go back to that!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid points all but I do feel like the end result is a bigger increase in overall code complexity than what we're gaining through de-duplication here.

Thoughts @thomcc @SanjoDeundiak @spacekookie ?

// We need alloc here for `async_trait` it's very easy to prevent this by making this clone
// directly inside of `bind` but since `processors` already needs to be an async_trait and
// ockam_core still needs alloc I'll leave this here.
use ockam_core::compat::boxed::Box;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,222 @@
//! Traits based in the `futures::io` crate to support `no_std` compilation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these!

I don't think we're likely to get official ones any time soon: rust-lang/wg-async#23

I'm wondering if these shouldn't live in ockam_core::compat? @SanjoDeundiak @thomcc @spacekookie ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See nrc/portable-interoperable#5 (and https://www.ncameron.org/blog/async-read-and-write-traits/) for discussion on "official" equivalents.

But in the mean time, these should be in ockam_core::compat, yeah.

Comment on lines +36 to +44
[main-ockam-crate-link]: https://crates.io/crates/ockam
[crate-image]: https://img.shields.io/crates/v/ockam_transport_tcp.svg
[crate-link]: https://crates.io/crates/ockam_transport_tcp
[docs-image]: https://docs.rs/ockam_transport_tcp/badge.svg
[docs-link]: https://docs.rs/ockam_transport_tcp
[license-image]: https://img.shields.io/badge/License-Apache%202.0-green.svg
[license-link]: https://github.com/ockam-network/ockam/blob/HEAD/LICENSE
[discuss-image]: https://img.shields.io/badge/Discuss-Github%20Discussions-ff70b4.svg
[discuss-link]: https://github.com/ockam-network/ockam/discussions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: See previous nit on consistency between crates :-)

@@ -0,0 +1,77 @@
// This is copied verbatim(Some stuff we don't need removed) from [embassy](https://github.com/embassy-rs/embassy/blob/7561fa19348530ce85e2645e0be8801b9b2bbe13/embassy-net/src/packet_pool.rs)
// One advantage of using atomic_pool instead of heapless it doesn't seem to have the same kind of ABA problem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should consider using atomic_pool to replace heapless in ockam_core::compat where we can?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason to use atomic_pool was because embassy was doing that.

I'm not sure the advantages of heapless's Treiber stack over atomic_pool's method of keeping track of available pool using a bitset.

While the advantage of atomic_pool's method is clear, it is Sync for all architectures, I'm not sure if it's because atomic_pool didn't take the same considerations and edge cases into account or the implementation is sound.

Also, atomic_pool doesn't mention the ABA soundness in x64 problem and by superficially reviewing it I came to the conclusion that it doesn't seem to have that problem since that seems to arise from the Treiber stack using a linked list but I can't confirm it %100.

I will read more on this and get back to you! (Atomics are really complicated 😵‍💫)

@etorreborre
Copy link
Member

Hi @conectado, do you plan to resume your work at some stage on this PR? Otherwise, since it has derived quite a bit from the current develop I propose to close it and hopefully someone else will be able to use it to restart the work at some stage.

@etorreborre
Copy link
Member

Since this branch has diverged we are going to close this PR for now. If anyone wants to resume the work on the smoltcp transport, the branch is available in the ockam repository at conectado/tcp-transport-using-smoltcp.

@mrinalwadhwa
Copy link
Member

@conectado Thank you for all the time you put into this 🙏
Hopefully someone else will come along and take your great work further from this branch
https://github.com/build-trust/ockam/tree/conectado/tcp-transport-using-smoltcp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants