-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable use of Cyclone DDS security features #123
Conversation
Add utility function to insert security settings to the cyclone QOS object used to create nodes. Include a utility to find security files and properly format their location to use with DDS. Signed-off-by: Sid Faber <[email protected]>
This does not pay attention to the @eboasson, is this something we want to add? |
@rotu because rmw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful @SidFaber, thank you! I have only seen one trivial detail in addition to the rather important question concerning the ENABLE_SECURITY
option. Here I see two distinct issues:
- successfully building the code regardless of the version of Cyclone that is used (the RMW layer covers all versions of Cyclone and of ROS2 since May 2019, no need to throw that away); and
- certainty that security settings will never be silently ignored.
The first is an issue because the dds_qset_prop
function does not yet exist in the master
branch of Cyclone DDS. It got added in the security branch and will only show up in master when the support for security gets merged into it. Simply merging this PR now will therefore cause all ROS2 CI builds to fail. My preference is to define a macro DDS_HAS_QOS_PROPERTY_LIST
in Cyclone's public header files if the property list QoS is implemented, and test for that in the RMW layer.
The second has two parts, in my view. One is that rmw_create_node
must fail if enforce_security
is set but DDS_HAS_QOS_PROPERTY_LIST
is not defined.
The other is that a build of Cyclone without support for security needs to refuse the creation of a participant for which some of the DDS security properties have been set. That's straightforward addition to the code in the security branch.
With these changes, I think all bases will be covered. Do you (and @dennis-adlink and @rotu) agree?
README.md
Outdated
@@ -30,6 +30,5 @@ DDS directly instead of via the ROS2 abstraction. | |||
|
|||
## Known limitations | |||
|
|||
Cyclone DDS doesn't yet implement the DDS Security standard, nor does it fully implement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh! The lifespan, deadline and liveliness QoS are fully supported nowadays, there aren't really any known limitations anymore! To be fixed in another PR, obviously.
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
qos, "dds.sec.access.permissions", "permissions.p7s", | ||
security_options); | ||
|
||
dds_qset_prop(qos, "dds.sec.auth.library.path", "libdds_security_auth.so"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "lib" and ".so" affixes will cause trouble on macOS and Windows. If the specified file does not exist, Cyclone also tries the names extended in the platform's native way (so lib/.dylib for macOS and .dll for Windows). So just "dds_security_auth" seems better (and analogously for the other libraries).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in bca0852
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fantastic!
I could get this to run out-of-the-box on bionic! however I had issues on Focal.
As I also had issues with other rmw implementations on that platform I opened a meta ticket at ros2/sros2#180
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
#include "rcutils/get_env.h" | ||
#include "rcutils/logging_macros.h" | ||
#include "rcutils/strdup.h" | ||
#include "rcutils/format_string.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: alphabetical order for includes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in bca0852
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
@@ -666,7 +742,13 @@ extern "C" rmw_node_t * rmw_create_node( | |||
static_cast<void>(domain_id); | |||
const dds_domainid_t did = DDS_DOMAIN_DEFAULT; | |||
#endif | |||
(void) security_options; | |||
|
|||
if (security_options == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative: Use RCUTILS_CHECK_ARGUMENT_FOR_NULL
macro instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with 1ca2269
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
@@ -688,8 +770,18 @@ extern "C" rmw_node_t * rmw_create_node( | |||
#endif | |||
|
|||
dds_qos_t * qos = dds_create_qos(); | |||
if (qos == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit here, could use RCUTILS_CHECK_FOR_NULL_WITH_MSG
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikaelarguedas, found that this needs to do a bit of cleanup before so this is still here
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
qos, "dds.sec.access.permissions", "permissions.p7s", | ||
security_options); | ||
|
||
dds_qset_prop(qos, "dds.sec.auth.library.path", "libdds_security_auth.so"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks platfrom specific, maybe there is support for the library basename and cyclone then modifies and look for specific names based on the platform (ala class_loader https://github.com/ros/class_loader/blob/11982ab3efc390608f10f2db1e97edf4037afec5/src/class_loader.cpp#L51-L73)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with bca0852
Signed-off-by: Sid Faber <[email protected]>
@eboasson do you plan to add this to the security branch of Cyclone? If so I'll update this with some code that should handle the items you pointed out. Nice catch on preventing a silent fail-open scenario! |
Yep, working on it: eclipse-cyclonedds/cyclonedds#448 |
@eboasson, I have updates here ready to land once this is ready, thanks for the improvements |
Add in conditional compile based on ENABLE_SECURITY make flag and Cyclone DDS feature availability. Also addressed review comments. Signed-off-by: Sid Faber <[email protected]>
@eboasson, @mikaelarguedas, thanks for your comments, they should be all addressed in bca0852. Important to note that now node creation fails when security is requested (ROS_SECURITY_ENABLE=true); this change the current fail-open behavior to fail-closed which should be desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @SidFaber! It works fine — also on Dashing, which I checked just to be sure. I'd have merged it if it weren't for 4 details, and I am not even sure I didn't overlook one or two before.
I'd have done GitHub suggestions if it would allow me (sometimes it does, sometimes it doesn't), so I have little choice but to ask you to go over them.
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
@@ -59,6 +61,7 @@ | |||
#include "namespace_prefix.hpp" | |||
|
|||
#include "dds/dds.h" | |||
#include "dds/ddsc/dds_public_qos.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is included automatically by dds/dds.h, and I'm not so sure it will won't change name or location in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 1ca2269
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
rmw_ret_t configure_qos_for_security( | ||
dds_qos_t * qos, const rmw_node_security_options_t * security_options) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This empty line triggers a cpplint warning:
/Users/erik/ros2_ws/src/ros2/rmw_cyclonedds/rmw_cyclonedds_cpp/src/rmw_node.cpp:698: Redundant blank line at the start of a code block should be deleted. [whitespace/blank_line] [2]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 1ca2269
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
@@ -688,8 +789,17 @@ extern "C" rmw_node_t * rmw_create_node( | |||
#endif | |||
|
|||
dds_qos_t * qos = dds_create_qos(); | |||
RCUTILS_CHECK_FOR_NULL_WITH_MSG( | |||
security_options, "rmw_create_node: Unable to create qos", return nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this was meant to check qos
rather than security_options
; and if bailing out, a domain will have been instantiated/refcounted already by check_create_domain_locked
and, as noted, its work must be undone using node_gone_from_domain_locked
(see a few lines below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with 1ca2269, thanks for catching!
|
||
if (security_options->enforce_security) { | ||
if (configure_qos_for_security(qos, security_options) != RMW_RET_OK) { | ||
return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, too, can leak the domain reference, and moreover leaks qos
. Isn't C just a lovely language, always being on the lookout for ways to get you? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't C just a lovely language, always being on the lookout for ways to get you?
Regarding that: such issues can be significantly reduced by making more substantial use of C++ features here. Can you explain why this is essentially C?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, the RMW layer is intended to be in C++. Part of the reason that it is so C-like is that I am much more experienced with C than with C++ and so whenever I try to write C++ it still ends up looking like C. There is also the fact that Cyclone is in C, that this is therefore relying on a C API, and that some of the niceties of C++ (like destructors) are simply not available on these objects.
There is an implementation of a standard C++11 API of DDS waiting for approval by the Eclipse Foundation's IP review process. That interface covers all the standard features but not all the additional tricks. At some point it might be worth rewriting the RMW implementation to use that API, it's only about a handful of lines of code anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the reason that it is so C-like is that I am much more experienced with C than with C++ and so whenever I try to write C++ it still ends up looking like C.
Fair enough, that's a perfectly valid reason.
There is also the fact that Cyclone is in C, that this is therefore relying on a C API, and that some of the niceties of C++ (like destructors) are simply not available on these objects.
In case you're unaware, shared/unique ptrs can be handed custom deleters that might help with that particular issue.
Anyway, I don't want to digress from the PR too much, just found myself wondering about this. Thanks for the info!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you're unaware, shared/unique ptrs can be handed custom deleters that might help with that particular issue.
That sounds like something worth reading up on, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eboasson better yet, define a destructors for your classes. Then you don't need to pass a custom deleter to unique_ptr/shared_ptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rotu that's not always possible:
[...] this is therefore relying on a C API, and that some of the niceties of C++ (like destructors) are simply not available on these objects.
My suggestion related particularly to that. Another common option is to wrap such C objects in a class responsible for managing it, but the trade off is more code to maintain. It also assumes you're comfortable with those C++ constructs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it's easier to maintain a wrapper class than a deleter, but YMMV. Where there's a matched initializer + deleter pair, I find it easiest to make them into a constructor + destructor. With a factory method + deleter pair, i prefer to use a class method + protected constructor + destructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rmw_cyclonedds_cpp/src/rmw_node.cpp
Outdated
std::string user_data = get_node_user_data(name, namespace_); | ||
dds_qset_userdata(qos, user_data.c_str(), user_data.size()); | ||
|
||
if (security_options->enforce_security) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is incomplete, it sets the security QoS only if enforce_security
is set.
The expected behavior is to:
If the security files can be found: setup the plugins
If they cannot be found:
- if enforce_security is true -> exit with error message
- else -> create a participant without the security plugins instanciated
This is most likely the cause of the failing tests at ros2/system_tests#408
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikaelarguedas this is how the other middleware behaves, but help me understand if this is the behavior we truly want.
The rcl sets security_options->enforce_security to either RMW_SECURITY_ENFORCEMENT_PERMISSIVE or RMW_SECURITY_ENFORCEMENT_ENFORCE. It's always set to RMW_SECURITY_ENFORCEMENT_PERMISSIVE when ROS_SECURITY_ENABLE is anything other than "true"; that is why it's used to completely bypass security qos here.
With what you propose security will be enabled anytime ROS_SECURITY_ROOT_DIRECTORY exists regardless of ROS_SECURITY_ENABLE and ROS_SECURITY_STRATEGY. The rcl only checks for the directory, it (rightfully) does not ensure all six security files exist and are properly formatted. So node creation would fail any time the middleware cannot start with files in the security directory, even when ROS_SECURITY_ENABLE is set to false.
Similarly ROS_SECURITY_ENABLE=false would not disable security for testing. Instead security would have to be disabled by changing ROS_SECURITY_ROOT_DIRECTORY or moving the security directory on disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If don't think the described behavior matches how the other middlewares behave, or the desired behavior.
Overall the expected behavior is the one described at: https://design.ros2.org/articles/ros2_dds_security.html
Putting it in a table for ease of read:
ROS_SECURITY_ENABLE | ROS_SECURITY_STRATEGY | valid set of security files located under ROS_SECURE_ROOT_DIRECTORY | Expected outcome |
---|---|---|---|
false | * | * | No security plugins instanciated, creating non-secure participant |
true | Enforce | Yes | creates secure participant |
true | Enforce | No | fails to create participant |
true | Permissive | Yes | creates secure participant |
true | Permissive | No | creates unsecure participant |
Anything not matching that behavior is a bug IMO.
I agree that the line you are referring to is misleading https://github.com/ros2/rcl/blame/73948da4c50b11d5d133e068b804ce7c2b4f9cb6/rcl/src/rcl/node.c#L316-L317 and in practice serve no purpose and at first glance could be removed completely.
The one that matters in our case is the else branch of that if: https://github.com/ros2/rcl/blame/73948da4c50b11d5d133e068b804ce7c2b4f9cb6/rcl/src/rcl/node.c#L318-L323
If ROS_SECURITY_ENABLE is false: node_security_options.security_root_path is not set and this is the variable that tells you if you should try to instantiate the security plugins or not.
Mimicking the logic of the other RMWs could be a good hint of how to implement it here: https://github.com/ros2/rmw_fastrtps/blob/bd5507b61c7285d6083e514ccd49f00b2945baeb/rmw_fastrtps_shared_cpp/src/rmw_node.cpp#L292
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikaelarguedas, thanks for the discussion. It was simple to switch the logic here, but you'll also see changes to configure_qos_for_security
: all files are checked before any properties are added to qos to enable cyclone dds security.
This is now passing all everything in the test_security
, that's been a great help and I'd like to see those test routines continue to mature!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's awesome.
all files are checked before any properties are added to qos to enable cyclone dds security.
Great improvement, we should consider modifying the other rmw implementations to make sure they do the same
This is now passing all everything in the test_security, that's been a great help and I'd like to see those test routines continue to mature!
🎉
I didnt test the new version as it's conflicting since #106
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SidFaber I gave it a go (with Dashing to verify it supports that, too) and I noticed that it warns
[INFO] [rmw_cyclonedds_cpp]: rmw_create_node: Unable to configure security
whenever it starts when security either isn't supported or couldn't be configured. That is, when security is:
- disabled, regardless of whether security support is included in the build;
- enabled in permissive mode, when security support not included in the build;
- enabled in permissive mode, when security support is included in the build but something is awry with the configuration.
Case (1) seems wrong (there is no expectation of security so also no expectation of getting warnings about it) and case (3) seems like a good thing to me. I am not quite sure what to make of case (2), I'd say it should be a hard fail but perhaps that's not the intent behind permissive mode (I don't quite get that mode anyway). @mikaelarguedas could you comment on that?
Just for completeness: if security is enabled in enforced mode, it only starts when the security configuration is valid and the implementation supports it.
Apart from the warning not being appropriate in all cases, it looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Case (1) seems wrong
👍
case (3) seems like a good thing to me
👍
I am not quite sure what to make of case (2), I'd say it should be a hard fail but perhaps that's not the intent behind permissive mode (I don't quite get that mode anyway)
Originally the "Permissive" mode is to allow someone to launch in a single environment a system with some parts using security and some parts without.
I don't think it's been used much in practice.
In the case of 2, the rmw implementation itself doesnt support security (because not included in the build) so there's no chance of having any part of the system using security. So IMO it should hard fail and provide a clearer error message as of why.
An example is such error message in rmw_fastrtps_cpp https://github.com/ros2/rmw_fastrtps/blob/48b403a211b269ab383e6adfcf89d1b4333b2c39/rmw_fastrtps_shared_cpp/src/participant.cpp#L238-L242
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about I just give an info message "Security is enabled" when the cyclone QOS object gets set up properly? It will normally follow the "found security directory" message that comes from rcl. I feel some messaging is important since the rcl message implies that security is enabled which may not be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I find the rcl message by itself pretty noisy when launching systems with many nodes, I'd rather not add an info message always. I would prefer a warning when users need to know something is not what they requested (case (3)), and nothing (or debug message) otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikaelarguedas, I just removed the info message: there are some tests where you'll get an rcl "found security" message and create a plain text node which is similar to how other middleware behaves. Although not ideal, this is something that would be better solved with logging.
Also remove superfluous include and blank line. Signed-off-by: Sid Faber <[email protected]>
Properly handle downstream effects of ROS_SECURITY_STRATEGY and ROS_SECURITY_ENABLE environment variables through security_options. Improve memory management and make sure to only set security qos properties when all files are sure to exist.
Fix residue from merge conflict in README.md
This has been updated for the recent context changes. @eboasson, I followed the existing style to conditionally include based on rmw versions. fastrtps has diverged from this code now as their latest update does more with contexts, but everything here seems to be working fine. @mikaelarguedas, this passed the current Thanks for your help with this one! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SidFaber thank you very much for these updates. I double-checked the builds on Foxy, as well as the full run of all combinations on Dashing and it all works as expected.
I'm good with the updated warning policy as well, especially now that I have discovered that the rcl
security code is also as lax as it can possibly be. It appears setting ROS_SECURITY_ENABLE to anything other than "true" (exact match) is silently interpreted as "false", and if enabled, "permissive" mode is the result of setting ROS_SECURITY_STRATEGY to anything other than "Enforce". Given the number of people that have been involved in it, I can only presume this to be the intent. With that background, the current warning policy fits nicely.
I'm taking @mikaelarguedas thumbs-up as an approval as well, so I'm merging it. I'm working on getting the changes in contexts/nodes into Cyclone's RMW and that'll allow me to also include your work in that process.
RMW_SET_ERROR_MSG( | ||
"Security was requested but the Cyclone DDS being used does not have security " | ||
"support enabled. Recompile Cyclone DDS with the '-DENABLE_SECURITY=ON' " | ||
"CMake option"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm always getting this error message if I build a ros2
workspace without passing -DENABLE_SECURITY=ON
and run any demo (using cyclone)
I think rmw_cyclonedds_cpp
should be able to automatically detect if security is available or not at build time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this error message should probably only be set if security was specifically requested by the user, but that's only checked in the above branch. @SidFaber what do you think, add an if (security_options->security_root_path != nullptr)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #172 for a possible fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only happen when ROS_SECURITY_STRATEGY=Enforce and ROS_SECURITY_ENABLE=True. The full interplay of environment variables, RCL and RMW_CYCLONE is outlined here, the message should only show in lines 19, 20 and 22.
@ivanpauno, can you confirm that is or is not what you're seeing? This should work with both the current and the security branches of rmw_cyclonedds_cpp. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanpauno, can you confirm that is or is not what you're seeing? This should work with both the current and the security branches of rmw_cyclonedds_cpp. Thanks!
I'm seeing this error without setting any security related environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#175 should fix this up, thanks for pointing this out.
Add utility function to insert security settings to the cyclone QOS object used to create nodes. Include a utility to find security files and properly format their location to use with DDS.
Code testing included running the ROS2 tutorials with encryption enabled, error checking for missing certificates, testing bad signatures on governance and permissions files, invalid or mismatched domain IDs, and invalid encryption keys.