Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for roboRIO safety controls #2265

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions source/docs/software/roborio-info/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ roboRIO
roborio-ssh
roborio-brownouts
recovering-a-roborio-using-safe-mode
roborio-safety-controls
Additional Help <https://www.ni.com/en-us/innovations/white-papers/15/imaging-the-roborio-and-common-troubleshooting-techniques.html>
19 changes: 19 additions & 0 deletions source/docs/software/roborio-info/roborio-safety-controls.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
roboRIO Safety Control System
=============================
There are multiple hardware and software components involved with safety on the RoboRIO. Outputs of the RoboRIO (e.g. PWMs) are controlled by the FPGA hardware. NetComm is a software daemon that talks to the DS, the FPGA, and the user program. Inside of the user process (the team's robot program), there's a NetComm DLL component that talks to the FPGA, CAN, and the NetComm daemon. And of course there are CAN motor controllers on the CAN bus.

* The FPGA has a system watchdog. This watchdog will time out and force a "disable" of PWM motor outputs if NetComm hasn't told it it's received an enable packet in the last 125 ms.
* The PWM disable works by sending a single idle pulse to the motor controller at the start of the next 20 ms PWM cycle after the disable condition is set, and following that, stopping output on the PWM signal line.
* The NetComm DLL in the user process will send a disable broadcast message on the CAN network and then stop sending keep-alive CAN messages after the disabled system watchdog signal is read back from the FPGA (this is checked on a 20 ms loop). REV pneumatics and motor controllers will stop immediately upon receipt of the disable broadcast. They also stop if no keep-alive is received for 100 ms (pneumatics) or 220 ms (motor controllers).
* CTRE uses a custom approach that reads the disable indicator on NetComm and stops motors within 100 ms of a disable.
* When NetComm receives a control packet from the DS with enable set to true, it will immediately enable motors (and restart the FPGA watchdog timer).
* A count of watchdog expiration events is sent by NetComm to the DS, so this data is in the DS log.
* NetComm (and the control protocol) does not currently have a mechanism to detect delayed control packets. (If it gets a control packet with enable set to true, it will enable the robot and feed the watchdog, even if that packet was sent seconds ago and delayed that much by the network).
* The DS sends a control packet to the robot every 20 ms. This is on a high-priority timed loop. Other loops in the DS, including the joystick and GUI loops, run at lower priority. What this means is that under poor CPU conditions or rendering delays (e.g. large amounts of console output), it's possible for the DS to have internal delay between disable being clicked, a key being hit, or joystick inputs being read to those changes being reflected in the control packets being sent to the robot.
* Control (DS->Robot) and status (Robot->DS) packets have an embedded sequence number. The DS uses these to compute round-trip-time and packet loss. A status packet that's returned is marked as "lost" if the RTT is greater than ~250 ms. This does not mean it was actually missing (no response received). The DS does keep a separate count of truly missing (e.g. no response) packets and disables (starts sending control packets with enable=false) after ~10 drops occur (this works out to ~450 ms, assuming it's 250+10*20).

There are several potential ways robots could continue moving even after spacebar or the disable GUI button is pressed on the DS:

* High CPU / GUI delays result in the DS continuing to send packets with enable=true for a period of time until that loop is notified a disable occurs
* There is no upper limit to control lag. As long as packets keep arriving, they may be several seconds delayed from the DS, so a disable command from the DS would take the same amount of time to be reflected in robot operation. Once it's delayed, all controls, including disable/estop, will be delayed. We've all seen delays increase either slowly or quickly-the robot was controllable until it's suddenly much more laggy, or even been laggy from the start.
* Packet buffering / Wi-Fi retransmits of control packets result in sporadic enable packets making it to the robot after some delay. The watchdog would disable after 125 ms, but a single enable packet would re-enable motors for another 125 ms at a time.