Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procstat: Add a 'compartments' command to list c18n compartments #2276

Merged
merged 3 commits into from
Jan 25, 2025

Conversation

dpgao
Copy link
Contributor

@dpgao dpgao commented Dec 22, 2024

This rebases #2272 to dev and adds more features (incl. a generation counter for the compartment array to deal with races).

@dpgao dpgao requested review from rwatson and jrtc27 December 22, 2024 04:00
sys/cheri/c18n.h Outdated
struct cheri_c18n_compart {
ssize_t ccc_id;
char ccc_name[CHERI_C18N_COMPART_MAXNAME];
char _ccc_pad[64]; /* Shrink as new fields added above. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should preemptively define a 'ccc_flags' field to capture concepts like "This is a non-default sub-library compartment", "this compartment can performance system calls", and similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can always add them in the future when they become needed. Right now RTLD doesn't track such information about compartments, so it is risky to to add the flags prematurely.

*/
if (len != sizeof(info) ||
info.version != CHERI_C18N_INFO_VERSION ||
info.comparts_gen % 2 != 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat wondered if we wanted some sort of memory-barrier arrangement to ensure that we got a clean(ish) snapshot -- i.e., that if we saw the current generation, we saw all the stores we read from the compartment / string tables came before the generation number we read was stored, and that at the end of the sysctl function we haven't seen any stores that post-dated that generation-number store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every iteration below we also re-read the generation number and check that it hasn’t changed. Presumably this achieves the desired effect?

return;
}
if ((procstat_opts & PS_OPT_NOHEADER) == 0)
xo_emit("{T:/%5s %-19s %4s %-40s}\n", "PID", "COMM", "CID",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always feel that 19 characters wide is quite a long slot for COMM, which most of the time uses list. Not sure if other procstat/ps modes might give less by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 19 is fine (in fact it is too short for the cheribsdtest variants). And the existing c18n and cheri commands both use 19.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I widened to 19 (MAXCOMLEN) for auxv in 64e9f6a. It can be shorter, but it should probably the right most column so it can safely spill if it's not going to be MAXCOMLEN.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compartment name needs to be last and is likely to be the very long thing.

Copy link
Member

@brooksdavis brooksdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking it would be nice if this were split into kernel, libprocstat, and prostatic commits.

@@ -47,6 +53,8 @@
.Nm procstat_getargv ,
.Nm procstat_getauxv ,
.Nm procstat_getenvv ,
.Nm procstat_getc18n ,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting this is good, but it should be a separate commit.

return;
}
if ((procstat_opts & PS_OPT_NOHEADER) == 0)
xo_emit("{T:/%5s %-19s %4s %-40s}\n", "PID", "COMM", "CID",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I widened to 19 (MAXCOMLEN) for auxv in 64e9f6a. It can be shorter, but it should probably the right most column so it can safely spill if it's not going to be MAXCOMLEN.

@dpgao dpgao force-pushed the c18n-procstat-comparts branch 2 times, most recently from acd2d25 to dd8c97b Compare January 9, 2025 15:53
sys/cheri/c18n.h Outdated
* The interface provided by the kernel via sysctl for compartmentalization
* monitoring tools such as procstat.
*/
#define CHERI_C18N_COMPART_MAXNAME 56
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry this should be more like PATH_MAX + NAME_MAX + 2 (separator and terminator).

sys/cheri/c18n.h Show resolved Hide resolved
sys/kern/kern_proc.c Show resolved Hide resolved
if (!cheri_can_access(sptr, CHERI_PERM_LOAD,
(__cheri_addr ptraddr_t)&sptr[n], 1))
return (-1);
readlen = proc_readmem(td, p,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can likely do a bit better than reading one byte at a time. That is, you can fetch the remaining bytes for the current page up to the limit of len or the remaining length of sptr. You could always read it into a temporary PAGE_SIZE'd buffer and only copy out to buf up to the first \0. That will be significantly more efficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that even copyinstr() doesn't do this on arm64, and this is used only for debugging/monitoring, I feel like there complexity from that change probably isn't worth it .. especially as we'll then probably get the capability bounds checks wrong :-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, for an amusing time, you can look at proc_read_string(), which basically does the optimisation but doesn't actually know about strings 😄 .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike copyinstr this is having to go lock the VM map, wire the page, etc. for each byte. copyinstr is just setting pcb_onfault around each byte fetch, so it does seem like there is quite a bit more overhead in this case compared to copyinstr. proc_read_string() doesn't seem to do any bounds checking at all, probably in part because the freebsd64 caller of it doesn't generate proper USER_CAPs but instead uses cheri_fromint. But proc_read_string does seem to be a bit broken.

Given that proc_readmem() returns the bytes read and that we have cheri_bytes_remaining you could do:

     ssize_t readlen;
     size_t n, valid;

     if (len < 1)
        return (EFAULT);
     if (!cheri_can_access(sptr, CHERI_PERM_LOAD, (ptraddr_t)sptr)
        return (EPROT);
     valid = MIN(len - 1, cheri_bytes_remaining(sptr));
     readlen = proc_readmem(td, p, (ptraddr_t)sptr, buf, valid);
     if (readlen <= 0)
        return (EFAULT);
     n = strnlen(buf, valid);
     if (n == valid && valid != len - 1)
        return (EPROT);
     buf[len - 1] = '\0';
     return (0);

(I would also return an error, you don't need the length and this distinguishes EFAULT from EPROT)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be MIN(len, ...) instead of MIN(len - 1, ...)? Otherwise a non-null-terminated string of exactly len - 1 bytes would fail trigger the last EPROT error because valid == len - 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the last EPROT check should be if (cheri_bytes_remaining(sptr) <= len && valid == n), I think? Otherwise the above counterexample with the a bad string of length len also escapes the check.

sys/sys/sysctl.h Outdated
@@ -1065,6 +1065,7 @@ TAILQ_HEAD(sysctl_ctx_list, sysctl_ctx_entry);
#define KERN_PROC_REVOKER_STATE 47 /* revoker state */
#define KERN_PROC_REVOKER_EPOCH 48 /* revoker epoch */
#define KERN_PROC_C18N 49 /* compartmentalisation statistics */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it may be worth renaming this to KERN_PROC_C18N_STATS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be good.

sys/kern/kern_proc.c Show resolved Hide resolved
@@ -399,6 +405,53 @@ procstat_getc18n(struct procstat *procstat, struct kinfo_proc *kp,
return (-1);
}

int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not require the caller to allocate an array and hope for the best. :) It should return an allocated array of objects that the caller can free. You can then call the sysctl twice, once to get the size estimate and a second time to populate it. This requires fixing the sysctl to add the optimized path for querying the size I mentioned.


#include "procstat.h"

#define C18N_MAX_COMPARTS 1024 /* Horrible but functional, for now. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this hack can go away as it should.

return;
}
if ((procstat_opts & PS_OPT_NOHEADER) == 0)
xo_emit("{T:/%5s %-19s %4s %-40s}\n", "PID", "COMM", "CID",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compartment name needs to be last and is likely to be the very long thing.

@dpgao
Copy link
Contributor Author

dpgao commented Jan 9, 2025

@bsdjhb Would be curious to hear your thoughts about the generation counter as a fix for the race condition. Does this seem sound to you?

@bsdjhb
Copy link
Collaborator

bsdjhb commented Jan 9, 2025

@bsdjhb Would be curious to hear your thoughts about the generation counter as a fix for the race condition. Does this seem sound to you?

Yes, that pattern is used elsewhere for the same trick.

@dpgao dpgao force-pushed the c18n-procstat-comparts branch from 9a24af1 to 335aa7e Compare January 22, 2025 16:35
@dpgao
Copy link
Contributor Author

dpgao commented Jan 22, 2025

@bsdjhb I've just pushed a fix for the build failure caused by a missing __cheri_addr.

sys/kern/kern_proc.c Outdated Show resolved Hide resolved
lib/libprocstat/libprocstat.c Outdated Show resolved Hide resolved
@dpgao dpgao force-pushed the c18n-procstat-comparts branch from 335aa7e to 5b0a59f Compare January 23, 2025 16:20
@dpgao dpgao requested a review from bsdjhb January 23, 2025 17:03
}

/* Unpack elements of the input buffer into the output buffer. */
outbuf = malloc(n * sizeof(*outbuf));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one suggestion here would be to use calloc() so that the extra padding between structures is zeroed. However, that can be fixed up as a later commit after merging.

@bsdjhb bsdjhb merged commit cf628ae into dev Jan 25, 2025
29 checks passed
@bsdjhb bsdjhb deleted the c18n-procstat-comparts branch January 25, 2025 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants