Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan for incorrect permissions #29

Merged
merged 5 commits into from
Aug 10, 2022
Merged

Scan for incorrect permissions #29

merged 5 commits into from
Aug 10, 2022

Conversation

micaeljtoliveira
Copy link
Contributor

@micaeljtoliveira micaeljtoliveira commented Jul 14, 2022

Adds script to find all the files and directories with incorrect permissions. This runs alongside the update of the database and marks the build as unstable if any incorrect permissions are found. It will also send an email to the owners of the affected files.

There is a currently a Jenkins pipeline to test this: https://accessdev.nci.org.au/jenkins/job/COSIMA/job/Test/

Closes #27

@micaeljtoliveira micaeljtoliveira marked this pull request as draft July 14, 2022 05:08
@micaeljtoliveira
Copy link
Contributor Author

I'm still unsure about what to do when we find incorrect permissions. Mark the pipeline as unstable or failed? Send an email to some users? Something else?

@micaeljtoliveira
Copy link
Contributor Author

The pipeline could also be combined with the one that updates the database if we decide that the stage executed here marks the build as unstable if incorrect permissions are found.

@aidanheerdegen
Copy link
Contributor

Sending emails to users sounds like a decent option for an action. [email protected] will be forwarded to their contact email. It has to be done from Jenkins as the mailer on NCI is disabled.

Does marking unstable mean the database update continues, just has a different symbol and issues a warning? In which case that would be ok. I don't think there is any point stopping updating the database if there are just some files with incorrect permissions/group.

@micaeljtoliveira
Copy link
Contributor Author

Does marking unstable mean the database update continues, just has a different symbol and issues a warning?

Basically yes.

@aidanheerdegen
Copy link
Contributor

I like the experiment_dirs thing. Are you planning in incorporating/sharing this across both the DB update task and this one as well?

I wonder if now is the right time to think about something like this

#18 (comment)

using the presence of a metadata.yaml file to search for experiment root directories? It would make the indexing logic a lot simpler, but I have no idea how many current experiments have no metadata.yaml file.

@micaeljtoliveira
Copy link
Contributor Author

I like the experiment_dirs thing. Are you planning in incorporating/sharing this across both the DB update task and this one as well?

Yes, that's the idea. As long as we are hard-coding the paths, it is safer if the information about the directories is in a single place.

I also like the suggestion in #18. At which level should that be implemented: the bash script or in the cookbook?

@aidanheerdegen
Copy link
Contributor

It depends on what you want to use it for. If the presence of a metadata.yaml file is a signifier that this is the root of an experiment directory then you'd want to make use of that in the bash script to find experiments, and so make the logic of find experiments a bit less weird with all those byzantine find commands (that I wrote in case anyone thinks I am being mean to @micaeljtoliveira).

The downside is you can't look for a missing metadata.yaml file to warn that it is missing, the experiment just doesn't show up in the index. Not a massive deal, but worth pointing out.

The other sticking point: the backup script for the payu (git) control directory means there are a heap of metadata.yaml files in git-runlog subdirectories, e.g.

/g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle4_jra55v150_extension/git-runlog/metadata.yaml

So it isn't sufficient to simply do

find /g/data/ik11 -name "metadata.yaml" -exec dirname {} \;

that list would have to be filtered for any subdirectories of other experiment directories.

This works:

find /g/data/ik11 -name "metadata.yaml" -not -path "*git-runlog*" -exec dirname {} \;

but is not completely general.

Walking the directory tree works better conceptually, stop as soon as you hit a metadata.yaml file, but the find command is good enough to begin with I guess.

@aidanheerdegen
Copy link
Contributor

Just to be clear, you would continue to want to explicitly do this find in each of the subdirs in ik11 because of the existence of things like /g/data/ik11/tmp

$ find /g/data/ik11/{inputs,observations,outputs} -name "metadata.yaml" -not -path "*git-runlog*" -exec dirname {} \; 2>/dev/null
/g/data/ik11/inputs/JRA-55/RYF/indexing/JRA55-RYF-1-4                                            
/g/data/ik11/inputs/JRA-55/MRI-JRA55-do/MRI-JRA55-do-1-4-0                                       
/g/data/ik11/observations/woa18                                                                  
/g/data/ik11/outputs/mom6-om4-025/OM4_025.JRA_RYF                                                
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip_straits_topo5_cycle1                   
/g/data/ik11/outputs/access-om2-025/025deg_jra55_ryf9091_gadi                                    
/g/data/ik11/outputs/access-om2-025/025deg_jra55_ryf9091_bgc                                     
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip_straits_topo2_cycle1
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle4 
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip_straits_topo4_cycle1
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle6 
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle2 
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle3 
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip_straits_topo3_cycle1
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip_straits_topo_cycle1
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle1
/g/data/ik11/outputs/access-om2-025/025deg_jra55_ryf9091_gadi_norediGM
/g/data/ik11/outputs/access-om2-025/025deg_jra55_ryf9091_gadi_noGM
/g/data/ik11/outputs/access-om2-025/025deg_jra55_iaf_omip2_cycle5 
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle41
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_v2.0.0rc3          
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle32
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle11
/g/data/ik11/outputs/access-om2/1deg_iamip2_EC-Earth3ssp126       
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_v2.0.0rc3_nonuniform_albedo
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle34
/g/data/ik11/outputs/access-om2/1deg_jra55_ryf_iamip2_sens_mp
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle2
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle31
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2spunup_cycle10
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2_cycle2       
/g/data/ik11/outputs/access-om2/1deg_jra55_iaf_omip2_cycle3      
.
.
.

@aidanheerdegen
Copy link
Contributor

aidanheerdegen commented Jul 14, 2022

I am happy to approve this if you want to leave as-is, or can wait if you think there is value in making any of the changes above.

Just noticed it is still a draft, I am jumping the gun a bit!

@micaeljtoliveira
Copy link
Contributor Author

@aidanheerdegen Thanks for the ideas and comments. Probably the best is to get this PR and #25 merged first, as that's more important in the short term. I plan on finishing both PR's tomorrow, so it would be good if you could have a look and approve them tomorrow. I just need to make a few changes beforehand.

@aidanheerdegen
Copy link
Contributor

Let me know when you're ready for the review

@micaeljtoliveira
Copy link
Contributor Author

@aidanheerdegen Thanks, but this PR still needs some more changes (working on it right now). In the meantime, #25 is actually ready, in case you want to look at it.

@micaeljtoliveira micaeljtoliveira marked this pull request as ready for review July 26, 2022 23:18
@micaeljtoliveira
Copy link
Contributor Author

@aidanheerdegen @aekiss This PR is now ready to be reviewed. I've tested this quite extensively, but it would be good to have a second and third pair of eyes going through it.

Currently there are a few files with incorrect permissions. They all belong to the user pas561. Do you know who that is? Maybe we should contact him/her before merging this?

@aidanheerdegen
Copy link
Contributor

Hi @micaeljtoliveira. pas561 is Paul Spence. You can look up users in the grafana dashboard

https://accessdev.nci.org.au/grafana/org/users

but that might only show users who have logged into grafana. I made a user lookup dashboard, but you might not be able to access it.

https://accessdev.nci.org.au/grafana/d/kZSiZsmnk/user-lookup?orgId=1

Let me know if you can't.

Jenkinsfile Outdated Show resolved Hide resolved
@micaeljtoliveira
Copy link
Contributor Author

Let me know if you can't.

Sorry, I can't access the dashboard.

@aidanheerdegen
Copy link
Contributor

Let me know if you can't.

Sorry, I can't access the dashboard.

I've changed something, check if you can see it now.

@micaeljtoliveira
Copy link
Contributor Author

Works now. Thanks!

@aidanheerdegen
Copy link
Contributor

It's a bit weird to use. Click on the funnel (filter) icon on any of the columns to begin filtering by name or user id, and then you can select the matching ones and they'll show a full entry.

experiment_dirs Outdated Show resolved Hide resolved
aekiss
aekiss previously approved these changes Aug 9, 2022
Copy link
Contributor

@aekiss aekiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. My only concern is spamming people if emails are sent out every day. Maybe skip sending the email if date +%A isn't Monday?

@micaeljtoliveira
Copy link
Contributor Author

@aekiss I've now changed the script to only send emails on Mondays. I'm now testing the change and will need a new approval on this PR once I confirm that everything is working as expected.

@micaeljtoliveira
Copy link
Contributor Author

@aekiss Looks like this is working as expected and can now be merged.

Copy link
Contributor

@aekiss aekiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@aekiss aekiss merged commit ea40d30 into COSIMA:master Aug 10, 2022
@micaeljtoliveira micaeljtoliveira deleted the permissions_check branch November 16, 2022 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scan for incorrect permissions
3 participants