-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(format): respect max line width #242
Conversation
…/wdl into feat/line-width
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review before the interview starts
|
||
/// An indentation level. | ||
#[derive(Clone, Copy, Debug)] | ||
pub enum Indent { | ||
/// Tabs. | ||
Tabs(NonZeroUsize), | ||
|
||
Tabs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if a user indents with tab
, they can only use one tab. I'm fine with that, but just flagging that to make sure we're OK with that choice.
/// The default maximum line length. | ||
pub const DEFAULT_MAX_LINE_LENGTH: usize = 90; | ||
/// The minimum maximum line length. | ||
pub const MIN_MAX_LINE_LENGTH: usize = 60; | ||
/// The maximum maximum line length. | ||
pub const MAX_MAX_LINE_LENGTH: usize = 240; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested any non-default line lengths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out these 3 commits on workflows
- stjudecloud/workflows@main...tmp/format-examples
/// A temporary indent. | ||
/// | ||
/// This is added after a [`PostToken::Indent`] during the formatting of | ||
/// command sections. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me from these comments why this is needed. Maybe there is additional documentation below where the command block is formatted, but it may be useful to expand here.
msg = "Cannot use automatic control subsampling (\"chip.ctl_depth_limit\">0 and \"chip.exp_ctl_depth_limit\">0) for " + "multiple controls with mixed endedness (e.g. SE ctl-rep1 and PE ctl-rep2). " + "Automatic control subsampling is enabled by default. " + "Disable automatic control subsampling by explicitly defining the above two parameters as 0 in your input JSON file. " + "You can still use manual control subsamping (\"chip.ctl_subsample_reads\">0) since it is done " + "for individual control's TAG-ALIGN output according to each control's endedness. ", | ||
msg = "Cannot use automatic control subsampling (\"chip.ctl_depth_limit\">0 and \"chip.exp_ctl_depth_limit\">0) for " | ||
+ "multiple controls with mixed endedness (e.g. SE ctl-rep1 and PE ctl-rep2). " | ||
+ "Automatic control subsampling is enabled by default. " + "Disable automatic control subsampling by explicitly defining the above two parameters as 0 in your input JSON file. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it should bring the second +
down to the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would require backtracking, which is certainly possible but I'd say is OOS for this PR. I can open an issue for tracking that feature request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I missed the +
in the middle of the line initially when trying to interpret Clay's comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good and I like the formatting changes, especially for if
expressions.
Just a few comments.
]) | ||
) | ||
Boolean paired_end_ = if !defined(paired_end) && i < length(paired_ends) then paired_ends[ | ||
i] else select_first([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence whether I like this idea that (
and [
can split things down to the next line. For instance, here, I think it looks quite strange. I'm thinking that, for these lines that are being broken based on long line length, perhaps that should not be permitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning for a follow up PR that will line break "matching" tokens. So this example might become:
Boolean paired_end_ = if !defined(paired_end) && i < length(paired_ends) then paired_ends[
i
] else select_first([
or something similar. I haven't spent time thinking about the specifics yet, but I agree that the current implementation leaves something to be desired. I can push up a change the removes the open/close tokens from can_be_line_broken
so we can see how it looks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there should be some sort of priority list. Then if a line break is needed, you'd backtrack and find the highest "priority" line break location and break there. Here that's probably before then
.
@@ -1627,7 +1472,8 @@ workflow chip { | |||
} | |||
} | |||
|
|||
Boolean has_input_of_count_signal_track = has_output_of_bam2ta || defined(bam2ta.ta) | |||
Boolean has_input_of_count_signal_track = has_output_of_bam2ta || defined(bam2ta.ta | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, this is quite strange from a formatting perspective.
]) | ||
|
||
Boolean has_input_of_align_ctl = i < length(ctl_fastqs_R1) && length(ctl_fastqs_R1[ | ||
i]) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
|
||
Boolean has_input_of_filter_ctl = has_output_of_align_ctl || defined(align_ctl.bam) | ||
Boolean has_input_of_filter_ctl = has_output_of_align_ctl || defined(align_ctl.bam | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here. Sorry for so many comments, just want to call out places that require further examination.
File? blacklist_ = if length(blacklists) > 1 then pool_blacklist.ta_pooled else if length( | ||
blacklists) > 0 then blacklists[0] else blacklist2_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is one of the outstanding TODO items, but if not, we should revisit this case.
else "p.value" | ||
) | ||
String idr_rank_ = if peak_caller_ == "spp" then "signal.value" else if peak_caller_ | ||
== "macs2" then "p.value" else "p.value" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This too
Int num_rep_nodup_bam = if length(nodup_bams) < num_rep_bam then num_rep_bam else length( | ||
nodup_bams) | ||
Int num_rep_ta = if length(tas) < num_rep_nodup_bam then num_rep_nodup_bam else length( | ||
tas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are also cases to revisit.
Int num_ctl_bam = if length(ctl_bams) < num_ctl_fastq then num_ctl_fastq else length( | ||
ctl_bams) | ||
Int num_ctl_nodup_bam = if length(ctl_nodup_bams) < num_ctl_bam then num_ctl_bam else length( | ||
ctl_nodup_bams) | ||
Int num_ctl_ta = if length(ctl_tas) < num_ctl_nodup_bam then num_ctl_nodup_bam else length( | ||
ctl_tas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
Boolean has_input_of_bam2ta_no_dedup = (has_output_of_align || defined(align.bam)) | ||
&& !defined(bam2ta_no_dedup_R1.ta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example (in my opinion) of a good break. It broke on the &&
between clauses which would be my preference.
if (has_input_of_count_signal_track_pooled && enable_count_signal_track_ && num_rep > 1 | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another case where it should probably backtrack and break on &&
.
Int chosen_ctl_ta_id = if has_all_input_of_choose_ctl && !align_only_ then select_first( | ||
[ | ||
choose_ctl.chosen_ctl_ta_ids, | ||
])[i] else -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this is a really awkward break.
Array[File?] chosen_ctl_ta_pooled = if !has_all_input_of_choose_ctl || align_only_ | ||
then [] else if chosen_ctl_ta_pooled_subsample > 0 then [ | ||
subsample_ctl_pooled.ta_subsampled, | ||
] else if num_ctl < 2 then [ | ||
ctl_ta_[0], | ||
] else [ | ||
pool_ta_ctl.ta_pooled, | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this indentation correct? It looks like an indent is dropped for the last 5 lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the unformatted source:
Array[File?] chosen_ctl_ta_pooled = if !has_all_input_of_choose_ctl || align_only_ then []
else if chosen_ctl_ta_pooled_subsample > 0 then [ subsample_ctl_pooled.ta_subsampled ]
else if num_ctl < 2 then [ ctl_ta_[0] ]
else [ pool_ta_ctl.ta_pooled ]
@@ -3344,7 +2954,7 @@ task call_peak { | |||
~{"--chrsz " + chrsz} \ | |||
~{"--fraglen " + fraglen} \ | |||
~{"--peak-type " + peak_type} \ | |||
~{"--blacklist " + blacklist} | |||
~{"--blacklist " + blacklist} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have trailing whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's present in the unformatted file. I'm not sure why the prior implementation was stripping it 🤔 but I think it's supposed to be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but we should probably create a tracking issue for any follow-up tweaking needed to the line breaking output.
Describe the problem or feature in addition to a link to the issues.
Took me a while to crack this problem, but with some help from @adthrasher I think we've finally got the feature working!
Still to do in a follow up PR: we want certain line breaks to be "matched", like if we break on one half of a parenthesis pair we should break on the other half as well. Until that gets implemented, some of the line breaks will look a little awkward.
Before submitting this PR, please make sure:
changes (when appropriate).
CHANGELOG.md
(see"keep a changelog" for more information).