Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify "unignored" result of "test" method #116

Open
mlmcauliffe opened this issue May 1, 2024 · 9 comments
Open

Clarify "unignored" result of "test" method #116

mlmcauliffe opened this issue May 1, 2024 · 9 comments

Comments

@mlmcauliffe
Copy link

Hello,

I'm trying to understand the circumstances under which a path is said to be unignored by the test method. My intuition is that test should only return two falses ({ignored: false, unignored: false}) if the given path does not match any rules in the ignore object. If the path matches at least one rule then exactly one of ignored or unignored should be true, and the one it should be depends on the sense of the last matching rule.

For example, if I have this pair of rules:

foo*
!foobar

then I would expect the following:

1. test("blah")   -> { ignored: false, unignored: false } (0 rules match, so 2 falses)
2. test("fooey")  -> { ignored: true,  unignored: false } (>0 rules match, so exactly 1 true)
3. test("foobar") -> { ignored: false, unignored: true  } (>0 rules match, so exactly 1 true)

So far, my intuition seems correct, as I do in fact receive the expected results for the three test cases above. But when a rule matches one of the parent directories of the path, I don't always get what I expect.

These two examples work as expected:

4. test("blah/x.js")  -> { ignored: false, unignored: false } (0 rules match, so 2 falses)
5. test("fooey/x.js") -> { ignored: true,  unignored: false } (>0 rules match, so exactly 1 true)

But this one doesn't:

6. test("foobar/x.js") -> { ignored: false, unignored: false }  Why is unignored false here?

The unignored=false in case 6 breaks my intuition. The overall result is correct in the sense that foobar/x.js should not be ignored (hence it applied the rules correctly), but since it matches at least one rule, it seems that it shouldn't return two falses. { ignored: false, unignored: true } feels like the correct answer.

I initially thought this was just a bug, and I was going to submit a fix for it (diff below, if it's of interest), but my change caused one of the tests to fail (test/others.js, line 202):

  [
    'test: dir ignored then unignored -> not matched',
    ['foo', '!foo'],
    'foo/bar',
    [false, false]
  ],

This test case contradicts my intuition, because it says that a path that matches at least one rule should actually return two falses. If this test case is correct then I don't know how to understand when to expect unignored to be true. Can you help me understand what the correct behavior is?

In case it is of interest, here is my proposed change:

diff --git a/index.js b/index.js
index c559806..9a9af44 100644
--- a/index.js
+++ b/index.js
@@ -560,12 +560,14 @@ class Ignore {
       slices
     )
 
-    // If the path contains a parent directory, check the parent first
-    return cache[path] = parent.ignored
+    if (parent.ignored) {
       // > It is not possible to re-include a file if a parent directory of
       // >   that file is excluded.
-      ? parent
-      : this._testOne(path, checkUnignored)
+      return cache[path] = parent
+    }
+
+    const curr = this._testOne(path, checkUnignored)
+    return cache[path] = {ignored: curr.ignored, unignored: parent.unignored || curr.unignored}
   }
 
   ignores (path) {
@kaelzhang
Copy link
Owner

kaelzhang commented May 28, 2024

Frankly speaking, you got a good point.

The case 6 encounters the situation that the path is indirectly unignored. However, changing the behavior of the TestResult object will bring a breaking change to the library.

The .test method is often used for some low-level libraries to introduce better caching mechanism, so to treat the TestResult::unignored for direct negative matching is more useful than those indirect cases.

@mlmcauliffe
Copy link
Author

Thank you for your reply. I can certainly see that the change I proposed would be a breaking change to the library, so I'm not sure exactly what I am wishing for here. My use case involves ignore files in multiple directories of a source tree, where a child directory's rules override rules in parent directories. In order to implement this behavior, I need to know whether a given ignore file has a rule that matches a given path. I haven't found a way to implement this with node-ignore given the issue I described. Do you know of a way?

@kaelzhang
Copy link
Owner

kaelzhang commented May 28, 2024

Since your use case involves file structure traversing, and if a directory is ignored, everything inside it is definitely ignored. And in fact, the child directory could not unignore its parent directory.

So actually, to achieve better performance, you'd better not just list all resources under those directories and Ignore::test() them one by one but traverse down from the root directory.

Which means, actually how the property TestResult::unignored behaves does not matter in your case.

@mlmcauliffe
Copy link
Author

My implementation already works as you describe, but doing so doesn't completely avoid the problem. For example, if I have the following ignore files:

  1. src/.gitignore contains foo
  2. src/subdir/.gitignore contains !foo

then I would expect src/subdir/foo/xyz to be included (not ignored). This is difficult (impossible?) to implement, because the test method of file 2's ignore object returns { ignored: false, unignored: false }, so there is no way to know that file 2 contains a rule that supersedes file 1's rule.

@kaelzhang
Copy link
Owner

kaelzhang commented May 28, 2024

TestResult::unignored is only used for helping implementing caching mechanism in some certain situation, and in your case, checking TestResult::ignored is enough, or use the .ignores() method directly.

To test a path, the path should always be path.relative()d to the dirname of the .gitignore file in advance.

ignore.ignores('subdir/foo/xyz')

// and then
ignore_subdir.ignores('foo/xyz')

And then to solve the problem you met, you might as well implement some similar mechanism like here according to the .gitignore rules

@kaelzhang
Copy link
Owner

kaelzhang commented May 29, 2024

To be honest, similar cases have been asked several times here, I am considering to implement the feature directly in this package.

@mlmcauliffe
Copy link
Author

TestResult::unignored is only used for helping implementing caching mechanism in some certain situation

It would be valuable to document the situations where unignored is set. The existing behavior is not easy to discern.

in your case, checking TestResult::ignored is enough, or use the .ignores() method directly.

I don't think this is true. Consider these two cases:

Case A (the same as in my previous comment):

  1. src/.gitignore contains foo
  2. src/subdir/.gitignore contains !foo

Case B:

  1. src/.gitignore contains foo (same as Case A)
  2. src/subdir/.gitignore is empty

In both cases, I am testing src/subdir/foo/xyz. For Case 1, the final answer should be "not ignored", and for Case 2, the final answer should be "ignored". I make the following pair of calls, but they return the same results for both cases:

  1. ignore1.test('subdir/foo/xyz') returns { ignored: true, unignored: false }
  2. ignore2.test('foo/xyz') returns { ignored: false, unignored: false }

How can I arrive at "not ignored" for Case 1 and "ignored" for Case 2 when the results of the two calls are identical in both cases?

Under the change I described, ignore2.test('foo/xyz') would return unignored: true for Case 1, making it clear that there is a matching rule in file 2 that overrides anything in file 1.

@kaelzhang
Copy link
Owner

kaelzhang commented May 31, 2024

As I said, for your case, some similar mechanism like here needs to be implemented, according to one of the .gitignore rules:

It is not possible to re-include a file if a parent directory of that file is excluded.

@mlmcauliffe
Copy link
Author

It is not possible to re-include a file if a parent directory of that file is excluded.

That rule does not apply to my example, as none of the parent directories of src/subdir/foo/xyz are excluded. src/.gitignore tried to exclude src/subdir/foo, but it was overridden by src/subdir/.gitignore, hence it is included.

If you create a repo with the set of files in my example, you will see that git includes src/subdir/foo/xyz. This is the behavior I am trying to replicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants