Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent Tags/BTags with query #1572

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fab4100
Copy link

@fab4100 fab4100 commented Sep 28, 2024

The BTags command is declared with optional [QUERY] while the Tags
command is declared with optional [PREFIX]. The query behavior in
BTags source generator is less restrictive compared to Tags whose
source is generated using the readtags external command, where tags
are only listed if PREFIX is an exact match. This commit relaxes the
exact prefix match when readtags is used to allow for a fuzzy prefix
similar to the source generator of BTags. The fzf fuzzy score
ranking in the s:tags_sink will still rank the best fuzzy match on
top, same as if the match was exact like in the current implementation.

The changes in this commit are useful for tags that contain some scope
prefixes like the functions s:tags_sink or fzf#vim#tags in
vim-script code, in combination with a key mapping of the form

nnoremap <silent> <leader>l :execute "Tags '" . expand('<cword>')<CR>
nnoremap <silent> <leader>bl :execute "BTags '" . expand('<cword>')<CR>

If the cursor is placed on the word "tags" in this example
fzf#vim#tags, the relaxed Tags query call will list the tags entry
if <cword> expands to "tags", while the current implementation will
not. Inconsistently, the BTags call will list the tags entry with its
current implementation.

The `BTags` command is declared with optional `[QUERY]` while the `Tags`
command is declared with optional `[PREFIX]`.  The query behavior in
`BTags` source generator is less restrictive compared to `Tags` whose
source is generated using the `readtags` external command, where tags
are only listed if `PREFIX` is an exact match.  This commit relaxes the
exact prefix match when `readtags` is used to allow for a fuzzy prefix
similar to the source generator of `BTags`.  The `fzf` fuzzy score
ranking in the `s:tags_sink` will still rank the best fuzzy match on
top, same as if the match was exact like in the current implementation
(although additional second order matches may be produced by this
relaxed `Tags [QUERY]` implementation which are excluded by the current
`Tags [PREFIX]` implementation).

The changes in this commit are useful for tags that contain some scope
prefixes like the functions `s:tags_sink` or `fzf#vim#tags` in
vim-script code, in combination with a key mapping of the form

```vim
nnoremap <silent> <leader>l :execute "Tags '" . expand('<cword>')<CR>
nnoremap <silent> <leader>bl :execute "BTags '" . expand('<cword>')<CR>
```

If the cursor is placed on the word "tags" in this example
`fzf#vim#tags`, the relaxed `Tags` query call will list the tags entry
if `<cword>` expands to "tags", while the current implementation will
not.  Inconsistently, the `BTags` call will list the tags entry with its
current implementation.
@junegunn
Copy link
Owner

The reason we only allow a prefix is to make readtags perform fast binary search over huge tags files. I've tested your patch, but the performance difference is quite noticeable for a tags file of hundreds of MBs.

See #1524

@fab4100
Copy link
Author

fab4100 commented Sep 29, 2024

Thanks for the referenced issue, I missed that somehow. It would be nice to have a homogeneous interface between Tags and BTags while still get the benefit of readtags. My proposal above is actually not homogeneous since the query would need to be sanitized first to get a POSIX compatible extended regex that is passed to readtags for pre-filter and actual user query should be passed as initial prompt to fzf call for post-filter. I will do some more benchmarking to see if something like that is possible using readtags as a pre-filter.

@junegunn
Copy link
Owner

It would be nice to have a homogeneous interface between Tags and BTags while still get the benefit of readtags.

I don't think it's possible. You can't avoid scanning through the whole list for non-prefix queries.

If performance is not a concern for you, you can redefine Tags command like so in your configuration file.

command! -bang -nargs=* Tags call fzf#vim#tags('', fzf#vim#with_preview({ "options": ['--query', <q-args>], "placeholder": "--tag {2}:{-1}:{3..}" }), <bang>0)

Original definition:

\'command! -bang -nargs=* Tags call fzf#vim#tags(<q-args>, fzf#vim#with_preview({ "placeholder": "--tag {2}:{-1}:{3..}" }), <bang>0)',

@fab4100
Copy link
Author

fab4100 commented Sep 29, 2024

I believe you are correct. Using readtags with a regex as pre-filter is not viable from performance point of view. For people who use rg already and are willing to sacrifice some performance, it may be an option to use as the source filter in fzf#vim#tags(query, ...).

Here are a few test results (the first test result corresponds to the implementation I use in the commit above):

Linux kernel tags file:

du tags = 1.2G  tags

query = 'munmap'

GOLD: readtags -t tags -e -p - "${query}"
real    0m0.001s
user    0m0.000s
sys     0m0.001s
Matches:  6

TEST: readtags -t tags -e -Q "(#/^[^[:space:]]*${query}\$/ \$name)" -l
real    0m9.632s
user    0m9.554s
sys     0m0.070s
Matches:  44

TEST: readtags -t tags -e -Q "(#/^${query}/ \$name)" -l
real    0m4.300s
user    0m4.238s
sys     0m0.060s
Matches:  6

TEST: rg ^${query} tags
real    0m0.218s
user    0m0.162s
sys     0m0.056s
Matches:  7

TEST: rg ${query} tags
real    0m0.215s
user    0m0.158s
sys     0m0.056s
Matches:  85

TEST: grep ^${query} tags
real    0m0.440s
user    0m0.360s
sys     0m0.080s
Matches:  6

Some other large project (tags file one order of magnitude smaller than Linux kernel):

du tags = 82M   tags

query = 'BulkData'

GOLD: readtags -t tags -e -p - "${query}"
real    0m0.001s
user    0m0.001s
sys     0m0.000s
Matches:  31

TEST: readtags -t tags -e -Q "(#/^[^[:space:]]*${query}\$/ \$name)" -l
real    0m0.417s
user    0m0.414s
sys     0m0.003s
Matches:  49

TEST: readtags -t tags -e -Q "(#/^${query}/ \$name)" -l
real    0m0.289s
user    0m0.289s
sys     0m0.000s
Matches:  31

TEST: rg ^${query} tags
real    0m0.012s
user    0m0.006s
sys     0m0.006s
Matches:  31

TEST: rg ${query} tags
real    0m0.012s
user    0m0.000s
sys     0m0.012s
Matches:  4112

TEST: grep ^${query} tags
real    0m0.046s
user    0m0.046s
sys     0m0.000s
Matches:  31

If I understand your suggestion w/r/t redefining the Tags command, it would result in calling tags.pl with an empty query. For a tags file of the size of Linux kernel, this would result in:

TEST: perl ../bin/tags.pl "" tags | rg ^${query}
real    0m8.072s
user    0m7.957s
sys     0m0.659s
Matches:  7

whereas using rg as a source filter in fzf#vim#tags(query, ...) would result in something in the order of

TEST: rg ^${query} tags
real    0m0.218s
user    0m0.162s
sys     0m0.056s
Matches:  7

which I could consider as performance trade-off at the benefit of homogeneous Tags/BTags interface (but not really the perl solution above).

From your perspective, is it possible to support a custom 'source' filter in fzf#vim#tags(query, ...) to avoid copy-pasting fzf#vim#tags(query, ...) to my personal config just to change that line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants