Consistent Tags/BTags with query #1572

fab4100 · 2024-09-28T15:55:11Z

The BTags command is declared with optional [QUERY] while the Tags
command is declared with optional [PREFIX]. The query behavior in
BTags source generator is less restrictive compared to Tags whose
source is generated using the readtags external command, where tags
are only listed if PREFIX is an exact match. This commit relaxes the
exact prefix match when readtags is used to allow for a fuzzy prefix
similar to the source generator of BTags. The fzf fuzzy score
ranking in the s:tags_sink will still rank the best fuzzy match on
top, same as if the match was exact like in the current implementation.

The changes in this commit are useful for tags that contain some scope
prefixes like the functions s:tags_sink or fzf#vim#tags in
vim-script code, in combination with a key mapping of the form

nnoremap <silent> <leader>l :execute "Tags '" . expand('<cword>')<CR>
nnoremap <silent> <leader>bl :execute "BTags '" . expand('<cword>')<CR>

If the cursor is placed on the word "tags" in this example
fzf#vim#tags, the relaxed Tags query call will list the tags entry
if <cword> expands to "tags", while the current implementation will
not. Inconsistently, the BTags call will list the tags entry with its
current implementation.

The `BTags` command is declared with optional `[QUERY]` while the `Tags` command is declared with optional `[PREFIX]`. The query behavior in `BTags` source generator is less restrictive compared to `Tags` whose source is generated using the `readtags` external command, where tags are only listed if `PREFIX` is an exact match. This commit relaxes the exact prefix match when `readtags` is used to allow for a fuzzy prefix similar to the source generator of `BTags`. The `fzf` fuzzy score ranking in the `s:tags_sink` will still rank the best fuzzy match on top, same as if the match was exact like in the current implementation (although additional second order matches may be produced by this relaxed `Tags [QUERY]` implementation which are excluded by the current `Tags [PREFIX]` implementation). The changes in this commit are useful for tags that contain some scope prefixes like the functions `s:tags_sink` or `fzf#vim#tags` in vim-script code, in combination with a key mapping of the form ```vim nnoremap <silent> <leader>l :execute "Tags '" . expand('<cword>')<CR> nnoremap <silent> <leader>bl :execute "BTags '" . expand('<cword>')<CR> ``` If the cursor is placed on the word "tags" in this example `fzf#vim#tags`, the relaxed `Tags` query call will list the tags entry if `<cword>` expands to "tags", while the current implementation will not. Inconsistently, the `BTags` call will list the tags entry with its current implementation.

junegunn · 2024-09-29T10:49:47Z

The reason we only allow a prefix is to make readtags perform fast binary search over huge tags files. I've tested your patch, but the performance difference is quite noticeable for a tags file of hundreds of MBs.

See #1524

fab4100 · 2024-09-29T11:14:14Z

Thanks for the referenced issue, I missed that somehow. It would be nice to have a homogeneous interface between Tags and BTags while still get the benefit of readtags. My proposal above is actually not homogeneous since the query would need to be sanitized first to get a POSIX compatible extended regex that is passed to readtags for pre-filter and actual user query should be passed as initial prompt to fzf call for post-filter. I will do some more benchmarking to see if something like that is possible using readtags as a pre-filter.

junegunn · 2024-09-29T12:31:08Z

It would be nice to have a homogeneous interface between Tags and BTags while still get the benefit of readtags.

I don't think it's possible. You can't avoid scanning through the whole list for non-prefix queries.

If performance is not a concern for you, you can redefine Tags command like so in your configuration file.

command! -bang -nargs=* Tags call fzf#vim#tags('', fzf#vim#with_preview({ "options": ['--query', <q-args>], "placeholder": "--tag {2}:{-1}:{3..}" }), <bang>0)

Original definition:

fzf.vim/plugin/fzf.vim

Line 65 in c5ce790

    
           \'command!      -bang -nargs=* Tags                             call fzf#vim#tags(<q-args>, fzf#vim#with_preview({ "placeholder": "--tag {2}:{-1}:{3..}" }), <bang>0)',

fab4100 · 2024-09-29T13:35:35Z

I believe you are correct. Using readtags with a regex as pre-filter is not viable from performance point of view. For people who use rg already and are willing to sacrifice some performance, it may be an option to use as the source filter in fzf#vim#tags(query, ...).

Here are a few test results (the first test result corresponds to the implementation I use in the commit above):

Linux kernel tags file:

du tags = 1.2G  tags

query = 'munmap'

GOLD: readtags -t tags -e -p - "${query}"
real    0m0.001s
user    0m0.000s
sys     0m0.001s
Matches:  6

TEST: readtags -t tags -e -Q "(#/^[^[:space:]]*${query}\$/ \$name)" -l
real    0m9.632s
user    0m9.554s
sys     0m0.070s
Matches:  44

TEST: readtags -t tags -e -Q "(#/^${query}/ \$name)" -l
real    0m4.300s
user    0m4.238s
sys     0m0.060s
Matches:  6

TEST: rg ^${query} tags
real    0m0.218s
user    0m0.162s
sys     0m0.056s
Matches:  7

TEST: rg ${query} tags
real    0m0.215s
user    0m0.158s
sys     0m0.056s
Matches:  85

TEST: grep ^${query} tags
real    0m0.440s
user    0m0.360s
sys     0m0.080s
Matches:  6

Some other large project (tags file one order of magnitude smaller than Linux kernel):

du tags = 82M   tags

query = 'BulkData'

GOLD: readtags -t tags -e -p - "${query}"
real    0m0.001s
user    0m0.001s
sys     0m0.000s
Matches:  31

TEST: readtags -t tags -e -Q "(#/^[^[:space:]]*${query}\$/ \$name)" -l
real    0m0.417s
user    0m0.414s
sys     0m0.003s
Matches:  49

TEST: readtags -t tags -e -Q "(#/^${query}/ \$name)" -l
real    0m0.289s
user    0m0.289s
sys     0m0.000s
Matches:  31

TEST: rg ^${query} tags
real    0m0.012s
user    0m0.006s
sys     0m0.006s
Matches:  31

TEST: rg ${query} tags
real    0m0.012s
user    0m0.000s
sys     0m0.012s
Matches:  4112

TEST: grep ^${query} tags
real    0m0.046s
user    0m0.046s
sys     0m0.000s
Matches:  31

If I understand your suggestion w/r/t redefining the Tags command, it would result in calling tags.pl with an empty query. For a tags file of the size of Linux kernel, this would result in:

TEST: perl ../bin/tags.pl "" tags | rg ^${query}
real    0m8.072s
user    0m7.957s
sys     0m0.659s
Matches:  7

whereas using rg as a source filter in fzf#vim#tags(query, ...) would result in something in the order of

TEST: rg ^${query} tags
real    0m0.218s
user    0m0.162s
sys     0m0.056s
Matches:  7

which I could consider as performance trade-off at the benefit of homogeneous Tags/BTags interface (but not really the perl solution above).

From your perspective, is it possible to support a custom 'source' filter in fzf#vim#tags(query, ...) to avoid copy-pasting fzf#vim#tags(query, ...) to my personal config just to change that line?

fab4100 force-pushed the tags-query branch from 84c9597 to ceefc4f Compare September 28, 2024 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent Tags/BTags with query #1572

Consistent Tags/BTags with query #1572

fab4100 commented Sep 28, 2024

junegunn commented Sep 29, 2024

fab4100 commented Sep 29, 2024

junegunn commented Sep 29, 2024

fab4100 commented Sep 29, 2024

Consistent Tags/BTags with query #1572

Are you sure you want to change the base?

Consistent Tags/BTags with query #1572

Conversation

fab4100 commented Sep 28, 2024

junegunn commented Sep 29, 2024

fab4100 commented Sep 29, 2024

junegunn commented Sep 29, 2024

fab4100 commented Sep 29, 2024