Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent syntax highlighting for tagged templates in JavaScript #11952

Closed
kj opened this issue Oct 28, 2024 · 5 comments · Fixed by #12217
Closed

Inconsistent syntax highlighting for tagged templates in JavaScript #11952

kj opened this issue Oct 28, 2024 · 5 comments · Fixed by #12217
Labels
C-bug Category: This is a bug

Comments

@kj
Copy link

kj commented Oct 28, 2024

Summary

Tagged template strings in JavaScript have strange, inconsistent highlighting. It appears that with certain characters in the function name, the entire template string content is highlighted as regular JavaScript.

screenshot

The characters 'r', 'd', and 'c' seem to change the highlighting. Strangely, the function names starting with 'd' and 'c' also change the colour of the single quoted string 'Baz', but function names starting with 'r' do not (even if they're not .

Edit: Just to be clear, the qwetyuiopasfghjklzxvbnm tagged templates in the screenshot have the correct highlighting (so named because they don't include any of the characters which cause the highlighting to change).

Reproduction Steps

The source code above:

qwetyuiopasfghjklzxvbnm`
  Foo 123 Bar -456 'Baz';
`;

r`
  Foo 123 Bar -456 'Baz';
`;

d`
  Foo 123 Bar -456 'Baz';
`;

c`
  Foo 123 Bar -456 'Baz';
`;

qwetyuiopasfghjklzxvbnm`
  Foo 123 Bar -456 'Baz';
`;

r`
  Foo 123 Bar -456 'Baz';
`;

rdc`
  Foo 123 Bar -456 'Baz';
`;

Helix log

No response

Platform

Linux

Terminal Emulator

alacritty 0.14.0 (22a44757)

Installation Method

source

Helix Version

helix 24.7 (101a74b)

@kj kj added the C-bug Category: This is a bug label Oct 28, 2024
@rwakulszowa
Copy link

The 3 magic characters happen to correspond to 3 popular single-character programming languages:

Helix supports tree sitter language injections, using a different parser for a part of a file (here - a template literal). There's no reliable way to tell what language a string represents, so it uses a simple heuristic, matching the template tag against a list of known injection regexes.

For example, the d comes from:

injection-regex = "d"

So, long story short, that's just a negative side effect of supporting injections in the first place.

Disclaimer

I'm not an expert. I'm not even a contributor. I was just hoping to pick up a simple task to contribute, but there's nothing to do here :(

@kj
Copy link
Author

kj commented Nov 3, 2024

Ah! Thanks for pointing that out. Perhaps these patterns could be a bit more strict, as a single letter which matches in any string including them seems far from ideal.

[[language]]
name = "c"
injection-regex = "^c$"

[[language]]
name = "c"
injection-regex = "^d$"

[[language]]
name = "c"
injection-regex = "^r$"

This is how I'm working around it now anyway. I'm also not really convinced that template strings should be language injection sites. Unlike for example Markdown code blocks where the intention is clear, in JavaScript it's just a function name which could mean anything. Either way though, I think the injection-regex patterns could be improved

Edit: and this doesn't just apply to single letter patterns. Should the injection-regex just imply ^ and $? This is probably a separate issue at this point. Should I close this and open another one?

@rwakulszowa
Copy link

Should I close this and open another one?

I'd leave it open until a real maintainer shares their thoughts.

I agree it's a bit confusing, though.

@uncenter
Copy link
Contributor

I'm also not really convinced that template strings should be language injection sites. Unlike for example Markdown code blocks where the intention is clear, in JavaScript it's just a function name which could mean anything

Fwiw this is a very popular feature request on Zed editor's issue board atm - people definitely want this, at least for applications such as with html or css tagged templates. Maybe the list of matched tagged templates could be restricted to a hard coded list including html, css, and maybe a few others if requested?

This is where this injection is performed:

; Parse the contents of tagged template literals using
; a language inferred from the tag.
(call_expression
function: [
(identifier) @injection.language
(member_expression
property: (property_identifier) @injection.language)
]
arguments: (template_string) @injection.content)

And my suggestion above can easily be implemented with something like:

; Parse the contents of tagged template literals using
; a language inferred from the tag.

(call_expression
  function: [
    (identifier) @injection.language
    (member_expression
      property: (property_identifier) @injection.language)
  ]
  arguments: (template_string) @injection.content
+ (#match? @injection.language "^(html|css)$"))

You can see this working here:

CleanShot 2024-11-28 at 16 47 26

`foo.js` tagged template sample code
/* Should highlight: */
html`<div>bar</div>`;

css`
	:root {
		color: red;
	}
`;

/* Should not highlight: */

htmlll`<div>bar</div>`;

cssss`
	:root {
		color: red;
	}
`;

qwetyuiopasfghjklzxvbnm`
  Foo 123 Bar -456 'Baz';
`;

r`
  Foo 123 Bar -456 'Baz';
`;

d`
  Foo 123 Bar -456 'Baz';
`;

c`
  Foo 123 Bar -456 'Baz';
`;

rdc`
  Foo 123 Bar -456 'Baz';
`;

Happy to open a pull request adding this line if y'all agree :)

@nik-rev
Copy link
Contributor

nik-rev commented Dec 8, 2024

I'm also not really convinced that template strings should be language injection sites. Unlike for example Markdown code blocks where the intention is clear, in JavaScript it's just a function name which could mean anything

Fwiw this is a very popular feature request on Zed editor's issue board atm - people definitely want this, at least for applications such as with html or css tagged templates. Maybe the list of matched tagged templates could be restricted to a hard coded list including html, css, and maybe a few others if requested?

This is where this injection is performed:

; Parse the contents of tagged template literals using
; a language inferred from the tag.
(call_expression
function: [
(identifier) @injection.language
(member_expression
property: (property_identifier) @injection.language)
]
arguments: (template_string) @injection.content)

And my suggestion above can easily be implemented with something like:

; Parse the contents of tagged template literals using
; a language inferred from the tag.

(call_expression
  function: [
    (identifier) @injection.language
    (member_expression
      property: (property_identifier) @injection.language)
  ]
  arguments: (template_string) @injection.content
+ (#match? @injection.language "^(html|css)$"))

You can see this working here:

CleanShot 2024-11-28 at 16 47 26
foo.js tagged template sample code

Happy to open a pull request adding this line if y'all agree :)

I think that's a great idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants