Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax for references #200

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

oysteins
Copy link
Contributor

I think it would be nice to have an XML-free syntax for writing references.

I suggest that the name of a function prefixed with a hash symbol should give a reference, so that the AutoDoc code

#SomeFunction

is translated to the GAPDoc code

<Ref Func="SomeFunction" />

For operations, attributes and properties, we need to include the argument filters as well. I suggest the syntax

#SomeOperation[IsInt,IsGroup]

which is translated to

<Ref Func="SomeOperation" Label="for IsInt, IsGroup" />

This pull request is my attempt to implement this, including documentation and tests.

Some points for discussion:

  1. Is the syntax acceptable? For me, this seems like a reasonable way to write references (for instance, it is similar to the syntax for referring to issues on Github). What do you think?
  2. Is it safe to always use the Func attribute to Ref, instead of using Oper, Attr and so on? It is easier to generate the GAPDoc code this way (we don't have to find out what kind of thing we refer to), and I'm not able to see that it makes any difference in the final result.

New syntax for refering to documented declarations without writing
XML.  The AutoDoc code

  #SomeFunction
  #SomeOperation[IsInt,IsGroup]

is now translated to

  <Ref Func="SomeFunction" />
  <Ref Func="SomeOperation" Label="for IsInt, IsGroup" />
@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #200 into master will increase coverage by 1.61%.
The diff coverage is 90.47%.

@@            Coverage Diff            @@
##           master    #200      +/-   ##
=========================================
+ Coverage   68.49%   70.1%   +1.61%     
=========================================
  Files          13      13              
  Lines        2206    2248      +42     
=========================================
+ Hits         1511    1576      +65     
+ Misses        695     672      -23
Impacted Files Coverage Δ
gap/Markdown.gi 95.8% <90.47%> (-1.8%) ⬇️
gap/ToolFunctions.gi 88.75% <0%> (+0.4%) ⬆️
gap/Parser.gi 50.47% <0%> (+4.1%) ⬆️

@fingolfin
Copy link
Member

First of, thanks for the PR, much appreciated!

As to the syntax: Why are these rectangular brackets needed? It seems those are for labels. But then they are needlessly restricted, as one can specify arbitrary label texts, but the #link references you suggest here does not support that, but I think it should, or at least there should be an "obvious" way to extend it to support that in the future...

In any case: personally, I'd prefer something that is closer to Markdown, if possible. That would suggest basing it on the [text](URL) syntax. For example, for Julia, they are using

[`DenseArray`](@ref)  for a basic ref to a function, and
[manual section about modules](@ref modules)    for a reference using a label

We could just borrow that.

That said, the # syntax proposed in this PR of course is terser...

As to <Ref Func=... vs. <Ref Oper=.. etc. -- I am not sure from the top of my head if it is safe right now to use (in the sense of that it'll produce a working link), but a simple experiment should reveal that. Of course even if it works, it may not always work in the future -- but then, personally, I wish GAPDoc did not have this "feature" at all, and just allowed me to always write <Ref Label="BLA">.... So, what I'd do is: first conduct some experiments to find out if always using Func= suffices right now; if it does, do so. If it does not suffice (or if it results in many annoying warnings), we could also add code that figures out what a given name references. But before doing that, let's run the experiment.

@oysteins
Copy link
Contributor Author

Some elaboration:

1. Hash syntax versus Markdown link syntax

I also thought about the Markdown style [text](URL) syntax, but I don't think that's a very good match for this situation. For these references, there aren't two separate things that are "link text" and "address", but just a single thing (the name of a function/operation/...) that we want to refer to the definition of.

The syntax you mention from Julia --

[`DenseArray`](@ref)

-- seems very clumsy to me. The @ref part doesn't seem to add any meaning at all, since the brackets already say that this is a reference. It looks like it's there just because there needs to be something between the parentheses.

On the other hand, I think it would be a very good idea to have the normal Markdown syntax

[text](URL)

for creating an external link. Combined with the # syntax I suggested, this would actually be very similar to GitHub Flavored Markdown: [text](URL) for arbitrary links, and #something for special links.

2. What is the stuff in square brackets?

Ideally, I would like to write just #FooBar to get a reference to FooBar, whatever that is (a function, operation, attribute, ...). But this doesn't work in general, since operations (and attributes and properties) can be defined multiple times with different filter lists. So I intended

#FooBar[IsInt,IsGroup]

to be read as "reference to the version of FooBar with filter list [ IsInt, IsGroup ]". The square brackets are meant to mimic the way the filter list is written in GAP code. The actual GAPDoc code I generate for this -- Label="for IsInt, IsGroup" -- is just a consequence of the way AutoDoc automatically generates labels from filter lists.

In many cases, of course, there is only one declaration for a given operation name. However, when the <Oper> element has a label, GAPDoc forces us to include that label in any reference to it as well. This could be solved in AutoDoc: Keep a list of all documented things, and either put labels only where needed, or supply missing labels on references when the name alone is unique. This might be quite a bit of work (I don't know enough about how AutoDoc works to see how difficult this would be).

3. References to other things: chapters, sections etc

I think the #something syntax could be used for referring to chapters and sections as well. For example, we could write

Read about the functions #Buff and #Gruff in section #stuff.

and get the GAPDoc code

Read about the functions <Ref Func="Buff" /> and <Ref Func="Gruff" />
in section <Ref Sect="Stuff" />.

-- given that AutoDoc is able to keep track of all documented names and labels, so that it knows just from a name if that name is a chapter label, or a section label, or a function name, and so on.

@fingolfin
Copy link
Member

OK, I think you have a valid point regarding the #bla vs. [desc](URL) syntax!

As to labels: Actually, I don't see how AutoDoc could possibly resolve labels automatically:
First off, AutoDoc in general is not able to keep track of all documented names. It could perhaps do this if the whole manual is written in pure AutoDoc format, but as soon as parts of it are written in XML, this fails; and for many packages that already have a GAPDoc manual, this is a hard reality. Also, due to limitations of the .autodoc format, one may still wish to write parts of even a completely new package manual in GAPDoc XML.

Also, GAPDoc labels are typically used to disambiguate multiple documentation entries for the same identifier, and the label texts can be anything. How should we guess them? Again, your proposal only seems to make sense if we assume that 100% of the manual are written in AutoDoc. But that locks out a lot of users.

@fingolfin
Copy link
Member

So, here is another concern I have with using #identifier: It is very close to the markdown syntax for sections, # title, ## title. So perhaps we can come up with an alternative syntax that does not clash with MarkDown (and common extensions to it, e.g. GitHub flavored Markdown)? E.g &link and &link[label] ? Or %link, or !link, or ...? Somewhat more radical: <link> which at first seems to clash with <URL>, but if we are willing to restrict <URL> to URLs (or rather, URIs) with a protocol such as https://..., these are trivial to disambiguate.

Thoughts?

@oysteins
Copy link
Contributor Author

Regarding whether AutoDoc can have complete control over all labels: It is true that I was only thinking about the case where everything is written in AutoDoc. However, even if there is a mix of AutoDoc and GAPDoc code, AutoDoc can still have control over all labels in those parts of the manual that are written in AutoDoc.

This means that for references pointing to something that was written in GAPDoc, the user would need to write a quite long and complicated thing containing all the information needed to generate the proper GAPDoc <Ref> element. (Or the user could just write the <Ref> element directly.) But for references pointing to something that was written in AutoDoc, it is possible for AutoDoc to automatically supply much of this information, so that the user only has to put in enough information to make AutoDoc able to determine what they want to refer to.

For example, if the operation Foo is documented in GAPDoc, and has label "for IsInt and IsGroup", then we could refer to it in AutoDoc text with something like this hypothetical syntax:

#Oper:Foo[for IsInt and IsGroup]

If, on the other hand, it is documented in AutoDoc, and there is no other operation with the same name, then we could refer to it by writing only

#Foo

In this way, the simple (and common) cases are kept simple, and the complicated cases are a bit complicated, but not impossible.

@oysteins
Copy link
Contributor Author

About syntax: The syntax #thing does look a bit similar to # title, but they don't conflict directly since the latter contains a space. And GitHub Flavored Markdown seems to have no problems with having both the #issue and the # title syntax.

The alternatives &link and <link> conflict with XML syntax, so they imply that GAPDoc code can no longer be written directly inside AutoDoc code (which I think is a great improvement, but it would probably require a lot of rewriting in the packages using AutoDoc).

I don't have strong opinions on the exact choice of syntax, but I really want the ability to make a reference by writing only an identifier (if it is unique) or an identifier and a label, combined with some very few extra characters to say that it is a reference.

@fingolfin
Copy link
Member

I really want the ability to make a reference by writing only an identifier

I think we violently agree on this ;-)

Also, I find your remark that Github already uses #issue quite convincing, and you are of course right about XML compatibility! So (at least from my POV) we could indeed use #.

the simple (and common) cases are kept simple, and the complicated cases are a bit complicated, but not impossible

Agreed, that should indeed be the goal. But precisely that is what worries me (at least about your original proposal), as it seems to only makes short syntax possible for things documented in AutoDoc using default labels, but seems to make short syntax impossible for anything with custom labels, including things documented in AutoDoc using custom labels.

To clarify: It's not #foo that worries me (ignoring the question about whether to use # or not for the moment); it is the syntax #foo[IsInt,IsString] which then automagically is translated to a reference to "foo" with the label for IsInt and IsString. But how do I reference something with a label that does not fit into that pattern?

Perhaps we could say "If it looks like #ident[WORD1,WORD2,WORD3,...] (possibly with spaces after the comma allowed), then assume this is shorthand for for WORD1 and WORD2 and WORD3 and ..., otherwise assume it is the actual label". And that might work, but I am a bit wary about such shenanigans, as they can lead to very confusing issues if somebody happens to use a custom label WORD1,WORD2 somewhere. Indeed, I think it is very likely that somebody is using custom labels consisting of a single word, and that would already confuse this trick.

tl;dr: I'll be sold if we can find a syntax we all think is OK and that also supports arbitrary custom labels, both those set in GAPDoc and in AutoDoc, and without forcing me to add a prefix like oper: for stuff that I documented in AutoDoc (and for which thus I want AutoDoc to deduce this for me).

@oysteins
Copy link
Contributor Author

It is not easy to imagine any reference syntax that would handle absolutely every legal GAPDoc label, except GAPDoc's own syntax. For example, even if we say that

#identifier[label]

means the version of identifier with label set to exactly the text label, it would still be possible for someone to make this not work by writing a label like

<Oper Name="Foo" Label="silly]label" />

in GAPDoc.

@oysteins
Copy link
Contributor Author

A suggestion for syntax:

A reference is written as

type#identifier(label)

where:

  • type is one of the strings Oper, Func, Chap, Sect and so on, or the empty string. (If empty, the reference must be to something defined in AutoDoc code, so that AutoDoc can figure out the correct type.)
  • identifier is the identifier we want to refer to, or the empty string. (It is empty if we are refering to a chapter, section or so on.)
  • label is a label or the empty string. (If it is empty, the parentheses can also be left out.)

When referring to something defined in GAPDoc code, we need to include the type part; when referring to something defined in AutoDoc code, we can leave it out.

So we could for example write something like this when referring to GAPDoc-defined things:

See the descriptions of Func#Foobar and Oper#Quux(for IsInt) in Section Sect#(foo).

Or something like this when referring to AutoDoc-defined things:

See the descriptions of #Foobar and #Quux(for IsInt) in Section #(foo).

This could also be combined with the syntax I suggested initially, so that

#identifier[filters]

means a reference to the version of identifier with filter list filters.

@fingolfin
Copy link
Member

@oysteins that sounds very good to me! Thank you for putting in the time and effort to work on this. Would be good if @sebasguts could also comment to say if he approves as well?

@sebasguts
Copy link
Contributor

@oysteins Sorry for being late to the show, and thank you, that all looks good.

I agree with the Syntax specified above, just a small comment (which is probably not important):

AutoDoc allows you to set, e.g., chapters and sections with the same name. How would you distinguish them using your Syntax? The (automatically set) labels might be tricky to guess for users, (Chapter_Foo_Section_Foo), so would it be beneficial that also in the AutoDoc-case you describe above allow the specifiers, so something like Sect#(Foo) which then refers to the AutoDoc section with name Foo, instead of Sect#Foo which refers to the section with label Foo? Or would this be too confusing?

@fingolfin
Copy link
Member

@oysteins: if you were willing to update this PR, that would be great! If not (e.g. you lack time or are annoyed by us being slow in accepting you are right ;-), I totally understand, then we'd try to adapt it (but that may take some more time, esp. since @sebasguts will sadly soon leave us)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants