-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be Stricter About Lack of Null/None/Nil #975
Comments
How about:
Not entirely sure where to put that though; maybe just a new section before "Filename Extension"? |
I would have put it directly into the key/value pairs section. It already has an example on invalid values which is somewhat related. |
It's difficult to specify something that shouldn't exist, which is why I've held my tongue. We also seem to discourage adopting into the standard explicit recommendations for configuration designers, though there are exceptions. We also don't mention layering or merging configurations anywhere, but we could note that TOML provides no means to "unset" key/value pairs. Let me take a stab at this, starting with your suggestion, @arp242. The proposal below follows the guidelines previously mentioned, and would appear in the Key/Value Pair section, per @mitsuhiko's suggestion.
What do you think? |
That sounds good to me. |
I concur, and this is also why I didn't reply initially, hoping someone else had something insightful to say, but it seems not 😅 I think the issue comes up frequently enough that it's worth at least explicitly mentioning that TOML doesn't support Null values. Your phrasing is better than mine, although I would replace I considered proposing a new document for this ("FAQ", "Effective TOML", or something like that) but I don't really know what else we could put there. |
This is advice, not spec, and as such I think it should not go anywhere into the spec. It might be good to write something in that regard into a TOML FAQ. It's true that there is no such FAQ yet, but maybe something like "Why is there no None type?" is indeed a good first question to start it. Besides that, I'm very skeptical about the discussed contents of the answer. As I see it, merging multiple TOML files or somehow modifying TOML data that already exists in memory (by means of a different TOML file) are exotic special cases, not your typical use case. In the typical case, treating None by omitting the whole key/value pair when serializing and treating a missing key/value pair as None when deserializing is exactly the right way of dealing with such data (implicit rather than explicit None). I also believe that the fact that TOML has such an implicit None type is very much the reason that it doesn't have (or need) an explicit one. Hence I'm opposed to mentioning anything about "sentinel values" except possibly for very specialized use cases. Should we ever arrive at the conclusion that such sentinel values are indeed more frequently needed, that would IMHO provide a good rationale for finally adding an explicit None as the paramount sentinel value. |
Add-on regarding
That's indeed the very reason given by @mojombo for closing #30
|
In the absense of |
If you have If you want to indicate some sort of semantic absense of information intended for those slots, say for some sort of serialization scenario? Personally I'd leave that down to whatever application is resposible for validating the input, rather than depend on it being a part of the TOML itself. or, if you're suggesting that the schema itself would assume |
I am talking about optional value types (like Option in Rust). A concrete value can be, let say, integer or None if it it missing or undefined for some reason. But it is essential that it occupies the right slot in the array. Just shrinking the array would not work. EDIT: Providing more context. I have a library that uses JSON for configuration. It extensively uses json arrays with |
Well then, you can come up with some scheme for that in your application, like using an empty inline table |
I'd prefer it if this was kept on-topic, rather than be a generic "how do I do X without nulls?" issue. If any of the previous issues don't answer your question, it's better to comment on one of those (if on-topic) or open a new one. |
May I ask, is this library publicly available? Using nulls in arrays as a means of specifying configuration defaults seems problematic on the surface if that's the only way to get the defaults. Can you link to the library's documentation? Or is that not possible? |
This is what I consider an "awkward" hack. Essentially, there is no clean way to build JSON <=> TOML bridge if json objects contain nulls. |
VS Code is using JSON for its configs. |
I'm thinking they should. Nulls and undefineds are the bane of their existence on so many occasions. They are partly fighting against them in casual use with strict null checking. Not something any configuration standard should ever have to deal with, if they did things right. Getting these well-informed opinions codified is a struggle. |
Personal arrogance aside, I feel that I should step back a moment. Nothing really should be said about NULLs as far as parsing is concerned. As it stands, parsers will produce table-like objects, and the absence of key names (including table names) when expected by applications will always be subject to interpretation by those programs. Saying anything else at the TOML level just mucks up this dynamic. NULLs just are not simple or obvious. They may be so in programming languages, but they are not so in configurations. So programming languages must carry that weight when they do their parsing. So let's shift our focus, where we can be much more clear-cut. Maybe we should write something to address TOML emitters. We should make it clear that the presence of a NULL value in an object to be written to TOML must raise an error. It's the responsibility of the emitter to check for this, and although they may provide ways to address NULL values before emission, they certainly cannot handle NULLs without preprocessing. This approach keeps the complications of handling null values out of TOML. And it's not a complex ask; we already strongly imply that output to a file leaves behind a valid TOML document. This may leave users who want NULLs in arrays searching for ways to represent them. But arrays can hold any legal type of TOML value. Sentinels can be used. Empty arrays and empty tables can be used. Special Error object tables, describing the errors in the same way that web APIs return them, could be emitted as long as subsequent reads by human readers lets them know that their attention is needed. There's no elegant catch-all solution to representing deliberately missing values, which is why data validation is such a chore sometimes. We can't assume anything at a fundamental level. That's up to users to solve, not the TOML language. So what do you say? Shall we address proper emitter behavior? |
This would be standardizing existing practice - all the python TOML libraries work this way at the very least. There isn't really any other sensible approach. Putting it in writing makes sense to me. |
I'm using pyserde to serialize my dataclass as below. It is usual in python to use None, and TOML can not be used in these cases. I'm designing the metadata format for Mojo Package. @serde
@dataclass
class RingInfo:
"""Confrom mojo core-metadata"""
name: str
version: str
metadata_version: str = "0.1"
# Below are optional
dynamic: list[str] | None = None
platforms: list[str] | None = None
supported_platforms: list[str] | None = None
summary: str = ""
description: str = ""
description_content_type: str = "text/markdown"
keywords: list[str] | None = None
home_page: str = ""
download_url: str = ""
author: str = get_user_email_from_git()[0]
author_email: str = get_user_email_from_git()[1]
maintainer: str = get_user_email_from_git()[0]
maintainer_email: str = get_user_email_from_git()[1]
license: str = ""
classifiers: list[str] | None = None
requires_dist: list[str] | None = None
requires_mojo: str = ""
requires_external: list[str] | None = None
project_urls: dict[str, str] | None = None
provides_extra: list[str] | None = None
provides_dist: list[str] | None = None
obsoletes_dist: list[str] | None = None
file_name: str = "" |
@drunkwcodes: I'd say an object like that should simply be serialized by omitting the corresponding key/value pairs when the value is None. That's the general practice (I suppose) and also leads to the shortest and most readable TOML. That brings me to the earlier question about whether to "address proper emitter behavior". I'd say we sure can, but we must be careful regarding the wording to use. As I see it, None in a list should (by default) be rejected as an error, but for None in a table-like object (one that's serialized to a table) the above approach is certainly reasonable and we must not forbid it. |
Thank you for your response. I can see it as a better solution. I have filed an issue with |
+1 to this, mashumaro does it for TOML since version 3.2 (release notes). |
@Fatal1ty Marvelous! I will try it and adopt |
I do think it'd be neat if the website provided recommendations for good configuration API design, and better alternatives to null. But this very thread makes me doubt we'd ever agree on them. Here's what I would say, which directly contradicts some of the alternatives people have proposed before:
|
@DominoPivot wrote:
That's actually a pretty good idea! The website is a separate project, of course related to this one. And that would be a suitable place to maintain and collaborate on such a guide. @pradyunsg Remember I was talking about another document for FAQs and advice on using TOML? This is an ideal candidate for a document on the website, and well-suited to it. If it's found to be useful, it could be translated into all the languages that the website supports. This is broader than the original purpose of this issue; an API recommendations page would cover more topics than simply how to find alternatives to null. But it would be the best place to explain why we oppose null values in the TOML spec in a productive way. And composing such a document may allow the differences in approach that we've expressed here to melt away. I don't know much about the website, though. But it could use a few more eyeballs. |
Yup! toml-lang/toml.io#70 filed for this. It might not need to be on the website (another
Honestly, that's more library API design than anything else. If an encoder wants to do this and it makes sense in the context of their library, I don't see why the spec needs to outlaw that. There's little reason to dictate the library APIs here. I would be really surprised if this isn't obvious in a similar manner that a user-defined-object or complex numbers or whatever-cool-type a language has shouldn't be serialised as-is. A weaker but similar argument applies for arrays too -- I think it's sufficiently obvious, and doesn't need to be called out explicitly. I don't think it's productive to have API design guidance in the specification. Policing everyone's use of TOML is not a feasible thing, and particularly, I don't think it's a productive thing to be doing.
I'm not convinced this is a format-specific thing, and needs to be coupled with TOML in some way (or is made better by coupling with TOML). There's many resources for API design guidance in the world and there is limited value in having one-more-place to get that advice -- I'm certainly not interested in having one associated with TOML. All the examples provided are generic pieces of advice for API design and, as much as I agree with them, they don't belong in a TOML-specific location.
Not in the specification, no.
That's fair, and I agree that we should have this -- albeit separately. I wouldn't mind having a place to put "why certain language design choices were made" information, which can cover this specifically as well as a few other things that have come up. "Why does TOML not have NULL/None/Nil?" is a good thing to have in an FAQ, which we can discuss how to handle the use cases where people reach for null-ish values. And, one of the options is omitting keys, as has come up in the discussion already and unlike what OP is claiming. I'm inclined to close this out, on the basis that the specification isn't going to change to gain new language around this. However, some of the things I've stated here might warrant more discussion so I'm not gonna click the button just yet. :) |
Any advice on porting to TOML an existing JSON schema that extensively uses |
@slonik-az There's been a number of suggestions in this discussion thread already, as well as in other related issues (links above). All broadly boil down to one of:
If the application already uses JSON, and depends on JSON-only features, it's not clear what value you'd be gaining out of porting to TOML. |
OK, it looks like there isn't more discussion to be had here so closing this out. toml-lang/toml.io#70 covers adding an FAQ to the website. |
There are quite a few people who want
null
/none
/nil
/unset
or anything like this in the language. It's one of the oldest issues (#30) that was opened and closed. The most recent incarnation also did not go very far in terms of getting towards support for it:#921
I personally try my best to avoid the use of
null
in my own uses of TOML, but I keep running into other people's TOML based configurations where the absence of a key is used as an alternative tonull
. I believe this to be a mistake in designing TOML schemas because such a value cannot be represented in TOML. This is particularly a problem when TOML files are "merged" or layered. Since null/none support is unlikely to happen I wonder if an explicit section could be added to the TOML specification describing about why null does not exist, and that the absence of a key should not be used to indicate that a value is null.The text was updated successfully, but these errors were encountered: