Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft of CLI/API design #1

Open
ribose-jeffreylau opened this issue Jan 22, 2025 · 5 comments
Open

Draft of CLI/API design #1

ribose-jeffreylau opened this issue Jan 22, 2025 · 5 comments
Labels
question Further information is requested

Comments

@ribose-jeffreylau
Copy link
Contributor

ribose-jeffreylau commented Jan 22, 2025

This issue is to gather inputs from potential users of this tool.

Assuming the CLI is called doctl:

Initialize a documents register

> doctl init doc-reg-name
> ls
doc-reg-name/

Set register metadata

> cd doc-reg-name/
> doctl set name1 value1
> doctl set-add name1.ary1[] ary_value1
> doctl set-add name1.obj1.ary[] val1

Initialize dataset

A dataset groups together a set of documents.
It is up to the documents register user to decide which documents to be grouped as one.
One document can only belong to one dataset.

> cd doc-reg-name/
> doctl init-dataset my-documents

List datasets

> doctl ls
my-documents
...

Set dataset metadata

> cd doc-reg-name/
> doctl  set datasets.my-documents.meta.name1 value1
> doctl  set-add datasets.my-documents.meta.name1.ary1 ary_value1
> doctl  set-add datasets.my-documents.meta.name1.ary1 ary_value2
> doctl  get datasets.my-documents.meta.name1.ary1
ary_value1
ary_value2

Upload a new document

> doctl upload my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID FILES...
> doctl upload my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID [--status DOC_STATUS] FILES...

Under the hood, UNIQUE_EXTERNAL_DOCUMENT_ID will be modelled as an Item Class (ISO 19135).

Modify document metadata

> doctl set datasets.my-documents.UNIQUE_EXTERNAL_DOCUMENT_ID.status published
> doctl set datasets.my-documents.UNIQUE_EXTERNAL_DOCUMENT_ID.published_date $(date)

List document versions

> doctl ls-versions my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID
...1
...2
...

List document files of a particular version

> doctl ls my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID
...
main.pdf
...
> doctl ls my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID#version_string
...
main.pdf
...

Add a new document version

> doctl upload my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID FILES...

Override a document version

> doctl upload --force my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID#VERSION_STRING FILES...

Remove a document version

> doctl rm my-documents/UNIQUE_EXTERNAL_DOCUMENT_ID#VERSION_STRING

Getting help

> doctl help
> doctl set help
> doctl ls help
> doctl upload help
> doctl rm help
> doctl init help
> doctl init-dataset help
@ronaldtse
Copy link

What's the reason for a purpose specific CLI for documents? So I can't manage other types of content? That can't be right.

@ronaldtse
Copy link

The new ISO 19135 already provides the underlying model for concept class, concept, register item class and register item.

These are the concept classes to implement:

  • document (across stages but not across years)
  • document with stage
  • document with stage version

These are the register item classes to implement:

  • document
  • document with stage
  • document with stage version that includes the actual content

I.e you need to create data definitions for these things above.

We need a generic way of handling these, not hard coding their attributes in code.

Please refer to how data definitions are expressed and implemented in lutaml-model.

@ribose-jeffreylau
Copy link
Contributor Author

What's the reason for a purpose specific CLI for documents? So I can't manage other types of content? That can't be right.

Bad name I guess. It's a temporary name. Lemme change that.

@ribose-jeffreylau
Copy link
Contributor Author

ribose-jeffreylau commented Jan 23, 2025

What's the reason for a purpose specific CLI for documents? So I can't manage other types of content? That can't be right.

The new ISO 19135 already provides the underlying model for concept class, concept, register item class and register item.

These are the concept classes to implement:

* document (across stages but not across years)

* document with stage

* document with stage version

These are the register item classes to implement:

* document

* document with stage

* document with stage version that includes the actual content

I.e you need to create data definitions for these things above.

We need a generic way of handling these, not hard coding their attributes in code.

Please refer to how data definitions are expressed and implemented in lutaml-model.

Ruby Paneron Register currently only has concepts of register item class and register item, which as I understand, realize "concept classes" and "concepts" from ISO 19135.

If we want to keep the generality of the tools, my thoughts go to something like this:

> ctl --concept-model-map  ./document_with_stages_mapping_to_register_model.lutaml \
    init document_register
> cd document_register
> ctl --concept-model-map  ../document_with_stages_mapping_to_register_model.lutaml \
    init-dataset my-dataset
> ctl --concept-model-map  ../document_with_stages_mapping_to_register_model.lutaml \
    upload  my-dataset/Doc-00001 files...

Does this sound like the right direction?

EDIT: I am getting some similar vibe from here: lutaml/lutaml-model#165

@ribose-jeffreylau
Copy link
Contributor Author

ribose-jeffreylau commented Jan 24, 2025

Further ideas & goals

  • The tool should be able to fetch external data models (on local disk / hosted on GitHub / etc.) that are specified on invocation (impl. details: cached, of course).
  • The data models should provide enough domain-specific information for the tool to:
    • perform data input validation,
    • present generic register items as domain-specific objects,
    • provide domain-specific interactions with register items, item classes, metadata, etc.
      • That implies a need to define a mapping of such domain-specific actions to primitive register actions.
  • These data models should ideally be defined in terms of LutaML models.
  • The tool should aim to eliminate the need to perform any further Git-related tasks (e.g., git tag) for the purpose of publishing documents, for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants