Skip to content

Commit

Permalink
Merge branch 'master' into fix_chemical_formula
Browse files Browse the repository at this point in the history
  • Loading branch information
atomprobe-tc committed Jul 5, 2024
2 parents dd7e962 + aec8ab9 commit 329fa12
Show file tree
Hide file tree
Showing 22 changed files with 689 additions and 134 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/plugin_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,12 @@ jobs:
repository: FAIRmat-NFDI/${{ matrix.plugin }}
path: ${{ matrix.plugin }}
ref: ${{ matrix.branch }}
- name: Install nomad
run: |
uv pip install --system nomad-lab@git+https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR.git@pluginise-nexus-code
- name: Install ${{ matrix.plugin }}
run: |
cd ${{ matrix.plugin }}
cd ${{ matrix.plugin }}
uv pip install --system .
- name: Run ${{ matrix.plugin }} tests
run: |
Expand Down
8 changes: 7 additions & 1 deletion .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,15 @@ jobs:
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install --system coverage coveralls
- name: Install package
- name: Install pynx without nomad plugin
if: "${{ matrix.python_version == '3.8' }}"
run: |
uv pip install --system ".[dev]"
- name: Install pynx with nomad plugin
if: "${{ matrix.python_version != '3.8' }}"
run: |
uv pip install --system nomad-lab@git+https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR.git@pluginise-nexus-code
uv pip install --system ".[dev]"
- name: Test with pytest
run: |
coverage run -m pytest -sv --show-capture=no tests
Expand Down
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ pip install git+https://github.com/FAIRmat-NFDI/pynxtools.git

`pynxtools` (previously called `nexusutils`) is intended as a parser for combining various instrument output formats and electronic lab notebook (ELN) formats to an hdf5 file according to NeXus application definitions.

Additionally, the software is used in the research data management system NOMAD for
Additionally, the software can be used as a plugin in the research data management system NOMAD for
making experimental data searchable and publishable.
NOMAD is developed by the FAIRMAT consortium, as a part of the German National Research Data Infrastructure
(NFDI).
Expand All @@ -49,6 +49,13 @@ data into the NeXus standard and visualising the files content.
- [**read_nexus**](https://github.com/FAIRmat-NFDI/pynxtools/blob/master/src/pynxtools/nexus/README.md): Outputs a debug log for a given NeXus file.
- [**generate_eln**](https://github.com/FAIRmat-NFDI/pynxtools/blob/master/src/pynxtools/eln_mapper/README.md): Outputs ELN files that can be used to add metadata to the dataconverter routine.

# NOMAD integration

To use pynxtools with NOMAD, simply install it in the same environment as the `nomad-lab` package.
NOMAD will recognize pynxtools as a plugin automatically and offer automatic parsing of `.nxs` files
and a schema for NeXus application definitions.
pynxtools is already included in the NOMAD main deployment and NOMAD NeXus distribution images.

# Documentation
Documentation for the different tools can be found [here](https://fairmat-nfdi.github.io/pynxtools/).

Expand Down Expand Up @@ -111,7 +118,7 @@ on how to build on this work, or to get your parser included into NOMAD, you can

### Does this software require NOMAD or NOMAD OASIS ?

No. The data files produced here can be uploaded to Nomad. Therefore, this acts like the framework to design schemas and instances of data within the NeXus universe.
No. The data files produced here can be uploaded to Nomad. Therefore, this acts like the framework to design schemas and instances of data within the NeXus universe. It can, however, be used as a NOMAD plugin to parse nexus files, please see the section above for details.

# Troubleshooting

Expand Down
4 changes: 3 additions & 1 deletion src/pynxtools/_build_wrapper.py → _build_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ def get_vcs_version(tag_match="*[0-9]*") -> Optional[str]:
"--match",
tag_match,
],
cwd=os.path.join(os.path.dirname(__file__), "../pynxtools/definitions"),
cwd=os.path.join(
os.path.dirname(__file__), "src/pynxtools/definitions"
),
check=True,
capture_output=True,
)
Expand Down
20 changes: 8 additions & 12 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,7 @@ click==8.1.7
click-default-group==1.2.4
# via pynxtools (pyproject.toml)
colorama==0.4.6
# via
# click
# mkdocs
# mkdocs-material
# pytest
# via mkdocs-material
contourpy==1.2.0
# via matplotlib
coverage==7.4.4
Expand Down Expand Up @@ -59,7 +55,7 @@ jinja2==3.1.3
# mkdocs-material
kiwisolver==1.4.5
# via matplotlib
lxml==5.1.0
lxml==5.2.2
# via pynxtools (pyproject.toml)
markdown==3.6
# via
Expand Down Expand Up @@ -98,7 +94,7 @@ mypy-extensions==1.0.0
# via mypy
nodeenv==1.8.0
# via pre-commit
numpy==1.26.4
numpy==1.22.4
# via
# pynxtools (pyproject.toml)
# ase
Expand All @@ -116,7 +112,7 @@ packaging==24.0
# xarray
paginate==0.5.6
# via mkdocs-material
pandas==2.2.1
pandas==1.5.3
# via
# pynxtools (pyproject.toml)
# xarray
Expand Down Expand Up @@ -174,7 +170,9 @@ ruff==0.3.4
scipy==1.12.0
# via ase
setuptools==70.0.0
# via nodeenv
# via
# pynxtools (pyproject.toml)
# nodeenv
six==1.16.0
# via
# anytree
Expand All @@ -196,8 +194,6 @@ types-requests==2.31.0.20240311
# via pynxtools (pyproject.toml)
typing-extensions==4.10.0
# via mypy
tzdata==2024.1
# via pandas
urllib3==2.2.1
# via
# requests
Expand All @@ -208,7 +204,7 @@ virtualenv==20.25.1
# via pre-commit
watchdog==4.0.0
# via mkdocs
xarray==2024.2.0
xarray==2023.12.0
# via pynxtools (pyproject.toml)
zipp==3.18.1
# via importlib-metadata
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[build-system]
requires = ["setuptools>=64.0.1", "setuptools-scm[toml]>=6.2"]
backend-path = ["src/pynxtools"]
backend-path = ["."]
build-backend = "_build_wrapper"

[project]
Expand Down Expand Up @@ -35,7 +35,6 @@ dependencies = [
"importlib-metadata",
"lxml>=4.9.1",
"anytree",
"setuptools>=64.0.1"
]

[project.urls]
Expand Down Expand Up @@ -95,6 +94,7 @@ ellips = [
[project.entry-points.'nomad.plugin']
nexus_parser = "pynxtools.nomad.entrypoints:nexus_parser"
nexus_schema = "pynxtools.nomad.entrypoints:nexus_schema"
nexus_data_converter = "pynxtools.nomad.entrypoints:nexus_data_converter"

[project.scripts]
read_nexus = "pynxtools.nexus.nexus:main"
Expand Down
31 changes: 30 additions & 1 deletion src/pynxtools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,42 @@
import os
import re
from datetime import datetime
from subprocess import CalledProcessError, run
from typing import Optional

from pynxtools._build_wrapper import get_vcs_version
from pynxtools.definitions.dev_tools.globals.nxdl import get_nxdl_version

MAIN_BRANCH_NAME = "fairmat"


def get_vcs_version(tag_match="*[0-9]*") -> Optional[str]:
"""
The version of the Nexus standard and the NeXus Definition language
based on git tags and commits
"""
try:
return (
run(
[
"git",
"describe",
"--dirty",
"--tags",
"--long",
"--match",
tag_match,
],
cwd=os.path.join(os.path.dirname(__file__), "../pynxtools/definitions"),
check=True,
capture_output=True,
)
.stdout.decode("utf-8")
.strip()
)
except (FileNotFoundError, CalledProcessError):
return None


def _build_version(tag: str, distance: int, node: str, dirty: bool) -> str:
"""
Builds the version string for a given set of git states.
Expand Down
6 changes: 6 additions & 0 deletions src/pynxtools/dataconverter/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -892,6 +892,12 @@ def update_and_warn(key: str, value: str, overwrite=False):
get_nexus_version(),
overwrite=False,
)
update_and_warn(
f"/ENTRY[{entry_name}]/definition/@URL",
"https://github.com/FAIRmat-NFDI/nexus_definitions/"
f"blob/{get_nexus_version_hash()}",
overwrite=False,
)


def extract_atom_types(formula, mode="hill"):
Expand Down
101 changes: 72 additions & 29 deletions src/pynxtools/dataconverter/nexus_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
from pynxtools.dataconverter.helpers import (
contains_uppercase,
get_all_parents_for,
get_nxdl_name_for,
get_nxdl_root_and_path,
is_appdef,
remove_namespace_from_tag,
Expand Down Expand Up @@ -134,6 +133,14 @@ class NexusNode(NodeMixin):
for a tree, i.e., setting the parent of a node is enough to add it to the tree
and to its parent's children.
For the root this is None.
is_a: List["NexusNode"]:
A list of NexusNodes the current node represents.
This is used for attaching siblings to the current node, e.g.,
if the parent appdef has a field `DATA(NXdata)` and the current appdef
has a field `my_data(NXdata)` the relation `my_data` `is_a` `DATA` is set.
parent_of: List["NexusNode"]:
The inverse of the above `is_a`. In the example case
`DATA` `parent_of` `my_data`.
"""

name: str
Expand Down Expand Up @@ -217,37 +224,68 @@ def get_path(self) -> str:
current_node = current_node.parent
return "/" + "/".join(names)

def search_child_with_name(
self, names: Union[Tuple[str, ...], str]
def search_add_child_for_multiple(
self, names: Tuple[str, ...]
) -> Optional["NexusNode"]:
"""
This searches a child or children with `names` in the current node.
Searchs and adds a child with one of the names in `names` to the current node.
This calls `search_add_child_for` repeatedly until a child is found.
The found child is then returned.
Args:
name (Tuple[str, ...]):
A tuple of names of the child to search for.
Returns:
Optional["NexusNode"]:
The first matching NexusNode for the child name.
If no child is found at all None is returned.
"""
for name in names:
child = self.search_add_child_for(name)
if child is not None:
return child
return None

def search_add_child_for(self, name: str) -> Optional["NexusNode"]:
"""
This searches a child with name `name` in the current node.
If the child is not found as a direct child,
it will search in the inheritance chain and add the child to the tree.
Args:
names (Union[Tuple[str, ...], str]):
Either a single string or a tuple of string.
In case this is a string the child with the specific name is searched.
If it is a tuple, the first match is used.
name (str):
Name of the child to search for.
Returns:
Optional[NexusNode]:
The node of the child which was added. None if no child was found.
"""
if isinstance(names, str):
names = (names,)
for name in names:
direct_child = next((x for x in self.children if x.name == name), None)
if direct_child is not None:
return direct_child
if name in self.get_all_direct_children_names():
return self.add_inherited_node(name)
tags = (
"*[self::nx:field or self::nx:group "
"or self::nx:attribute or self::nx:choice]"
)
for elem in self.inheritance:
xml_elem = elem.xpath(
f"{tags}[@name='{name}']",
namespaces=namespaces,
)
if not xml_elem and name.isupper():
xml_elem = elem.xpath(
f"{tags}[@type='NX{name.lower()}' and not(@name)]",
namespaces=namespaces,
)
if not xml_elem:
continue
existing_child = self.get_child_for(xml_elem[0])
if existing_child is None:
return self.add_node_from(xml_elem[0])
return existing_child
return None

def get_children_for(self, xml_elem: ET._Element) -> Optional["NexusNode"]:
def get_child_for(self, xml_elem: ET._Element) -> Optional["NexusNode"]:
"""
Get the children of the current node which matches xml_elem.
Get the child of the current node, which matches xml_elem.
Args:
xml_elem (ET._Element): The xml element to search in the children.
Expand All @@ -257,7 +295,10 @@ def get_children_for(self, xml_elem: ET._Element) -> Optional["NexusNode"]:
The NexusNode containing the children.
None if there is no initialised children for the xml_node.
"""
return next((x for x in self.children if x.inheritance[0] == xml_elem), None)
for child in self.children:
if child.inheritance and child.inheritance[0] == xml_elem:
return child
return None

def get_all_direct_children_names(
self,
Expand Down Expand Up @@ -618,31 +659,33 @@ def _check_sibling_namefit(self):
if get_nx_namefit(self.name, sibling_name) < 0:
continue

sibling_node = self.parent.get_children_for(sibling)
sibling_node = self.parent.get_child_for(sibling)
if sibling_node is None:
sibling_node = self.parent.add_node_from(sibling)
self.is_a.append(sibling_node)
sibling_node.parent_of.append(self)

min_occurs = (
(1 if self.optionality == "required" else 0)
(1 if sibling_node.optionality == "required" else 0)
if sibling_node.occurrence_limits[0] is None
else sibling_node.occurrence_limits[0]
)
min_occurs = (
1
if self.optionality == "required" and min_occurs < 1
else min_occurs
)

required_children = reduce(
lambda x, y: x + (1 if y.optionality == "required" else 0),
sibling_node.parent_of,
0,
)

if required_children >= min_occurs:
self.optionality = "optional"
if (
sibling_node.optionality == "required"
and required_children >= min_occurs
):
sibling_node.optionality = "optional"
break
else:
continue
break

def _set_occurence_limits(self):
"""
Expand Down Expand Up @@ -817,7 +860,7 @@ def populate_tree_from_parents(node: NexusNode):
The current node from which to populate the tree.
"""
for child in node.get_all_direct_children_names(only_appdef=True):
child_node = node.search_child_with_name(child)
child_node = node.search_add_child_for(child)
populate_tree_from_parents(child_node)


Expand Down
Loading

0 comments on commit 329fa12

Please sign in to comment.