-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from mundialis/training_preparation
Training preparation
- Loading branch information
Showing
9 changed files
with
676 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
MODULE_TOPDIR = ../.. | ||
|
||
PGM = m.neural_network.preparetraining.worker | ||
|
||
include $(MODULE_TOPDIR)/include/Make/Script.make | ||
|
||
default: script |
14 changes: 14 additions & 0 deletions
14
m.neural_network.preparetraining.worker/m.neural_network.preparetraining.worker.html
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
<h2>DESCRIPTION</h2> | ||
|
||
<em>m.neural_network.preparetraining.worker</em> is used within <em>m.neural_network.preparetraining</em> to rasterize label data in parallel. | ||
<p> | ||
<h2>SEE ALSO</h2> | ||
|
||
<em> | ||
<a href="g.region.html">g.region</a> | ||
<a href="r.mapcalc.html">r.mapcalc</a>, | ||
<a href="v.to.rast.html">v.to.rast</a>, | ||
</em> | ||
|
||
<h2>AUTHORS</h2> | ||
<p>Guido Riembauer, <a href="https://www.mundialis.de/">mundialis GmbH & Co. KG</a><br> |
221 changes: 221 additions & 0 deletions
221
m.neural_network.preparetraining.worker/m.neural_network.preparetraining.worker.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,221 @@ | ||
#!/usr/bin/env python3 | ||
"""############################################################################ | ||
# | ||
# MODULE: m.neural_network.preparetraining.worker | ||
# AUTHOR(S): Guido Riembauer | ||
# PURPOSE: Worker module for m.neural_network.preparetraining to check | ||
# and rasterize label data | ||
# COPYRIGHT: (C) 2024 by mundialis GmbH & Co. KG and the GRASS Development | ||
# Team. | ||
# | ||
# This program is free software under the GNU General Public | ||
# License (v3). Read the file COPYING that comes with GRASS | ||
# for details. | ||
# | ||
############################################################################## | ||
""" | ||
|
||
# %Module | ||
# % description: Worker module for m.neural_network.preparetraining to check and rasterize label data | ||
# % keyword: raster | ||
# % keyword: statistics | ||
# %end | ||
|
||
# %option G_OPT_F_INPUT | ||
# % required: yes | ||
# % multiple: no | ||
# % label: Path to the label vector file | ||
# % guisection: Input | ||
# %end | ||
|
||
# %option G_OPT_F_INPUT | ||
# % key: img_path | ||
# % required: yes | ||
# % multiple: no | ||
# % label: Path to the corresponding imagery raster file | ||
# % guisection: Input | ||
# %end | ||
|
||
# %option | ||
# % key: class_column | ||
# % type: string | ||
# % required: yes | ||
# % multiple: no | ||
# % answer: class_number | ||
# % label: Column of the label vector that holds the class number | ||
# % guisection: Parameters | ||
# %end | ||
|
||
# %option | ||
# % key: class_values | ||
# % type: integer | ||
# % required: yes | ||
# % multiple: yes | ||
# % answer: 2 | ||
# % label: Expected and output values for the class/es of interest | ||
# % guisection: Parameters | ||
# %end | ||
|
||
# %option | ||
# % key: no_class_value | ||
# % type: integer | ||
# % required: yes | ||
# % multiple: no | ||
# % answer: 1 | ||
# % label: Expected and output value for the non class of interest areas | ||
# % description: Can be understood as a "rest" class for a multiclass system and a "no-class" for a binary classification | ||
# % guisection: Parameters | ||
# %end | ||
|
||
# %option G_OPT_F_OUTPUT | ||
# % required: yes | ||
# % multiple: no | ||
# % label: Path to the output label raster file | ||
# % guisection: Output | ||
# %end | ||
|
||
# %option | ||
# % key: new_mapset | ||
# % type: string | ||
# % required: yes | ||
# % multiple: no | ||
# % label: Name of the new mapset to work in | ||
# % guisection: Parameters | ||
# %end | ||
|
||
import atexit | ||
import os | ||
import shutil | ||
|
||
import grass.script as grass | ||
from grass_gis_helpers.mapset import switch_to_new_mapset | ||
from osgeo import gdal | ||
|
||
NEWGISRC = None | ||
GISRC = None | ||
ID = grass.tempname(8) | ||
NEW_MAPSET = None | ||
|
||
|
||
def cleanup(): | ||
"""Switch mapsets and deleting the new one.""" | ||
# switch back to original mapset | ||
grass.utils.try_remove(NEWGISRC) | ||
os.environ["GISRC"] = GISRC | ||
# delete the new mapset (doppelt haelt besser) | ||
gisenv = grass.gisenv() | ||
gisdbase = gisenv["GISDBASE"] | ||
location = gisenv["LOCATION_NAME"] | ||
mapset_dir = os.path.join(gisdbase, location, NEW_MAPSET) | ||
if os.path.isdir(mapset_dir): | ||
shutil.rmtree(mapset_dir) | ||
|
||
|
||
def main(): | ||
"""Run label rasterization.""" | ||
global NEWGISRC, GISRC, NEW_MAPSET | ||
input = options["input"] | ||
img_file = options["img_path"] | ||
NEW_MAPSET = options["new_mapset"] | ||
class_values = options["class_values"].split(",") | ||
no_class_value = options["no_class_value"] | ||
class_col = options["class_column"] | ||
output = options["output"] | ||
|
||
# switch to the new mapset | ||
GISRC, NEWGISRC, old_mapset = switch_to_new_mapset(NEW_MAPSET) | ||
# get extent from reference img file | ||
info = gdal.Info(img_file, format="json") | ||
south = info["cornerCoordinates"]["lowerLeft"][1] | ||
west = info["cornerCoordinates"]["lowerLeft"][0] | ||
north = info["cornerCoordinates"]["upperRight"][1] | ||
east = info["cornerCoordinates"]["upperRight"][0] | ||
cols, rows = info["size"] | ||
# set the region | ||
grass.run_command( | ||
"g.region", | ||
n=north, | ||
s=south, | ||
e=east, | ||
w=west, | ||
rows=rows, | ||
cols=cols, | ||
quiet=True, | ||
) | ||
|
||
# import the label dataset | ||
labelvect = f"labelvect_{ID}" | ||
labelrast = f"labelrast_{ID}" | ||
grass.run_command("v.import", input=input, output=labelvect, quiet=True) | ||
|
||
# check the values of the vector | ||
dbselect = list(grass.parse_command("v.db.select", map=labelvect).keys()) | ||
colnames = dbselect[0].split("|") | ||
rows = [item.split("|") for item in dbselect[1:]] | ||
try: | ||
idx = colnames.index(class_col) | ||
except ValueError: | ||
grass.fatal(_(f"File {input} has no column {class_col}")) | ||
class_numbers = [item[idx] for item in rows] | ||
class_num_set_ref = set([*class_values, no_class_value]) | ||
difference = set(class_numbers).difference(class_num_set_ref) | ||
if len(difference) > 0: | ||
|
||
grass.fatal( | ||
_( | ||
f"Label file {input} has features with unexpected values" | ||
f" in column {class_col}: {difference}. Allowed values " | ||
f"are [{','.join(class_values)}, {no_class_value}].", | ||
), | ||
) | ||
|
||
tile_empty = False | ||
if len(class_numbers) == 0 or set(class_numbers) == set([no_class_value]): | ||
grass.warning( | ||
_( | ||
f"Label file {input} contains no features with the " | ||
f"expected class values {class_values} in " | ||
f"column {class_col}. It is assumed that the classes " | ||
"do not occur in this tile.", | ||
), | ||
) | ||
tile_empty = True | ||
|
||
# rasterize | ||
if tile_empty is True: | ||
grass.run_command( | ||
"r.mapcalc", | ||
expression=f"{labelrast}={no_class_value}", | ||
quiet=True, | ||
) | ||
else: | ||
labelrast_tmp = f"{labelrast}_tmp" | ||
grass.run_command( | ||
"v.to.rast", | ||
input=labelvect, | ||
output=labelrast_tmp, | ||
type="area", | ||
use="attr", | ||
attribute_column=class_col, | ||
quiet=True, | ||
) | ||
# if there is any nodata left in the label, this will be assigned | ||
# to the no-class class | ||
exp = f"{labelrast}=if(isnull({labelrast_tmp}),{no_class_value},{labelrast_tmp})" | ||
grass.run_command("r.mapcalc", expression=exp, quiet=True) | ||
|
||
grass.run_command( | ||
"r.out.gdal", | ||
input=labelrast, | ||
output=output, | ||
type="Byte", | ||
createopt="COMPRESS=LZW", # no tiles or overviews required for the small tiles (?) | ||
flags="c", | ||
quiet=True, | ||
) | ||
|
||
|
||
if __name__ == "__main__": | ||
options, flags = grass.parser() | ||
atexit.register(cleanup) | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
MODULE_TOPDIR = ../.. | ||
|
||
PGM = m.neural_network.preparetraining | ||
|
||
include $(MODULE_TOPDIR)/include/Make/Script.make | ||
|
||
default: script |
63 changes: 63 additions & 0 deletions
63
m.neural_network.preparetraining/m.neural_network.preparetraining.html
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
<h2>DESCRIPTION</h2> | ||
|
||
<em>m.neural_network.preparetraining</em> prepares imagery and labelled data for training and application of a neural network. | ||
<p>While <a href="m.neural_network.preparedata">m.neural_network.preparedata</a> initially provides a setup for labelling tiles of imagery, | ||
<em>m.neural_network.preparetraining</em> rasterizes the vector labels and restructures the imagery data. | ||
|
||
<h2>Notes</h2> | ||
It is expected that all data lie in the directory structure and naming format as created by <a href="m.neural_network.preparedata">m.neural_network.preparedata</a>. | ||
This data is provided to <em>m.neural_network.preparetraining</em> via the <em>input_traindir</em> and <em>input_applydir</em> parameters. | ||
<em>m.neural_network.preparetraining</em> creates a new directory with the two directories <em>train</em> and <em>apply</em>. Each of these contains | ||
the following directories/data: | ||
|
||
<ul> | ||
<li><em>train_images:</em>: contains tilewise multiband .vrt-files with all imagery bands and an ndsm band to be used for training. This directory is empty in the <em>apply</em> dir.</li> | ||
<li><em>train_masks:</em>: contains tilewise rasterized .tif label files to be used for training. This directory is empty in the <em>apply</em> dir.</li> | ||
<li><em>val_images:</em>: contains tilewise multiband .vrt-files with all imagery bands and an ndsm band to be used for validation. This directory holds data both in the <em>train</em> and <em>apply</em> dirs. In the <em>train</em> dir, this data is used for validation during training, while in the <em>apply</em> dir, this directory holds all imagery used for prediction.</li> | ||
<li><em>val_masks:</em>: contains tilewise rasterized .tif label files to be used for training. This directory is empty in the <em>apply</em> dir.</li> | ||
<li><em>singleband_vrts:</em>: contains singleband .vrts for each imagery band of each tile. They are stored here as a basis to create the tilewise multiband .vrts.</li> | ||
<li><em>tile_XX_YY.vrt:</em> (only in the <em>train</em> dir): One multiband tile .vrt is stored here for the NN model to read in the number of bands.</li> | ||
</ul> | ||
<p> | ||
In order to save diskspace, all imagery is stored as .vrts, so the original datasets (created by <a href="m.neural_network.preparedata">m.neural_network.preparedata</a>) should | ||
not be moved (or <em>m.neural_network.preparetraining</em> should be run again afterwards). | ||
</p> | ||
<p> | ||
The user can indicate what percentage of the training tiles are used for validation (during training) with the <em>val_percentage</em> parameter. | ||
</p> | ||
<p> | ||
It is not possible to run <em>m.neural_network.preparetraining</em> repeatedly with the same <em>output</em> directory, as the training/validation split up happens during runtime. | ||
Hence, <em>m.neural_network.preparetraining</em> expects that the <em>output</em> directory does not exist. | ||
</p> | ||
<p> | ||
With the <em>class_values</em> and the <em>no_class_value</em> parameters, the user defines the allowed range of values in the <em>class_column</em> of the labelled data. In | ||
case an unexpected value is found, an error is thrown which indicates the affected tile. | ||
</p> | ||
<p> | ||
If a tile is not completely covered either by <em>class_values</em> or <em>no_class_value</em>, the not allocated areas will be filled with <em>no_class_value</em> in the rasterized version. | ||
</p> | ||
|
||
<h2>EXAMPLES</h2> | ||
|
||
<div class="code"><pre> | ||
m.neural_network.preparetraining input_traindir=nn_data_with_labels/train input_applydir=nn_data_with_labels/apply nprocs=6 class_column=class_number class_values=2 no_class_value=1 output=nn_data_structured | ||
</pre></div> | ||
|
||
|
||
<h2>SEE ALSO</h2> | ||
|
||
<em> | ||
<a href="https://grass.osgeo.org/grass-stable/manuals/v.import.html">v.import</a>, | ||
<a href="https://grass.osgeo.org/grass-stable/manuals/g.region.html">g.region</a> | ||
<a href="https://grass.osgeo.org/grass-stable/manuals/r.mapcalc.html">r.mapcalc</a>, | ||
<a href="https://grass.osgeo.org/grass-stable/manuals/v.to.rast.html">v.to.rast</a>, | ||
</em> | ||
|
||
<h2>REQUIREMENTS</h2> | ||
<ul> | ||
<li>GDAL and OGR Python bindings</li> | ||
<li><a href="https://pypi.org/project/grass-gis-helpers/">grass-gis-helpers</a> Python library >= 2.2.0</li> | ||
</ul> | ||
|
||
<h2>AUTHORS</h2> | ||
Guido Riembauer, <a href="https://www.mundialis.de/">mundialis GmbH & Co. KG</a><br> |
Oops, something went wrong.