Skip to content

Commit

Permalink
Moved convert utilities to OpenVisus official repo
Browse files Browse the repository at this point in the history
  • Loading branch information
scrgiorgio committed Sep 22, 2022
1 parent fefee3b commit 0f5b2d1
Show file tree
Hide file tree
Showing 14 changed files with 512,219 additions and 27 deletions.
3 changes: 1 addition & 2 deletions ansible/roles/install_conda/tasks/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,7 @@
# vault
git clone [email protected]:nsdf-fabric/vault.git ~/.nsdf/vault
mkdir -p ~/.aws/
cp ~/.nsdf/vault/aws_config ~/.aws/config
cp ~/.nsdf/vault/aws_credentials ~/.aws/credentials
cp ~/.nsdf/vault/aws/* ~/.aws/
# software stack
git clone [email protected]:nsdf-fabric/nsdf-software-stack.git ~/nsdf-software-stack
Expand Down
3 changes: 1 addition & 2 deletions ansible/roles/install_cpython/tasks/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,7 @@
rm -Rf ~/.nsdf/vault
git clone [email protected]:nsdf-fabric/vault.git ~/.nsdf/vault
mkdir -p ~/.aws/
cp ~/.nsdf/vault/aws_config ~/.aws/config
cp ~/.nsdf/vault/aws_credentials ~/.aws/credentials
cp ~/.nsdf/vault/aws/* ~/.aws/
# software stack
rm -Rf ~/nsdf-software-stack
Expand Down
4 changes: 3 additions & 1 deletion nsdf/convert/__main__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
import os,sys



# ////////////////////////////////////////////////////////////////////////////
if __name__=="__main__":
from nsdf.kernel import SetupLogger
from nsdf.kernel import SetupLogger,logger

import logging
os.makedirs("/tmp/nsdf",exist_ok=True)
Expand Down
120 changes: 107 additions & 13 deletions nsdf/ipfs/ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ pushd ~/.nsdf/vault/ && git checkout -f && git pull && popd
# create AWS credentials
mkdir ~/.aws
cp ~/.nsdf/vault/aws_config ~/.aws/config
cp ~/.nsdf/vault/aws_credentials ~/.aws/credentials
cp ~/.nsdf/vault/awsaws/* ~/.aws/
more ~/.aws/config
# open firewall ports
Expand Down Expand Up @@ -75,24 +74,119 @@ git clone [email protected]:nsdf-fabric/nsdf-software-stack.git
cd nsdf-software-stack
```

# Copy objects
# OpenVisus (modvisus)


See https://docs.google.com/spreadsheets/d/11dw63aS7NJA4px0yp_IZB1Yq79JvXsl7/edit#gid=1779202315

```
ssh nsdf1.scu.utah.edu
sudo apt install nfs-common
sudo mkdir -p /usr/sci/cedmav
sudo mount -t nfs 155.98.58.116:/cedmav /usr/sci/cedmav
sudo mkdir -p /usr/sci/brain
sudo mount -t nfs 155.98.58.116:/sci/brain /usr/sci/brain
rm datasets.txt
nohup find /usr/sci/cedmav -iname "*.idx" 2>/dev/null >> datasets.txt
nohup find /usr/sci/brain -iname "*.idx" 2>/dev/null >> datasets.txt
```

# Material Science Data

See `material-science-upload.py`


# OpenVisus (arco)

Example about how to create an arco version of an existing db

```
#!/bin/bash
# convert locally to ARCO format
python3 -m nsdf.convert copy-dataset --arco 1mb /mnt/c/data/visus-dataset/2kbit1/modvisus/visus.idx /mnt/c/data/visus-dataset/2kbit1/1mb/visus.idx
python3 -m nsdf.convert compress-dataset --compression zip --num-threads 32 /mnt/c/data/2kbit1/1mb/visus.idx
# copy ARCO to the cloud
# aws s3 --no-verify-ssl --endpoint-url <endpoint> --profile <profile> cp --if-size-differ <src> s3://<dst-path>
export NUM_WORKERS=6
export NUM_CONNECTIONS=128
export SRC="s3://Pania_2021Q3_in_situ_data?profile=wasabi&num-connections=$NUM_CONNECTIONS"
export DST="s3://utah/buckets/Pania_2021Q3_in_situ_data?profile=sealstorage&num-connections=$NUM_CONNECTIONS&no-verify-ssl"
AWS_PROFILE=chpc s5cmd --no-verify-ssl -log=debug "$@" --numworkers 64 --endpoint-url https://pando-rgw01.chpc.utah.edu cp --if-size-differ \
"/mnt/c/data/visus-datasets/2kbit1/1mb/*" "s3://nsdf/visus-datasets/2kbit1/1mb/"
while [[ 1 == 1 ]] ; do
python3 -m nsdf.s3 copy-objects ${SRC} ${DST}
done
```

# New upload

See `upload.py`
# NASA Satellite data (?)

Links:

- https://www.matecdev.com/posts/download-remote-sensing-data-python.html
- https://www.geosage.com/products/spectral_transformer/Sentinel2/SpectralDiscoveryForSentinel2.pdf
- https://www.matecdev.com/posts/download-remote-sensing-data-python.html
- https://ladsweb.modaps.eosdis.nasa.gov/tools-and-services/data-download-scripts/


Repository with NASA data
- https://e4ftl01.cr.usgs.gov/


All AWS public data with `RequesterPays`:

```
git clone https://github.com/awslabs/open-data-registry.git
cd open-data-registry/
git grep --files-with-matches "RequesterPays: True"
# https://github.com/awslabs/open-data-registry/blob/main/datasets/cbers.yaml
# magery acquired by the China-Brazil Earth Resources Satellite (CBERS), 4 and 4A.
https://github.com/awslabs/open-data-registry/blob/main/datasets/cbers.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/hsip-lidar-us-cities.yaml
# The U.S. Cities elevation data collection program supported the US Department of Homeland Security Homeland Security and Infrastructure Program (HSIP).
https://github.com/awslabs/open-data-registry/blob/main/datasets/hsip-lidar-us-cities.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/modis-astraea.yaml
# data from the Moderate Resolution Imaging Spectroradiometer (MODIS), managed by the U.S. Geological Survey and NASA.
https://github.com/awslabs/open-data-registry/blob/main/datasets/modis-astraea.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/naip.yaml
# The National Agriculture Imagery Program (NAIP) acquires aerial imagery during the agricultural growing seasons in the continental U.S.
https://github.com/awslabs/open-data-registry/blob/main/datasets/naip.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/sentinel-1.yaml
# "[Sentinel-1](https://sentinel.esa.int/web/sentinel/missions/sentinel-1) is a pair of European radar imaging (SAR) satellites launched in 2014 and 2016.
https://github.com/awslabs/open-data-registry/blob/main/datasets/sentinel-1.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/sentinel-2.yaml
# The [Sentinel-2 mission](https://sentinel.esa.int/web/sentinel/missions/sentinel-2) is a land monitoring
# constellation of two satellites that provide high resolution optical imagery
https://github.com/awslabs/open-data-registry/blob/main/datasets/sentinel-2.yaml
# (NASA) https://github.com/awslabs/open-data-registry/blob/main/datasets/usgs-landsat.yaml
# This joint NASA/USGS program provides the longest continuous space-based record of
# Earth’s land in existence. Every day, Landsat satellites provide essential information
# to help land managers and policy makers make wise decisions about our resources and our environment.
# Data is provided for Landsats 1, 2, 3, 4, 5, 7, and 8.
https://github.com/awslabs/open-data-registry/blob/main/datasets/usgs-landsat.yaml
# https://github.com/awslabs/open-data-registry/blob/main/datasets/usgs-lidar.yaml
# The goal of the [USGS 3D Elevation Program ](https://www.usgs.gov/core-science-systems/ngp/3dep) (3DEP) is to collect elevation data in the form of light detection and ranging (LiDAR) data
https://github.com/awslabs/open-data-registry/blob/main/datasets/usgs-lidar.yaml
```

Command to collect statistics:

```
# with AWS cli tools
aws s3 ls --request-payer requester s3://usgs-landsat/collection02/
aws s3 ls --request-payer requester --summarize --recursive --human-readable s3://usgs-landsat/collection02/
# with s5cmd
export AWS_REGION=us-west-2
s5cmd --endpoint-url=https://s3.us-west-2.amazonaws.com --request-payer=requester --log trace --numworkers=256 ls --recursive "s3://usgs-landsat/collection02/"
s5cmd --endpoint-url=https://s3.us-west-2.amazonaws.com --request-payer=requester --log trace --numworkers=256 du "s3://usgs-landsat/collection02/*"
```
File renamed without changes.
Loading

0 comments on commit 0f5b2d1

Please sign in to comment.