diff --git a/README.md b/README.md index 42f1eb95..265962ff 100644 --- a/README.md +++ b/README.md @@ -48,10 +48,10 @@ For interactive use, the following commands (without ending semicolon) display t ```julia using NCDatasets -ds = Dataset("file.nc") +ds = NCDataset("file.nc") ``` -This creates the central structure of NCDatasets.jl, `Dataset`, which represents the contents of the netCDF file (without immediatelly loading everything in memory). `NCDataset` is an alias for `Dataset`. +This creates the central structure of NCDatasets.jl, `NCDataset`, which represents the contents of the netCDF file (without immediatelly loading everything in memory). `NCDataset` is an alias for `Dataset`. The following displays the information just for the variable `varname`: @@ -89,7 +89,7 @@ Loading a variable with known structure can be achieved by accessing the variabl ```julia # The mode "r" stands for read-only. The mode "r" is the default mode and the parameter can be omitted. -ds = Dataset("/tmp/test.nc","r") +ds = NCDataset("/tmp/test.nc","r") v = ds["temperature"] # load a subset @@ -110,7 +110,7 @@ close(ds) In the example above, the subset can also be loaded with: ```julia -subdata = Dataset("/tmp/test.nc")["temperature"][10:30,30:5:end] +subdata = NCDataset("/tmp/test.nc")["temperature"][10:30,30:5:end] ``` This might be useful in an interactive session. However, the file `test.nc` is not directly closed (closing the file will be triggered by Julia's garbage collector), which can be a problem if you open many files. On Linux the number of opened files is often limited to 1024 (soft limit). If you write to a file, you should also always close the file to make sure that the data is properly written to the disk. @@ -118,7 +118,7 @@ This might be useful in an interactive session. However, the file `test.nc` is n An alternative way to ensure the file has been closed is to use a `do` block: the file will be closed automatically when leaving the block. ```julia -data = Dataset(filename,"r") do ds +data = NCDataset(filename,"r") do ds ds["temperature"][:,:] end # ds is closed ``` @@ -132,7 +132,7 @@ using NCDatasets using DataStructures # This creates a new NetCDF file /tmp/test.nc. # The mode "c" stands for creating a new file (clobber) -ds = Dataset("/tmp/test.nc","c") +ds = NCDataset("/tmp/test.nc","c") # Define the dimension "lon" and "lat" with the size 100 and 110 resp. defDim(ds,"lon",100) @@ -164,7 +164,7 @@ It is also possible to create the dimensions, the define the variable and set it ```julia using NCDatasets -ds = Dataset("/tmp/test2.nc","c") +ds = NCDataset("/tmp/test2.nc","c") data = [Float32(i+j) for i = 1:100, j = 1:110] v = defVar(ds,"temperature",data,("lon","lat")) close(ds) @@ -178,7 +178,7 @@ to open it with the `"a"` option. Here, for example, we add a global attribute * file created in the previous step. ```julia -ds = Dataset("/tmp/test.nc","a") +ds = NCDataset("/tmp/test.nc","a") ds.attrib["creator"] = "your name" close(ds); ``` diff --git a/docs/src/dataset.md b/docs/src/dataset.md index 87c66c4b..49d8b46a 100644 --- a/docs/src/dataset.md +++ b/docs/src/dataset.md @@ -30,6 +30,22 @@ Otherwise, we attempt to use standard structures from the Julia standard library ## Groups +A NetCDF group is a dataset (with variables, attributes, dimensions and sub-groups) and +can be arbitrarily nested. +A group is created with `defGroup` and accessed via the `group` property of +a `NCDataset`. + +```julia +# create the variable "temperature" inside the group "forecast" +ds = NCDataset("results.nc", "c"); +ds_forecast = defGroup(ds,"forecast") +defVar(ds_forecast,"temperature",randn(10,11,12),("lon","lat","time")) + +# load the variable "temperature" inside the group "forecast" +forecast_temp = ds.group["forecast"]["temperature"][:,:,:] +close(ds) +``` + ```@docs defGroup getindex(g::NCDatasets.Groups,groupname::AbstractString) diff --git a/docs/src/index.md b/docs/src/index.md index cd2fe4f7..026d4abe 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,6 +1,12 @@ # NCDatasets.jl -Documentation for NCDatasets.jl, a Julia package for loading and writing NetCDF ([Network Common Data Form](https://www.unidata.ucar.edu/software/netcdf/)) files. +Documentation for [NCDatasets.jl](https://github.com/Alexander-Barth/NCDatasets.jl), a Julia package for loading and writing NetCDF ([Network Common Data Form](https://www.unidata.ucar.edu/software/netcdf/)) files. +NCDatasets.jl implements the for the NetCDF format the interface defined +in [CommonDataModel.jl](https://github.com/JuliaGeo/CommonDataModel.jl). +All functions defined by CommonDataModel.jl are also available for NetCDF data, including: +* virtually concatenating multiple files along a given dimension +* create a virtual subset (`view`) by indices or by values of coordinate variables (`CommonDataModel.select`, `CommonDataModel.@select`) +* group, map and reduce (with `mean`, standard deviation `std`, ...) a variable (`CommonDataModel.groupby`, `CommonDataModel.@groupby`) and rolling reductions like running means `CommonDataModel.rolling`). ## Installation @@ -11,6 +17,8 @@ using Pkg Pkg.add("NCDatasets") ``` +Or by typing `]add NCDatasets` using the package manager mode. + ### Latest development version If you want to try the latest development version, again go into package manager mode and simply type @@ -30,9 +38,9 @@ To get started quickly see the [Quickstart](@ref) section. Otherwise see the fol * [Attributes](@ref) : accessing/creating NetCDF attributes * See [Performance tips](@ref performance_tips), [Known issues](@ref), [Experimental features](@ref) for more information. -## Quickstart +## Quick start -This is a quickstart guide that outlines basic loading, reading, etc. usage. +This is a quick start guide that outlines basic loading, reading, etc. usage. For more details please see the individual pages of the documentation. @@ -183,9 +191,9 @@ using NCDatasets using DataStructures data = [Float32(i+j) for i = 1:100, j = 1:110] -Dataset("/tmp/test2.nc","c",attrib = OrderedDict("title" => "this is a test file")) do ds +NCDataset("/tmp/test2.nc","c",attrib = OrderedDict("title" => "this is a test file")) do ds # Define the variable temperature. The dimension "lon" and "lat" with the - # size 100 and 110 resp are implicetly created + # size 100 and 110 resp are implicitly created defVar(ds,"temperature",data,("lon","lat"), attrib = OrderedDict( "units" => "degree Celsius", "comments" => "this is a string attribute with Unicode Ω ∈ ∑ ∫ f(x) dx" @@ -209,10 +217,12 @@ close(ds); The utility function [`ncgen`](https://alexander-barth.github.io/NCDatasets.jl/stable/#NCDatasets.ncgen) generates the Julia code that would produce a netCDF file with the same metadata as a template netCDF file. -It is thus similar to the [command line tool `ncgen`](https://www.unidata.ucar.edu/software/netcdf/netcdf/ncgen.html). +It is thus similar to the [command line tool `ncgen`](https://www.unidata.ucar.edu/software/netcdf/netcdf/ncgen.html) +which can generate C or Fortran code from the output of [`ncdump`](https://www.unidata.ucar.edu/software/netcdf/netcdf/ncdump.html). ```julia -# download example file +using Downloads: download +# download an example file ncfile = download("https://www.unidata.ucar.edu/software/netcdf/examples/sresa1b_ncar_ccsm3-example.nc") # generate Julia code ncgen(ncfile) @@ -240,8 +250,8 @@ ncarea.attrib["units"] = "meter2"; ### Get one or several variables by specifying the value of an attribute -The variable names are not always standardized. For example, for the longitude we can -find: `lon`, `LON`, `longitude`, ... +The variable names are not always standardized. For example, the longitude can +be named: `lon`, `LON`, `longitude`, `łøñgitüdè`, ... The solution implemented in the function `varbyattrib` consists in searching for the variables that have specified value for a given attribute. @@ -257,6 +267,24 @@ attribute `standard_name` equal to `"longitude"` one can do the following: data = varbyattrib(ds, standard_name = "longitude")[1][:] ``` +As looking-up a variable by standard name is quite common, one can also use the +`@CF_str` macro and index the dataset using a string prefixed by `CF`. + +```julia +using NCDatasets: @CF_str +ds[CF"longitude"] +``` + +If multiple variables share the same standard name, such statements `ds[CF"longitude"]` are ambiguous and an error is returned. +This is typically the case for e.g. ocean models like ROMS where different variables (u, v and w velocity) are defined on different staggered grids (i.e. shifted by a half grid-cell from each other). +To disambiguate, one can first index the dataset `ds` with main data variable (e.g. vertical velocity) and then again extract the longitude associated to the data variable. + +```julia +ds[CF"upward_sea_water_velocity"][CF"longitude"] +``` + +Such statement is no longer ambiguous as from the dimension names it is clear which longitude has to be accessed. + ### Load a file with unknown structure If the structure of the netCDF file is not known before-hand, the program must check if a variable or attribute exists (with the `haskey` function) before loading it or alternatively place the loading in a `try`-`catch` block. diff --git a/docs/src/variables.md b/docs/src/variables.md index 3a18f3bc..e8f6fe9e 100644 --- a/docs/src/variables.md +++ b/docs/src/variables.md @@ -35,6 +35,7 @@ A scalar variable can be loaded using `[]`, for example: ```julia using NCDatasets NCDataset("test_scalar.nc","c") do ds + # the list of dimension names is simple `()` as a scalar does not have dimensions defVar(ds,"scalar",42,()) end