From db84465f4a2fcb8e42e5692ed927a4761995181c Mon Sep 17 00:00:00 2001 From: Allen Byrne <50328838+byrnHDF@users.noreply.github.com> Date: Fri, 22 Nov 2024 14:40:29 -0600 Subject: [PATCH] Convert vfl html file to doxygen (#5140) --- doxygen/dox/TechnicalNotes.dox | 1021 +++++++++++++++++++- doxygen/examples/VFL.html | 1598 -------------------------------- 2 files changed, 1020 insertions(+), 1599 deletions(-) delete mode 100644 doxygen/examples/VFL.html diff --git a/doxygen/dox/TechnicalNotes.dox b/doxygen/dox/TechnicalNotes.dox index 61ddfbd3398..199d17de129 100644 --- a/doxygen/dox/TechnicalNotes.dox +++ b/doxygen/dox/TechnicalNotes.dox @@ -32,7 +32,1026 @@ /** \page VFL HDF5 Virtual File Layer -\htmlinclude VFL.html +\section sec_vfl_intro Introduction +The HDF5 file format describes how HDF5 data structures and dataset raw data are mapped +to a linear format address space and the HDF5 library implements that bidirectional mapping +in terms of an API. However, the HDF5 format specifications do not indicate how the format +address space is mapped onto storage and HDF (version 5 and earlier) simply mapped the format +address space directly onto a single file by convention. + +Since early versions of HDF5 it became apparent that users want the ability to map the +format address space onto different types of storage (a single file, multiple files, local +memory, global memory, network distributed global memory, a network protocol, etc.) with +various types of maps. For instance, some users want to be able to handle very large format +address spaces on operating systems that support only 2GB files by partitioning the format +address space into equal-sized parts each served by a separate file. Other users want the +same multi-file storage capability but want to partition the address space according to +purpose (raw data in one file, object headers in another, global heap in a third, etc.) +in order to improve I/O speeds. + +In fact, the number of storage variations is probably larger than the number of methods +that the HDF5 team is capable of implementing and supporting. Therefore, a Virtual File +Layer API is being implemented which will allow application teams or departments to design +and implement their own mapping between the HDF5 format address space and storage, with each +mapping being a separate file driver (possibly written in terms of other file drivers). The +HDF5 team will provide a small set of useful file drivers which will also serve as examples +for those who which to write their own: + + + + + + + + + + + + + + + + +
#H5FD_SEC2This is the default driver which uses Posix file-system functions +like read and write to perform I/O to a single file. All I/O requests are unbuffered +although the driver does optimize file seeking operations to some extent. +
#H5FD_STDIOThis driver uses functions from 'stdio.h' to perform buffered I/O to a single file. +
#H5FD_COREThis driver performs I/O directly to memory and can be +used to create small temporary files that never exist on permanent storage. This +type of storage is generally very fast since the I/O consists only of memory-to-memory copy operations. +
#H5FD_MPIOThis is the driver of choice for accessing files in parallel +using MPI and MPI-IO. It is only predefined if the library is compiled with parallel I/O support. +
#H5FD_FAMILYLarge format address spaces are partitioned into more +manageable pieces and sent to separate storage locations using an underlying driver +of the user's choice. \ref H5TOOL_RT_UG can be used to change the sizes of the family +members when stored as files or to convert a family of files to a single file or vice versa. +
+ +\section sec_vfl_use Using a File Driver +Most application writers will use a driver defined by the HDF5 library or contributed by another +programming team. This chapter describes how existing drivers are used. + +\subsection subsec_vfl_use_hdr Driver Header Files +Each file driver is defined in its own public header file which should be included by any +application which plans to use that driver. The predefined drivers are in header files whose +names begin with 'H5FD' followed by the driver name and '.h'. The 'hdf5.h' header file includes +all the predefined driver header files. + +Once the appropriate header file is included a symbol of the form 'H5FD_' followed by the +upper-case driver name will be the driver identification number.(The driver name is by convention +and might not apply to drivers which are not distributed with HDF5.) However, the value may +change if the library is closed (e.g., by calling #H5close) and the symbol is referenced again. + +\subsection subsec_vfl_use_create Creating and Opening Files +In order to create or open a file one must define the method by which the storage is +accessed(The access method also indicates how to translate the storage name to a storage server +such as a file, network protocol, or memory.) and does so by creating a file access property +list(The term "file access property list" is a misnomer since storage isn't required to be a file.) +which is passed to the #H5Fcreate or #H5Fopen function. A default file access property list is created +by calling #H5Pcreate and then the file driver information is inserted by calling a driver initialization +function such as #H5Pset_fapl_family: +\code +hid_t fapl = H5Pcreate(H5P_FILE_ACCESS); +size_t member_size = 100*1024*1024; /*100MB*/ +H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT); +hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl); +H5Pclose(fapl); +\endcode + +Each file driver will have its own initialization function whose name is H5Pset_fapl_ followed by +the driver name and which takes a file access property list as the first argument followed by additional +driver-dependent arguments. + +An alternative to using the driver initialization function is to set the driver directly using the +#H5Pset_driver function.(This function is overloaded to operate on data transfer property lists also, as described below.) +Its second argument is the file driver identifier, which may have a different numeric value from run to run +depending on the order in which the file drivers are registered with the library. The third argument encapsulates +the additional arguments of the driver initialization function. This method only works if the file driver +writer has made the driver-specific property list structure a public datatype, which is often not the case. +\code +hid_t fapl = H5Pcreate(H5P_FILE_ACCESS); +static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT}; +H5Pset_driver(fapl, H5FD_FAMILY, &fa); +hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl); +H5Pclose(fapl); +\endcode + +It is also possible to query the file driver information from a file access property list by +calling #H5Pget_driver to determine the driver and then calling a driver-defined query function +to obtain the driver information: +\code +hid_t driver = H5Pget_driver(fapl); +if (H5FD_SEC2==driver) { + /*nothing further to get*/ +} else if (H5FD_FAMILY==driver) { + hid_t member_fapl; + haddr_t member_size; + H5Pget_fapl_family(fapl, &member_size, &member_fapl); +} else if (....) { + .... +} +\endcode + +\subsection subsec_vfl_use_per Performing I/O +The #H5Dread and #H5Dwrite functions transfer data between application memory and the file. They both take +an optional data transfer property list which has some general driver-independent properties and optional +driver-defined properties. An application will typically perform I/O in one of three styles via the +#H5Dread or #H5Dwrite function: + +Like file access properties in the previous section, data transfer properties can be set using a driver +initialization function or a general purpose function. For example, to set the MPI-IO driver to use +independent access for I/O operations one would say: +\code +hid_t dxpl = H5Pcreate(H5P_DATA_XFER); +H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT); +H5Dread(dataset, type, mspace, fspace, buffer, dxpl); +H5Pclose(dxpl); +\endcode + +The alternative is to initialize a driver defined C struct and pass it to the #H5Pset_driver function: +\code +hid_t dxpl = H5Pcreate(H5P_DATA_XFER); +static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT}; +H5Pset_driver(dxpl, H5FD_MPIO, &dx); +H5Dread(dataset, type, mspace, fspace, buffer, dxpl); +\endcode + +The transfer property list can be queried in a manner similar to the file access property list: the driver +provides a function (or functions) to return various information about the transfer property list: +\code +hid_t driver = H5Pget_driver(dxpl); +if (H5FD_MPIO==driver) { + H5FD_mpio_xfer_t xfer_mode; + H5Pget_dxpl_mpio(dxpl, &xfer_mode); +} else { + .... +} +\endcode + +\subsection subsec_vfl_use_inter File Driver Interchangeability +The HDF5 specifications describe two things: the mapping of data onto a linear format address +space and the C API which performs the mapping. However, the mapping of the format address space +onto storage intentionally falls outside the scope of the HDF5 specs. This is a direct result of the +fact that it is not generally possible to store information about how to access storage inside the +storage itself. For instance, given only the file name '/arborea/1225/work/f%03d' the HDF5 library +is unable to tell whether the name refers to a file on the local file system, a family of files on +the local file system, a file on host 'arborea' port 1225, a family of files on a remote system, etc. + +Two ways which library could figure out where the storage is located are: storage access information +can be provided by the user, or the library can try all known file access methods. This implementation +uses the former method. + +In general, if a file was created with one driver then it isn't possible to open it with another driver. +There are of course exceptions: a file created with MPIO could probably be opened with the sec2 driver, +any file created by the sec2 driver could be opened as a family of files with one member, etc. In fact, +sometimes a file must not only be opened with the same driver but also with the same driver properties. +The predefined drivers are written in such a way that specifying the correct driver is sufficient for +opening a file. + +\section sec_vfl_imp Implementation of a Driver +A driver is simply a collection of functions and data structures which are registered with the HDF5 +library at runtime. The functions fall into these categories: +\li Functions which operate on modes +\li Functions which operate on files +\li Functions which operate on the address space +\li Functions which operate on data +\li Functions for driver initialization +\li Optimization functions + +\subsection subsec_vfl_imp_mode Mode Functions +Some drivers need information about file access and data transfers which are very specific to the driver. +The information is usually implemented as a pair of pointers to C structs which are allocated and +initialized as part of an HDF5 property list and passed down to various driver functions. There are two +classes of settings: file access modes that describe how to access the file through the driver, and +data transfer modes which are settings that control I/O operations. Each file opened by a particular +driver may have a different access mode; each dataset I/O request for a particular file may have a +different data transfer mode. + +Since each driver has its own particular requirements for various settings, each driver is responsible +for defining the mode structures that it needs. Higher layers of the library treat the structures as +opaque but must be able to copy and free them. Thus, the driver provides either the size of the +structure or a pair of function pointers for each of the mode types. + +Example: The family driver needs to know how the format address space is partitioned and the file +access property list to use for the family members. +\code +// Driver-specific file access properties +typedef struct H5FD_family_fapl_t { + hsize_t memb_size; // size of each family member + hid_t memb_fapl; // file access property list for each family member +} H5FD_family_fapl_t; + +// Driver specific data transfer properties +typedef struct H5FD_family_dxpl_t { + hid_t memb_dxpl_id; //data xfer property list of each member +} H5FD_family_dxpl_t; +\endcode +n order to copy or free one of these structures the member file access or data transfer properties must +also be copied or freed. This is done by providing a copy and close function for each structure: + +Example: The file access property list copy and close functions for the family driver: +\code +static void * +H5FD_family_fapl_copy(const void *_old_fa) +{ + const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa; + H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t)); + assert(new_fa); + + memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t)); + new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id); + return new_fa; +} + +static herr_t +H5FD_family_fapl_free(void *_fa) +{ + H5FD_family_fapl_t *fa = (H5FD_family_fapl_t*)_fa; + H5Pclose(fa->memb_fapl_id); + free(fa); + return 0; +} +\endcode + +Generally when a file is created or opened the file access properties for the driver are copied into the +file pointer which is returned and they may be modified from their original value (for instance, the file +family driver modifies the member size property when opening an existing family). In order to support the +#H5Fget_access_plist function the driver must provide a fapl_get callback which creates a copy of the +driver-specific properties based on a particular file. + +Example: The file family driver copies the member size file access property list into the return value: +\code +static void * +H5FD_family_fapl_get(H5FD_t *_file) +{ + H5FD_family_t *file = (H5FD_family_t*)_file; + H5FD_family_fapl_t *fa = calloc(1, sizeof(H5FD_family_fapl_t*)); + + fa->memb_size = file->memb_size; + fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id); + return fa; +} +\endcode + +\subsection subsec_vfl_imp_file File Functions +The higher layers of the library expect files to have a name and allow the file to be accessed in various modes. +The driver must be able to create a new file, replace an existing file, or open an existing file. Opening or +creating a file should return a handle, a pointer to a specialization of the H5FD_t struct, which allows read-only +or read-write access and which will be passed to the other driver functions as they are called.(Read-only access is +only appropriate when opening an existing file.) +\code +typedef struct { + // Public fields + H5FD_class_t *cls; //class data defined below + + // Private fields -- driver-defined + +} H5FD_t; +\endcode + +Example: The family driver requires handles to the underlying storage, the size of the members for this +particular file (which might be different than the member size specified in the file access property list +if an existing file family is being opened), the name used to open the file in case additional members +must be created, and the flags to use for creating those additional members. The eoa member caches the +size of the format address space so the family members don't have to be queried in order to find it. +\code +// The description of a file belonging to this driver. +typedef struct H5FD_family_t { + H5FD_t pub; // public stuff, must be first + hid_t memb_fapl_id; // file access property list for members + hsize_t memb_size; // maximum size of each member file + int nmembs; // number of family members + int amembs; // number of member slots allocated + H5FD_t **memb; // dynamic array of member pointers + haddr_t eoa; // end of allocated addresses + char *name; // name generator printf format + unsigned flags; // flags for opening additional members +} H5FD_family_t; +\endcode + +Example: The sec2 driver needs to keep track of the underlying Unix file descriptor and also the +end of format address space and current Unix file size. It also keeps track of the current file +position and last operation (read, write, or unknown) in order to optimize calls to lseek. The +device and inode fields are defined on Unix in order to uniquely identify the file and will be +discussed below. +\code +typedef struct H5FD_sec2_t { + H5FD_t pub; // public stuff, must be first + int fd; // the unix file + haddr_t eoa; // end of allocated region + haddr_t eof; // end of file; current file size + haddr_t pos; // current file I/O position + int op; // last operation + dev_t device; // file device number + ino_t inode; // file i-node number +} H5FD_sec2_t; +\endcode + +\subsection subsec_vfl_imp_open Open Files +All drivers must define a function for opening/creating a file. This function should have a prototype which is: + + + + + +
static H5FD_t * open (const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)The file name name and file access property list fapl are the same as were specified in the #H5Fcreate +or #H5Fopen call. The flags are the same as in those calls also except the flag #H5F_ACC_CREAT is also +present if the call was to H5Fcreate and they are documented in the 'H5Fpublic.h' file. The maxaddr +argument is the maximum format address that the driver should be prepared to handle (the minimum address is always zero).
+ +Example: The sec2 driver opens a Unix file with the requested name and saves information which +uniquely identifies the file (the Unix device number and inode). +\code +static H5FD_t * +H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/, + haddr_t maxaddr) +{ + unsigned o_flags; + int fd; + struct stat sb; + H5FD_sec2_t *file=NULL; + + // Check arguments + if (!name || !*name) return NULL; + if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL; + if (ADDR_OVERFLOW(maxaddr)) return NULL; + + // Build the open flags + o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY; + if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC; + if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT; + if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL; + + // Open the file + if ((fd=open(name, o_flags, 0666))<0) return NULL; + if (fstat(fd, &sb)<0) { + close(fd); + return NULL; + } + + // Create the new file struct + file = calloc(1, sizeof(H5FD_sec2_t)); + file->fd = fd; + file->eof = sb.st_size; + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + file->device = sb.st_dev; + file->inode = sb.st_ino; + + return (H5FD_t*)file; +} +\endcode + +\subsection subsec_vfl_imp_close Closing Files +Closing a file simply means that all cached data should be flushed to the next lower layer, the +file should be closed at the next lower layer, and all file-related data structures should be +freed. All information needed by the close function is already present in the file handle. + + + + + +
static herr_t close (H5FD_t *file)The file argument is the handle which was returned by the open function, and the close should +free only memory associated with the driver-specific part of the handle (the public parts will +have already been released by HDF5's virtual file layer).
+ +Example: The sec2 driver just closes the underlying Unix file, making sure that the actual +file size is the same as that known to the library by writing a zero to the last file position +it hasn't been written by some previous operation (which happens in the same code which flushes +the file contents and is shown below). +\code +static herr_t +H5FD_sec2_close(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + + if (H5FD_sec2_flush(_file)<0) return -1; + if (close(file->fd)<0) return -1; + free(file); + return 0; +} +\endcode + +\subsection subsec_vfl_imp_key File Keys +Occasionally an application will attempt to open a single file more than one time in order +to obtain multiple handles to the file. HDF5 allows the files to share information(For instance, +writing data to one handle will cause the data to be immediately visible on the other handle.) +but in order to accomplish this HDF5 must be able to tell when two names refer to the same file. +It does this by associating a driver-defined key with each file opened by a driver and comparing +the key for an open request with the keys for all other files currently open by the same driver. + + + + + +
const int cmp (const H5FD_t *f1, const H5FD_t *f2)The driver may provide a function which compares two files f1 and f2 belonging to the same +driver and returns a negative, positive, or zero value a la the strcmp function.(The ordering +is arbitrary as long as it's consistent within a particular file driver.) If this function is +not provided then HDF5 assumes that all calls to the open callback return unique files regardless +of the arguments and it is up to the application to avoid doing this if that assumption is incorrect.
+ +Each time a file is opened the library calls the cmp function to compare that file with all other files +currently open by the same driver and if one of them matches (at most one can match) then the file +which was just opened is closed and the previously opened file is used instead. + +Opening a file twice with incompatible flags will result in failure. For instance, opening a file with +the truncate flag is a two step process which first opens the file without truncation so keys can be +compared, and if no matching file is found already open then the file is closed and immediately reopened +with the truncation flag set (if a matching file is already open then the truncating open will fail). + +Example: The sec2 driver uses the Unix device and i-node as the key. They were initialized when +the file was opened. +\code +static int +H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2) +{ + const H5FD_sec2_t *f1 = (const H5FD_sec2_t*)_f1; + const H5FD_sec2_t *f2 = (const H5FD_sec2_t*)_f2; + + if (f1->device < f2->device) return -1; + if (f1->device > f2->device) return 1; + + if (f1->inode < f2->inode) return -1; + if (f1->inode > f2->inode) return 1; + + return 0; +} +\endcode + +\subsection subsec_vfl_imp_save Saving Modes Across Opens +Some drivers may also need to store certain information in the file superblock in order +to be able to reliably open the file at a later date. This is done by three functions: +one to determine how much space will be necessary to store the information in the superblock, +one to encode the information, +and one to decode the information. These functions are optional, but if any one is defined +then the other two must also be defined. + + + + + + + + + + + + + + + + + +
FunctionDescription
static hsize_t sb_size (H5FD_t *file)The sb_size function returns the number of bytes necessary to encode +information needed later if the file is reopened.
static herr_t sb_encode (H5FD_t *file, char *name, unsigned char *buf)The sb_encode function encodes information from the file into buffer buf +allocated by the caller. It also writes an 8-character (plus null termination) into +the name argument, which should be a unique identification for the driver.
static herr_t sb_decode (H5FD_t *file, const char *name, const unsigned char *buf)The sb_decode function looks at the name decodes data from the buffer buf and +updates the file argument with the new information, advancing *p in the process.
+The part of this which is somewhat tricky is that the file must be readable before the +superblock information is decoded. File access modes fall outside the scope of the HDF5 +file format, but they are placed inside the boot block for convenience.(File access modes +do not describe data, but rather describe how the HDF5 format address space is mapped to +the underlying file(s). Thus, in general the mapping must be known before the file +superblock can be read. However, the user usually knows enough about the mapping for +the superblock to be readable and once the superblock is read the library can fill +in the missing parts of the mapping.) + +\section sec_vfl_address Address Space Functions +HDF5 does not assume that a file is a linear address space of bytes. Instead, the library +will call functions to allocate and free portions of the HDF5 format address space, which +in turn map onto functions in the file driver to allocate and free portions of file address +space. The library tells the file driver how much format address space it wants to allocate +and the driver decides what format address to use and how that format address is mapped +onto the file address space. Usually the format address is chosen so that the file address +can be calculated in constant time for data I/O operations (which are always specified by format addresses). + +\subsection subsec_vfl_address_blk Userblock and Superblock +The HDF5 format allows an optional userblock to appear before the actual HDF5 data in such +a way that if the userblock is sucked out of the file and everything remaining is +shifted downward in the file address space, then the file is still a valid HDF5 file. +The userblock size can be zero or any multiple of two greater than or equal to 512 and +the file superblock begins immediately after the userblock. + +HDF5 allocates space for the userblock and superblock by calling an allocation function +defined below, which must return a chunk of memory at format address zero on the first call. + +\subsection subsec_vfl_address_alloc Allocatiion of Format Regions +The library makes many types of allocation requests: + + + + + + + + + + + + + + + + + + + + +
#H5FD_MEM_SUPERuserblock
#H5FD_MEM_BTREEAn allocation request for a node of a B-tree. +
#H5FD_MEM_DRAWAn allocation request for the raw data of a dataset. +
#H5FD_MEM_GHEAPAn allocation request for a global heap collection. Global +heaps are used to store certain types of references such as dataset region references. +The set of all global heap collections can become quite large. +
#H5FD_MEM_LHEAPAn allocation request for a local heap. Local heaps are used +to store the names which are members of a group. The combined size of all local heaps is +a function of the number of object names in the file. +
#H5FD_MEM_OHDRAn allocation request for (part of) an object header. Object +headers are relatively small and include meta information about objects (like the data +space and type of a dataset) and attributes. +
+ +When a chunk of memory is freed the library adds it to a free list and allocation requests +are satisfied from the free list before requesting memory from the file driver. Each type of +allocation request enumerated above has its own free list, but the file driver can specify that +certain object types can share a free list. It does so by providing an array which maps a +request type to a free list. If any value of the map is H5MF_DEFAULT (zero) then the object's +own free list is used. The special value H5MF_NOLIST indicates that the library should not +attempt to maintain a free list for that particular object type, instead calling the file driver +each time an object of that type is freed. + +Mappings predefined in the 'H5FDpublic.h' file are: + + + + + + + + + + +
#H5FD_FLMAP_SINGLEAll memory usage types are mapped to a single free list. +
#H5FD_FLMAP_DICHOTOMYMemory usage is segregated into meta data and raw data +for the purposes of memory management. +
#H5FD_FLMAP_DEFAULTEach memory usage type has its own free list. +
+ +Example: To make a map that manages object headers on one free list and everything else on +another free list one might initialize the map with the following code: (the use of #H5FD_MEM_SUPER is arbitrary) +\code +H5FD_mem_t mt, map[H5FD_MEM_NTYPES]; + +for (mt = 0; mt < H5FD_MEM_NTYPES; mt++) { + map[mt] = (H5FD_MEM_OHDR== mt) ? mt : H5FD_MEM_SUPER; +} +\endcode + +If an allocation request cannot be satisfied from the free list then one of two things happen. +If the driver defines an allocation callback then it is used to allocate space; otherwise new +memory is allocated from the end of the format address space by incrementing the end-of-address marker. + + + + + +
static haddr_t alloc (H5FD_t *file, H5MF_type_t type, hsize_t size)The file argument is the file from which space is to be allocated, type is the type of +memory being requested (from the list above) without being mapped according to the freelist +map and size is the number of bytes being requested. The library is allowed to allocate large +chunks of storage and manage them in a layer above the file driver (although the current library +doesn't do that). The allocation function should return a format address for the first byte +allocated. The allocated region extends from that address for size bytes. If the request cannot +be honored then the undefined address value is returned (#HADDR_UNDEF). The first call to this +function for a file which has never had memory allocated must return a format address of zero +or #HADDR_UNDEF since this is how the library allocates space for the userblock and/or superblock.
+ +\subsection subsec_vfl_address_free Freeing Format Regions +When the library is finished using a certain region of the format address space it will return the +space to the free list according to the type of memory being freed and the free list map described above. +If the free list has been disabled for a particular memory usage type (according to the free list map) +and the driver defines a free callback then it will be invoked. The free callback is also invoked for +all entries on the free list when the file is closed. + + + + + + +
static herr_t free (H5FD_t *file, H5MF_type_t type, haddr_t addr, hsize_t size)The file argument is the file for which space is being freed; type is the type of object being +freed (from the list above) without being mapped according to the freelist map; addr is the first +format address to free; and size is the size in bytes of the region being freed. The region being +freed may refer to just part of the region originally allocated and/or may cross allocation boundaries +provided all regions being freed have the same usage type. However, the library will never attempt +to free regions which have already been freed or which have never been allocated.
+A driver may choose to not define the free function, in which case format addresses will be leaked. +This isn't normally a huge problem since the library contains a simple free list of its own and freeing +parts of the format address space is not a common occurrence. + +\subsection subsec_vfl_address_query Querying the Address Range +Each file driver must have some mechanism for setting and querying the end of address, or +EOA, marker. The EOA marker is the first format address after the last format address ever allocated. +If the last part of the allocated address range is freed then the driver may optionally decrease the eoa marker. + + + + + +
static haddr_t get_eoa (H5FD_t *file)This function returns the current value of the EOA marker for the specified file.
+ +Example: The sec2 driver just returns the current eoa marker value which is cached in the file structure: +\code +static haddr_t +H5FD_sec2_get_eoa(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + return file->eoa; +} +\endcode + +The eoa marker is initially zero when a file is opened and the library may set it to some other value +shortly after the file is opened (after the superblock is read and the saved eoa marker is determined) +or when allocating additional memory in the absence of an alloc callback (described above). + +Example: The sec2 driver simply caches the eoa marker in the file structure and does not extend the +underlying Unix file. When the file is flushed or closed then the Unix file size is extended to match +the eoa marker. +\code +static herr_t +H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + file->eoa = addr; + return 0; +} +\endcode + +\section sec_vfl_data Data Functions +These functions operate on data, transferring a region of the format address space between memory and files. + +\subsection subsec_vfl_data_cont Contiguous I/O Functions +A driver must specify two functions to transfer data from the library to the file and vice versa. + + + + + + + + + +
static herr_t read (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buf)The read function reads data from file file beginning at address addr and continuing +for size bytes into the buffer buf supplied by the caller.
static herr_t write (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buf)The write function transfers data +in the opposite direction.
+\li Both functions take a data transfer property list dxpl which +indicates the fine points of how the data is to be transferred and which comes directly +from the #H5Dread or #H5Dwrite function. +\li Both functions receive type of data being written, +which may allow a driver to tune it's behavior for different kinds of data. +\li Both functions should return +a negative value if they fail to transfer the requested data, or non-negative if they +succeed. The library will never attempt to read from unallocated regions of the format address space. + +Example: The sec2 driver just makes system calls. It tries not to call lseek if the current operation +is the same as the previous operation and the file position is correct. It also fills the output buffer +with zeros when reading between the current EOF and EOA markers and restarts system calls which were interrupted. +\code +static herr_t +H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/, + haddr_t addr, hsize_t size, void *buf/*out*/) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + ssize_t nbytes; + + assert(file && file->pub.cls); + assert(buf); + + /* Check for overflow conditions */ + if (REGION_OVERFLOW(addr, size)) return -1; + if (addr+size>file->eoa) return -1; + + /* Seek to the correct location */ + if ((addr!=file->pos || OP_READ!=file->op) && + file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) { + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + return -1; + } + + /* + * Read data, being careful of interrupted system calls, partial results, + * and the end of the file. + */ + while (size>0) { + do nbytes = read(file->fd, buf, size); + while (-1==nbytes && EINTR==errno); + if (-1==nbytes) { + /* error */ + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + return -1; + } + if (0==nbytes) { + /* end of file but not end of format address space */ + memset(buf, 0, size); + size = 0; + } + assert(nbytes>=0); + assert((hsize_t)nbytes<=size); + size -= (hsize_t)nbytes; + addr += (haddr_t)nbytes; + buf = (char*)buf + nbytes; + } + + /* Update current position */ + file->pos = addr; + file->op = OP_READ; + return 0; +} +\endcode +Example: The sec2 write callback is similar except it updates the file EOF marker when extending the file. + +\subsection subsec_vfl_data_flush Flushing Cached Data +Some drivers may desire to cache data in memory in order to make larger I/O requests to the +underlying file and thus improving bandwidth. Such drivers should register a cache flushing +function so that the library can insure that data has been flushed out of the drivers in +response to the application calling #H5Fflush. + + + + + +
static herr_t flush (H5FD_t *file)Flush all data for file file to storage.
+ +Example: The sec2 driver doesn't cache any data but it also doesn't extend the Unix file as +aggressively as it should. Therefore, when finalizing a file it should write a zero to the last +byte of the allocated region so that when reopening the file later the EOF marker will be at +least as large as the EOA marker saved in the superblock (otherwise HDF5 will refuse to open +the file, claiming that the data appears to be truncated). +\code +static herr_t +H5FD_sec2_flush(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + + if (file->eoa>file->eof) { + if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1; + if (write(file->fd, "", 1)!=1) return -1; + file->eof = file->eoa; + file->pos = file->eoa; + file->op = OP_WRITE; + } + + return 0; +} +\endcode + +\section sec_vfl_opt Optimization Functions +The library is capable of performing several generic optimizations on I/O, but these types of +optimizations may not be appropriate for a given VFL driver. + +Each driver may provide a query function to allow the library to query whether to enable these +optimizations. If a driver lacks a query function, the library will disable all types of +optimizations which can be queried. + + + + + + +
static herr_t query (const H5FD_t *file, unsigned long *flags)This function is called by the library to query which optimizations to enable for I/O to this driver.
+ +These are the flags which are currently defined: + + + + + + + + + + + + + +
H5FD_FEAT_AGGREGATE_METADATA (0x00000001)Defining the H5FD_FEAT_AGGREGATE_METADATA for a VFL driver means that the library will attempt to allocate +a larger block for metadata and then sub-allocate each metadata request from that larger block.
H5FD_FEAT_ACCUMULATE_METADATA (0x00000002)Defining the H5FD_FEAT_ACCUMULATE_METADATA for a VFL driver means that the library will attempt to cache +metadata as it is written to the file and build up a larger block of metadata to eventually pass to the +VFL 'write' routine.
H5FD_FEAT_DATA_SIEVE (0x00000004)Defining the H5FD_FEAT_DATA_SIEVE for a VFL driver means that the library will attempt to cache raw data + as it is read from/written to a file in a "data sieve" buffer.
+ +See Rajeev Thakur's papers: +http://www.mcs.anl.gov/~thakur/papers/romio-coll.ps.gz +http://www.mcs.anl.gov/~thakur/papers/mpio-high-perf.ps.gz + +\section sec_vfl_reg Registration of a Driver +Before a driver can be used the HDF5 library needs to be told of its existence. This is done by +registering the driver, which results in a driver identification number. Instead of passing many +arguments to the registration function, the driver information is entered into a structure and the +address of the structure is passed to the registration function where it is copied. This allows +the HDF5 API to be extended while providing backward compatibility at the source level. + + + + + + +
hid_t H5FDregister (H5FD_class_t *cls)The driver described by struct cls is registered with the library and an ID number for the driver is returned.
+ +The H5FD_class_t type is a struct with the following fields: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
const char *nameA pointer to a constant, null-terminated driver name to be used for debugging purposes.
size_t fapl_sizeThe size in bytes of the file access mode structure or zero if the driver supplies a copy function +or doesn't define the structure.
void *(*fapl_copy)(const void *fapl)An optional function which copies a driver-defined file access mode structure. This field takes +precedence over fm_size when both are defined.
void (*fapl_free)(void *fapl)An optional function to free the driver-defined file access mode structure. If null, then the +library calls the C free function to free the structure.
size_t dxpl_sizeThe size in bytes of the data transfer mode structure or zero if the driver supplies a copy +function or doesn't define the structure.
void *(*dxpl_copy)(const void *dxpl)An optional function which copies a driver-defined data transfer mode structure. This field +takes precedence over xm_size when both are defined.
void (*dxpl_free)(void *dxpl)An optional function to free the driver-defined data transfer mode structure. If null, then +the library calls the C free function to free the structure.
H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)The function which opens or creates a new file.
herr_t (*close)(H5FD_t *file)The function which ends access to a file.
int (*cmp)(const H5FD_t *f1, const H5FD_t *f2)An optional function to determine whether two open files have the same key. If this function +is not present then the library assumes that two files will never be the same.
int (*query)(const H5FD_t *f, unsigned long *flags)An optional function to determine which library optimizations a driver can support.
haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size)An optional function to allocate space in the file.
herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size)An optional function to free space in the file.
haddr_t (*get_eoa)(H5FD_t *file)A function to query how much of the format address space has been allocated.
herr_t (*set_eoa)(H5FD_t *file, haddr_t)A function to set the end of address space.
haddr_t (*get_eof)(H5FD_t *file)A function to return the current end-of-file marker value.
herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer)A function to read data from a file.
herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer)A function to write data to a file.
herr_t (*flush)(H5FD_t *file)A function which flushes cached data to the file.
H5FD_mem_t fl_map[H5FD_MEM_NTYPES]An array which maps a file allocation request type to a free list.
+ +Example: The sec2 driver would be registered as: +\code +static const H5FD_class_t H5FD_sec2_g = { + "sec2", /*name */ + MAXADDR, /*maxaddr */ + NULL, /*sb_size */ + NULL, /*sb_encode */ + NULL, /*sb_decode */ + 0, /*fapl_size */ + NULL, /*fapl_get */ + NULL, /*fapl_copy */ + NULL, /*fapl_free */ + 0, /*dxpl_size */ + NULL, /*dxpl_copy */ + NULL, /*dxpl_free */ + H5FD_sec2_open, /*open */ + H5FD_sec2_close, /*close */ + H5FD_sec2_cmp, /*cmp */ + H5FD_sec2_query, /*query */ + NULL, /*alloc */ + NULL, /*free */ + H5FD_sec2_get_eoa, /*get_eoa */ + H5FD_sec2_set_eoa, /*set_eoa */ + H5FD_sec2_get_eof, /*get_eof */ + H5FD_sec2_read, /*read */ + H5FD_sec2_write, /*write */ + H5FD_sec2_flush, /*flush */ + H5FD_FLMAP_SINGLE, /*fl_map */ +}; + +hid_t +H5FD_sec2_init(void) +{ + if (!H5FD_SEC2_g) { + H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g); + } + return H5FD_SEC2_g; +} +\endcode + +A driver can be removed from the library by unregistering it + + + + + +
herr_t H5Dunregister (hid_t driver)Where driver is the ID number returned when the driver was registered.
+Unregistering a driver makes it unusable for creating new file access or data transfer property +lists but doesn't affect any property lists or files that already use that driver. + +\subsection subsec_vfl_reg_prog Programming Note for C++ Developers Using C Functions +If a C routine that takes a function pointer as an argument is called from within C++ code, +the C routine should be returned from normally. + +Examples of this kind of routine include callbacks such as #H5Pset_elink_cb +and #H5Pset_type_conv_cb and functions such as #H5Tconvert and #H5Ewalk2. + +Exiting the routine in its normal fashion allows the HDF5 C Library to clean up +its work properly. In other words, if the C++ application jumps out of the routine +back to the C++ “catch” statement, the library is not given the opportunity to close +any temporary data structures that were set up when the routine was called. The C++ +application should save some state as the routine is started so that any problem that +occurs might be diagnosed. + +\section sec_vfl_query Querying Driver Information + + + + + +
void * H5Pget_driver_data (hid_t fapl)
void * H5Pget_driver_data (hid_t fxpl)
This function is intended to be used by driver functions, not applications. It returns a pointer +directly into the file access property list fapl which is a copy of the driver's file access mode +originally provided to the H5Pset_driver function. If its argument is a data transfer property list +fxpl then it returns a pointer to the driver-specific data transfer information instead. +
+ +\section sec_vfl_misc Miscellaneous +The various private H5F_low_* functions will be replaced by public H5FD* functions so they +can be called from drivers. + +All private functions H5F_addr_* which operate on addresses will be renamed as public functions +by removing the first underscore so they can be called by drivers. + +The haddr_t address data type will be passed by value throughout the library. The original +intent was that this type would eventually be a union of file address types for the various +drivers and may become quite large, but that was back when drivers were part of HDF5. It will +become an alias for an unsigned integer type (32 or 64 bits depending on how the library was configured). + +The various H5F*.c driver files will be renamed H5FD*.c and each will have a corresponding header +file. All driver functions except the initializer and API will be declared static. + +This documentation didn't cover optimization functions which would be useful to drivers like MPI-IO. +Some drivers may be able to perform data pipeline operations more efficiently than HDF5 and need to +be given a chance to override those parts of the pipeline. The pipeline would be designed to call +various H5FD optimization functions at various points which return one of three values: the operation +is not implemented by the driver, the operation is implemented but failed in a non-recoverable manner, +the operation is implemented and succeeded. + +Various parts of HDF5 check the only the top-level file driver and do something special if it is +the MPI-IO driver. However, we might want to be able to put the MPI-IO driver under other drivers +such as the raw part of a split driver or under a debug driver whose sole purpose is to accumulate +statistics as it passes all requests through to the MPI-IO driver. Therefore we will probably need +a function which takes a format address and or object type and returns the driver which would have +been used at the lowest level to process the request. */ diff --git a/doxygen/examples/VFL.html b/doxygen/examples/VFL.html deleted file mode 100644 index e0942fc077d..00000000000 --- a/doxygen/examples/VFL.html +++ /dev/null @@ -1,1598 +0,0 @@ - - - - -HDF5 Virtual File Layer - - - - - - - - -Revision History -

Initial document, 18 November 1999.

- -

Updated on 10/24/00, Quincey Koziol

- -

Added the section “Programming Note for C++ Developers Using C -Functions,” 08/23/2012, Mark Evans - - - -

-


-

Table of Contents

- -


- - -

Introduction

- -

-The HDF5 file format describes how HDF5 data structures and dataset raw -data are mapped to a linear format address space and the HDF5 -library implements that bidirectional mapping in terms of an -API. However, the HDF5 format specifications do not indicate how -the format address space is mapped onto storage and HDF (version 5 and -earlier) simply mapped the format address space directly onto a single -file by convention. - -

-

-Since early versions of HDF5 it became apparent that users want the ability to -map the format address space onto different types of storage (a single file, -multiple files, local memory, global memory, network distributed global -memory, a network protocol, etc.) with various types of maps. For -instance, some users want to be able to handle very large format address -spaces on operating systems that support only 2GB files by partitioning the -format address space into equal-sized parts each served by a separate -file. Other users want the same multi-file storage capability but want to -partition the address space according to purpose (raw data in one file, object -headers in another, global heap in a third, etc.) in order to improve I/O -speeds. - -

-

-In fact, the number of storage variations is probably larger than the -number of methods that the HDF5 team is capable of implementing and -supporting. Therefore, a Virtual File Layer API is being -implemented which will allow application teams or departments to design -and implement their own mapping between the HDF5 format address space -and storage, with each mapping being a separate file driver -(possibly written in terms of other file drivers). The HDF5 team will -provide a small set of useful file drivers which will also serve as -examples for those who which to write their own: - -

-
- -
H5FD_SEC2 -
-This is the default driver which uses Posix file-system functions like -read and write to perform I/O to a single file. All I/O -requests are unbuffered although the driver does optimize file seeking -operations to some extent. - -
H5FD_STDIO -
-This driver uses functions from `stdio.h' to perform buffered I/O -to a single file. - -
H5FD_CORE -
-This driver performs I/O directly to memory and can be used to create small -temporary files that never exist on permanent storage. This type of storage is -generally very fast since the I/O consists only of memory-to-memory copy -operations. - -
H5FD_MPIIO -
-This is the driver of choice for accessing files in parallel using MPI and -MPI-IO. It is only predefined if the library is compiled with parallel I/O -support. - -
H5FD_FAMILY -
-Large format address spaces are partitioned into more manageable pieces and -sent to separate storage locations using an underlying driver of the user's -choice. The h5repart tool can be used to change the sizes of the -family members when stored as files or to convert a family of files to a -single file or vice versa. - -
H5FD_SPLIT -
-The format address space is split into meta data and raw data and each is -mapped onto separate storage using underlying drivers of the user's -choice. The meta data storage can be read by itself (for limited -functionality) or both files can be accessed together. -
- - - -

Using a File Driver

- -

-Most application writers will use a driver defined by the HDF5 library or -contributed by another programming team. This chapter describes how existing -drivers are used. - -

- - - -

Driver Header Files

- -

-Each file driver is defined in its own public header file which should -be included by any application which plans to use that driver. The -predefined drivers are in header files whose names begin with -`H5FD' followed by the driver name and `.h'. The `hdf5.h' -header file includes all the predefined driver header files. - -

-

-Once the appropriate header file is included a symbol of the form -`H5FD_' followed by the upper-case driver name will be the driver -identification number.(1) However, the -value may change if the library is closed (e.g., by calling -H5close) and the symbol is referenced again. - -

- - -

Creating and Opening Files

- -

-In order to create or open a file one must define the method by which the -storage is accessed(2) and does so by creating a file access property list(3) which is passed to the H5Fcreate or -H5Fopen function. A default file access property list is created by -calling H5Pcreate and then the file driver information is inserted by -calling a driver initialization function such as H5Pset_fapl_family: - -

- -
-hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
-size_t member_size = 100*1024*1024; /*100MB*/
-H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT);
-hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
-H5Pclose(fapl);
-
- -

-Each file driver will have its own initialization function -whose name is H5Pset_fapl_ followed by the driver name and which -takes a file access property list as the first argument followed by -additional driver-dependent arguments. - -

-

-An alternative to using the driver initialization function is to set the -driver directly using the H5Pset_driver function.(4) Its second argument is the file driver identifier, which may -have a different numeric value from run to run depending on the order in which -the file drivers are registered with the library. The third argument -encapsulates the additional arguments of the driver initialization -function. This method only works if the file driver writer has made the -driver-specific property list structure a public datatype, which is -often not the case. - -

- -
-hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
-static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT};
-H5Pset_driver(fapl, H5FD_FAMILY, &fa);
-hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
-H5Pclose(fapl);
-
- -

-It is also possible to query the file driver information from a file access -property list by calling H5Pget_driver to determine the driver and then -calling a driver-defined query function to obtain the driver information: - -

- -
-hid_t driver = H5Pget_driver(fapl);
-if (H5FD_SEC2==driver) {
-    /*nothing further to get*/
-} else if (H5FD_FAMILY==driver) {
-    hid_t member_fapl;
-    haddr_t member_size;
-    H5Pget_fapl_family(fapl, &member_size, &member_fapl);
-} else if (....) {
-    ....
-}
-
- - - -

Performing I/O

- -

-The H5Dread and H5Dwrite functions transfer data between -application memory and the file. They both take an optional data transfer -property list which has some general driver-independent properties and -optional driver-defined properties. An application will typically perform I/O -in one of three styles via the H5Dread or H5Dwrite function: - -

-

-Like file access properties in the previous section, data transfer properties -can be set using a driver initialization function or a general purpose -function. For example, to set the MPI-IO driver to use independent access for -I/O operations one would say: - -

- -
-hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
-H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT);
-H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
-H5Pclose(dxpl);
-
- -

-The alternative is to initialize a driver defined C struct and pass it -to the H5Pset_driver function: - -

- -
-hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
-static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT};
-H5Pset_driver(dxpl, H5FD_MPIO, &dx);
-H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
-
- -

-The transfer property list can be queried in a manner similar to the file -access property list: the driver provides a function (or functions) to return -various information about the transfer property list: - -

- -
-hid_t driver = H5Pget_driver(dxpl);
-if (H5FD_MPIO==driver) {
-    H5FD_mpio_xfer_t xfer_mode;
-    H5Pget_dxpl_mpio(dxpl, &xfer_mode);
-} else {
-    ....
-}
-
- - - -

File Driver Interchangeability

- -

-The HDF5 specifications describe two things: the mapping of data onto a linear -format address space and the C API which performs the mapping. -However, the mapping of the format address space onto storage intentionally -falls outside the scope of the HDF5 specs. This is a direct result of the fact -that it is not generally possible to store information about how to access -storage inside the storage itself. For instance, given only the file name -`/arborea/1225/work/f%03d' the HDF5 library is unable to tell whether the -name refers to a file on the local file system, a family of files on the local -file system, a file on host `arborea' port 1225, a family of files on a -remote system, etc. - -

-

-Two ways which library could figure out where the storage is located are: -storage access information can be provided by the user, or the library can try -all known file access methods. This implementation uses the former method. - -

-

-In general, if a file was created with one driver then it isn't possible to -open it with another driver. There are of course exceptions: a file created -with MPIO could probably be opened with the sec2 driver, any file created -by the sec2 driver could be opened as a family of files with one member, -etc. In fact, sometimes a file must not only be opened with the same -driver but also with the same driver properties. The predefined drivers are -written in such a way that specifying the correct driver is sufficient for -opening a file. - -

- - -

Implementation of a Driver

- -

-A driver is simply a collection of functions and data structures which are -registered with the HDF5 library at runtime. The functions fall into these -categories: - -

- - - - - -

Mode Functions

- -

-Some drivers need information about file access and data transfers which are -very specific to the driver. The information is usually implemented as a pair -of pointers to C structs which are allocated and initialized as part of an -HDF5 property list and passed down to various driver functions. There are two -classes of settings: file access modes that describe how to access the file -through the driver, and data transfer modes which are settings that control -I/O operations. Each file opened by a particular driver may have a different -access mode; each dataset I/O request for a particular file may have a -different data transfer mode. - -

-

-Since each driver has its own particular requirements for various settings, -each driver is responsible for defining the mode structures that it -needs. Higher layers of the library treat the structures as opaque but must be -able to copy and free them. Thus, the driver provides either the size of the -structure or a pair of function pointers for each of the mode types. - -

-

-Example: The family driver needs to know how the format address -space is partitioned and the file access property list to use for the -family members. - -

- -
-/* Driver-specific file access properties */
-typedef struct H5FD_family_fapl_t {
-    hsize_t     memb_size;      /*size of each member                   */
-    hid_t       memb_fapl_id;   /*file access property list of each memb*/
-} H5FD_family_fapl_t;
-
-/* Driver specific data transfer properties */
-typedef struct H5FD_family_dxpl_t {
-    hid_t       memb_dxpl_id;   /*data xfer property list of each memb  */
-} H5FD_family_dxpl_t;
-
- -

-In order to copy or free one of these structures the member file access -or data transfer properties must also be copied or freed. This is done -by providing a copy and close function for each structure: - -

-

-Example: The file access property list copy and close functions -for the family driver: - -

- -
-static void *
-H5FD_family_fapl_copy(const void *_old_fa)
-{
-    const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa;
-    H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t));
-    assert(new_fa);
-
-    memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t));
-    new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id);
-    return new_fa;
-}
-
-static herr_t
-H5FD_family_fapl_free(void *_fa)
-{
-    H5FD_family_fapl_t  *fa = (H5FD_family_fapl_t*)_fa;
-    H5Pclose(fa->memb_fapl_id);
-    free(fa);
-    return 0;
-}
-
- -

-Generally when a file is created or opened the file access properties -for the driver are copied into the file pointer which is returned and -they may be modified from their original value (for instance, the file -family driver modifies the member size property when opening an existing -family). In order to support the H5Fget_access_plist function the -driver must provide a fapl_get callback which creates a copy of -the driver-specific properties based on a particular file. - -

-

-Example: The file family driver copies the member size file -access property list into the return value: - -

- -
-static void *
-H5FD_family_fapl_get(H5FD_t *_file)
-{
-    H5FD_family_t	*file = (H5FD_family_t*)_file;
-    H5FD_family_fapl_t	*fa = calloc(1, sizeof(H5FD_family_fapl_t*));
-
-    fa->memb_size = file->memb_size;
-    fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id);
-    return fa;
-}
-
- - - -

File Functions

- -

-The higher layers of the library expect files to have a name and allow the -file to be accessed in various modes. The driver must be able to create a new -file, replace an existing file, or open an existing file. Opening or creating -a file should return a handle, a pointer to a specialization of the -H5FD_t struct, which allows read-only or read-write access and which -will be passed to the other driver functions as they are -called.(5) - -

- -
-typedef struct {
-    /* Public fields */
-    H5FD_class_t *cls; /*class data defined below*/
-
-    /* Private fields -- driver-defined */
-
-} H5FD_t;
-
- -

-Example: The family driver requires handles to the underlying -storage, the size of the members for this particular file (which might be -different than the member size specified in the file access property list if -an existing file family is being opened), the name used to open the file in -case additional members must be created, and the flags to use for creating -those additional members. The eoa member caches the size of the format -address space so the family members don't have to be queried in order to find -it. - -

- -
-/* The description of a file belonging to this driver. */
-typedef struct H5FD_family_t {
-    H5FD_t      pub;            /*public stuff, must be first           */
-    hid_t       memb_fapl_id;   /*file access property list for members */
-    hsize_t     memb_size;      /*maximum size of each member file      */
-    int         nmembs;         /*number of family members              */
-    int         amembs;         /*number of member slots allocated      */
-    H5FD_t      **memb;         /*dynamic array of member pointers      */
-    haddr_t     eoa;            /*end of allocated addresses            */
-    char        *name;          /*name generator printf format          */
-    unsigned    flags;          /*flags for opening additional members  */
-} H5FD_family_t;
-
- -

-Example: The sec2 driver needs to keep track of the underlying Unix -file descriptor and also the end of format address space and current Unix file -size. It also keeps track of the current file position and last operation -(read, write, or unknown) in order to optimize calls to lseek. The -device and inode fields are defined on Unix in order to uniquely -identify the file and will be discussed below. - -

- -
-typedef struct H5FD_sec2_t {
-    H5FD_t      pub;                    /*public stuff, must be first   */
-    int         fd;                     /*the unix file                 */
-    haddr_t     eoa;                    /*end of allocated region       */
-    haddr_t     eof;                    /*end of file; current file size*/
-    haddr_t     pos;                    /*current file I/O position     */
-    int         op;                     /*last operation                */
-    dev_t       device;                 /*file device number            */
-    ino_t       inode;                  /*file i-node number            */
-} H5FD_sec2_t;
-
- - - -

Opening Files

- -

-All drivers must define a function for opening/creating a file. This -function should have a prototype which is: - -

-

-

-
Function: static H5FD_t * open (const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr) -
- -

-

-The file name name and file access property list fapl are -the same as were specified in the H5Fcreate or H5Fopen -call. The flags are the same as in those calls also except the -flag H5F_ACC_CREATE is also present if the call was to -H5Fcreate and they are documented in the `H5Fpublic.h' -file. The maxaddr argument is the maximum format address that the -driver should be prepared to handle (the minimum address is always -zero). -

- -

-

-Example: The sec2 driver opens a Unix file with the requested name -and saves information which uniquely identifies the file (the Unix device -number and inode). - -

- -
-static H5FD_t *
-H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/,
-               haddr_t maxaddr)
-{
-    unsigned    o_flags;
-    int         fd;
-    struct stat sb;
-    H5FD_sec2_t *file=NULL;
-
-    /* Check arguments */
-    if (!name || !*name) return NULL;
-    if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL;
-    if (ADDR_OVERFLOW(maxaddr)) return NULL;
-
-    /* Build the open flags */
-    o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY;
-    if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC;
-    if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT;
-    if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL;
-
-    /* Open the file */
-    if ((fd=open(name, o_flags, 0666))<0) return NULL;
-    if (fstat(fd, &sb)<0) {
-        close(fd);
-        return NULL;
-    }
-
-    /* Create the new file struct */
-    file = calloc(1, sizeof(H5FD_sec2_t));
-    file->fd = fd;
-    file->eof = sb.st_size;
-    file->pos = HADDR_UNDEF;
-    file->op = OP_UNKNOWN;
-    file->device = sb.st_dev;
-    file->inode = sb.st_ino;
-
-    return (H5FD_t*)file;
-}
-
- - - -

Closing Files

- -

-Closing a file simply means that all cached data should be flushed to the next -lower layer, the file should be closed at the next lower layer, and all -file-related data structures should be freed. All information needed by the -close function is already present in the file handle. - -

-

-

-
Function: static herr_t close (H5FD_t *file) -
- -

-

-The file argument is the handle which was returned by the open -function, and the close should free only memory associated with the -driver-specific part of the handle (the public parts will have already been released by HDF5's virtual file layer). -

- -

-

-Example: The sec2 driver just closes the underlying Unix file, -making sure that the actual file size is the same as that known to the -library by writing a zero to the last file position it hasn't been -written by some previous operation (which happens in the same code which -flushes the file contents and is shown below). - -

- -
-static herr_t
-H5FD_sec2_close(H5FD_t *_file)
-{
-    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
-
-    if (H5FD_sec2_flush(_file)<0) return -1;
-    if (close(file->fd)<0) return -1;
-    free(file);
-    return 0;
-}
-
- - - -

File Keys

- -

-Occasionally an application will attempt to open a single file more than one -time in order to obtain multiple handles to the file. HDF5 allows the files to -share information(6) but in order to -accomplish this HDF5 must be able to tell when two names refer to the same -file. It does this by associating a driver-defined key with each file opened -by a driver and comparing the key for an open request with the keys for all -other files currently open by the same driver. - -

-

-

-
Function: const int cmp (const H5FD_t *f1, const H5FD_t *f2) -
- -

-

-The driver may provide a function which compares two files f1 and -f2 belonging to the same driver and returns a negative, positive, or -zero value a la the strcmp function.(7) If this -function is not provided then HDF5 assumes that all calls to the open -callback return unique files regardless of the arguments and it is up to the -application to avoid doing this if that assumption is incorrect. -

- -

-

-Each time a file is opened the library calls the cmp function to -compare that file with all other files currently open by the same driver and -if one of them matches (at most one can match) then the file which was just -opened is closed and the previously opened file is used instead. - -

-

-Opening a file twice with incompatible flags will result in failure. For -instance, opening a file with the truncate flag is a two step process which -first opens the file without truncation so keys can be compared, and if no -matching file is found already open then the file is closed and immediately -reopened with the truncation flag set (if a matching file is already open then -the truncating open will fail). - -

-

-Example: The sec2 driver uses the Unix device and i-node as the -key. They were initialized when the file was opened. - -

- -
-static int
-H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2)
-{
-    const H5FD_sec2_t   *f1 = (const H5FD_sec2_t*)_f1;
-    const H5FD_sec2_t   *f2 = (const H5FD_sec2_t*)_f2;
-
-    if (f1->device < f2->device) return -1;
-    if (f1->device > f2->device) return 1;
-
-    if (f1->inode < f2->inode) return -1;
-    if (f1->inode > f2->inode) return 1;
-
-    return 0;
-}
-
- - - -

Saving Modes Across Opens

- -

-Some drivers may also need to store certain information in the file superblock -in order to be able to reliably open the file at a later date. This is done by -three functions: one to determine how much space will be necessary to store -the information in the superblock, one to encode the information, and one to -decode the information. These functions are optional, but if any one is -defined then the other two must also be defined. - -

-

-

-
Function: static hsize_t sb_size (H5FD_t *file) -
-
Function: static herr_t sb_encode (H5FD_t *file, char *name, unsigned char *buf) -
-
Function: static herr_t sb_decode (H5FD_t *file, const char *name, const unsigned char *buf) -
- -

-

-The sb_size function returns the number of bytes necessary to encode -information needed later if the file is reopened. The sb_encode -function encodes information from the file into buffer buf -allocated by the caller. It also writes an 8-character (plus null -termination) into the name argument, which should be a unique -identification for the driver. The sb_decode function looks at -the name - -

-

- decodes -data from the buffer buf and updates the file argument with the new information, -advancing *p in the process. -

- -

-

-The part of this which is somewhat tricky is that the file must be readable -before the superblock information is decoded. File access modes fall outside -the scope of the HDF5 file format, but they are placed inside the boot block -for convenience.(8) - -

-

-Example: To be written later. - -

- - -

Address Space Functions

- -

-HDF5 does not assume that a file is a linear address space of bytes. Instead, -the library will call functions to allocate and free portions of the HDF5 -format address space, which in turn map onto functions in the file driver to -allocate and free portions of file address space. The library tells the file -driver how much format address space it wants to allocate and the driver -decides what format address to use and how that format address is mapped onto -the file address space. Usually the format address is chosen so that the file -address can be calculated in constant time for data I/O operations (which are -always specified by format addresses). - -

- - - -

Userblock and Superblock

- -

-The HDF5 format allows an optional userblock to appear before the actual HDF5 -data in such a way that if the userblock is sucked out of the file and -everything remaining is shifted downward in the file address space, then the -file is still a valid HDF5 file. The userblock size can be zero or any -multiple of two greater than or equal to 512 and the file superblock begins -immediately after the userblock. - -

-

-HDF5 allocates space for the userblock and superblock by calling an -allocation function defined below, which must return a chunk of memory at -format address zero on the first call. - -

- - -

Allocation of Format Regions

- -

-The library makes many types of allocation requests: - -

-
- -
H5FD_MEM_SUPER -
-An allocation request for the userblock and/or superblock. -
H5FD_MEM_BTREE -
-An allocation request for a node of a B-tree. -
H5FD_MEM_DRAW -
-An allocation request for the raw data of a dataset. -
H5FD_MEM_META -
-An allocation request for the raw data of a dataset which -the user has indicated will be relatively small. -
H5FD_MEM_GROUP -
-An allocation request for a group leaf node (internal nodes of the group tree -are allocated as H5MF_BTREE). -
H5FD_MEM_GHEAP -
-An allocation request for a global heap collection. Global heaps are used to -store certain types of references such as dataset region references. The set -of all global heap collections can become quite large. -
H5FD_MEM_LHEAP -
-An allocation request for a local heap. Local heaps are used to store the -names which are members of a group. The combined size of all local heaps is a -function of the number of object names in the file. -
H5FD_MEM_OHDR -
-An allocation request for (part of) an object header. Object headers are -relatively small and include meta information about objects (like the data -space and type of a dataset) and attributes. -
- -

-When a chunk of memory is freed the library adds it to a free list and -allocation requests are satisfied from the free list before requesting memory -from the file driver. Each type of allocation request enumerated above has its -own free list, but the file driver can specify that certain object types can -share a free list. It does so by providing an array which maps a request type -to a free list. If any value of the map is H5MF_DEFAULT (zero) then the -object's own free list is used. The special value H5MF_NOLIST indicates -that the library should not attempt to maintain a free list for that -particular object type, instead calling the file driver each time an object of -that type is freed. - -

-

-Mappings predefined in the `H5FDpublic.h' file are: -

- -
H5FD_FLMAP_SINGLE -
-All memory usage types are mapped to a single free list. -
H5FD_FLMAP_DICHOTOMY -
-Memory usage is segregated into meta data and raw data for the purposes of -memory management. -
H5FD_FLMAP_DEFAULT -
-Each memory usage type has its own free list. -
- -

-Example: To make a map that manages object headers on one free list -and everything else on another free list one might initialize the map with the -following code: (the use of H5FD_MEM_SUPER is arbitrary) - -

- -
-H5FD_mem_t mt, map[H5FD_MEM_NTYPES];
-
-for (mt=0; mt<H5FD_MEM_NTYPES; mt++) {
-    map[mt] = (H5FD_MEM_OHDR==mt) ? mt : H5FD_MEM_SUPER;
-}
-
- -

-If an allocation request cannot be satisfied from the free list then one of -two things happen. If the driver defines an allocation callback then it is -used to allocate space; otherwise new memory is allocated from the end of the -format address space by incrementing the end-of-address marker. - -

-

-

-
Function: static haddr_t alloc (H5FD_t *file, H5MF_type_t type, hsize_t size) -
- -

-

-The file argument is the file from which space is to be allocated, -type is the type of memory being requested (from the list above) without -being mapped according to the freelist map and size is the number of -bytes being requested. The library is allowed to allocate large chunks of -storage and manage them in a layer above the file driver (although the current -library doesn't do that). The allocation function should return a format -address for the first byte allocated. The allocated region extends from that -address for size bytes. If the request cannot be honored then the -undefined address value is returned (HADDR_UNDEF). The first call to -this function for a file which has never had memory allocated must -return a format address of zero or HADDR_UNDEF since this is how the -library allocates space for the userblock and/or superblock. -

- -

- -

-Example: To be written later. - -

- - -

Freeing Format Regions

- -

-When the library is finished using a certain region of the format address -space it will return the space to the free list according to the type of -memory being freed and the free list map described above. If the free list has -been disabled for a particular memory usage type (according to the free list -map) and the driver defines a free callback then it will be -invoked. The free callback is also invoked for all entries on the free -list when the file is closed. - -

-

-

-
Function: static herr_t free (H5FD_t *file, H5MF_type_t type, haddr_t addr, hsize_t size) -
- -

-

-The file argument is the file for which space is being freed; type -is the type of object being freed (from the list above) without being mapped -according to the freelist map; addr is the first format address to free; -and size is the size in bytes of the region being freed. The region -being freed may refer to just part of the region originally allocated and/or -may cross allocation boundaries provided all regions being freed have the same -usage type. However, the library will never attempt to free regions which have -already been freed or which have never been allocated. -

- -

-

-A driver may choose to not define the free function, in which case -format addresses will be leaked. This isn't normally a huge problem since the -library contains a simple free list of its own and freeing parts of the format -address space is not a common occurrence. - -

-

-Example: To be written later. - -

- - -

Querying Address Range

- -

-Each file driver must have some mechanism for setting and querying the end of -address, or EOA, marker. The EOA marker is the first format address -after the last format address ever allocated. If the last part of the -allocated address range is freed then the driver may optionally decrease the -eoa marker. - -

-

-

-
Function: static haddr_t get_eoa (H5FD_t *file) -
- -

-

-This function returns the current value of the EOA marker for the specified -file. -

- -

-

-Example: The sec2 driver just returns the current eoa marker value -which is cached in the file structure: - -

- -
-static haddr_t
-H5FD_sec2_get_eoa(H5FD_t *_file)
-{
-    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
-    return file->eoa;
-}
-
- -

-The eoa marker is initially zero when a file is opened and the library may set -it to some other value shortly after the file is opened (after the superblock -is read and the saved eoa marker is determined) or when allocating additional -memory in the absence of an alloc callback (described above). - -

-

-Example: The sec2 driver simply caches the eoa marker in the file -structure and does not extend the underlying Unix file. When the file is -flushed or closed then the Unix file size is extended to match the eoa marker. - -

- -
-static herr_t
-H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr)
-{
-    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
-    file->eoa = addr;
-    return 0;
-}
-
- - - -

Data Functions

- -

-These functions operate on data, transferring a region of the format address -space between memory and files. - -

- - - -

Contiguous I/O Functions

- -

-A driver must specify two functions to transfer data from the library to the -file and vice versa. - -

-

-

-
Function: static herr_t read (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buf) -
-
Function: static herr_t write (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buf) -
- -

-

-The read function reads data from file file beginning at address -addr and continuing for size bytes into the buffer buf -supplied by the caller. The write function transfers data in the -opposite direction. Both functions take a data transfer property list -dxpl which indicates the fine points of how the data is to be -transferred and which comes directly from the H5Dread or -H5Dwrite function. Both functions receive type of -data being written, which may allow a driver to tune it's behavior for -different kinds of data. -

- -

-

-Both functions should return a negative value if they fail to transfer the -requested data, or non-negative if they succeed. The library will never -attempt to read from unallocated regions of the format address space. - -

-

-Example: The sec2 driver just makes system calls. It tries not to -call lseek if the current operation is the same as the previous -operation and the file position is correct. It also fills the output buffer -with zeros when reading between the current EOF and EOA markers and restarts -system calls which were interrupted. - -

- -
-static herr_t
-H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/,
-        haddr_t addr, hsize_t size, void *buf/*out*/)
-{
-    H5FD_sec2_t         *file = (H5FD_sec2_t*)_file;
-    ssize_t             nbytes;
-
-    assert(file && file->pub.cls);
-    assert(buf);
-
-    /* Check for overflow conditions */
-    if (REGION_OVERFLOW(addr, size)) return -1;
-    if (addr+size>file->eoa) return -1;
-
-    /* Seek to the correct location */
-    if ((addr!=file->pos || OP_READ!=file->op) &&
-        file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) {
-        file->pos = HADDR_UNDEF;
-        file->op = OP_UNKNOWN;
-        return -1;
-    }
-
-    /*
-     * Read data, being careful of interrupted system calls, partial results,
-     * and the end of the file.
-     */
-    while (size>0) {
-        do nbytes = read(file->fd, buf, size);
-        while (-1==nbytes && EINTR==errno);
-        if (-1==nbytes) {
-            /* error */
-            file->pos = HADDR_UNDEF;
-            file->op = OP_UNKNOWN;
-            return -1;
-        }
-        if (0==nbytes) {
-            /* end of file but not end of format address space */
-            memset(buf, 0, size);
-            size = 0;
-        }
-        assert(nbytes>=0);
-        assert((hsize_t)nbytes<=size);
-        size -= (hsize_t)nbytes;
-        addr += (haddr_t)nbytes;
-        buf = (char*)buf + nbytes;
-    }
-
-    /* Update current position */
-    file->pos = addr;
-    file->op = OP_READ;
-    return 0;
-}
-
- -

-Example: The sec2 write callback is similar except it updates -the file EOF marker when extending the file. - -

- - -

Flushing Cached Data

- -

-Some drivers may desire to cache data in memory in order to make larger I/O -requests to the underlying file and thus improving bandwidth. Such drivers -should register a cache flushing function so that the library can insure that -data has been flushed out of the drivers in response to the application -calling H5Fflush. - -

-

-

-
Function: static herr_t flush (H5FD_t *file) -
- -

-

-Flush all data for file file to storage. -

- -

-

-Example: The sec2 driver doesn't cache any data but it also doesn't -extend the Unix file as aggressively as it should. Therefore, when finalizing a -file it should write a zero to the last byte of the allocated region so that -when reopening the file later the EOF marker will be at least as large as the -EOA marker saved in the superblock (otherwise HDF5 will refuse to open the -file, claiming that the data appears to be truncated). - -

- -
-static herr_t
-H5FD_sec2_flush(H5FD_t *_file)
-{
-    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
-
-    if (file->eoa>file->eof) {
-        if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1;
-        if (write(file->fd, "", 1)!=1) return -1;
-        file->eof = file->eoa;
-        file->pos = file->eoa;
-        file->op = OP_WRITE;
-    }
-
-    return 0;
-}
-
- - - -

Optimization Functions

- -

-The library is capable of performing several generic optimizations on I/O, but -these types of optimizations may not be appropriate for a given VFL driver. -

- -

-Each driver may provide a query function to allow the library to query whether -to enable these optimizations. If a driver lacks a query function, the library -will disable all types of optimizations which can be queried. -

- -

-

-
Function: static herr_t query (const H5FD_t *file, unsigned long *flags) -
-

-

-This function is called by the library to query which optimizations to enable -for I/O to this driver. These are the flags which are currently defined: - -

-

- -
-

- -

Registration of a Driver

- -

-Before a driver can be used the HDF5 library needs to be told of its -existence. This is done by registering the driver, which results in a driver -identification number. Instead of passing many arguments to the registration -function, the driver information is entered into a structure and the address -of the structure is passed to the registration function where it is -copied. This allows the HDF5 API to be extended while providing backward -compatibility at the source level. - -

-

-

-
Function: hid_t H5FDregister (H5FD_class_t *cls) -
- -

-

-The driver described by struct cls is registered with the library and an -ID number for the driver is returned. -

- -

-

-The H5FD_class_t type is a struct with the following fields: - -

-
- -
const char *name -
-A pointer to a constant, null-terminated driver name to be used for debugging -purposes. -
size_t fapl_size -
-The size in bytes of the file access mode structure or zero if the driver -supplies a copy function or doesn't define the structure. -
void *(*fapl_copy)(const void *fapl) -
-An optional function which copies a driver-defined file access mode structure. -This field takes precedence over fm_size when both are defined. -
void (*fapl_free)(void *fapl) -
-An optional function to free the driver-defined file access mode structure. If -null, then the library calls the C free function to free the -structure. -
size_t dxpl_size -
-The size in bytes of the data transfer mode structure or zero if the driver -supplies a copy function or doesn't define the structure. -
void *(*dxpl_copy)(const void *dxpl) -
-An optional function which copies a driver-defined data transfer mode -structure. This field takes precedence over xm_size when both are -defined. -
void (*dxpl_free)(void *dxpl) -
-An optional function to free the driver-defined data transfer mode -structure. If null, then the library calls the C free function to -free the structure. -
H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr) -
-The function which opens or creates a new file. -
herr_t (*close)(H5FD_t *file) -
-The function which ends access to a file. -
int (*cmp)(const H5FD_t *f1, const H5FD_t *f2) -
-An optional function to determine whether two open files have the same key. If -this function is not present then the library assumes that two files will -never be the same. -
int (*query)(const H5FD_t *f, unsigned long *flags) -
-An optional function to determine which library optimizations a driver can -support. -
haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size) -
-An optional function to allocate space in the file. -
herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size) -
-An optional function to free space in the file. -
haddr_t (*get_eoa)(H5FD_t *file) -
-A function to query how much of the format address space has been allocated. -
herr_t (*set_eoa)(H5FD_t *file, haddr_t) -
-A function to set the end of address space. -
haddr_t (*get_eof)(H5FD_t *file) -
-A function to return the current end-of-file marker value. -
herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer) -
-A function to read data from a file. -
herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer) -
-A function to write data to a file. -
herr_t (*flush)(H5FD_t *file) -
-A function which flushes cached data to the file. -
H5FD_mem_t fl_map[H5FD_MEM_NTYPES] -
-An array which maps a file allocation request type to a free list. -
- -

-Example: The sec2 driver would be registered as: - -

- -
-static const H5FD_class_t H5FD_sec2_g = {
-    "sec2",                                     /*name                  */
-    MAXADDR,                                    /*maxaddr               */
-    NULL,                                       /*sb_size               */
-    NULL,                                       /*sb_encode             */
-    NULL,                                       /*sb_decode             */
-    0,                                          /*fapl_size             */
-    NULL,                                       /*fapl_get              */
-    NULL,                                       /*fapl_copy             */
-    NULL,                                       /*fapl_free             */
-    0,                                          /*dxpl_size             */
-    NULL,                                       /*dxpl_copy             */
-    NULL,                                       /*dxpl_free             */
-    H5FD_sec2_open,                             /*open                  */
-    H5FD_sec2_close,                            /*close                 */
-    H5FD_sec2_cmp,                              /*cmp                   */
-    H5FD_sec2_query,                            /*query                 */
-    NULL,                                       /*alloc                 */
-    NULL,                                       /*free                  */
-    H5FD_sec2_get_eoa,                          /*get_eoa               */
-    H5FD_sec2_set_eoa,                          /*set_eoa               */
-    H5FD_sec2_get_eof,                          /*get_eof               */
-    H5FD_sec2_read,                             /*read                  */
-    H5FD_sec2_write,                            /*write                 */
-    H5FD_sec2_flush,                            /*flush                 */
-    H5FD_FLMAP_SINGLE,                          /*fl_map                */
-};
-
-hid_t
-H5FD_sec2_init(void)
-{
-    if (!H5FD_SEC2_g) {
-        H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g);
-    }
-    return H5FD_SEC2_g;
-}
-
- -

-A driver can be removed from the library by unregistering it - -

-

-

-
Function: herr_t H5Dunregister (hid_t driver) -
-Where driver is the ID number returned when the driver was registered. -
- -

-

-Unregistering a driver makes it unusable for creating new file access or data -transfer property lists but doesn't affect any property lists or files that -already use that driver. - -

- - - - -

Programming Note -for C++ Developers Using C Functions

- -

If a C routine that takes a function pointer as an argument is -called from within C++ code, the C routine should be returned from -normally.

- -

Examples of this kind of routine include callbacks such as -H5Pset_elink_cb and H5Pset_type_conv_cb -and functions such as H5Tconvert and -H5Ewalk2.

- -

Exiting the routine in its normal fashion allows the HDF5 C -Library to clean up its work properly. In other words, if the C++ -application jumps out of the routine back to the C++ -“catch” statement, the library is not given the -opportunity to close any temporary data structures that were set -up when the routine was called. The C++ application should save -some state as the routine is started so that any problem that -occurs might be diagnosed.

- - - - - - - -

Querying Driver Information

- -

-

-
Function: void * H5Pget_driver_data (hid_t fapl) -
-
Function: void * H5Pget_driver_data (hid_t fxpl) -
- -

-

-This function is intended to be used by driver functions, not applications. -It returns a pointer directly into the file access property list -fapl which is a copy of the driver's file access mode originally -provided to the H5Pset_driver function. If its argument is a data -transfer property list fxpl then it returns a pointer to the -driver-specific data transfer information instead. -

- -

- - - -

Miscellaneous

- -

-The various private H5F_low_* functions will be replaced by public -H5FD* functions so they can be called from drivers. - -

-

-All private functions H5F_addr_* which operate on addresses will be -renamed as public functions by removing the first underscore so they can be -called by drivers. - -

-

-The haddr_t address data type will be passed by value throughout the -library. The original intent was that this type would eventually be a union of -file address types for the various drivers and may become quite large, but -that was back when drivers were part of HDF5. It will become an alias for an -unsigned integer type (32 or 64 bits depending on how the library was -configured). - -

-

-The various H5F*.c driver files will be renamed H5FD*.c and each -will have a corresponding header file. All driver functions except the -initializer and API will be declared static. - -

-

-This documentation didn't cover optimization functions which would be useful -to drivers like MPI-IO. Some drivers may be able to perform data pipeline -operations more efficiently than HDF5 and need to be given a chance to -override those parts of the pipeline. The pipeline would be designed to call -various H5FD optimization functions at various points which return one of -three values: the operation is not implemented by the driver, the operation is -implemented but failed in a non-recoverable manner, the operation is -implemented and succeeded. - -

-

-Various parts of HDF5 check the only the top-level file driver and do -something special if it is the MPI-IO driver. However, we might want to be -able to put the MPI-IO driver under other drivers such as the raw part of a -split driver or under a debug driver whose sole purpose is to accumulate -statistics as it passes all requests through to the MPI-IO driver. Therefore -we will probably need a function which takes a format address and or object -type and returns the driver which would have been used at the lowest level to -process the request. - -

- -


-

Footnotes

-

(1)

-

The driver name is by convention and might -not apply to drivers which are not distributed with HDF5. -

(2)

-

The access method also indicates how to translate -the storage name to a storage server such as a file, network protocol, or -memory. -

(3)

-

The term -"file access property list" is a misnomer since storage isn't -required to be a file. -

(4)

-

This -function is overloaded to operate on data transfer property lists also, as -described below. -

(5)

-

Read-only access is only appropriate when opening an existing -file. -

(6)

-

For instance, writing data to one handle will cause -the data to be immediately visible on the other handle. -

(7)

-

The ordering is -arbitrary as long as it's consistent within a particular file driver. -

(8)

-

File access modes do not describe data, but rather -describe how the HDF5 format address space is mapped to the underlying -file(s). Thus, in general the mapping must be known before the file superblock -can be read. However, the user usually knows enough about the mapping for the -superblock to be readable and once the superblock is read the library can fill -in the missing parts of the mapping. -


- - - - -