createVariable(self,
varname,
datatype,
dimensions=(),
zlib=False,
complevel=4,
shuffle=True,
fletcher32=False,
contiguous=False,
chunksizes=None,
endian='native',
least_significant_digit=None,
fill_value=None)
|
|
Creates a new variable with the given varname ,
datatype , and dimensions . If dimensions are not
given, the variable is assumed to be a scalar.
The datatype can be a numpy datatype object, or a string
that describes a numpy dtype object (like the dtype.str
attribue of a numpy array). Supported specifiers include: 'S1' or
'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h'
or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4'
(NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT),
'f8' or 'd' (NC_DOUBLE) . datatype can also be a CompoundType
instance (for a structured, or compound array), a VLType instance (for a
variable-length array), or the python str builtin (for a
variable-length string array).
Data from netCDF variables is presented to python as numpy arrays with
the corresponding data type.
dimensions must be a tuple containing dimension names
(strings) that have been defined previously using
createDimension . The default value is an empty tuple, which
means the variable is a scalar.
If the optional keyword zlib is True , the
data will be compressed in the netCDF file using gzip compression
(default False ).
The optional keyword complevel is an integer between 1
and 9 describing the level of compression desired (default 4). Ignored if
zlib=False .
If the optional keyword shuffle is True , the
HDF5 shuffle filter will be applied before compressing the data (default
True ). This significantly improves compression. Default is
True . Ignored if zlib=False .
If the optional keyword fletcher32 is True ,
the Fletcher32 HDF5 checksum algorithm is activated to detect errors.
Default False .
If the optional keyword contiguous is True ,
the variable data is stored contiguously on disk. Default
False . Setting to True for a variable with an
unlimited dimension will trigger an error.
The optional keyword chunksizes can be used to manually
specify the HDF5 chunksizes for each dimension of the variable. A
detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each
dimension to match as closely as possible the size of the data block that
users will read from the file. chunksizes cannot be set if
contiguous=True .
The optional keyword endian can be used to control
whether the data is stored in little or big endian format on disk.
Possible values are little, big or native
(default). The library will automatically handle endian conversions when
the data is read, but if the data is always going to be read on a
computer with the opposite format as the one used to create the file,
there may be some performance advantage to be gained by setting the
endian-ness.
The zlib, complevel, shuffle, fletcher32, contiguous,
chunksizes and endian keywords are silently ignored
for netCDF 3 files that do not use HDF5.
The optional keyword fill_value can be used to override
the default netCDF _FillValue (the value that the variable
gets filled with before any data is written to it, defaults given in
netCDF4.default_fillvals). If fill_value is set to False ,
then the variable is not pre-filled.
If the optional keyword parameter least_significant_digit
is specified, variable data will be truncated (quantized). In conjunction
with zlib=True this produces 'lossy', but significantly more
efficient compression. For example, if
least_significant_digit=1 , data will be quantized using
numpy.around(scale*data)/scale , where scale = 2**bits, and
bits is determined so that a precision of 0.1 is retained (in this case
bits=4). From http://www.cdc.noaa.gov/cdc/conventions/cdc_netcdf_standard.shtml:
"least_significant_digit -- power of ten of the smallest decimal
place in unpacked data that is a reliable value." Default is
None , or no quantization, or 'lossless' compression.
When creating variables in a NETCDF4 or
NETCDF4_CLASSIC formatted file, HDF5 creates something
called a 'chunk cache' for each variable. The default size of the chunk
cache may be large enough to completely fill available memory when
creating thousands of variables. The optional keyword
chunk_cache allows you to reduce (or increase) the size of
the default chunk cache when creating a variable. The setting only
persists as long as the Dataset is open - you can use the
set_var_chunk_cache method to change it the next time the Dataset is
opened. Warning - messing with this parameter can seriously degrade
performance.
The return value is the Variable class instance describing the new variable.
A list of names corresponding to netCDF variable attributes can be
obtained with the Variable method ncattrs() . A dictionary
containing all the netCDF attribute name/value pairs is provided by the
__dict__ attribute of a Variable
instance.
Variable
instances behave much like array objects. Data can be assigned to or
retrieved from a variable with indexing and slicing operations on the Variable instance. A
Variable instance
has five Dataset standard attributes: dimensions, dtype, shape,
ndim and least_significant_digit . Application
programs should never modify these attributes. The
dimensions attribute is a tuple containing the names of the
dimensions associated with this variable. The dtype
attribute is a string describing the variable's data type (i4, f8,
S1, etc). The shape attribute is a tuple describing
the current sizes of all the variable's dimensions. The
least_significant_digit attributes describes the power of
ten of the smallest decimal place in the data the contains a reliable
value. assigned to the Variable instance. If None , the data is not
truncated. The ndim attribute is the number of variable
dimensions.
|