Spatial processing¶
ela.spatial module¶
-
class
ela.spatial.DepthsRounding(depth_from_col='Depth From (m)', depth_to_col='Depth To (m)')¶ Bases:
objectHelper class to round lithology record classes to the nearest metre of depth
-
depth_from_col¶ Column name storing “from depth” information
- Type
str
-
depth_to_col¶ Column name storing “to depth” information
- Type
str
-
assess_num_collapsed(df, func=<function round_>)¶ How many records would be removed from the lithology records if rounding from/to depths
- Parameters
df (pandas data frame) – bore lithology data
func (callable) – rounding function callable with a signature similar to np.round
- Returns
the number of entries that would be collapsed from/to depths
- Return type
(int)
-
round_to_metre_depths(df, func=<function round_>, remove_collapsed=False)¶ Round lithology record classes to the nearest metre of depth
- Parameters
df (pandas data frame) – bore lithology data
func (callable) – rounding function callable with a signature similar to np.round
remove_collapsed (bool) – should entries where depths from and to are equal be removed from the result
- Returns
a data frame of similar structure as the input but without entries less than a metre resolution.
- Return type
(pandas dataframe)
-
-
class
ela.spatial.GeospatialDataFrameColumnNames(easting_col='Easting', northing_col='Northing', depth_from_ahd_col='Depth From (AHD)', depth_to_ahd_col='Depth To (AHD)')¶ Bases:
objectOperations on data frames with 3D spatial information of lithology logs. The purpose of this class is to adapt ‘pyela’ operations to different data without requiring renaming columns.
-
easting_col¶ name of the data frame column for easting
- Type
str
-
northing_col¶ name of the data frame column for northing
- Type
str
-
depth_from_ahd_col¶ name of the data frame column for the height of the top of the soil column (ahd stands for for australian height datum, but not restricted)
- Type
str
-
depth_to_ahd_col¶ name of the data frame column for the height of the bottom of the soil column (ahd stands for for australian height datum, but not restricted)
- Type
str
-
class_probability_estimates_depth(df, column_name, slice_depth, n_neighbours, mesh_grid, func_training_set=None)¶ Subset data frame with entries at a specified AHD coordinate
- Parameters
df (pandas data frame) – bore lithology data
column_name (str) – name of the column with string information to use to strip entries with missing lithology information
slice_depth (float) – AHD coordinate at which to slice the data frame for lithology observations
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
func_training_set (callable) – a function to processing the training set (e.g. completing dummy with dummy classes, other not present in the trainining set)
- Returns
a list of numpy arrays, shaped like the meshgrid.
-
class_probability_estimates_depth_bbox(df, column_name, slice_depth, n_neighbours, geo_pd, grid_res=100, func_training_set=None)¶ Interpolate over a volume the probability of each lithology class
- Parameters
df (pandas data frame) – bore lithology data, spatially georeferenced
column_name (str) – name of the column with numeric codes for lithology classes
z_ahd_coords (iterable of int) – datum heights at which to interpolate. Must be equal to the length of the last dimension of the volume
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
extract_bore_class_num(bore_log_df, column_name)¶ Gets the columns easting, northing, primary lithology class number, AHD depth ‘from’ and ‘to’ from a bore data log
- Parameters
bore_log_df (pandas data frame) – bore lithology data
column_name (str) – name of the column of interest e.g. lithology descriptions or classes
-
get_knn_model(df, column_name, slice_depth, n_neighbours)¶ Train a K-nearest neighbours model for a given plane
- Parameters
df (data frame) –
column_name (str) –
slice_depth (numeric) –
n_neighbours (int) –
- Returns
trained classifier.
- Return type
KNeighborsClassifier
-
get_lithology_classes_probabilities(lithologies, shape, df, column_name, z_ahd_coords, n_neighbours, mesh_grid)¶ Interpolate over a volume the probability of each lithology class
- Parameters
shape (tuple of ints) – shape of the output volumes to interpolate over
df (pandas data frame) – bore lithology data, spatially georeferenced
column_name (str) – name of the column with numeric codes for lithology classes
z_ahd_coords (iterable of int) – datum heights at which to interpolate. Must be equal to the length of the last dimension of the volume
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
get_lithology_observations_for_depth(df, slice_depth, column_name)¶ Subset data frame with entries at a specified AHD coordinate, and with valid lithology information.
- Args:
df (pandas data frame): bore lithology data slice_depth (float): AHD coordinate at which to slice the data frame for lithology observations column_name (str): name of the column with string information to use to strip entries with missing lithology information
- Returns:
a (view of a) data frame; a subset of the input data frame, entries intersecting with the specified slice depth
-
interpolate_lithologydata_slice_depth(df, column_name, slice_depth, n_neighbours, mesh_grid)¶ Interpolate lithology data
- Parameters
df (pandas data frame) – bore lithology data
slice_depth (float) – AHD coordinate at which to slice the data frame for lithology observations
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
interpolate_lithologydata_slice_depth_bbox(df, column_name, slice_depth, n_neighbours, geo_pd, grid_res=100)¶ Interpolate lithology data
- Parameters
df (pandas data frame) – bore lithology data
slice_depth (float) – AHD coordinate at which to slice the data frame for lithology observations
n_neighbours (int) – number of nearest neighbours
geo_pd (geopandas df) – vector of spatial data from which to get the bounds of interest (bounding box)
grid_res (int) – grid resolution in m for x and y.
- Returns
numpy array, predicted values over the grid.
-
lithologydata_slice_depth(df, slice_depth)¶ Subset data frame with entries at a specified AHD coordinate
- Args:
df (pandas data frame): bore lithology data slice_depth (float): AHD coordinate at which to slice the data frame for lithology observations
- Returns:
a (view of a) data frame, a subset of the input data frame, entries intersecting with the specified slice depth
-
make_training_set(observations, column_name)¶ Create a training set suitable for machine learning by e.g. scikit-learn out of a georeferenced data frame.
- Parameters
observations (pandas data frame) – bore lithology data
column_name (str) – name of the column of interest e.g. lithology descriptions or classes
-
-
class
ela.spatial.GridInterpolation(easting_col='Easting', northing_col='Northing', depth_from_ahd_col='Depth From (AHD)', depth_to_ahd_col='Depth To (AHD)')¶ Bases:
objectOperations interpolating over a grid using a trained model The purpose of this class is to adapt ‘pyela’ operations to different data without requiring renaming columns.
-
dfcn¶
-
northing_col¶ name of the data frame column for northing
- Type
str
-
depth_from_ahd_col¶ name of the data frame column for the height of the top of the soil column (ahd stands for for australian height datum, but not restricted)
- Type
str
-
depth_to_ahd_col¶ name of the data frame column for the height of the bottom of the soil column (ahd stands for for australian height datum, but not restricted)
- Type
str
-
get_knn_model(df, column_name, slice_depth, n_neighbours)¶ Train a K-nearest neighbours model for a given plane
- Args:
df (pandas data frame): bore lithology data column_name (str): name of the column with string information to use to strip entries with missing lithology information slice_depth (float): AHD coordinate at which to slice the data frame for lithology observations n_neighbours (int): number of nearest neighbours
- Returns
trained classifier.
- Return type
KNeighborsClassifier
-
get_lithology_observations_for_depth(df, slice_depth, column_name)¶ Subset data frame with entries at a specified AHD coordinate, and with valid lithology information.
- Args:
df (pandas data frame): bore lithology data slice_depth (float): AHD coordinate at which to slice the data frame for lithology observations column_name (str): name of the column with string information to use to strip entries with missing lithology information
- Returns:
a (view of a) data frame; a subset of the input data frame, entries intersecting with the specified slice depth
-
interpolate_lithologydata_slice_depth(df, column_name, slice_depth, n_neighbours, mesh_grid)¶ Interpolate lithology data
- Parameters
df (pandas data frame) – bore lithology data
column_name (str) – name of the column with string information to use to strip entries with missing lithology information
slice_depth (float) – AHD coordinate at which to slice the data frame for lithology observations
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
interpolate_volume(volume, df, column_name, z_ahd_coords, n_neighbours, mesh_grid)¶ Interpolate lithology data over a volume
- Parameters
volume (ndarray) – 3d volume to fill with interpolated lithologies
df (pandas data frame) – bore lithology data
column_name (str) – name of the column with numeric codes for lithology classes
z_ahd_coords (iterable of int) – datum heights at which to interpolate. Must be equal to the length of the last dimension of the volume
n_neighbours (int) – number of nearest neighbours
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
make_training_set(observations, column_name)¶ Create a training set from a set of geolocated observations
- Parameters
observations (pandas data frame) – bore lithology data with geocoordinates
column_name (str) – name of the column with string information to use to strip entries with missing lithology information
- Returns
observations and predictors (geolocation).
- Return type
(tuple)
-
-
class
ela.spatial.HeightDatumConverter(dem_raster, easting_col='Easting', northing_col='Northing', depth_from_ahd_col='Depth From (AHD)', depth_to_ahd_col='Depth To (AHD)')¶ Bases:
object-
dfcn¶ Adapter to a dataframe column names for depth/height information.
-
dem_raster¶ DEM raster
- Type
rasterio dataset
-
data_grid¶ band 1 data in the raster
- Type
??
-
add_height(lithology_df, depth_from_col='Depth From (m)', depth_to_col='Depth To (m)', depth_from_ahd_col='Depth From (AHD)', depth_to_ahd_col='Depth To (AHD)', easting_col='Easting', northing_col='Northing', drop_na=False)¶ Helper class to round lithology record classes to the nearest metre of depth
- Parameters
lithology_df –
depth_from_col (str) – Column name storing “from depth” information
depth_to_col (str) – Column name storing “to depth” information
depth_from_ahd_col (str) – Column name datum height information
depth_to_ahd_col (str) – Column name datum height information
easting_col (str) – Column name storing easting information
northing_col (str) – Column name storing northing information
drop_na (bool) – If true, entries with nissing datum heights are dropped
- Returns
- Return type
(pandas.DataFrame)
-
raster_drill_df(df)¶ Gets the DEM values for the easting/northing in a data frame
- Parameters
df (pandas.DataFrame) – data frame of records with easting/northing information in columns compatible with this object
- Returns
the datum height values at the specified points
- Return type
(1D numpy array)
-
raster_value_at(easting, northing)¶ Raster value for a given easting/northing
- Parameters
easting (float) – easting value
northing (float) – northing value
- Returns
the value of the band at the specified point
- Return type
(float)
-
-
class
ela.spatial.SliceOperation(dem_array_zeroes_infill, z_index_for_ahd)¶ Bases:
objectHelper class to perform slicing operations on a 3D volume.
-
dem_array_zeroes_infill¶ An array, DEM for the grid at the same x/y resolution as the volume to be sliced.
- Type
2D array
-
z_index_for_ahd¶ bijection from a z index in the volume to its AHD height
- Type
callable
-
static
arithmetic_average(slices)¶
-
static
arithmetic_average_int(slices)¶
-
from_ahd_to_depth_below_ground_level(volume, from_depth, to_depth)¶ Slice a volume at every meter between two depths below ground level
- Parameters
volume (3D array) – volume of interest
from_depth (numeric) – top level depth below ground
to_depth (numeric) – bottom level depth below ground. from_depth is less than to_depth
- Returns
volume values between the two depths below ground.
- Return type
(3D Array)
-
reduce_slices_at_depths(volume, from_depth, to_depth, reduce_func)¶ Slice a volume at every meter between two depths below ground level, and reduce the resulting ‘vertical’ values to a single statistic.
- Parameters
volume (3D array) – volume of interest
from_depth (numeric) – top level depth below ground
to_depth (numeric) – bottom level depth below ground. from_depth is less than to_depth
reduce_func (callable) – A function that takes in a list of 2D grids and returns one 2D grid
- Returns
reduced values (e.g. averaged) for each grid geolocation between the two depths below ground.
- Return type
(2D Array)
-
-
ela.spatial.average_slices(slices)¶ Gets the average values over numeric slices
- Parameters
slices (list of 2D np arrays) – slices to average
-
ela.spatial.burn_volume(volume, surface_raster, height_to_z, below=False, ignore_nan=False, inclusive=False)¶ Burn out parts of a xyz volume given a surface, below or above the intersection of the volume with the surface
- Volume
volume to modify
- Type
3D numpy
- Surface_raster
AHD coordinate at which to slice the data frame for lithology observations
- Type
2D numpy
- Height_to_z
Number of neighbors to pass to KNeighborsClassifier
- Type
a function to convert the surface raster value (height for a DEM) to a corresponding z-index in the volume.
- Below
should the part below or above be burnt out
- Type
bool
- Ignore_nan
If the surface to burn from has NaNs, should it mask out the whole corresponding cells in the volume
- Type
bool
- Inclusive
is the cell in the volume cut by the surface included in the burning out (i.e. set to NaN) or its value kept?
- Type
bool
-
ela.spatial.burn_volume_func(func_below, func_above, volume, surface_raster, height_to_z, below=False, ignore_nan=False, inclusive=False)¶ Reusable function, not for end user. Process parts of a xyz volume given a surface, below or above the intersection of the volume with the surface
-
ela.spatial.create_meshgrid(geo_pd, grid_res)¶ Create a 2D meshgrid to be used with numpy for vectorized operations and Mayavi visualisation.
- Parameters
geo_pd (geopandas) – shape from which we can get the bounding box as a basis for the extend of the meshgrid
grid_res (numeric) – x and y resolution of the grid we create
- Returns
- 2-D coordinate arrays for vectorized evaluations of 2-D scalar/vector fields.
The arrays are ordered such that the first dimension relates to the X coordinate and the second the Y coordinate. This is done such that the 2D coordinate arrays work as-is with visualisation with Mayavi without unnecessary transpose operations.
- Return type
(list of 2 2dim numpy.ndarray)
-
ela.spatial.create_meshgrid_cartesian(x_min, x_max, y_min, y_max, grid_res)¶ Create a 2D meshgrid to be used with numpy for vectorized operations and Mayavi visualisation.
- Parameters
x_min (numeric) – lower x coordinate
x_max (numeric) – upper x coordinate
y_min (numeric) – lower y coordinate
y_max (numeric) – upper y coordinate
grid_res (numeric) – x and y resolution of the grid we create
- Returns
- 2-D coordinate arrays for vectorized evaluations of 2-D scalar/vector fields.
The arrays are ordered such that the first dimension relates to the X coordinate and the second the Y coordinate. This is done such that the 2D coordinate arrays work as-is with visualisation with Mayavi without unnecessary transpose operations.
- Return type
(list of 2 2dim numpy.ndarray)
-
ela.spatial.get_bbox(geo_pd)¶ Gets the bounding box of a geopandas dataframe
- Parameters
geo_pd (geopandas) – shape from which we can get the bounding box as a basis for the extend of the meshgrid
- Returns
bbox (x_min, y_min, x_max, y_max)
- Return type
(tuple of floats)
-
ela.spatial.get_coords_from_gpd_shape(shp, colname='geometry', out_colnames=None)¶ Gets a dataframe of x/y geolocations out of a geopandas object
- Parameters
shp (GeoDataFrame) – a geopandas GeoDataFrame
colname (str) – name of columns that has the geometry in the geopandas shapefile
out_colnames (list of str) – names of the added geolocation columns in the output data frame
- Returns
a two column data frames, coordinates of all the points in the geodataframe
- Return type
(DataFrame)
Example
>>> from ela.doc.sampledata import * >>> from ela.spatial import get_coords_from_gpd_shape >>> dem ,bore_loc, litho_logs = sample_data() >>> df = litho_logs[[DEPTH_FROM_COL, DEPTH_TO_COL,TOP_ELEV_COL,BOTTOM_ELEV_COL,LITHO_DESC_COL,HYDRO_CODE_COL]] >>> geoloc = get_coords_from_gpd_shape(bore_loc, colname='geometry', out_colnames=[EASTING_COL, NORTHING_COL]) >>> geoloc[HYDRO_CODE_COL] = bore_loc[HYDRO_CODE_COL] >>> geoloc.head() >>> # With this data frame we can perform two operations in one go: subsetting the lithology records to only the 640 bores of interest, and adding to the result the x/y geolocations to the data frame. >>> df = pd.merge(df, geoloc, how='inner', on=HYDRO_CODE_COL, sort=False, copy=True, indicator=False, validate=None)
-
ela.spatial.get_unique_coordinates(easting, northing)¶ Gets the unique set of geolocated points
- Parameters
easting (iterable of floats) – easting values
northing (iterable of floats) – northing values
- Returns
two dimensional nparray
- Return type
(numpy array)
-
ela.spatial.interpolate_over_meshgrid(predicting_algorithm, mesh_grid)¶ Interpolate lithology data
- Parameters
predicting_algorithm (algorithm with a predict method.) – trained algorithm such as the K Nearest Neighbours in scikit (KNN)
mesh_grid (tuple) – coordinate matrices to interpolate over (numpy.meshgrid)
- Returns
numpy array, predicted values over the grid.
-
ela.spatial.pad_training_set_functor(classes)¶ Create a function that pads a training set so that all possible classes are present
-
ela.spatial.read_raster_value(dem, band_1, easting, northing)¶ Read a value in a raster grid given easting/northing
- Parameters
dem (rasterio dataset) – dem
band_1 (numpy array) – the band 1 from the DEM
easting (float) – easting value
northing (float) – northing value
- Returns
the value of the band at the specified point
- Return type
(float)
-
ela.spatial.set_at_surface_boundary(volume, surface_raster, height_to_z, value=0.0, ignore_nan=False)¶ Burn out parts of a xyz volume given a surface, below or above the intersection of the volume with the surface
- Volume
volume to modify
- Type
3D numpy
- Surface_raster
AHD coordinate at which to slice the data frame for lithology observations
- Type
2D numpy
- Height_to_z
Number of neighbors to pass to KNeighborsClassifier
- Type
a function to convert the surface raster value (height for a DEM) to a corresponding z-index in the volume.
- Below
should the part below or above be burnt out
- Type
bool
- Ignore_nan
If the surface to burn from has NaNs, should it mask out the whole corresponding cells in the volume
- Type
bool
- Inclusive
is the cell in the volume cut by the surface included in the burning out (i.e. set to NaN) or its value kept?
- Type
bool
-
ela.spatial.slice_volume(volume, slice_surface, height_to_z)¶ read values in a volume along a slicing surface
- Parameters
volume (3D ndarray) – volume to drill
slice_surface (2D numpy array) – array of elevations
height_to_z (callable) – converter from datum height to z index
- Returns
the value in the volume at the specified point
- Return type
(float)
-
ela.spatial.surface_array(raster, x_min, y_min, x_max, y_max, grid_res)¶ Creates an isometric 2D grid from a raster, effectively resampling.
This resampling is done to facilitate downstream operations on gridded interpolated volumes.
Args: raster (rasterio dataset): raster to sample from, typically DEM elevations
x_min (numeric): lower x coordinate x_max (numeric): upper x coordinate y_min (numeric): lower y coordinate y_max (numeric): upper y coordinate grid_res (numeric): x and y resolution of the grid we create
- Returns
grid of raster values, typically DEM elevations
- Return type
(2dim numpy.ndarray)
-
ela.spatial.volume_value_at(volume, slice_surface, height_to_z, x, y)¶ read a volume value at an x/y/z location.
- Parameters
volume (3D ndarray) – volume to drill
slice_surface (2D numpy array) – array of elevations
height_to_z (callable) – converter from datum height to z index
x (float) – easting index
y (float) – northing index
- Returns
the value in the volume at the specified point
- Return type
(float)
-
ela.spatial.vstacked_points(xx, yy)¶
-
ela.spatial.z_index_for_ahd_functor(a=1, b=50)¶ Creates a linear univariate function for height to index conversions
- Parameters
a (float) – slope - defaults to one
b (float) – height offset
- Returns
univariate linear function
- Return type
(callable)