GeoData utilities
SparseGeoArray (SGA)
Within DIVACoast.jl
, Gridded geodata is stored and processed using the SparseGeoArray
data structure. This structure retains the necessary geospatial meta-information for accurate referencing and stores the data in a memory-efficient sparse array format. By storing only non-empty grid cells, it is well-suited for coastal datasets, where large areas—such as open ocean or inland regions—often contain no or non relevant data for the analysis.
Main.DIVACoast.SparseGeoArray
— TypeSparseGeoArray{DT,IT}(filename::String, band::Integer=1) where {DT <: Real, IT <: Integer}
A SparseGeoArray
combines spatial data with geospatial information for referencing. The data itself is stored in a efficient (sparse) format, so only the non-empty grid cells are stored. The geospatial information is stored in a CoordinateTransformations.AffineMap
object, which is used to transform between pixel coordinates and geospatial coordinates.
The SparseGeoArray
is a mutable struct that contains the following attributes:
data
: a dictionary that stores the data values for each pixel in the grid. The keys are tuples of pixel coordinates (x,y), and the values are the data values.nodatavalue
: a value that represents missing or invalid data in the grid. This value is used to identify empty cells in the grid.f
: aCoordinateTransformations.AffineMap
object that defines the transformation between pixel coordinates and geospatial coordinates.crs
: aGeoFormatTypes.WellKnownText
object that defines the coordinate reference system (CRS) for the data. This is used to interpret the geospatial coordinates correctly.metadata
: a dictionary that stores additional metadata about the data, such as the source of the data or the date it was collected.xsize
: the number of pixels in the x-direction (width) of the grid.ysize
: the number of pixels in the y-direction (height) of the grid.projref
: a string that represents the projection reference for the data.filename
: a string that represents the filename of the data file.
SparseGeoArrays - Construct
SparseGeoArrays
can be constructing by multiple ways:
From GeoTIFF
Main.DIVACoast.SparseGeoArray{Float32, Int32}("path/to/file.tif")
Construct (empty)
Note: If you init an empty SparseGeoArray
you need to specify the geospatial-metadata yourself to perform geo-Operations.
Main.DIVACoast.SparseGeoArray{Float32, Int32}()
sga[(x, y)] = value # Fill the SparseGeoArray
Construct (empty) with meta-information from another SparseGeoArray
Main.DIVACoast.emptySGAfromSGA
— Function emptySGAfromSGA(orgSGA::SparseGeoArray{DT,IT}, extentNew) where {DT<:Real,IT<:Integer}
Creates an empty SparseGeoArray from an existing SGA (orgSGA) with same metadta but with a new extent defined by extentNew
and empty data.
Arguments
orgSGA::SparseGeoArray{DT,IT}
: The original SparseGeoArray from which metadata is copied.extentNew
: The new extent for the SparseGeoArray, defines as a named tuple with keysuppL
,uppR
,lwrL
, andlwrR
representing the upper left, upper right, lower left, and lower right corners of the new extent.
Returns
- A new
SparseGeoArray{DT,IT}
Example
emptySGAfromSGA(orgSGA, (uppL=(0.0, 10.0), uppR=(10.0, 10.0), lwrL=(0.0, 0.0), lwrR=(10.0, 0.0)))
Manually construct a SparseGeoArray
If you have your data as a dictionary and know the georeferencing, you can construct it directly:
sga = Main.DIVACoast.SparseGeoArray(
data, # Dict{Tuple{Int,Int}, Float64}
nodatavalue, # e.g., -9999.0
affine_map, # CoordinateTransformations.AffineMap
crs, # GeoFormatTypes.WellKnownText
metadata, # Dict{String,Any}
xsize, ysize, # grid dimensions
projref, # projection string
circular, # Bool
filename # String
)
SparseGeoArray - Operations
A set of core spatial operations is provided for SparseGeoArray
objects, enabling manipulation and analysis of gridded exposure data. These functions allow you to combine, compare, and summarize spatial datasets efficiently:
Main.DIVACoast.sga_union
— Function sga_union(sgaArray::Array{SparseGeoArray{DT,IT}}) where {DT<:Real,IT<:Integer}
sga_union(sga1::SparseGeoArray{DT,IT}, sga2::SparseGeoArray{DT,IT}) where {DT<:Real,IT<:Integer}
Get the union of multiple SparseGeoArray
s. The union is a new SparseGeoArray
that contains all data from the input arrays and fills the union extent defined by the input arrays. If a grid cell is present in multiple arrays, the value from the first array is used.
Arguments
sgaArray::Array{SparseGeoArray{DT,IT}}
: An array ofSparseGeoArray
s to be united.
Returns
- A new
SparseGeoArray{DT,IT}
that represents the union of the inputSparseGeoArray
's.
Example
`$julia union = sga_union([sga1, sga2, sga3])$
Main.DIVACoast.sga_intersect
— Function sga_intersect(sgaArray::Array{SparseGeoArray{DT,IT}})::Array{SparseGeoArray{DT,IT}} where {DT<:Real,IT<:Integer}
Get the intersection of multiple SparseGeoArray
s. The intersection is a new SparseGeoArray
that contains only the data that is present in all input arrays, and fills the intersection extent defined by the input arrays. If a grid cell is present in multiple arrays, the value from the first array is used.
Arguments
sgaArray::Array{SparseGeoArray{DT,IT}}
: An array ofSparseGeoArray
s to be intersected.
Returns
- An array of
SparseGeoArray{DT,IT}
that represents the intersection of the inputSparseGeoArray
s.
Example
julia intersection = sga_intersect([sga1, sga2, sga3])
`
Main.DIVACoast.sga_summarize_within
— Function sga_summarize_within(sga::SparseGeoArray{DT,IT}, p::Tuple{Real,Real}, radius::Real, sumryFunction::Function, valueTransformation) where {DT<:Real,IT<:Integer}
Summarize pixel values withing a defined radius around a point p
in a SparseGeoArray
. The function applies a summary function to the values within the radius, optionally transforming the values using a provided transformation function.
Arguments
sga::SparseGeoArray{DT,IT}
: The SparseGeoArray containing the data to be summarized.p::Tuple{Real,Real}
: The point around which to summarize the data, given as a tuple of longitude and latitude.radius::Real
: The radius in kilometers within which to summarize the data.sumryFunction::Function
: The function to apply to the values within the radius. It should take an array of values and return a single summary value.valueTransformation
: A function that transforms the values before applying the summary function. It should take the SparseGeoArray, and the x and y indices of the value to be transformed.
Returns
- A single summary value calculated from the values within the specified radius around point
p
.
There are multiple options to access the data based on an index or on a coordinate.
Indexing SparseGeoArrays
Base.getindex
— Functiongetindex(sga::SparseGeoArray, i::AbstractRange, j::AbstractRange, k::Union{Colon,AbstractRange,Integer})
Index a SparseGeoArray with AbstractRange
s to get a cropped SparseGeoArray with the correct AffineMap
set.
Examples
julia> sga[2:3,2:3]
2x2x1 Array{Float64, 3} with AffineMap([1.0 0.0; 0.0 1.0], [1.0, 1.0]) and undefined CRS
Main.DIVACoast.coords
— Functioncoords(sga::SparseGeoArray, p::SVector{2,<:Integer}, strategy::AbstractStrategy=Center())
coords(sga::SparseGeoArray, p::Tuple{<:Integer,<:Integer}, strategy::AbstractStrategy=Center())
coords(sga::SparseGeoArray, p::CartesianIndex{2}, strategy::AbstractStrategy=Center())`
Retrieve geospatial coordinates of the cell represented by indices p
in a SparseGeoArray
. The coordinates are returned as a SVector{2}
. The strategy
parameter determines how the coordinates are calculated, e.g., whether they are centered or aligned with the upper left corner of the cell. See indices
for the inverse function.
Arguments
sga
: TheSparseGeoArray
to retrieve coordinates from.p
: The pixel indices as aSVector{2}
of integers, a tuple of integers, or aCartesianIndex{2}
.strategy
: An optional parameter that determines how the coordinates are calculated. Defaults toCenter()
, which returns the center of the cell. Other strategies includeUpperLeft()
,UpperRight()
,LowerLeft()
, andLowerRight()
.
Returns
- Returns a
SVector{2}
containing the geospatial coordinates of the cell represented by the indicesp
.
Example
Main.DIVACoast.indices
— Functionindices(sga::SparseGeoArray, p::SVector{2,<:Real})
Retrieve indices of the cell represented by geospatial coordinates p
. See coords
for the inverse function.
Arguments
sga
: TheSparseGeoArray
to retrieve indices from.p
: The geospatial coordinates as aSVector{2}
of real numbers, a tuple of real numbers, or anAbstractVector{<:Real}
.
Returns
- Returns a
Tuple{IT, IT}
containing the indices of the cell represented by the geospatial coordinatesp
.
Example
indices(sga, (12.34, 56.78))
Neighbourhood and Distance Functions
Main.DIVACoast.nh4
— Functionnh4(sga :: SparseGeoArray{DT, IT}, x :: Integer, y :: Integer) :: Array{Tuple{IT,IT}}
nh4(sga :: SparseGeoArray{DT, IT}, p :: Tuple{Integer,Integer}) :: Array{Tuple{IT,IT}}
Compute the 4-Neighbourhood of the grid cell x
,y
in the SparseGeoArray sga
and return as Array of pairs. Takes into account the boundaries of the SparseGeoArray.
Examples
julia> nh4(sga, 1, 1)
2-element Vector{Tuple{Int32, Int32}}:
(2, 1)
(1, 2)
julia> nh4(sga, 2, 4)
4-element Vector{Tuple{Int32, Int32}}:
(1, 4)
(3, 4)
(2, 3)
(2, 5)
Main.DIVACoast.nh8
— Functionsame as nh4 but accounting for al 8 neighbours of a pixel with index (x,y)
Main.DIVACoast.distance
— Functiondistance(lon1 :: R, lat1 :: R, lon2 :: R, lat2 :: R) :: R where {R <: Real}
distance(p1 :: SVector{2,R}, p2 :: SVector{2,R}) :: R where {R <: Real}
distance(p1 :: AbstractVector{R}, p2 :: AbstractVector{R}) :: R where {R <: Real}
distance(p1 :: Tuple{R,R}, p2 :: Tuple{R,R}) where {R <: Real} :: R where {R <: Real}
Compute the distance (in km) between two points given by lon1,lat1 and lon2,lat2 resp. p1 and p2. Uses the Haversine formula.
Examples
julia> ...
distance(hspf::HypsometricProfile, e::Real)
Compute the distance of elevation e (given in m) from the coastline in hspf. disatnce is returned in km.
Main.DIVACoast.go_direction
— Functiongo_direction(lon :: R, lat :: R, distance :: Real, direction :: AbstractDirection) :: Tuple{R,R} where {R <: Real}
Compute the geographical coordinates of the point reached if we go distance km from (lon,lat) in direction. Takes into account circularity, but does not cross poles. direction can be East(), North(), West(), South()
Examples
julia> go_direction(13.2240, 52.3057, 10, East())
(13.370916039175427, 52.3057)
julia> go_direction(13.2240, 52.3057, 10000, North())
(13.224, 90.0)
julia> go_direction(19.0045,0.0,40075,West())
(19.004500000000007, 0.0)
Spatial Extent
Main.DIVACoast.bounding_boxes
— Functionbounding_boxes(sga :: SparseGeoArray{DT, IT}, lon_east :: Real, lon_west :: Real, lat_south :: Real, lat_north :: Real) where {DT <: Real, IT <: Integer}
Compute the bounding box(es) for the sparse geoarray sga and an area from loneast to lonwest and latsouth and latnorth.
Examples
julia> ...
Main.DIVACoast.area
— Functionarea(sga::SparseGeoArray, i::I, j::I) where {I <: Integer}
area(sga :: SparseGeoArray, p::Tuple{<:Integer,<:Integer}) = area(sga, p[1], p[2])
Calculate the area of the cell at pixel indices (i, j)
in a SparseGeoArray
. The area is calculated using the Haversine formula, which accounts for the curvature of the Earth.
Arguments
sga
: TheSparseGeoArray
containing the geospatial data.i
: The pixel index in the x-direction.j
: The pixel index in the y-direction.
Returns
- Returns the area of the cell.
Example
area(sga, 10, 20)
Main.DIVACoast.get_extent
— Function get_extent(sga::SparseGeoArray{DT,IT}) where {DT<:Real,IT<:Integer}
Get the extent of a SparseGeoArray as coordinates of its corners.
Arguments
sga::SparseGeoArray{DT,IT}
: The SparseGeoArray for which the extent is to be calculated.
Returns
- A named tuple with keys
uppL
,uppR
,lwrL
, andlwrR
representing the upper left, upper right, lower left, and lower right corners of the extent.
Example
sga = SparseGeoArray(...) # Assume sga is defined
extent = get_extent(sga)
Coordinate Reference System (CRS)
Main.DIVACoast.epsg2wkt
— Function epsg2wkt(epsgcode::Int)
Convert an EPSG code to WKT (WellKnownText) format.
Arguments
epsgcode::Int
: The EPSG code to convert.
Returns
- A string containing the WKT representation of the EPSG code.
Example
wkt = epsg2wkt(4326)
println(wkt)
Main.DIVACoast.proj2wkt
— Function proj2wkt(projstring::AbstractString)
Convert a PROJ string to WKT (WellKnownText) format.
Arguments
projstring::AbstractString
: The PROJ string to convert.
Returns
- A string containing the WKT representation of the PROJ string.
Example
wkt = proj2wkt("+proj=longlat +datum=WGS84 +no_defs")
println(wkt)
Main.DIVACoast.str2wkt
— Function str2wkt(crs_string::AbstractString)
Convert a CRS string (PROJ, EPSG, or WKT) to WKT format.
Arguments
crs_string::AbstractString
: The CRS string to convert, which can be in PROJ, EPSG, or WKT format.
Returns
- A string containing the WKT representation of the CRS string.
Example
wkt = str2wkt("+proj=longlat +datum=WGS84 +no_defs")
println(wkt)
wkt = str2wkt("EPSG:4326")
println(wkt)
Main.DIVACoast.epsg!
— Function epsg!(sga::SparseGeoArray, epsgcode::Int)
epsg!(sga::SparseGeoArray, epsgstring::AbstractString)
Set the EPSG code for a SparseGeoArray.
Arguments
sga::SparseGeoArray
: The SparseGeoArray to set the EPSG code for.epsgcode::Int
: The EPSG code to set.epsgstring::AbstractString
: The EPSG code as a string to set.
Returns
- The updated SparseGeoArray with the new EPSG code set.
Example
sga = SparseGeoArray(...)
epsg!(sga, 4326) # Set EPSG code to 4326
epsg!(sga, "EPSG:4326") # Set EPSG code using string
Main.DIVACoast.is_rotated
— FunctionCheck wether the AffineMap of a GeoArray contains rotations.
Main.DIVACoast.bbox!
— FunctionSet geotransform of SparseGeoArray
by specifying a bounding box. Note that this only can result in a non-rotated or skewed GeoArray
.
GeoTIFF
As described above SparseGeoArray
is the structure used by DIVACoast.jl
to handle GeoData. In addition, DIVACoast.jl
provides several functions to read, modify and export files in the common GeoData Format .geotiff
.
Operations
The following operations process (multiple) GeoTIFF rasters using a user-defined function, enabling custom analysis or aggregation across several input files.
Function | Inputs | Output | Operation |
---|---|---|---|
geotiff_connect | 2 rasters, function | 1 raster | Combine two rasters pixel-wise using a function |
geotiff_transform | 1 raster, function | 1 raster | Transform one raster pixel-wise using a function |
geotiff_collect | mask, rasters, func | (custom) | Collect values from multiple rasters and mask, process with a function |
Main.DIVACoast.geotiff_connect
— Function geotiff_connect(infilename1::String, infilename2::String, outfilename::String, f::Function)
Takes two geofiffs and connects (combines) them pixelwise by a given function. The result is saved in a new geotiff file.
Arguments
infilename1::String
: The path to the first input geotiff file.infilename2::String
: The path to the second input geotiff file.outfilename::String
: The path to the output geotiff file.f::Function
: A function that will be applied pixelwise. Function definition must match the signature: <br>f(value1::Float32, value2::Float32, sga1::SparseGeoArray, sga2::SparseGeoArray)
wherevalue1
andvalue2
are the pixel values from the two input rasters, andsga1
andsga2
are the corresponding SparseGeoArray objects. <br>Note: The SparseGeoArrays will be created from the input files
Example
path1 = "path/to/input1.tif"
path2 = "path/to/input2.tif"
path_out = "path/to/output.tif"
takemaximum = (x, y, sga1, sga2) -> x > y ? x : y # Takes the maximum of both files
geotiff_connect(path1, path2, path_out, takemaximum)
Main.DIVACoast.geotiff_transform
— Function function geotiff_transform(infilename1::String, outfilename::String, f::Function)
Applies a function pixelwise to a geotiff file and saves the result in a new geortiff file.
Arguments
infilename1::String
: The path to the input geotiff file.outfilename::String
: The path to the output geotiff file.f::Function
: A function that will be applied pixelwise. Function definition must match the signature: <br>f(value1::Float32, sga::SparseGeoArray, r::Int, i::Int)
wherevalue1
is the pixel value from the input raster, andsga
is the corresponding SparseGeoArray object.r
represents the row number andi
represents the column number of the pixel in the raster. <br>Note: The SparseGeoArray will be created from the input file
Example
input_path = "path/to/input.tif"
output_path = "path/to/output.tif"
square = (x, sga, r, i) -> x * x # Squares the pixel value
geotiff_transform(input_path, output_path, square)
Main.DIVACoast.geotiff_collect
— Function function geotiff_collect(maskfilename::String, infilenames::Array{String}, f::Function)
Applies a function pixelwise to one ore multiple geotiff's (infilenames
) and saves the result in a new geotiff file. The function is applied only to the pixels that are not masked by the mask geotiff (maskfilename
). The result is saved in a new geotiff file.
Arguments
maskfilename::String
: The path to the mask geotiff file.infilenames::Array{String}
: An array of paths to the input geotiff files.f::Function
: A function that will be applied pixelwise. Function definition must match the signature: <br>f(mask_value::Float32, values::Array{Float32}, sga_mask::SparseGeoArray, sga_ins::Array{SparseGeoArray})
wheremask_value
is the pixel value from the mask raster,values
is an array of pixel values from the input rasters, andsga_mask
andsga_ins
are the corresponding SparseGeoArray objects. <br>Note: The SparseGeoArrays will be created from the input files
Point Data
GeoData is often provided by coordinates. In most cases you can transform the provided data into a DataFrame
object with a coordinate or langitude and latitude column. DIVACoast.jl
provides a structure called Neighbour
to perform Nearest Neighbour searches on DataFrames
in a convinient and efficient way.
Main.DIVACoast.Neighbour
— Typefunction Neighbour(df::DataFrame, dtype::Type; lonlatCols::Tuple{Union{String, Symbol}, Union{String, Symbol}} = (:lon, :lat), dropna::Bool = true)
The Neighbour structure is a efficient way to access a DataFrame
by coordinates. It therfore transforms the longitude and latitude columns lonlatCols
of the input DataFrame to a Matrix in wide-format and creates a BallTree object. The BallTree object is then used to search for nearest neighbours efficiently.
Arguments
- df: The input DataFrame containing the longitude and latitude column.
- dtype: The datatype the coordinates will be parsed in.
- lonlatCols (optional): The longitude and latitude columns (default = (:lon, :lat))
- dropna (optional): Whether na's should be kept or not (default = true).
Example
using DataFrames
using CSV
observations = CSV.File("observations.csv") |> DataFrame # read observations.csv as DataFrame
nObservations = Neighbour(observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true) # create Neighbour object
nearest(nObservations, (37.1100, -12.2877)) # search for nearest neighbour of coordinate (37.1100, -12.2877)) # search for nearest neighbour of coordinate (37.1100, -12.2877) in DataFrame
Nearest Neighbour search can be performed using:
Main.DIVACoast.nearest
— Functionfunction nearest(n::Neighbour, coordinate::Tuple)
The nearest function performs a nearest neighbour search on the Neighbours
Object
Arguments
- n: The Neighbours Object to search trough.
- coordinate: A coordinate the nearest neighbour relates to.
Return
Returns a tuple (index, distance, info) containing the of the nearest neighbour in the Neighbour.tree
object, the distance to the nearest neighbour and the information of the nearest neighbour in the Neighbour.dataframe
object.
Mathches two DataFrames
by coordinates. The function uses the nearest
function to find the nearest neighbour in a Neighbour
object for each coordinate in the input DataFrame.
Arguments
- n: The Neighbours Object to search trough.
- df: DataFrame holding the coordinates, the nearest neighbour should relate to.
- dtype: DataType the coordinates will be parsed in.
- lonlatCols: Columns names (string / symbol) of the columns in the input dataframe holding the coordinates.
- dropna: Whether na's should be kept or not (default = true).
Return
Returns a DataFrame containing the index, distance and information of the nearest neighbour for each coordinate in the input DataFrame.
Example
using DataFrames
using CSV
last_observations = CSV.File("last_observations.csv") |> DataFrame # read last_observations.csv as DataFrame
new_observations = CSV.File("new_observations.csv") |> DataFrame # read new_observations.csv as DataFrame
nlastObservations = Neighbour(last_observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true)
observations = nearest(nlastObservations, new_observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true)
Main.DIVACoast.nearest_coord
— Functionnearest_coord(n::Neighbour, coordinate::Tuple)
The nearest_coord function performs a nearest neighbour search on the Neighbours
Object and returns the coordinates of the nearest neighbour.
Arguments
- n: The Neighbours Object to search trough.
- coordinate: A coordinate the nearest neighbour relates Tuple{Float64, Float64} (lon, lat) to.
Return
Returns a tuple (lonN, latN) containing the coordinates of the nearest neighbour in the Neighbour.tree
object.
Example
using DataFrames
using CSV
observations = CSV.File("observations.csv") |> DataFrame # read observations.csv as DataFrame
nObservations = Neighbour(observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true) # create Neighbour object
nearest = nearest_coord(nObservations, (37.1100, -12.2877)) # search for nearest neighbour coordinate of coordinate (37.1100, -12.2877)