GeoData utilities

SparseGeoArray (SGA)

Within DIVACoast.jl, Gridded geodata is stored and processed using the SparseGeoArray data structure. This structure retains the necessary geospatial meta-information for accurate referencing and stores the data in a memory-efficient sparse array format. By storing only non-empty grid cells, it is well-suited for coastal datasets, where large areas—such as open ocean or inland regions—often contain no or non relevant data for the analysis.

Main.DIVACoast.SparseGeoArrayType
SparseGeoArray{DT,IT}(filename::String, band::Integer=1) where {DT <: Real, IT <: Integer}

A SparseGeoArray combines spatial data with geospatial information for referencing. The data itself is stored in a efficient (sparse) format, so only the non-empty grid cells are stored. The geospatial information is stored in a CoordinateTransformations.AffineMap object, which is used to transform between pixel coordinates and geospatial coordinates.

The SparseGeoArray is a mutable struct that contains the following attributes:

  • data: a dictionary that stores the data values for each pixel in the grid. The keys are tuples of pixel coordinates (x,y), and the values are the data values.

  • nodatavalue: a value that represents missing or invalid data in the grid. This value is used to identify empty cells in the grid.

  • f: a CoordinateTransformations.AffineMap object that defines the transformation between pixel coordinates and geospatial coordinates.

  • crs: a GeoFormatTypes.WellKnownText object that defines the coordinate reference system (CRS) for the data. This is used to interpret the geospatial coordinates correctly.

  • metadata: a dictionary that stores additional metadata about the data, such as the source of the data or the date it was collected.

  • xsize: the number of pixels in the x-direction (width) of the grid.

  • ysize: the number of pixels in the y-direction (height) of the grid.

  • projref: a string that represents the projection reference for the data.

  • filename: a string that represents the filename of the data file.

SparseGeoArrays - Construct

SparseGeoArrays can be constructing by multiple ways:

From GeoTIFF

Main.DIVACoast.SparseGeoArray{Float32, Int32}("path/to/file.tif")

Construct (empty)
Note: If you init an empty SparseGeoArray you need to specify the geospatial-metadata yourself to perform geo-Operations.

Main.DIVACoast.SparseGeoArray{Float32, Int32}()
sga[(x, y)] = value # Fill the SparseGeoArray

Construct (empty) with meta-information from another SparseGeoArray

Main.DIVACoast.emptySGAfromSGAFunction
    emptySGAfromSGA(orgSGA::SparseGeoArray{DT,IT}, extentNew) where {DT<:Real,IT<:Integer}

Creates an empty SparseGeoArray from an existing SGA (orgSGA) with same metadta but with a new extent defined by extentNew and empty data.

Arguments

  • orgSGA::SparseGeoArray{DT,IT}: The original SparseGeoArray from which metadata is copied.
  • extentNew: The new extent for the SparseGeoArray, defines as a named tuple with keys uppL, uppR, lwrL, and lwrR representing the upper left, upper right, lower left, and lower right corners of the new extent.

Returns

  • A new SparseGeoArray{DT,IT}

Example

emptySGAfromSGA(orgSGA, (uppL=(0.0, 10.0), uppR=(10.0, 10.0), lwrL=(0.0, 0.0), lwrR=(10.0, 0.0)))

Manually construct a SparseGeoArray
If you have your data as a dictionary and know the georeferencing, you can construct it directly:

sga = Main.DIVACoast.SparseGeoArray(
    data,          # Dict{Tuple{Int,Int}, Float64}
    nodatavalue,   # e.g., -9999.0
    affine_map,    # CoordinateTransformations.AffineMap
    crs,           # GeoFormatTypes.WellKnownText
    metadata,      # Dict{String,Any}
    xsize, ysize,  # grid dimensions
    projref,       # projection string
    circular,      # Bool
    filename       # String
)

SparseGeoArray - Operations

A set of core spatial operations is provided for SparseGeoArray objects, enabling manipulation and analysis of gridded exposure data. These functions allow you to combine, compare, and summarize spatial datasets efficiently:

Main.DIVACoast.sga_unionFunction
    sga_union(sgaArray::Array{SparseGeoArray{DT,IT}}) where {DT<:Real,IT<:Integer}
    sga_union(sga1::SparseGeoArray{DT,IT}, sga2::SparseGeoArray{DT,IT}) where {DT<:Real,IT<:Integer}

Get the union of multiple SparseGeoArrays. The union is a new SparseGeoArray that contains all data from the input arrays and fills the union extent defined by the input arrays. If a grid cell is present in multiple arrays, the value from the first array is used.

Arguments

  • sgaArray::Array{SparseGeoArray{DT,IT}}: An array of SparseGeoArrays to be united.

Returns

  • A new SparseGeoArray{DT,IT} that represents the union of the input SparseGeoArray's.

Example

`$julia union = sga_union([sga1, sga2, sga3])$

Main.DIVACoast.sga_intersectFunction
    sga_intersect(sgaArray::Array{SparseGeoArray{DT,IT}})::Array{SparseGeoArray{DT,IT}} where {DT<:Real,IT<:Integer}

Get the intersection of multiple SparseGeoArrays. The intersection is a new SparseGeoArray that contains only the data that is present in all input arrays, and fills the intersection extent defined by the input arrays. If a grid cell is present in multiple arrays, the value from the first array is used.

Arguments

  • sgaArray::Array{SparseGeoArray{DT,IT}}: An array of SparseGeoArrays to be intersected.

Returns

  • An array of SparseGeoArray{DT,IT} that represents the intersection of the input SparseGeoArrays.

Example

julia intersection = sga_intersect([sga1, sga2, sga3])`

Main.DIVACoast.sga_summarize_withinFunction
    sga_summarize_within(sga::SparseGeoArray{DT,IT}, p::Tuple{Real,Real}, radius::Real, sumryFunction::Function, valueTransformation) where {DT<:Real,IT<:Integer}

Summarize pixel values withing a defined radius around a point p in a SparseGeoArray. The function applies a summary function to the values within the radius, optionally transforming the values using a provided transformation function.

Arguments

  • sga::SparseGeoArray{DT,IT}: The SparseGeoArray containing the data to be summarized.
  • p::Tuple{Real,Real}: The point around which to summarize the data, given as a tuple of longitude and latitude.
  • radius::Real: The radius in kilometers within which to summarize the data.
  • sumryFunction::Function: The function to apply to the values within the radius. It should take an array of values and return a single summary value.
  • valueTransformation: A function that transforms the values before applying the summary function. It should take the SparseGeoArray, and the x and y indices of the value to be transformed.

Returns

  • A single summary value calculated from the values within the specified radius around point p.

There are multiple options to access the data based on an index or on a coordinate.

Indexing SparseGeoArrays

Base.getindexFunction
getindex(sga::SparseGeoArray, i::AbstractRange, j::AbstractRange, k::Union{Colon,AbstractRange,Integer})

Index a SparseGeoArray with AbstractRanges to get a cropped SparseGeoArray with the correct AffineMap set.

Examples

julia> sga[2:3,2:3]
2x2x1 Array{Float64, 3} with AffineMap([1.0 0.0; 0.0 1.0], [1.0, 1.0]) and undefined CRS
Main.DIVACoast.coordsFunction
coords(sga::SparseGeoArray, p::SVector{2,<:Integer}, strategy::AbstractStrategy=Center())
coords(sga::SparseGeoArray, p::Tuple{<:Integer,<:Integer}, strategy::AbstractStrategy=Center())
coords(sga::SparseGeoArray, p::CartesianIndex{2}, strategy::AbstractStrategy=Center())`

Retrieve geospatial coordinates of the cell represented by indices p in a SparseGeoArray. The coordinates are returned as a SVector{2}. The strategy parameter determines how the coordinates are calculated, e.g., whether they are centered or aligned with the upper left corner of the cell. See indices for the inverse function.

Arguments

  • sga: The SparseGeoArray to retrieve coordinates from.
  • p: The pixel indices as a SVector{2} of integers, a tuple of integers, or a CartesianIndex{2}.
  • strategy: An optional parameter that determines how the coordinates are calculated. Defaults to Center(), which returns the center of the cell. Other strategies include UpperLeft(), UpperRight(), LowerLeft(), and LowerRight().

Returns

  • Returns a SVector{2} containing the geospatial coordinates of the cell represented by the indices p.

Example

Main.DIVACoast.indicesFunction
indices(sga::SparseGeoArray, p::SVector{2,<:Real})

Retrieve indices of the cell represented by geospatial coordinates p. See coords for the inverse function.

Arguments

  • sga: The SparseGeoArray to retrieve indices from.
  • p: The geospatial coordinates as a SVector{2} of real numbers, a tuple of real numbers, or an AbstractVector{<:Real}.

Returns

  • Returns a Tuple{IT, IT} containing the indices of the cell represented by the geospatial coordinates p.

Example

indices(sga, (12.34, 56.78))

Neighbourhood and Distance Functions

Main.DIVACoast.nh4Function
nh4(sga :: SparseGeoArray{DT, IT}, x :: Integer, y :: Integer) :: Array{Tuple{IT,IT}}
nh4(sga :: SparseGeoArray{DT, IT}, p :: Tuple{Integer,Integer}) :: Array{Tuple{IT,IT}}

Compute the 4-Neighbourhood of the grid cell x,y in the SparseGeoArray sga and return as Array of pairs. Takes into account the boundaries of the SparseGeoArray.

Examples

julia> nh4(sga, 1, 1)
2-element Vector{Tuple{Int32, Int32}}:
 (2, 1)
 (1, 2)

julia> nh4(sga, 2, 4)
4-element Vector{Tuple{Int32, Int32}}:
 (1, 4)
 (3, 4)
 (2, 3)
 (2, 5)
Main.DIVACoast.nh8Function

same as nh4 but accounting for al 8 neighbours of a pixel with index (x,y)

Main.DIVACoast.distanceFunction
distance(lon1 :: R, lat1 :: R, lon2 :: R, lat2 :: R) :: R where {R <: Real}
distance(p1 :: SVector{2,R}, p2 :: SVector{2,R}) :: R where {R <: Real}
distance(p1 :: AbstractVector{R}, p2 :: AbstractVector{R}) :: R where {R <: Real} 
distance(p1 :: Tuple{R,R}, p2 :: Tuple{R,R}) where {R <: Real} :: R where {R <: Real}

Compute the distance (in km) between two points given by lon1,lat1 and lon2,lat2 resp. p1 and p2. Uses the Haversine formula.

Examples

julia> ...

distance(hspf::HypsometricProfile, e::Real)

Compute the distance of elevation e (given in m) from the coastline in hspf. disatnce is returned in km.

Main.DIVACoast.go_directionFunction
go_direction(lon :: R, lat :: R, distance :: Real, direction :: AbstractDirection) :: Tuple{R,R} where {R <: Real}

Compute the geographical coordinates of the point reached if we go distance km from (lon,lat) in direction. Takes into account circularity, but does not cross poles. direction can be East(), North(), West(), South()

Examples

julia> go_direction(13.2240, 52.3057, 10, East())
(13.370916039175427, 52.3057)
julia> go_direction(13.2240, 52.3057, 10000, North())
(13.224, 90.0)
julia> go_direction(19.0045,0.0,40075,West())
(19.004500000000007, 0.0)

Spatial Extent

Main.DIVACoast.bounding_boxesFunction
bounding_boxes(sga :: SparseGeoArray{DT, IT}, lon_east :: Real, lon_west :: Real, lat_south :: Real, lat_north :: Real) where {DT <: Real, IT <: Integer}

Compute the bounding box(es) for the sparse geoarray sga and an area from loneast to lonwest and latsouth and latnorth.

Examples

julia> ...
Main.DIVACoast.areaFunction
area(sga::SparseGeoArray, i::I, j::I) where {I <: Integer}
area(sga :: SparseGeoArray, p::Tuple{<:Integer,<:Integer}) = area(sga, p[1], p[2])

Calculate the area of the cell at pixel indices (i, j) in a SparseGeoArray. The area is calculated using the Haversine formula, which accounts for the curvature of the Earth.

Arguments

  • sga: The SparseGeoArray containing the geospatial data.
  • i: The pixel index in the x-direction.
  • j: The pixel index in the y-direction.

Returns

  • Returns the area of the cell.

Example

area(sga, 10, 20)
Main.DIVACoast.get_extentFunction
    get_extent(sga::SparseGeoArray{DT,IT}) where {DT<:Real,IT<:Integer}

Get the extent of a SparseGeoArray as coordinates of its corners.

Arguments

  • sga::SparseGeoArray{DT,IT}: The SparseGeoArray for which the extent is to be calculated.

Returns

  • A named tuple with keys uppL, uppR, lwrL, and lwrR representing the upper left, upper right, lower left, and lower right corners of the extent.

Example

sga = SparseGeoArray(...)  # Assume sga is defined
extent = get_extent(sga)

Coordinate Reference System (CRS)

Main.DIVACoast.epsg2wktFunction
    epsg2wkt(epsgcode::Int)

Convert an EPSG code to WKT (WellKnownText) format.

Arguments

  • epsgcode::Int: The EPSG code to convert.

Returns

  • A string containing the WKT representation of the EPSG code.

Example

wkt = epsg2wkt(4326)
println(wkt)
Main.DIVACoast.proj2wktFunction
    proj2wkt(projstring::AbstractString)

Convert a PROJ string to WKT (WellKnownText) format.

Arguments

  • projstring::AbstractString: The PROJ string to convert.

Returns

  • A string containing the WKT representation of the PROJ string.

Example

wkt = proj2wkt("+proj=longlat +datum=WGS84 +no_defs")
println(wkt)        
Main.DIVACoast.str2wktFunction
    str2wkt(crs_string::AbstractString)

Convert a CRS string (PROJ, EPSG, or WKT) to WKT format.

Arguments

  • crs_string::AbstractString: The CRS string to convert, which can be in PROJ, EPSG, or WKT format.

Returns

  • A string containing the WKT representation of the CRS string.

Example

wkt = str2wkt("+proj=longlat +datum=WGS84 +no_defs")
println(wkt)
wkt = str2wkt("EPSG:4326")
println(wkt)
Main.DIVACoast.epsg!Function
    epsg!(sga::SparseGeoArray, epsgcode::Int)
    epsg!(sga::SparseGeoArray, epsgstring::AbstractString)

Set the EPSG code for a SparseGeoArray.

Arguments

  • sga::SparseGeoArray: The SparseGeoArray to set the EPSG code for.
  • epsgcode::Int: The EPSG code to set.
  • epsgstring::AbstractString: The EPSG code as a string to set.

Returns

  • The updated SparseGeoArray with the new EPSG code set.

Example

sga = SparseGeoArray(...)
epsg!(sga, 4326)  # Set EPSG code to 4326
epsg!(sga, "EPSG:4326")  # Set EPSG code using string
Main.DIVACoast.bbox!Function

Set geotransform of SparseGeoArray by specifying a bounding box. Note that this only can result in a non-rotated or skewed GeoArray.

GeoTIFF

As described above SparseGeoArray is the structure used by DIVACoast.jl to handle GeoData. In addition, DIVACoast.jl provides several functions to read, modify and export files in the common GeoData Format .geotiff.

Operations

The following operations process (multiple) GeoTIFF rasters using a user-defined function, enabling custom analysis or aggregation across several input files.

FunctionInputsOutputOperation
geotiff_connect2 rasters, function1 rasterCombine two rasters pixel-wise using a function
geotiff_transform1 raster, function1 rasterTransform one raster pixel-wise using a function
geotiff_collectmask, rasters, func(custom)Collect values from multiple rasters and mask, process with a function
Main.DIVACoast.geotiff_connectFunction
    geotiff_connect(infilename1::String, infilename2::String, outfilename::String, f::Function)

Takes two geofiffs and connects (combines) them pixelwise by a given function. The result is saved in a new geotiff file.

Arguments

  • infilename1::String : The path to the first input geotiff file.
  • infilename2::String : The path to the second input geotiff file.
  • outfilename::String : The path to the output geotiff file.
  • f::Function : A function that will be applied pixelwise. Function definition must match the signature: <br> f(value1::Float32, value2::Float32, sga1::SparseGeoArray, sga2::SparseGeoArray) where value1 and value2 are the pixel values from the two input rasters, and sga1 and sga2 are the corresponding SparseGeoArray objects. <br>Note: The SparseGeoArrays will be created from the input files

Example

path1 = "path/to/input1.tif"
path2 = "path/to/input2.tif"
path_out = "path/to/output.tif"

takemaximum = (x, y, sga1, sga2) -> x > y ? x : y  # Takes the maximum of both files
geotiff_connect(path1, path2, path_out, takemaximum) 
Main.DIVACoast.geotiff_transformFunction
    function geotiff_transform(infilename1::String, outfilename::String, f::Function)

Applies a function pixelwise to a geotiff file and saves the result in a new geortiff file.

Arguments

  • infilename1::String : The path to the input geotiff file.
  • outfilename::String : The path to the output geotiff file.
  • f::Function : A function that will be applied pixelwise. Function definition must match the signature: <br> f(value1::Float32, sga::SparseGeoArray, r::Int, i::Int) where value1 is the pixel value from the input raster, and sga is the corresponding SparseGeoArray object. r represents the row number and i represents the column number of the pixel in the raster. <br>Note: The SparseGeoArray will be created from the input file

Example

input_path = "path/to/input.tif"
output_path = "path/to/output.tif"
square = (x, sga, r, i) -> x * x # Squares the pixel value
geotiff_transform(input_path, output_path, square)
Main.DIVACoast.geotiff_collectFunction
    function geotiff_collect(maskfilename::String, infilenames::Array{String}, f::Function)

Applies a function pixelwise to one ore multiple geotiff's (infilenames) and saves the result in a new geotiff file. The function is applied only to the pixels that are not masked by the mask geotiff (maskfilename). The result is saved in a new geotiff file.

Arguments

  • maskfilename::String : The path to the mask geotiff file.
  • infilenames::Array{String} : An array of paths to the input geotiff files.
  • f::Function : A function that will be applied pixelwise. Function definition must match the signature: <br> f(mask_value::Float32, values::Array{Float32}, sga_mask::SparseGeoArray, sga_ins::Array{SparseGeoArray}) where mask_value is the pixel value from the mask raster, values is an array of pixel values from the input rasters, and sga_mask and sga_ins are the corresponding SparseGeoArray objects. <br>Note: The SparseGeoArrays will be created from the input files

Point Data

GeoData is often provided by coordinates. In most cases you can transform the provided data into a DataFrame object with a coordinate or langitude and latitude column. DIVACoast.jl provides a structure called Neighbour to perform Nearest Neighbour searches on DataFrames in a convinient and efficient way.

Main.DIVACoast.NeighbourType
function Neighbour(df::DataFrame, dtype::Type; lonlatCols::Tuple{Union{String, Symbol}, Union{String, Symbol}} = (:lon, :lat), dropna::Bool = true)

The Neighbour structure is a efficient way to access a DataFrame by coordinates. It therfore transforms the longitude and latitude columns lonlatCols of the input DataFrame to a Matrix in wide-format and creates a BallTree object. The BallTree object is then used to search for nearest neighbours efficiently.

Arguments

  • df: The input DataFrame containing the longitude and latitude column.
  • dtype: The datatype the coordinates will be parsed in.
  • lonlatCols (optional): The longitude and latitude columns (default = (:lon, :lat))
  • dropna (optional): Whether na's should be kept or not (default = true).

Example

using DataFrames
using CSV

observations = CSV.File("observations.csv") |> DataFrame # read observations.csv as DataFrame
nObservations = Neighbour(observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true) # create Neighbour object
nearest(nObservations, (37.1100, -12.2877)) # search for nearest neighbour of coordinate (37.1100, -12.2877)) # search for nearest neighbour of coordinate (37.1100, -12.2877) in DataFrame

Nearest Neighbour search can be performed using:

Main.DIVACoast.nearestFunction
function nearest(n::Neighbour, coordinate::Tuple)

The nearest function performs a nearest neighbour search on the Neighbours Object

Arguments

  • n: The Neighbours Object to search trough.
  • coordinate: A coordinate the nearest neighbour relates to.

Return

Returns a tuple (index, distance, info) containing the of the nearest neighbour in the Neighbour.tree object, the distance to the nearest neighbour and the information of the nearest neighbour in the Neighbour.dataframe object.

Mathches two DataFrames by coordinates. The function uses the nearest function to find the nearest neighbour in a Neighbour object for each coordinate in the input DataFrame.

Arguments

  • n: The Neighbours Object to search trough.
  • df: DataFrame holding the coordinates, the nearest neighbour should relate to.
  • dtype: DataType the coordinates will be parsed in.
  • lonlatCols: Columns names (string / symbol) of the columns in the input dataframe holding the coordinates.
  • dropna: Whether na's should be kept or not (default = true).

Return

Returns a DataFrame containing the index, distance and information of the nearest neighbour for each coordinate in the input DataFrame.

Example

using DataFrames
using CSV

last_observations = CSV.File("last_observations.csv") |> DataFrame # read last_observations.csv as DataFrame
new_observations = CSV.File("new_observations.csv") |> DataFrame # read new_observations.csv as DataFrame

nlastObservations = Neighbour(last_observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true)
observations = nearest(nlastObservations, new_observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true) 
Main.DIVACoast.nearest_coordFunction
nearest_coord(n::Neighbour, coordinate::Tuple)

The nearest_coord function performs a nearest neighbour search on the Neighbours Object and returns the coordinates of the nearest neighbour.

Arguments

  • n: The Neighbours Object to search trough.
  • coordinate: A coordinate the nearest neighbour relates Tuple{Float64, Float64} (lon, lat) to.

Return

Returns a tuple (lonN, latN) containing the coordinates of the nearest neighbour in the Neighbour.tree object.

Example

using DataFrames
using CSV

observations = CSV.File("observations.csv") |> DataFrame # read observations.csv as DataFrame
nObservations = Neighbour(observations, Float64, lonlatCols = (:longitude, :latitude), dropna = true) # create Neighbour object

nearest = nearest_coord(nObservations, (37.1100, -12.2877)) # search for nearest neighbour coordinate of coordinate (37.1100, -12.2877)