Home ITCAbout ITCCentres of expertiseCenter of Expertise in Big Geodata ScienceBig Geodata TalksHigh-performance Spatial Data Management and Analysis with DuckDB
DuckDB

High-performance Spatial Data Management and Analysis with DuckDB

Join us for the Big Geodata Talk on DuckDB Spatial!

DuckDB is a novel in-process SQL database system designed for analytical workloads that has been making waves in the data science and engineering community. Not only for its impressive performance, but also for its focus on ease of use and integrations with the wider data ecosystem. A key part in making this possible is DuckDB's flexible extension system that enable DuckDB to be used across different domains while the core system itself remains small and focused. One such extension is the DuckDB Spatial Extension which brings geospatial data processing capabilities to DuckDB, allowing users to perform complex spatial queries and transformations. By incorporating the trifecta of foundational open source GIS libraries: GDAL, GEOS and PROJ as well as natively implemented geospatial algorithms all neatly packaged into a single binary with no runtime dependencies, the spatial extension provides hundreds of familiar spatial SQL functions and import and export capabilities to and from dozens of different vector file formats.

Just like DuckDB tries to default to the behavior or PostgreSQL, the spatial extension is heavily inspired by PostGIS and similarly follows the Simple Features SQL standard. However, while the Simple Features geometry model undoubtedly provides a great deal of flexibility with its hierarchy of subtypes (points, linestrings, multipolygons) and optional Z and M dimensions, it is not always the most efficient representation for modern high performance processing. While the spatial extension implements a bunch of geospatial algorithms natively to try to make the most of DuckDBs vectorized execution engine and memory model, it also complements the GEOMETRY type that we all know and love with a new set of strongly typed spatial types backed by a columnar storage model, similarly to what is being proposed in the GeoArrow project. This makes DuckDBs spatial extension an exciting project as it stands with one foot firmly in the traditional open source GIS world and the other in the modern data science and engineering movement.

In this talk, we will introduce DuckDB and the DuckDB Spatial Extension, walk through some of the internals that make DuckDB special as well as some of the challenges and design decisions encountered when adapting it for geospatial processing. We will also showcase some of the main features the spatial extension brings to the table today and share some insights into the future of the project.

Cookies on utwente.nl

We use cookies and similar technologies and process your personal data (e.g., IP address) to personalise content and ads, to integrate media from third-party providers, or to analyse traffic. Data processing may also occur as a result of cookies being set. The data processing may take place with your consent. You have the right to withhold consent and to change or revoke your consent at a later time. For more information on the use of your data, please visit our privacy statement or cookie policy.