Geospatial distance

Contributor

Hello!

I've taken the city example from https://docs.exasol.com/sql_references/geospatialdata/import_geospatial_data_from_csv.htm

 

CREATE TABLE cities(name VARCHAR(200), geo GEOMETRY(4326));
INSERT INTO cities VALUES('Berlin', 'POINT (13.36963 52.52493)');
INSERT INTO cities VALUES('London', 'POINT (-0.1233 51.5309)');

-- this shows the distance in degrees:
SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b;
-- this shows the distance in meters:
SELECT a.name, b.name, st_distance(ST_Transform(a.geo, 2163),
     ST_Transform(b.geo, 2163)) FROM cities a, cities b;

 

The result, which shall be in meters is a bit surprising (30 km off😞

 

Berlin	London	969387.149096201

 

I've also tried similar calculation with my home address and my working place using different SRID (EPSG) and noticed strange behaviour if I try to convert using ST_TRANSFORM. Is this kind of bug or some kind of newbie mistake?

Another example - working place to downtown:

 

-- original encoding RLB -> U-Bahn Station Stephansplatz - should be around 700 m
SELECT ST_DISTANCE( ST_SETSRID('POINT(3261.81 341739.69)',31256), ST_SETSRID('POINT(2957.94 341097.68)',31256) ) AS loc_dist;

Result: 710.2913606401333

-- EPSG WGS 84 - Pseudo-Mercator
SELECT ST_DISTANCE( ST_TRANSFORM(ST_SETSRID('POINT(3261.81 341739.69)',31256), 3857), ST_TRANSFORM(ST_SETSRID('POINT(2957.94 341097.68)',31256), 3857) ) AS loc_dist;

Result: 1066.4943908612343

-- haversine distance
SELECT
    -- RLB
    16.3760247 AS LON1
    , 48.214047699999995 AS LAT1 
    -- Stephansplatz
    , 16.371931900000007 AS LON2
    , 48.2082753 AS LAT2
    , 6372800 AS earth_radius  --Earth radius in meters
    , RADIANS(LOCAL.LAT1) AS PHI1 -- LAT 1
    , RADIANS(LOCAL.LAT2) AS PHI2 -- LAT 2
    , RADIANS(LOCAL.LAT2 - LOCAL.LAT1) AS DPHI
    , RADIANS(ABS(LOCAL.LON2 - LOCAL.LON1)) AS DLAMBDA
    , POWER(SIN(LOCAL.DPHI/2),2) + COS(LOCAL.PHI1)*COS(LOCAL.PHI2)*POWER(SIN(LOCAL.DLAMBDA/2),2) AS a -- a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
    , 2 * LOCAL.earth_radius * ATAN2(SQRT(LOCAL.a), SQRT(1-LOCAL.a)) AS loc_dist; -- return 2*R*math.atan2(math.sqrt(a), math.sqrt(1 - a))

 

Product:

 

SELECT * FROM EXA_METADATA
WHERE PARAM_NAME = 'databaseProductVersion';

PARAM_NAME	PARAM_VALUE	IS_STATIC
databaseProductVersion	6.2.5	true

 

KR, comments highly appreciated!

Michael

1 ACCEPTED SOLUTION

Team Exasol
Team Exasol

You are correct. Spatial functionality is all based on the OGC Simple Features Access Standard which can be found here: https://www.ogc.org/standards/sfs

Thus, for geometry types ST_DISTANCE returns the minimum 2D Cartesian (planar) distance between two geometries, in projected units (spatial ref units). We do not implement any extensions such as a GEOGRAPHY datatype, yet, that would take the spheroid into account.

The best approach would be to implement the Haversine distance as UDF in case you need it or to select a spatial reference system of choice that is most accurate for the area of investigation.

SRID: 26986 for example is a reference system that is most accurate for the area of Massachusetts, similar systems exist for Britain,... which provide high accuracy for local regions.

 

 

 

________________________
Senior Sales Engineer @ Exasol

View solution in original post

3 REPLIES 3

Team Exasol
Team Exasol

Hello,

great that you tried out the geospatial features of Exasol. Given your examples I understand that you are irritated if you perform distance calculations and get different results depending on the spatial reference system used, however, this is quite normal and explainable. In the first example, e.g. if you calculate the distance between London and Berlin using a transformation to SRID 3857 you will get a different result: 1512755.859775003

SELECT a.name, b.name, st_distance(ST_Transform(a.geo, 3857), ST_Transform(b.geo, 3857)) 
FROM cities a, cities b;

Most data is commonly stored in spherical coordinates, e.g. coming from GPS systems in WGS 84 reference system (SRID 4326). Distance calculated in this system is returned in degrees on a sphere. This spherical distance calculation is the most accurate. 

However, oftentimes distance is needed in metrical units, this is where other spatial reference systems come into play, such as EPSG:3857 which is a Spherical Mercator projection used by Google Maps or OpenStreetMaps. Here the sphere of the earth is projected out  into a plane, which always leads to some distortion. SRID 3857 is for example very exact when it comes to distance calculations close to the equator and adds more distortion the closer one gets to the poles.

Other projections in turn have other drawbacks and benefits. The conversion from one reference system to the other is commonly performed by a library called Proj, the parameters of those projections are stored in EXA_SPATIAL_REF_SYS.

You can have a look into this system table and compare it to other systems, e.g. SPATIAL_REF_SYS in Postgis and you should see similar entries for a given SRID.

If you nevertheless find some unusual behaviour feel free to continue this thread.

 

 

________________________
Senior Sales Engineer @ Exasol

Contributor

Hello again!

Thanks for your reply, I've played around and I believe that ST_DISTANCE is using Euclidean distance - is that right or is that just some coincidence which I came accross using Excel.

CREATE TABLE cities(name VARCHAR(200), geo GEOMETRY(4326));
INSERT INTO cities VALUES('Berlin', 'POINT (13.411400 52.523403)'); -- Berlin  
INSERT INTO cities VALUES('London', 'POINT (-0.126236 51.500153)'); -- Westminster 
-- Euclidean???
SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b;
NAME	NAME	ST_DISTANCE(A.GEO,B.GEO)
London	London	0.0
London	Berlin	13.57625239272591
Berlin	London	13.57625239272591
Berlin	Berlin	0.0

Excel table trying to figure out what's going on, for comparison - I've been using this site: https://www.distance.to/London/Berlin#

 LONLAT
Berlin13,4114052,52340
London (City of Westminster)-0,1262451,50015
(q_i - p_i)^2183,267591,04704
R…Earth radius (m) 6372800 
   
in Radians  
Berlin0,2340730870,916706317
London (City of Westminster)-0,0022032340,898847235
Deltas-0,23628-0,01786
   
Euclidean13,57625239<= same as SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b;
   
Haversine  
a = SIN((LAT2-LAT1)/2)^2 + COS(LAT1) * COS(LAT2) * SIN((LON2 - LON1)/2)^20,005341397 
c = 2 * ATAN2(SQRT(a);SQRT(1-a))0,146300165Excel: ATAN2 flipped x and y!
d = R * c932341,6908 
Haversine kilometers932,3416908 
   
   

So, being a newbie - I'm totally confused if ST_DIFFERENCE would use Euclidean distance for geospatial data, which I'm pretty sure that I'm wrong. As you can see, I've already calculated the haversine distance which would meet my expectations, but I'd be interested if there is a nicer and almost as good as haversine distance - builtin function.

Thanks in advance.

Michael

Team Exasol
Team Exasol

You are correct. Spatial functionality is all based on the OGC Simple Features Access Standard which can be found here: https://www.ogc.org/standards/sfs

Thus, for geometry types ST_DISTANCE returns the minimum 2D Cartesian (planar) distance between two geometries, in projected units (spatial ref units). We do not implement any extensions such as a GEOGRAPHY datatype, yet, that would take the spheroid into account.

The best approach would be to implement the Haversine distance as UDF in case you need it or to select a spatial reference system of choice that is most accurate for the area of investigation.

SRID: 26986 for example is a reference system that is most accurate for the area of Massachusetts, similar systems exist for Britain,... which provide high accuracy for local regions.

 

 

 

________________________
Senior Sales Engineer @ Exasol

View solution in original post