25-09-2020 01:48 PM - edited 25-09-2020 02:05 PM
Hello!
I've taken the city example from https://docs.exasol.com/sql_references/geospatialdata/import_geospatial_data_from_csv.htm
CREATE TABLE cities(name VARCHAR(200), geo GEOMETRY(4326));
INSERT INTO cities VALUES('Berlin', 'POINT (13.36963 52.52493)');
INSERT INTO cities VALUES('London', 'POINT (-0.1233 51.5309)');
-- this shows the distance in degrees:
SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b;
-- this shows the distance in meters:
SELECT a.name, b.name, st_distance(ST_Transform(a.geo, 2163),
ST_Transform(b.geo, 2163)) FROM cities a, cities b;
The result, which shall be in meters is a bit surprising (30 km off😞
Berlin London 969387.149096201
I've also tried similar calculation with my home address and my working place using different SRID (EPSG) and noticed strange behaviour if I try to convert using ST_TRANSFORM. Is this kind of bug or some kind of newbie mistake?
Another example - working place to downtown:
-- original encoding RLB -> U-Bahn Station Stephansplatz - should be around 700 m
SELECT ST_DISTANCE( ST_SETSRID('POINT(3261.81 341739.69)',31256), ST_SETSRID('POINT(2957.94 341097.68)',31256) ) AS loc_dist;
Result: 710.2913606401333
-- EPSG WGS 84 - Pseudo-Mercator
SELECT ST_DISTANCE( ST_TRANSFORM(ST_SETSRID('POINT(3261.81 341739.69)',31256), 3857), ST_TRANSFORM(ST_SETSRID('POINT(2957.94 341097.68)',31256), 3857) ) AS loc_dist;
Result: 1066.4943908612343
-- haversine distance
SELECT
-- RLB
16.3760247 AS LON1
, 48.214047699999995 AS LAT1
-- Stephansplatz
, 16.371931900000007 AS LON2
, 48.2082753 AS LAT2
, 6372800 AS earth_radius --Earth radius in meters
, RADIANS(LOCAL.LAT1) AS PHI1 -- LAT 1
, RADIANS(LOCAL.LAT2) AS PHI2 -- LAT 2
, RADIANS(LOCAL.LAT2 - LOCAL.LAT1) AS DPHI
, RADIANS(ABS(LOCAL.LON2 - LOCAL.LON1)) AS DLAMBDA
, POWER(SIN(LOCAL.DPHI/2),2) + COS(LOCAL.PHI1)*COS(LOCAL.PHI2)*POWER(SIN(LOCAL.DLAMBDA/2),2) AS a -- a = math.sin(dphi/2)**2 + math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
, 2 * LOCAL.earth_radius * ATAN2(SQRT(LOCAL.a), SQRT(1-LOCAL.a)) AS loc_dist; -- return 2*R*math.atan2(math.sqrt(a), math.sqrt(1 - a))
Product:
SELECT * FROM EXA_METADATA
WHERE PARAM_NAME = 'databaseProductVersion';
PARAM_NAME PARAM_VALUE IS_STATIC
databaseProductVersion 6.2.5 true
KR, comments highly appreciated!
Michael
Solved! Go to Solution.
28-09-2020 05:50 PM
You are correct. Spatial functionality is all based on the OGC Simple Features Access Standard which can be found here: https://www.ogc.org/standards/sfs
Thus, for geometry types ST_DISTANCE returns the minimum 2D Cartesian (planar) distance between two geometries, in projected units (spatial ref units). We do not implement any extensions such as a GEOGRAPHY datatype, yet, that would take the spheroid into account.
The best approach would be to implement the Haversine distance as UDF in case you need it or to select a spatial reference system of choice that is most accurate for the area of investigation.
SRID: 26986 for example is a reference system that is most accurate for the area of Massachusetts, similar systems exist for Britain,... which provide high accuracy for local regions.
25-09-2020 03:31 PM
Hello,
great that you tried out the geospatial features of Exasol. Given your examples I understand that you are irritated if you perform distance calculations and get different results depending on the spatial reference system used, however, this is quite normal and explainable. In the first example, e.g. if you calculate the distance between London and Berlin using a transformation to SRID 3857 you will get a different result: 1512755.859775003
SELECT a.name, b.name, st_distance(ST_Transform(a.geo, 3857), ST_Transform(b.geo, 3857))
FROM cities a, cities b;
Most data is commonly stored in spherical coordinates, e.g. coming from GPS systems in WGS 84 reference system (SRID 4326). Distance calculated in this system is returned in degrees on a sphere. This spherical distance calculation is the most accurate.
However, oftentimes distance is needed in metrical units, this is where other spatial reference systems come into play, such as EPSG:3857 which is a Spherical Mercator projection used by Google Maps or OpenStreetMaps. Here the sphere of the earth is projected out into a plane, which always leads to some distortion. SRID 3857 is for example very exact when it comes to distance calculations close to the equator and adds more distortion the closer one gets to the poles.
Other projections in turn have other drawbacks and benefits. The conversion from one reference system to the other is commonly performed by a library called Proj, the parameters of those projections are stored in EXA_SPATIAL_REF_SYS.
You can have a look into this system table and compare it to other systems, e.g. SPATIAL_REF_SYS in Postgis and you should see similar entries for a given SRID.
If you nevertheless find some unusual behaviour feel free to continue this thread.
28-09-2020 01:35 PM
Hello again!
Thanks for your reply, I've played around and I believe that ST_DISTANCE is using Euclidean distance - is that right or is that just some coincidence which I came accross using Excel.
CREATE TABLE cities(name VARCHAR(200), geo GEOMETRY(4326));
INSERT INTO cities VALUES('Berlin', 'POINT (13.411400 52.523403)'); -- Berlin
INSERT INTO cities VALUES('London', 'POINT (-0.126236 51.500153)'); -- Westminster
-- Euclidean???
SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b;
NAME NAME ST_DISTANCE(A.GEO,B.GEO)
London London 0.0
London Berlin 13.57625239272591
Berlin London 13.57625239272591
Berlin Berlin 0.0
Excel table trying to figure out what's going on, for comparison - I've been using this site: https://www.distance.to/London/Berlin#
LON | LAT | |
Berlin | 13,41140 | 52,52340 |
London (City of Westminster) | -0,12624 | 51,50015 |
(q_i - p_i)^2 | 183,26759 | 1,04704 |
R…Earth radius (m) | 6372800 | |
in Radians | ||
Berlin | 0,234073087 | 0,916706317 |
London (City of Westminster) | -0,002203234 | 0,898847235 |
Deltas | -0,23628 | -0,01786 |
Euclidean | 13,57625239 | <= same as SELECT a.name, b.name, st_distance(a.geo, b.geo) FROM cities a, cities b; |
Haversine | ||
a = SIN((LAT2-LAT1)/2)^2 + COS(LAT1) * COS(LAT2) * SIN((LON2 - LON1)/2)^2 | 0,005341397 | |
c = 2 * ATAN2(SQRT(a);SQRT(1-a)) | 0,146300165 | Excel: ATAN2 flipped x and y! |
d = R * c | 932341,6908 | |
Haversine kilometers | 932,3416908 | |
So, being a newbie - I'm totally confused if ST_DIFFERENCE would use Euclidean distance for geospatial data, which I'm pretty sure that I'm wrong. As you can see, I've already calculated the haversine distance which would meet my expectations, but I'd be interested if there is a nicer and almost as good as haversine distance - builtin function.
Thanks in advance.
Michael
28-09-2020 05:50 PM
You are correct. Spatial functionality is all based on the OGC Simple Features Access Standard which can be found here: https://www.ogc.org/standards/sfs
Thus, for geometry types ST_DISTANCE returns the minimum 2D Cartesian (planar) distance between two geometries, in projected units (spatial ref units). We do not implement any extensions such as a GEOGRAPHY datatype, yet, that would take the spheroid into account.
The best approach would be to implement the Haversine distance as UDF in case you need it or to select a spatial reference system of choice that is most accurate for the area of investigation.
SRID: 26986 for example is a reference system that is most accurate for the area of Massachusetts, similar systems exist for Britain,... which provide high accuracy for local regions.
Then it's time to become part of a unique family! Discover helpful tips and support other Community members with your knowledge.
Sign In