Very interesting research that shows that the statistical errors build up with GPS in a way that exaggerates distance traveled. Counterintuitive!
Have you had a suspicion that your GPS app is overestimating the distance traveled? If so you are probably correct but the reason isn't an algorithmic glitch. The answer lies in the statistics and it is a strange story.
The mathematics of GPS position finding are complex enough to be left to a code library, but lots of apps will take a set of co-ordinates and work out the distance between each adjacent pair on a path to work out the total distance traveled.
This is so simple and so intuitive that it is hard to see how it could possibly go wrong, but it does.
If you start thinking about how this algorithm could give you the wrong answer then your first conclusion might be that interpolation error might dominate. After all the user is taking a curved path through the world and your GPS calculation is approximating this curve with a set of straight line segments. A moment's thought should reveal that this suggests that the GPS distance will be less than the true path. This is correct, but in practice GPS measurements are taken often enough to make interpolation error small.
The reason is all down to the errors in the positions and how these errors accumulate.
If you make a measurement and it is subject to a random unbiased error then you generally are safe in assuming that the random component will make the quantity larger as often as it makes it smaller. This is how it seems to be with GPS there are errors in positioning that are inherent in the system but certainly don't show any particular bias. Given this observation you would expect the distance between two points located with unbiased random error would also be unbiased, i.e. it would be on average bigger as often as it was smaller than the true value.
However, you would be wrong.
Researchers at the University of Salzburg (UoS), Salzburg Forschungsgesellchaft (SFG), and the Delft University of Technology have done some fairly simple calculations that prove that this is not the case. Irrespective of the distribution of the errors, the expected measured length squared between two points is bigger than the true length squared unless the errors at both points are identical.
That is, if you have two points p1 and p2 and errors in measuring x and y at each, the squared distance measured between them will come out as bigger than the true distance unless the errors are such that they move both points by the same amount - which is highly unlikely in practice.
How can this be?
Consider the two points and the straight line between them. This straight line is the shortest distance between the two points. Now consider random displacements of the two points. The only displacements that reduce the distance are those that move the two points closer together, for example displacements along the line towards each other. The majority of random displacements end up increasing the distance.
This is the reason that unbiased errors end up biasing the distance measurement.
So given that the GPS path is just a sum of distances computed between pairs of points, the total estimated distance is going to be bigger than the true distance because of random errors.
A little more work and the researchers derive a formula forhow much of an Over Estimate of Distance OED is produced:
OED= (d2 + var - C)1/2 - d
where var is the variance in the GPS position and C is the autocovariance (correlation) between the errors. Notice that the more correlated the errors, the smaller the over estimate.
To test the theory some experiments were performed. A consumer quality GPS was walked around a 10m square with segment lengths of 1m and 5m. The average measured segments were 1.2m and 5.6m. That is, an overestimate of between 20% and 60%. Clearly a smaller segment length is a good idea.
Is there anything to be done?
The researchers point out the measurement of speed is not subject to the same problem. GPS devices can measure speed using Doppler shift and this is accurate and not subject to the same measurement bias. It might well be that you can get unbiased distances by integrating velocity measurements over time.
Runners and other athletes have long complained that GPS devices overestimate their performance and there have been lots of suggestions as to why.
This research does seem to have come with the answer - statistics.
More Information
Peter Ranacher, Richard Brunauer, Wolfgang Trutschnig, Stefan Van der Spek & Siegfried Reich (2015): Why GPS makes distances bigger than they are, International Journal of Geographical Information Science, DOI: 10.1080/13658816.2015.1086924
Via Robohub
Comments
The math in that paper may be horrible. The 60% figure is wrong. it should be (5.6 - 5.0)/5.0 = 12%. It would not surprise me if other numbers are wrong too. Their assumption that velocity does not suffer from this problem is also incorrect. If you estimate speed from velocity, then you end up with a similar bias in speed. (Remember that speed is a scalar and velocity is a vector.) This bias occurs whenever you estimate a scalar from a vector.
They refer to "GPS point speed" in their paper. If there is a way to measure speed directly from GPS signals, then it may not be subject to a bias. As far as I know, you can only get a velocity measurement from GPS signals.
@Thomas Stone: There is some auto-correlation of GPS position estimates in time and space. According to the paper, it is advantageous to sample as fast as possible to maximize this auto-correlation in order to minimize your OED. I think that maximizing auto-correlation is good, but there must be some optimal sample that minimizes the trade-off between too many measurements and too low auto-correlation.
Interesting. A simple thought experiment: if you sit still, basic GPS odometry will accumulate a (positive) distance traveled.
They should try experiments with the GPS moving faster (or sampling slower). At some point, the interpolation error should become significant. And for straight-line interpolation, all interpolation error should be <= 0 (i.e., under-estimation of actual distance traveled).... But perhaps, they have already determined that sort of speed is not representative of most applications.
I think that's different. Coastlines are fractal, which by definition have infinite length. GPS measurement is finite (10Hz) and thus should converge on a real distance were it not for statistical sampling error.
It's a well known fact that the more precise measurement of coastline is done the more length it yelds. The same we have with GPS odometry.