3689582777?profile=originalA number of people on this site are using their vehicles for aerial mapping. There are tools available for image stitching, which basically produces a rough idea about what the terrain looks like from above. Image stitching does not compensate for perspective and usually contains lots of artifacts. The reason why some older tools don't really work that well is because they rely on older algorithms for image matching which are not as scale or rotation invariant.

For example, have a look at the image below:

3689582614?profile=originalHere we see two images that have been matched together using a "SIFT" algorithm. That algorithm is scale and rotation invariant, which means images that are slightly different in scale, are rotated in any direction can still be matched together reasonably well. In this example, it's easy to see it also deals with changes in luminosity very well. These algorithms look at local luminosity gradients (not actual values), so they detect places where you have very specific, irregular changes, but changes that are consistent between images. This makes large, uninteresting areas invisible to this algorithm (as there are no local gradients there). The shadow edge is pretty regular and never matched. Have a look at which features the algorithm detected instead and matched to the paired image to understand what makes a good feature. For an indepth read, this guy explains this really well: http://www.aishack.in/2010/05/sift-scale-invariant-feature-transform/

Now here's the reason why it's important: if you understand how this algorithm works, you can also get a better understanding how to shoot your images and what to prevent to get good matches. For algorithms like these, organic, flat areas are great. However, trees aren't that great because leaves occlude specific gradients when you change position over them. In other words, if you fly 10 meters further, the way a 16x16 pixel area looked has changed considerably due to wind and what is visible through the leaves: your gradients change completely! That's why such areas need photos taken closer together to be able to get features in the biomass, otherwise they'll end up flat or blobby.

The second image shows the matching pairs of features after the fundamental matrix was established. The fundamental matrix establishes an epipolar relationship between two images. That means that a point in one image is related to another point in the image which must be located somewhere along a line. This makes finding the feature in the other image easier. When you have a camera model, it also becomes possible to triangulate these points to real world geometry.

The image at the top wasn't created with the fundamental matrix, it was created using a "homography matrix". This matrix defines how the two images as planar surfaces are related to one another.  So it describes a 2D geometric transform that should be applied to minimize the error between these two images. You can use this to to stitch images, but also for things like "augmented reality", where a 3D camera is matched to your real camera depending on how a 2D marker is matched in the view.

Want to play around with this yourself?  I found a very nice Java library, probably easier to use than opencv, with some really clear examples:



This code is already halfway towards a panorama stitcher. If you calibrated your camera, then you can use the parameters to work on the undistorted images in this case. I don't recommend undistorting images prior to 3D reconstructions because it also distorts pixels and therefore impacts on how features are matched. In the 3D reconstruction pipeline there is a camera model with calibrated parameters, so features do get transformed correctly through that more accurate model.

If you want to georeference your work, have a look here:


E-mail me when people leave their comments –

You need to be a member of diydrones to add comments!

Join diydrones


  • GDAL is great. I reckon if you take some photos with good overlap to reduce ambiguity, you should be able to make a giant orthophoto through boofcv or something similar. BoofCV has a SURF detector inside it. You'd have to define which images get matched perhaps to reduce processing times, but if the set is small like up to 40 I'd just sit through the pain. Then BoofCV can stitch your image quite easily. Then through GDAL you should be able to apply measured ground control points to get a GeoTiff out of this (or in QGIS I guess?). What's left is a nice algorithm to blend borders between images, but most of the stitched images have some kind of artifact, I think it cannot be avoided because of the difference in perspective, especially when you have buildings and things "going up" like trees.

  • Thanks @Gerard for the links and explanation. I've been playing around with processing my imagery, but have never really looked under the hood to see how this actually works.
    @Noli I've used the GDAL lib for tiled aerial imagery from a full scale aircraft, but haven't had to much luck with my own capture imagery.
  • Here's link for 2D mosaicking (below) for aerial photos.

    Image Mosaicking with GDAL


  • Moderator

    Thanks so much for adding this Gerard useful stuff.

This reply was deleted.