Python Fmask

Introduction

A set of command line utilities and Python modules that implement the ‘fmask’ algorithm as published in:

Zhu, Z. and Woodcock, C.E. (2012). Object-based cloud and cloud shadow detection in Landsat imagery Remote Sensing of Environment 118 (2012) 83-94.

and

Zhu, Z., Wang, S. and Woodcock, C.E. (2015). Improvement and expansion of the Fmask algorithm: cloud, cloud shadow, and snow detection for Landsats 4-7, 8, and Sentinel 2 images Remote Sensing of Environment 159 (2015) 269-277.

Also includes optional extension for Sentinel-2 from Frantz, D., Hass, E., Uhl, A., Stoffels, J., & Hill, J. (2018). Improvement of the Fmask algorithm for Sentinel-2 images: Separating clouds from bright surfaces based on parallax effects. Remote Sensing of Environment 215, 471-481.

Installation requires Python, numpy, scipy, GDAL and RIOS and the ability to compile C extensions for Python. It is licensed under GPL 3.

Originally developed by Neil Flood at DSITI and additional work funded by Landcare Research.

Philosophy

This package implements the Fmask algorithm as a Python module. It is intended that this can be wrapped in a variety of main programs which can handle the local details of how the image files are named and organised, and is intended to provide maximum flexibility. It should not be tied to expecting the imagery to be layed out in a particular manner.

This modular design also simplifies the use of the same core algorithm on either Landsat and Sentinel imagery. The wrapper programs take care of differences in file organisation and metadata formats, while the core algorithm is the same for both.

However, we have also supplied some example wrapper scripts, based around the image organisation as supplied by the usual distributors of the imagery. In the case of Landsat, we have supplied main programs which can cope with the data as it comes from USGS, while in the case of Sentinel-2 we have supplied wrappers to deal with the data as supplied by ESA.

It is expected that some users will use these directly, while larger organisations will wish to create their own wrappers specific to their own file naming and layout conventions.

The output from the core algorithm module is a single thematic raster, with integer codes representing null, clear, cloud, shadow, snow, water respectively.

The examples shown below use the given example wrappers.

Command Line Examples

All the commandline programs given use argparse to handle commandline arguments, and hence will respond sensibly to the -h option by printing their own help. Some have options to modify their behaviour.

Please note that the output format used is defined by RIOS. This defaults to HFA (.img). See RIOS documentation for more information and how to change this using the environment variable $RIOS_DFLT_DRIVER.

Note: these examples are for use in a Unix/Linux shell so that the filename wildcards get expanded properly. Windows users should prefix these commands with “fmask_expandWildcards” so that the commands get run with the wildcards expanded as they do in Unix/Linux.

USGS Landsat

Update: Since the USGS released their Collection-1 data set (completed globally in 2017), they now distribute cloud, shadow and snow masks included in their QA layer. These are calculated using CFMask, which should, in principle, be equivalent to the results of this code. Therefore, when processing USGS Collection-1 data, users may prefer the USGS-supplied masks. See USGS QA Layer.

The command line scripts supplied can process an untarred USGS Landsat scene. Firstly, the reflective and thermal bands must be stacked separately. This needs to be done in a different manner depending on the sensor. (Note: The wild cards shown here are for the more modern form of the USGS naming convention. Older forms would require slightly different wild card patterns. )

Landsat 4&5:

gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o ref.img L*_B[1,2,3,4,5,7].TIF
gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o thermal.img L*_B6.TIF

Landsat 7:

gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o ref.img L*_B[1,2,3,4,5,7].TIF
gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o thermal.img L*_B6_VCID_?.TIF

Landsat 8:

gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o ref.img LC8*_B[1-7,9].TIF
gdal_merge.py -separate -of HFA -co COMPRESSED=YES -o thermal.img LC8*_B1[0,1].TIF

The next step is to create an image of the relevant angles, per-pixel, for use in subsequent steps. For Landsat, this can be done using:

fmask_usgsLandsatMakeAnglesImage.py -m *_MTL.txt -t ref.img -o angles.img

Then mask and Top of Atmosphere reflectance must be calculated and finally the cloud mask itself:

fmask_usgsLandsatSaturationMask.py -i ref.img -m *_MTL.txt -o saturationmask.img
fmask_usgsLandsatTOA.py -i ref.img -m *_MTL.txt -z angles.img -o toa.img
fmask_usgsLandsatStacked.py -t thermal.img -a toa.img -m *_MTL.txt -z angles.img -s saturationmask.img -o cloud.img

If the thermal band is empty (for Landsat-8 with the SSM anomaly, after 2015-11-01) then it is ignored gracefully.

Sentinel2

The command line scripts supplied can process a Sentinel2 Level C granule from the image directory. Here is an example of how to do this. This example works at 20m resolution, but the recipe can be varied as required. Be warned, processing at 10m resolution would be considerably slower, and is unlikely to be any more accurate.

This command will take a given .SAFE directory, find the right images, and create an output file called cloud.tif:

fmask_sentinel2Stacked.py -o cloud.tif --safedir S2B_MSIL1C_20180918T235239_N0206_R130_T56JNQ_20180919T011001.SAFE

When working with the old ESA zipfile format, which packed multiple tiles into a single SAFE-format zipfile, this approach will not work, as it won’t know which tile to process. So, instead, use the option to specify the granule directory, as follows:

fmask_sentinel2Stacked.py -o cloud.tif --granuledir S2A_OPER_PRD_MSIL1C_PDMC_20160111T072442_R030_V20160111T000425_20160111T000425.SAFE/GRANULE/S2A_OPER_MSI_L1C_TL_SGS__20160111T051031_A002887_T56JNQ_N02.01

This would also work on a new-format directory, but specifying the top .SAFE directory is easier.

There are command line options to modify many aspects of the algorithm’s behaviour.

There are two options which are now obsolete, for manually specifying a pre-stacked file of reflectance bands, and the image of angles. These should be considered obsolete, and are replaced with the –safedir or –granuledir option, which take care of all internally.

Re-wrapping and Re-configuring

To build a different set of wrappers, and configure things differently, the default wrappers are a good place to start. The configuration is mainly handled by the fmask.config.FmaskConfig class. For example, one would call fmask.config.FmaskConfig.setReflectiveBand() to change which layer of the stack corresponds to which wavelength band.

Downloads

Get the source as a bundle from BitBucket. Release notes for each version can be read in Release Notes. To install from source, read the INSTALL.txt file included inside the source bundle.

Pre-built binary Conda packages are available under the ‘conda-forge’ channel. Once you have installed Conda, run the following commands on the command line to install python-fmask:

conda config --add channels conda-forge
conda create -n myenv python-fmask
source activate myenv # omit 'source' on Windows

Applications that use python-fmask

  • Cloud Masking: It is a Qgis plugin for cloud masking the Landsat (4, 5, 7 and 8) products using different process and filters such as Fmask, Blue Band, Cloud QA, Aerosol and Pixel QA.

Issues

Please log bugs encountered with the Issue Tracker.