$30
Problem 1
In this first first problem, you will work with GeoTIFF and GeoJSON files, and use GDAL to manipulate geospatial data. You will also use the Python scientific stack to implement simple image processing algorithms, composite (e.g. temporal) operations and remote sensing indices from band data.
For this problem, you will be analyzing and processing imagery from Sentinel 2 (L1C, Top of Atmosphere) taken over the greater Santa Fe metro area from 2019 to 2020. Each GeoTIFF file contains seven bands [red, green, blue, nir, swir1, swir2, alpha]. Upon examination of this dataset, you will notice that the resolution and coordinate reference system of each file does not match.
There is a zip file that contains the contents of this dataset, called s2_santafe.zip.
Task 1 - Align the dataset
For the first part of this problem, you are asked to create a spatially aligned dataset from the provided dataset. Specifically, every file in your output dataset should be at the same resolution, coordinate reference system and spatial extent. I would recommend projecting all your images to UTM. Your code should take the provided input dataset and write out the output dataset. If using GDAL on the command line rather than with the Python bindings, your code can be a bash script or the accompanying equivalent in Python.
There is a GeoJSON file that contains the spatial extent that each image in your output dataset should match, called santafe_crop.geojson.
Task 2 - Analyze the dataset
Once you have created an aligned dataset, you will perform some analysis on this dataset.
First, you are asked to compute a histogram of values across the entire temporal stack each of the six bands (excluding the alpha band).
Next, across the temporal stack, you are asked to:
- Find the greenest scene (e.g. most vegetated scene - max(NDVI))
- Find the snowiest scene (NDSI)
- Find the cloudiest scene
- Find the brightest scene
Note that your outputs should be from the result of your technique / code / algorithm running on the stack of imagery. You are NOT allowed to produce your answers simply through visual inspection of the data, although you will certainly want to inspect the data closely to figure out what approach to take. For each answer, provide the scene ID and the corresponding image to answer each question.
Finally, you are asked to create composite images (e.g. reduce the stack of imagery to a single image) of varying kinds:
- mean
- min
- max
- median
- greenest pixel (e.g. argmax NDVI)
- 85% greenest pixel
For each temporal operation, your output should be a GeoTIFF file that contains the georeferenced composite image.
For the purposes of measuring cloud cover across the scene, you will want to implement a simple cloud masking algorithm. Don’t go for something perfect, rather get something that works reasonably well. Note that this same cloud masking algorithm can be used for creation of composite imagery, by masking each image by your derived cloud mask. Finally, note that you do not need to necessarily use every single image / pixel for your composite operation.