Tutorial: Machine learning classification of Sentinel-2 satellite imagery using R


In my earlier post, I wrote about the events leading up to my paper in GIScience & Remote Sensing. In this short post I would like to help you conduct your own machine learning classification of Sentinel-2 data using the open source package R. The process is pretty straightforward if you have experience in remote sensing and image classification. Even if you don’t have extensive experience, basic knowledge of remote sensing terminology is sufficient. I’ve provided detailed information about different machine learning algorithms, including explanations of key concepts in my article linked below. It is open access and accessible to everyone, so I strongly encourage you to read it as it will serve as a useful reference going forward:

Abdi, A. M. (2019): Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data - GIScience & Remote Sensing.

However, the tutorial assumes that you are already well-grounded in R concepts. I’ve prepared a small script that contains the main steps to performing a classification procedure using some pretty standard and well-known algorithms. Each line is well-commented so that you will know exactly what it does.

The entire code is detailed in this GIST, and the full dataset need to run the code can be downloaded here (155 MB) as a compressed ZIP file. All you have to do is run the code from within the directory where all the data is located. In order to make the most out of this code, it is essential to understand all the steps and hopefully the comments in the code are helpful. If you are unclear about any particular concept, I strongly advise you to read my article linked above. Happy classifying!