Behind the Paper: Comparing machine learning algorithms using Sentinel-2 data


The field of machine learning is moving fast, and it seems that new fancy algorithms coming out every week. Sometimes, it is confusing to figure out which algorithms are best suited for which purpose. This is particularly the case when it comes to land-use and land-cover classification using multidimensional satellite imagery because most of the new algorithms are tested with either binary or uni-dimensional data. I was faced with this dilemma in early 2018 when I was hired to develop a classification workflow for a mixed-use landscape in Sweden using Sentinel-2 data. I thought this was a straightforward enough task. My plan was to find a study that had already tested algorithms on Sentinel-2 data in boreal regions and use their best performing algorithm in my workflow. However, upon searching the literature, I couldn’t find studies that had used machine learning on Sentinel-2 data in mixed-use boreal regions. So, I decided to conduct the study on my own.

The first task was to find a study area in Sweden that was (1) challenging to classify, i.e. with a diverse number of land cover and land use classes, (2) and for which cloud-free Sentinel-2 data was available. I found an area within the vicinity of the Norunda research station that satisfied my two requirements. The added benefit of selecting this area was that the eddy covariance tower at Norunda measures greenhouse gas exchange between land surface and the atmosphere. This necessitates the regular availability of accurate land cover data to understand what type of surface cover produces the greenhouse gases.

The second task was to select machine learning algorithms. The selection of random forest and support vector machines was obvious to me because they are so commonly used in remote sensing studies. I wasn’t so sure about extreme gradient boosting and deep learning since they have been only recently gaining traction in remote sensing. So, I had to read up on them and understand how they work before using them.

The final task before embarking on the actual classification was to figure out how to obtain a large number of training samples without spending too much time on it as I only had a few months. At first I collected one hundred training samples for each of my eight land cover classes using some high resolution orthophotos. Then, I tested whether these eight hundred samples matched up with a new 10-meter land cover and land use map produced by Naturvårdsverket. They were an almost perfect match because the map was was highly accurate owing to the myriad of dataset that were combined to produce it. So, in order to get more training samples and compare my algorithms, I used the Naturvårdsverket map to increase my samples.


I had two land cover and land use classes called “Open Land” to denote areas of open, tree-less land with and without a grass or shrub cover; and “Clear Cuts” to denote areas that have been deforested within and outside wetlands. Both these land cover and land use have similar spectral signatures because they are both essentially open areas of land. This extreme similarity between these classes brought down the overall classification because removing one of them caused a 15% increase in the overall classification. The lesson learned is to either combine similar land cover and land use categories into one class, or actively work on separating extremely similar classes by increasing their training sample size.

In an effort to help others who are interested in conducting machine learning classification, I prepared a small tutorial that details most of the necessary steps. The tutorial includes the R code to prepare the data and train the algorithms as well as all the data that you need. However, I encourage you to read my paper, linked below, before starting the exercise in order to get an idea of the key concepts. If you are already knowledgeable about this topic, then the paper will serve as a handy reference.

Citation: Abdi A. M. (2019) Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data, GIScience & Remote Sensing.