URBAN VEGETATION CLASSIFICATION WITH HIGH-RESOLUTION PLANETSCOPE AND SKYSAT MULTISPECTRAL IMAGERY

In this study two high-resolution satellite imagery, the PlanetScope, and SkySat were compared based on their classification capabilities of urban vegetation. During the research, we applied Random Forest and Support Vector Machine classification methods at a study area, center of Rome, Italy. We performed the classifications based on the spectral bands, then we involved the NDVI index, too. We evaluated the classification performance of the classifiers using different sets of input data with ROC curves and AUC values. Additional statistical analyses were applied to reveal the correlation structure of the satellite bands and the NDVI and General Linear Modeling to evaluate the AUC of different models. Although different classification methods did not result in significantly differing outcomes (AUC values between 0.96 and 0.99), SVM’s performance was better. The contribution of NDVI resulted in significantly higher AUC values. SkySat’s bands provided slightly better input data related to PlanetScope but the difference was minimal (~3%); accordingly, both satellites ensured excellent classification results.


Introduction
Remote sensing is one of the fast-growing geospatial technologies, progressively influencing a wide range of areas such as commerce, science, and applied research as well as public policy (Estes et al. 2001). By definition, remote sensing relates to the science and technology of obtaining information about the earth's surface without any direct physical contact (Campbell and Wynne, 2011). The field of remote sensing has evolved considerably. For several decades, aerial photographs, acquired from airborne vehicles, have been the main source of information, until the early 1970s, satellite images have emerged as an alternative to aerial photography for earth's surface observation (Szabó et al. 2018). One of the advantages of satellite images over aerial images is the spatio-temporal characteristics, which permit for wide-area mapping on a temporal basis. Already, successful application of satellite images in various areas has been demonstrated i.e., land use/ land cover monitoring (Kraas, 2007;Singh et al. 2015;Szabó et al. 2016;Phinzi and Ngetar, 2019;Gyenizse et al. 2020), urban applications (Streutker, 2002;Li et al. 2009;Szabó Z. et al. 2019;Paramita and Matzarakis, 2019), agriculture (Atzberger, 2013), water quality monitoring (Chebud et al. 2012), drought monitoring (Gulácsi and Kovács, 2018) mapping at wetlands Van Leeuwen et al. 2020) and erosion risk assessment (Bakacsi et al. 2019;Phinzi et al. 2020).
Recently, there has been increased interest in the use of remote sensing for urban vegetation mapping. Such increased interest in remote sensing of urban vegetation is a direct response to global climate change, a major burden for cities (Kraas, 2007). The increasing number of urban dwellers, traffic congestion, and the ongoing urban heat island effect, all have a direct bearing on global climate change. In the face of such urban problems and their subsequent contributions to climate change, urban vegetation, trees, in particular, plays a critical role as an ecosystem service, mitigating climate change impacts (Pickett et al. 2011). The availability of accurate information on the spatial distribution of urban vegetation is the first, but important step towards addressing the aforementioned urban problems, ultimately reducing the effects of climate change.
Remote sensing, due to its spatial and temporal characteristic, offers tremendous opportunities for mapping urban vegetation, providing reliable and reproducible information on urban vegetation patterns across large areas (Melesse et al. 2007). However, urban vegetation assessment using remote sensing-based approaches is still faced with challenges, largely related to the spectral and spatial complexity of urban environments. The presence of a variety of vegetation types together with the pronounced 3D structure of urban environments with shadowing and obscured urban objects, as well as rapid temporal changes (Tigges et al. 2013), make it even more difficult to assess urban vegetation using remote sensing. Freely available satellite images like Landsat and Sentinel, with 30 m and 10 m spatial resolutions, respectively, are probably the most widely used for vegetation mapping over large areas but they are not suitable for detailed urban vegetation mapping because of their relatively low spatial resolutions .
Although Sentinel-2 has the best spatial resolution among the freely available multispectral datasets, the © Planet Labs Inc. (Planet Team, 2017) provides PlanetScope imagery free for educational and research applications. The PlanetScope is a constellation with more than 180 lunchbox multispectral satellites with 3 m spatial resolution and daily revisit time (Shendryk et al. 2019). © Planet Labs Inc. has another ultra-high (1 m multispectral and 0.8 m panchromatic) spatial resolution satellite constellation, the SkySat satellites. SkySat is a commercial database, but it has a few sample scenes ready to use.
The use of PlanetScope and SkySat images has not been widely reported in the literature, especially their capabilities in vegetation classification. The aim of this study is to compare the classification capabilities of the urban vegetation using these high spatial resolution satellite imageries. The main goal is to reveal that a) the images provide accurate maps of vegetation cover; b) which classifier provides the best classification accuracy; c) does the NDVI improve the classification results?

Study area
The study area is located in the center of Rome, Italy. The area is 30 km2 and contains mostly urban built-up areas, green parks, and the River Tevere. There is very little amount of vegetation in the built-up areas, especially in the city center. Most of the vegetation is located in the green parks and along the main roads and the river.

Satellite imagery
In this study, we used two high-resolution satellite imageries, both from the Planet Labs database (Planet Team, 2017) and with the same capture date 28th August 2018. First is PlanetScope satellite imagery. PlanetScope has four spectral bands, blue, green, red, and near-infrared. The spatial resolution is 3 m, and the constellation has a daily revisit time. The second is a SkySat scene. SkySat also has four multispectral bands (blue, green, red, and near-infrared) with 1 m spatial resolution and a panchromatic band with 0.8 m spatial resolution. In this study, we used the pan-sharpened multispectral dataset with 0.8 m spatial resolution.

Reference data
We applied the binary approach; thus, reference data were collected as vegetation and non-vegetation classes. We gathered the data from the same spots in the images. In the vegetation class, we included pixels from trees and herbaceous plants as well. The "non-vegetation" class incorporated more diverse land cover types for e.g. buildings, roads, water surfaces etc.

Classification methods
Analyses were conducted with two supervised classification algorithms: Random Forest (RF) and Support Vector Machine (SVM). RF is an ensemble learning method that uses multiple (i.e. 100-500) decision trees to make predictions. Class labels are assigned based on the majority votes of the decision trees (Belgiu and Drăguţ, 2016;Breiman, 2001). The basic idea behind SVM is to find a line (hyperplane) which separates the classes, however, there could be infinite possible lines to do this, so the algorithm's goal is to find the optimal hyperplane by maximizing the margin between the support vectors (Chapelle et al., 1999).

Evaluation of the classification performance
In this study, we performed ROC curves to evaluate the accuracy of our classifications. The method represents the tradeoff between false-positive and true-positive rates (McClish, D. K., 1989). The Area Under the Curve (AUC) values represent the classification quality.

Statistical analysis
We determined the correlations among the satellite bands and NDVI with the Pearson correlation test; furthermore, as a quantified comparison tool, we used the Cronbach's alpha, a measure of internal consistency (0 indicates the lack of correlations, and when the shared covariances increase, Cronbach's alpha approaches 1).
We evaluated the AUC values (as dependent variable) with General Linear Modelling (GLM), and determined the explained variance by the following factors (as independent variables): satellite type, classifier, and the usage of NDVI as an additional variable in the classification. We also determined the statistical interactions among the factors (i.e. to reveal if a factor influences the effect of another). Effect sizes (ω²) had been calculated to quantify the contribution of the variables in the model (Field, 2013;Rotigliano et al., 2018).

Classification by spectral bands and by involving NDVI
Evaluating the classification performances of the Planet data by the ROC curve ( Fig.  1.) show that SVM had 0.97 AUC, better than RF (0.96 AUC) classifying with the spectral bands. Involving the NDVI into the classification the RF reached 0.98 AUC (Fig.  3.), better than SVM (0.97 AUC).  In the case of the SkySat, the RF classification with the spectral bands had 0.98 AUC. The same classification using SVM, and involving NDVI using both classifiers reached 0.99 AUC as well (Fig. 2, 4.). Figure 5. shows samples of the classified maps.

Correlations among the bands and NDVI
NDVI differently correlated with the original bands of the two satellites (Fig.  6.). While the correlation differences were minimal (0.01) in the case of B1-B3 bands, these were ~0.1 in the case of B4 and 0.2-0.3 between NDVIs. Cronbach's alpha was 0.819 for the Planet and 0.827 for the SkySat images. Accordingly, both values indicated high consistency, in the case of SkySat it was slightly higher.

Effects of factorial variables on the AUC
GLM revealed that type of satellites, classifiers and the involvement/omitting of NDVI explained 31.6% (adjusted R 2 = 0.316) of the variance. Although most of the factors and their interactions were significant (except the interaction of satellite type and the classifier, and the interaction of satellite type and the inclusion of NDVI), effect sizes indicated large effect only in case of satellite type, and all other factors had a small effect   (Table 1.). Comparison by factors (Fig. 7.) showed that on average Skysat had 0.018 higher AUC values than Planet (t = -9.88, df = 392, p<0.001), inclusion of NDVI improved AUC by 0.011 (t = -5.77, df = 392, p<0.001), and SVM provided 0.007 better AUC (t = -4.01, df = 392, p<0.001).

Discussion
Classifying the Planet and SkySat data showed that SVM provided slightly better AUC values than RF in case of the classification with the spectral bands. When we involved the NDVI into the classification RF provided slightly better results in case of the Planet. SkySat, except the RF classification of the spectral bands provided 0.99 AUC, the best outcomes in this evaluation method. Involving NDVI improves the classification performance in almost all cases. The SkySat performed better than the Planet, although the RF classification of the Planet involving NDVI also reached the same AUC than the SkySat classified the spectral bands with RF.
Several studies proved that RF outperforms, at least slightly, SVM (Schlosser et al. 2020, Liu et al. 2013, in this case SVM provided better accuracy. Although the difference of AUC was only small (0.007), but significant. This is a special case of statistical analysis when the significant differences are not completely useful: the difference is true, but the magnitude between the groups is rather small; thus, effect size can be a more useful metric than the p-value. The ω² indicated a small effect; accordingly, the two classifiers can be characterized with almost similar accuracies.
These satellites are rather new ones, thus, the literature is not wide of them, and there is not much experience with their applicability. Zeng et al. (2018) applied the Planet satellites in South Asia to reveal the expansion of croplands against forests; however, the Planet images were used as auxiliary data. Olthof and Svacina (2020) applied Planet images in flood mapping, and they used these data to determine the maximum flood extent to serve as a validation dataset in their flood simulations. Shendryk et al. (2019) also found Planet images efficient to filter out clouds and shadows in land cover mapping. Planet images were successfully applied in oil spill detection (Park et al. 2019), and in vegetation mapping (Gašparović et al. 2018), in this latter case, Planet performed ~5% better than Sentinel-2 in vegetation mapping. Terra Bella's SkySat (a previous generation of this type of satellites with 2 m resolution) was used in smallholder (<0.3 ha) plots to predict crop yield and served appropriate data (Jain et al. 2016). Generally, similarly to our findings, all previous studies found microsatellites useful in environmental mapping.
The contribution of NDVI resulted in a significant difference in AUC values; nevertheless, this was only ~1%. In this case the ω² was 0.05, which indicated medium effect. NDVI had different correlations with the original bands, and in case of Skysat correlations were only moderate. However, the interaction between satellite type and the involvement of NDVI into the set of predictors had only a very small effect (ω²=0.005). It means that in spite of the differences in the correlation matrices (i.e. NDVI provided different predictors by satellites), the AUC did not result in significantly differing outcomes with high magnitude of differences; differences were insignificant (p=0.054).

Conclusions
In this study the PlanetScope and SkySat satellite imageries were compared based on their classification capabilities of urban vegetation. Study area was in the city centre of Rome, Italy. We classified the imageries using RF and SVM classification methods using the original bands of the satellites and the NDVI indices. We found that satellite bands resulted in better outcomes using SVM than RF, and adding NDVI provided higher AUC values. AUC values were slightly better (0.04) by using the SkySat imagery than Planet which was also high (AUC=0.96). While the different classification methods did not result in significantly differing outcomes (AUC values between 0.96 and 0.99), the contribution of NDVI caused significantly higher AUC.