Introduction
At present, the core area of grain production has become a national strategy. Henan Province, as the most important part of it, that is in the critical period of promoting the ”coordination of the four modernizations and scientific development”. Agriculture is the basis for realizing modernization, so must to speed up the transformation of agricultural development mode ,and improve agricultural quality, efficiency and competitiveness. Path to agricultural modernization featuring, of high output efficiency, product safety, resource conservation and environmental friendliness. However, the important bottleneck that restricts agriculture sustainable development of Henan, what is the large number of people and small amount of land, the serious shortage of reserve resources of cultivated land, and the overall low quality of cultivated land. Therefore, strengthening Well-facilitied capital farmland construction has enormous symbolic significance, for implementing the national strategy of China and promoting agricultural production.
Well-facilitied capital farmland refers to the basic farmland with centralized contiguity, supporting facilities, high and stable yield, good ecology and strong disaster resistance formed ,through rural land consolidation and construction in a certain period of time, which is compatible with modern agricultural production and operation mode(TD/T1033-2012). At present, most of the researches on Well-facilitied capital farmland construction include Well-facilitied capital farmland construction demarcation( LI Yilong, et al., 2019; DONG Fei, et al., 2020; ZHANG Hebing, et al., 2018) potential evaluation(CAI Xiangwen. et al., 2019; LI C M, et al., 2018), suitability evaluation(TANG Feng, et al., 2019; CHEN Lin, et al., 2019; ZHANG Jing, et al., 2020), construction sequence and mode zoning(LI Long, et al., 2020; WANC Ke, et al., 2021; ZENG Ya, et al., 2020), project implementation and effect evaluation (MA Xueying, et al., 2018; XIONG Yufei, et al., 2019; WANG Xiaoqing, et al., 2018), etc. In the Standard of Well-facilitied Capital Farmland Construction(TD/T1033-2012), it is clearly proposed to ”the quality of cultivated land after completion reaches the higher level of the county”, etc .Therefore, the construction of Well-facilitied capital farmland requires, that the soil quality should reach the higher level of the region, while the construction of field projects. ” it should be determined the amount of fertilizer , according to soil nutrient status, and the soil nitrogen, phosphorus, potassium, medium and trace elements, organic matter content, soil acidification, salinity, and should be regularly monitored other conditions, and should be constantly adjusted the fertilization formula, according to the actual situation”(NY/T 2148-2012). However, in the process of Well-facilitied capital farmland construction, in addition to the more placed on land leveling, improvement of supporting facilities of roads, ditches and other projects, while strengthen the rapid acquisition, and real-time monitoring of soil basic information. However, most research methods are quantitative inversion research for single soil properties(BIAN Zijin, et al., 2021; WEI Lifei, et al., 2020; LI zhiyuan, et al., 2021; YU Huan, et al., 2021; Schreiner Simon, et al., 2021; SHI Yuanyuan, et al., 2021; XU Xitong, et al., 2020). Inversion models of soil properties and spectral reflectance need to be built one by one, and the calculation process is complicated and time-consuming. And Panel data model can be build the three-dimensional data model at the same time, contain various soil properties of multiple points, and high spectral characteristics of the band values, By inversion modeling get multiple soil properties at once, modeling calculation process more simple, and can according to the relationship between each model analysis of soil properties, and the influence of high spectral band characteristic values on each soil properties. Therefore, It is necessary to study the rapid and nondestructive testing of soil attribute information in Well-facilitied capital farmland construction area, so as to provide technical support for the rapid acquisition, and real-time monitoring about soil attribute, and to provide support for the optimization of well-facilitied capital farmland construction area.
This reseach taking Well-facilitied capital farmland construction area of Xinzheng City as the research object, obtained soil hyperspectral data by using ASD Field Spec3 ground object spectrometer in laboratory experiments, and combined with soil properties such as soil PH, organic matter, nitrogen, phosphorus, potassium, Fe, Cr, Cd, Cu, Zn, Pb, etc. It performed Savitzky-golay(SG )filtering and Continuum removal (CR ) spectral transformation on the original spectral reflectance. And used Correlation analysis and Fuzzy clustering maximum tree method to select the common significance band of different soil attributes as the best hyperspectral characteristic band. This paper attempts to establish a comprehensive hyperspectral inversion model of cultivated land soil attributes by using Panel date model, estimate the influence of hyperspectral characteristic band values on each soil attribute, and predict the content of each soil attribute, aiming to provide theoretical and technical support for the rapid acquisition and real-time monitoring about soil attribute of Well-facilitied capital farmland construction area.
1. Materials and Methods
1.1. The overview of the researched area
Xinzheng city is located in the central part of Henan Province in China, is transition zone from the north China plain, western Henan mountain to eastern Henan plain, is the core of the Central Plains economic zone, under zhengzhou city, located in 34°16’~ 34°39’ N , 113°30’~ 113°54’ E , north of the provincial capital Zhengzhou, east of Zhongmu County, Weishi County, south of Changge City, Yuzhou city. It borders Xinmi city on the west, North of Zhengzhou city 38 km ; Northeast from zhongmu county 45.6 kilometers , 120kilometers downtown Kaifeng; East to Yushi county 42.6 kilometers; South to Changge city 20.4 km , Xuchang city 40 km; Southwest to Yuzhou city 36.5 km , Pingdingshan city 84 km ; 34.5 kilometers west to Xinmi urban area. It is 42kilometers long from north to south and 36 kilometers wide from east to west. It covers an area of 873 square kilometers and has a total population of 653,000. In 2019, It has jurisdiction over towns of 9, townships of 1, streets of 3 and administrative villages of 253, natural villages of 921 and residential areas of 24.
According to the survey of land use status in 2013, the total land area of Xinzheng city is 884.5915km2 , and the cultivated land is 521.7641 km2 , accounting for 58.59% of the total land area. The total annual grain output is 273148 t . According to Integrated Land-use Planning of Xinzheng city (2010-2020), the protection index of prime farmland in Xinzheng City is 427.73km2 . Xinzheng city is warm temperate continental monsoon climate, moderate temperature, four distinct seasons; The main disastrous weather is drought, flood, wind, hail, etc. The average annual temperature is 14.2℃, the average annual precipitation is 676.1mm , the average annual evaporation is 1476.2mm , the average annual sunshine duration is 2,114.2h , the average annual frost day is 67 days, the average annual total water resources is 147.73 millionm3 , and the per capita water resources are 236m3 . There are various soil types, mainly cinnamon soil, tidal soil and aeolian sand soil. The terrain is high in the west and low in the east, with shallow hills in the west, plains in the east and hills in the northwest.
1.2. Field collection of soil samples
According to the soil type, topographic characteristics and spatial variation characteristics of the study area, and taking into account the integrity of administrative units (towns or villages as units), sampling points were laid out using the 2km ×2km regular grid method. Every point in a spatial database included the basic information, as its serial number, latitude and longitude coordinates, township and neighboring villages, etc. According to the map about sample point and the table of point attribute, and used GPS to accurately locate the field sampling, and the sampling depth was 0-30cm on the surface of the soil, and recorded the coordinates of the actual sampling points and detailed characteristic information of the sample site. And collected a total of 154 soil samples in this sampling, and removed the invasive body such as plant roots and stem residues and brick and tile fragments. After natural air drying, grinding and passing through a 1 mm sieve, and divided the samples into four parts by quartering method in duplicate, one was used for determination of physical and chemical properties in laboratory, the other was used for determination of soil spectrum. The main soil properties measured in this study were soil PH, SOM, AN, AP AK, Fe, Cr, Cd, Zn, Cu and Pb. And carried out the determination method according to Regional Geochemical Sample Analysis Method(DZ/T 0279-2016). In order to ensure the quality of analysis, national geochemical standard samples were used for quality control.
1. 3. Laboratory test of samples spectra
This research measured soil spectral reflectance by ASD spectrometer on treated soil samples under indoor conditions. The spectrum measurement instrument is an ASD Field Spec 3 spectrometer produced by ASD, USA. The spectral range is 350-2500nm , with sampling interval of 1.4nm for 350-1000nm , sampling interval of 2nm for 1000-2500nm , and resampling interval of 1nm . Before spectral measurement, the surface of the soil should be scraped in the same direction, along the edge of the soil sample vessel with a ruler, and then filled with soil sample dish are placed black rubber MATS of reflectivity approximately 0, halogenated lamp with power of 50Wis used as light source, probe the view Angle of 25º, light incidence Angle of 45º, the distance of light source is 15cm , and the distance of probe is 15cm . To reduce the influence of anisotropy soil sample spectra, when measuring turn the sample plate 3 times , each time the rotation angle of about 90º, and obtained the soil sample spectra about four directions, reference plate calibration is performed before and after each target spectrum acquisition, repeated measurement 5 times, a total of 20 times, and used View Spec Pro software to obtain the average value of spectral reflectance as the original reflectance spectral value . Because near the two ends of the band test range (350nm and 2500nm ) are unstable regions of the spectral data, removed the data of 350-399nm and 2401-2500nm ,which are greatly affected by external noise.
1.4. Model establishment and accuracy test
1.4.1. Fuzzy clustering maximum tree method
Fuzzy theory is developed on the mathematical basis of Fuzzy set theory established by American cybernetics expert Professor L.A.Zadeh in 1965, which has been widely used in mathematics and many other fields( LIU Qi, et al., 2004). Fuzzy Clustering Number is a multi-technology, which classifies objective things by using fuzzy mathematics method, establishing similarity relation according to characteristics, similarity and affinity degree of objective(LI Hongxing, et al., 1994; WANG Peizhuang,1983). Because the classification of reality is often accompanied by fuzziness, fuzzy clustering theory is more consistent with objective reality.
The basic steps of Fuzzy clustering analysis using the maximum tree method are as follows:
(1) Establish sample set matrix
Suppose that the sample set, n represents the number of samples, each sample has an m dimensional vector representation, that is, each sample has m indicators, that is
(2) Establish fuzzy similarity matrix
According to the given sample characteristic data, the correlation coefficient method is used to establish the fuzzy similarity matrix,rij is the similarity coefficient between different samples, namely
(3) Maximum tree generation
With a certain point xi in a relatively concentrated set of classified objects, as its vertex andrij in the fuzzy similarity matrix R as its weight, it is arranged in descending order, requiring no loop (i.e. circle), until all vertices are connected, forming a special graph, namely the largest tree (the largest tree may not be unique).
(4) Clustering
Select the appropriate threshold λ , cut off the branches of the weight, get an unconnected graph, each connected branches constitute the classification of horizontal λ , there are several branches indicating the classification of several categories.
In this paper calculate the fuzzy similarity coefficient between the correlation coefficient curves of soil attributes and spectral indexes by systematic clustering method, and constructed a fuzzy similarity matrix to determine the similarity of the correlation coefficient, that between different soil attributes and spectral indexes. On this basis, determined the common hyperspectral inversion bands of different soil attributes by the maximum tree classification method.
1.4.2. Panel data model
Panel date is also called parallel data, or time series and cross section date or pool data. It refers to taking multiple cross sections on time series, sample data formed by sample observation values are simultaneously selected on these cross sections(SUN jingshui,2010). From the cross section, it is a cross section observation value formed by several individuals at a certain moment, and from the longitudinal section, it is a time series. According to the characteristics of panel data, the hyperspectral characteristic band values of soil properties of multiple samples can be regarded as the hyperspectral characteristic band values of soil properties at a sample point on the cross section, and a sequence of sample points on the vertical section. Through the construction of Panel data model, a comprehensive inversion model of soil properties can be established at the same time, without the need for individual inversion of each index, which reduces the tedious process of multi-index inversion(ZHANG Qiuxia, et al., 2017).
Due to the large number of sample points T and the small number of cross section N , it was determined as Fixed influence model, and Ordinary Least Squares Estimation (OLS ) was selected to build the Panel data model. Then, panel data model types are determined by Analysis of Covariance, namely invariant coefficient model, variable intercept model and variable coefficient model. In order to reduce the impact of heteroscedasticity, the natural logarithm of variables was calculated on both sides of the panel data model equation, and the panel data model was obtained as:
Where:
– values of explained variables on cross section i and samplet , namely soil heavy metal element content
– Constant term or intercept term, representing the cross section ofi (influence of the individual of i )
– Model parameter of the j explanatory variable on thei cross section
– The value of the j explanatory variable on cross sectioni and sample t , namely, the reflectance of hyperspectral characteristic band of soil heavy metals
– Random error term on cross section i and sample t
k – Number of explanatory variables
1.4.3. Accuracy test method of inversion model
The calibration set determination coefficient 2 and Root Mean Square Error (RMSEC ) are used to verify the modeling accuracy. Validation set test is based on validation set determination coefficient v2 , Root Mean Square Error (RMSEP ) and Relative Percent Deviation (RPD ), the Relative Percent Deviation is the ratio about between the standard deviation and RMSEP of validation set. WhenRPD >2.5, model has excellent predictive ability. When 2.0<RPD ≤2.5, the model has good quantitative prediction ability. When 1.8< RPD ≤2.0, the model has quantitative prediction ability. When 1.40<RPD ≤1.80, the model has general quantitative prediction ability. When 1.00<RPD ≤1.40, the model has the ability to distinguish the high value from the low value. When RPD ≤1.00, the model has no predictive ability (Rossel RAV, et al., 2007). For the modeling set, the larger 2 is, the smaller RMSEC is, the higher the modeling accuracy is, and the more stable the model is. For the verification set, the largerv2 and RPD are, the smaller RMSEP is, the higher the prediction accuracy is.
2. Results and discussion
2.1. Spectral pretreatment
In the process of ASD spectrometer acquisition, acquisition and transmission of spectral signals, in addition to the spectral information of soil itself, spectrometer breeding and interference of external factors, there may be many ”burr” noises in spectral curves, and the signal-to-noise ratio is reduced. In order to obtain the stable spectrum and improve the signal-to-noise ratio, it is necessary to smooth the spectral data. Savitzky-golay (SG ) convolution smoothing method was proposed by Savitky and Golay (Savitzhy A, et al., 1964)in 1964. It is a weighted average method, that obtains smooth point data by least square fitting of the data, that to be measured in the moving window interval using polynomial method. It is a widely used smoothing method at present. In the process of SG filtering, need to be selected appropriate smoothing points and polynomial fitting times. The more smoothing points are taken, the smoother the spectral curve will be, but some information will be lost at the same time. Therefore, SG filtering smoothing based on 9-point quadratic polynomial is adopted. The transform tool used for smoothing and denoising by Unscrambler 9.7 , as shown in Figure 1.
In order to better highlight the smoothing effect, the band curves of 2000-2400 nm were amplified (Figure.1b). By comparing the details before and after SG smoothing, it can be seen that SGsmoothing can effectively remove noise, and better preserve the overall characteristics of spectral curves.
2.2. Continuum removal
In order to find the sensitive relationship between soil heavy metal content and spectral reflectance, Continuum removal (CR ) spectral transformation after SG smoothing is required. To envelope as a spectral analysis method is put forward first by Clark and Rous in 1984(Clark R N , et al., 1984), is defined as a point in a straight line connected with the wavelength change reflect or absorb protruding point of ”peak value”, and make the line in the ”peak value” on the outside, is greater than 180°(TONG Qingxi, et al., 2006), the actual spectrum reflectance and envelope line of the corresponding band reflectance ratio, By normalizing the spectral value to 0~1(LI Shumin, et al., 2011), the absorption and reflection characteristics of the spectral curve can be effectively highlighted and the characteristic bands can be extracted. Through proper spectral transformation, the influence of various noises can be reduced or even eliminated, the spectral sensitivity can be improved, and the prediction ability and stability of the calibration model can be improved. in the study obtained the de-envelope by constructing pop database in Envi4.8, as shown in Figure 2.
The reflectance curve of CR not only enhances the spectral characteristics of the original spectral curve at 1400nm , 1900nm and 2200nm , but also highlights the weak absorption characteristics at 410nm , 500nm and 700nm . It shows that the weak absorption characteristic information of the original spectral curve is enhanced, and the signal-to-noise ratio is improved by de-enveloping spectral transformation, which is helpful for the extraction of effective characteristic bands.
2.3. Selection of common spectral characteristic bands for soil properties
On the basis of soil properties significant band selection about Xinzheng well-facilitied capital farmland construction area, considering the needs of different soil property spectrum inversion, combined with the correlation coefficient curve similarity and inflection point, using the method of Fuzzy clustering tree, determine the share best band of hyperspectral inversion of soil properties about Xinzheng well-facilitied capital farmland construction area.
Through comparative analysis of the correlation coefficient curves of 11 soil attributes and SG-CR transformations of Xinzheng City well-facilitied capital farmland area (Figure 3), it can be seen that the correlation coefficient curves of the same spectral transformation have similar inflection points, showing good similarity.
The soil properties corresponding to the row numbers of the fuzzy similarity matrix were PH, SOM, AN, AK, AP, Fe, Cr, Cd, Zn, Cu and Pb. The fuzzy similarity matrix of correlation coefficient curves of 11 soil attributes and SG-CR spectral transformation is:
Using the maximum tree classification method, λ = 0.76, PH, AP, Cr, Cd, Pb as a class, SOM, AN as a class, AK, Fe, Cu as a class, Zn as a class. According to the similarity and inflection point of correlation coefficient curves, the band of SG-CR spectral transformation is selected 405nm 、418nm 、781nm 、784nm 、794nm 、805nm 、807nm 、830nm 、831nm 、1079nm 、1085nm 、1251nm 、1267nm 、1308nm 、1309nm 、1410nm 、1836nm 、1860nm 、1897nm 、1898nm 、2080nm 、2137nm 、2149nm 、2156nm 、2184nm 、2382nm 、2395nm. At the same time, according to the significant bands of soil attributes and SG-CR spectrum, the significant bands of 11 soil attributes passing the significance level test of P = 0.01 were selected, i.e406~419nm 、421~423nm 、427~431nm 、1044nm 、1062nm 、1087nm 、1887~1890nm 、2118nm 、2119nm 、2185~2187nm 、2198~2201nm 、2324nm 、2325nm .
The significant band and the inflection point of the correlation coefficient curve were combined to determine the band 405~419nm 、421~423nm 、427~431nm 、781nm 、784nm 、794nm 、805nm 、807nm 、830nm 、831nm 、1044nm 、1062nm 、1079nm 、1085nm、1087nm 、1251nm 、1267nm、1308nm 、1309nm 、1410nm 、1836nm 、1860nm 、1887~1890nm 、1897nm 、1898nm 、2080nm 、2118nm、2119nm 、2137nm 、2149nm 、2156nm 、2184~2187nm 、2198~2201nm 、2324nm 、2325nm 、2382nm 、2395nmas a common spectral characteristic band of soil attribute spectral inversion in well-facilitied capital farmland construction area of Xinzheng City.
2.4. Construction of panel data model
On the basis of soil types, to divide the sample set into calibration set and verification set through used Rank-KS(LIU Wei, et al., 2014) (content gradient method-Kennard-Stone) method. divided into two groups: calibration set and validation set, about the 154 samples in the study area. The calibration set included 116 samples for the construction of soil attribute inversion model, and the validation set included 38 samples for testing the prediction accuracy of the model.
Using the common spectral feature bands selected from SG-CRspectral transformation as independent variables of soil attribute inversion model, and constructed the panel data based on ordinary least squares estimation (OLS ), about soil attribute content of 116 soil samples in Xinzheng city, as shown in Table 1.
The results show that the regression coefficient is significantly not 0, and the sample determination coefficient after adjustmentv 2 is 0.9991, indicating that the goodness of fit of the model is high. A large F statistic indicates that the regression coefficient is significant and the regression model is significant as a whole.