Wood density estimation using dendrometric and edaphoclimatic data in artificial neural networks

Forestry measurement is aimed at volumetric production of wood; however, for the pulp processing industry, the main interest is productivity in wood biomass and, to know this variable, it is necessary to determine the basic wood density (BWD) beforehand. Artificial neural networks (ANN) have been used in the forestry sector quite successfully to describe the dynamics of forest characteristics, such as estimating wood volume. In this context, the objective of this study was to assess the accuracy of the basic wood density estimates by means of ANN’s with Continuous Forest Inventory (CFI) and edaphoclimatic input variables. The database consisted of 3,797 data, from permanent plots of the CFI conducted in Eucalyptus sp stands and edaphoclimatic data from the planting sites. The five best ANNs were selected and the analysis of the estimates was carried out through the correlation between the estimated and BWD, the relative root mean square error (RMSE%) and graphical information. It was observed that both the CFI, edaphoclimatic information and the combination of both are potential and present similar results for the basic wood density estimate, and the errors associated with the estimates are between 3.9% to 3.5%. The ANNs based only on the CFI information presented higher RMSE. The use of ANN’s is feasible for estimating BWD and allows for excellent accuracy statistics.


INTRODUCTION
The largest areas of planted forests in the world with fast growing species are found in Brazil (IBÁ 2020) and, approximately, 77% of these plantations are formed by forest species of the genus Eucalyptus spp.Although the use of these forests is aimed at supplying the domestic and foreign markets with products such as coal and cellulose, these forests also play important functions for the ecosystem, such as carbon sequestration and the reduction of pressure on native forests.
Among the factors that contributed to the choice of Eucalyptus spp as one of the main raw materials used in the industrial segments are the high growth rates, high adaptability and productivity.Thus, precise and continuous monitoring of the volume of wood (BURKHART & TOMÉ 2012), production in biomass and consequent, wood density of these plantations, is necessary to maintain forest planning.
The basic wood density (BWD) has been constituted as a universal index to evaluate the quality of the wood, since it is related to the physical, mechanical and anatomical properties of the wood (TSOUMIS 1991, KNAPIC et al. 2007).Although it is a property of high genetic heritability (ASSIS 2014, TAN et al. 2018) it is influenced by some other factors, such as age, productivity and edaphoclimatic variations of the planting environment (GALLO et al. 2018, BARBOSA et al. 2019, ROCHA et al. 2020, BRITO et al. 2020).
Most forestry companies use the forest inventory to obtain the volume in m³ of the forest stand.However, when the wood is destined for the production of cellulose, panels or energy, the forest production should be expressed in mass of wood, which is estimated according to the BWD (CAMPOS & LEITE 2013).Quantifying the density of the wood before harvesting is important to know in advance what is the quality of the wood that will supply the industry.
There is a gap in forest science about techniques that can provide efficient estimates of the basic wood density of Eucalyptus species, in order to favor the inclusion of this property in the routine of forest inventories.The search for implementing these techniques encounters a lack of understanding about the interaction of the BWD property with modifying factors.
In the forestry sector, several computational and mathematical modeling tools have been used with great success in several situations to describe the forest dynamics.Among these tools, the Artificial Neural Networks (ANN) stand out.ANNs are a computational system made up of simple, highly interconnected processing elements to perform a certain task.Several works have been developed with the objective of adapting and parameterizing the ANN techniques, for several situations, such as the estimation of the volume of trees (GORGENS et al. 2009, SILVA et al. 2009, BINOTI et al. 2014, CORDEIRO et al. 2015) growth and production (BINOTI 2010), taper (LEITE et al. 2011, SCHIKOWSKI et al. 2015, MARTINS et al. 2017), and wood density estimates (BOA 2018, LOPES 2018, RIBEIRO 2018).
In this context, the objectives of this study were to know the influence of dendrometric and edaphoclimatic variables on wood density, as well as to model this property according to quantitative and qualitative (cadastral) variables obtained from the Continuous Forest Inventory (CFI) and climatic information of the study area, employing Artificial Neural Networks.

Description of data and study area
The data used in this study came from permanent plots of continuous forest inventories (CFI), conducted in Eucalyptus stands located predominantly in the central east region of Minas Gerais -Brazil.The CFI data with 7 to 8 measurements of dendrometric data per plot, totaling 3,797 records and the edaphoclimatic data were considered for the plots of the study population.
CFI data were processed at field level and include the quantitative and qualitative variables, described in Table 1, with their respective descriptive statistics and descriptions.
It is worth noting that qualitative variables were included in all scenarios of this study.The climatic data were obtained from climatic stations distributed in the study region, comprising the annual averages in the period between the years 2006 to 2013.The climatic stations data, connected by their respective geographical coordinates, were processed in the Arcgis ® software for the entire base of the register in order to extrapolate the information for each plot using the Thiessen polygon methodology, as described by ALCÂNTARA (2015).
The edaphoclimatic variables used in this study are described in Table 2.

Training and generalization of artificial neural networks
According to the available data, three scenarios were carried out to estimate the density of the wood at the cutting age to supply a pulp mill.They were: Scenery 1 -Estimation of wood density as a function of CFI variables; Scenery 2 -Estimation of wood density as a function of edaphoclimatic variables; and Scenery 3 -Estimation of wood density as a function of CFI variables and edaphoclimatic variables; as detailed in Table 3.In all studies, the database was randomly divided into two sets, with 80% of the data for training (parameter adjustment) and 20% for generalization (training validation) of artificial neural networks (ANN).Twenty ANNs were trained for each scenario in the NeuroForest software (BINOTI 2013) and the best 5 were selected based on the evaluation criteria described in next section.The stopping criterion is the number of 3,000 cycles or the mean square error of the estimates of 0.0001.In other words, the training was completed when one of the criteria was reached.The training algorithm used was Resilient propagation (RPROP +).
The same ANN configuration was considered for all scenarios, being of the multilayer perceptron (MLP) type, with 12 neurons in the hidden layer and one neuron in the output layer.The logistic activation function was used in the hidden layer and in the output layer.The database consisted of quantitative and qualitative variables, and for each scenario the number of neurons in the input layer followed the rule of one neuron for each quantitative variable, one neuron for each class of qualitative variable and one more neuron to represent the bias (HAYKIN 2001).In this study, the number of neurons in the input layer of each scenario was 34, 40 and 73, respectively.

Evaluation of wood density estimates
The evaluation of the estimates of artificial neural networks in the training and validation stages was made by statistics and graphic analyzes.The statistics used were: the correlation between the estimated and observed densities and the square root of the percentage mean square error (Root mean square error -RMSE%) that reports an average percentage error of all estimates in relation to the average of the observed values (MEHTÄTALO et al. 2006) represented in Equation 1.
Where is the observed mean density; is the observed density of i-th field; is the density estimated by the NNA of the i-th field, and n is the total number of fields.
As complementary analyzes, scatter plots of percentage errors and histograms of percentage frequency of errors were generated.The percentage error was calculated according to Equation 2. (2) Of the all MLP-type ANN generated for each scenario (1, 2 and 3), the five best were selected according to the evaluation criteria.The percentage errors were plotted according to the estimated values of the basic density of the wood for the generalization sets in all studies, as well as, they were grouped in classes of the amplitude of 5% and the percentage frequency in each class in each scenario was calculated.

RESULTS
The three scenarios showed strong and significant correlations between the observed and estimated basic densities, as well as low estimation errors, classifying the three scenarios as feasible for estimating basic wood density.However, scenery 2 of basic wood density estimation using edaphoclimatic characteristics provided better results, both for the training of ANNs and for generalization.The correlations between the estimated and observed densities were on average 3% greater than for scenery 1 (CFI) and 0.20% greater than for scenery 3 (CFI + Edafoclimatic) for both training and generalization (Table 4).According to Figure 1, it is possible to observe that for the three studies and in the five ANN's there were few discrepancies between the observed and estimated basic density following linear trend e showing that the adjustment of the estimate was satisfactory.
Figure 1.Graphs of the relationship between the observed and estimated values of basic wood density for each scenery.
The errors associated with the estimates showed averages 0. However, there is a slight tendency towards a greater dispersion of errors for the higher basic wood densities in the training of the five ANN's for the three studies in the range of 0.500 kg.m³ to 0.550 kg.m³ (Figure 2).It was observed that for the five ANN's in the three scenarios, from 92.54% to 94.91% of the errors occurred in the range of -5% to 5% with a high concentration in 0% followed and the amplitude of errors was similar for the three scenarios for both training and validation (Figure 3).The tested statistics (Table 4) showed the viability of the technique by the high correlation with values above 0.80 or 80% and with low values of RMSE%, the closer these are to zero, the better the quality of the estimate.Based on these same values, it was not possible to infer which of the studies would be the most viable to estimate the density.However, it was possible to estimate the basic density of the wood with good precision as a function of the CFI data, the edaphoclimatic characteristics or the combination of both.
In estimating the basic density, the use of data obtained through the CFI resulted in great accuracy and consistency, which can be observed in the estimates of Scenery 1.However, with the inclusion of edaphoclimatic information added to the CFI data in the training base (scenery 3), there was no significant difference between the error classes.Therefore, in practice, ANN 2 (scenery 2) can be used with edaphic and climatic information to project the basic density in locations without a CFI database or without plantations, and ANN 3 (scenery 1) for situations where data from CFI exist.

DISCUSSION
The desire to include basic wood density in forest inventory routines has stimulated the development of studies aimed at estimating it, thus, several techniques such as the use of linear regression models (ROCHA et al. 2020), mixed models (SILVA et al. 2019), and currently using artificial neural networks (LEITE et al. 2016, SILVA et al. 2018).However, it is necessary to know which factors modify this property in order to develop a good estimation method of basic wood density.
For a long time, it was believed that its genetic heritability primarily modified the basic density of wood, age, and growth rate of the stand (ASSIS 2014, TAN et al. 2018, VIDAURRE et al. 2020).Recently, many studies have proven the influence of other variables, such as climatic and dendrometric variables on the basic density of wood of Eucalyptus species.
According to ALMEIDA et al. (2020), there are traces of changes in wood basic density according to environmental conditions due to growth variations.In a humid tropical environment, a moderate decrease in the wood density of E. grandis was observed, associated with accelerated growth resulting from increased precipitation (SETTE et al. 2016).
Eucalyptus trees have strategies for adapting their growth rate according to the environment they are submitted to, for example, in underwater stress conditions, their stomatal conductance's are reduced, minimizing water loss, while the stress condition remains the structure of the secondary xylem is modified with a decrease in the diameter of the conducting vessels in order to minimize the risk of the implosion vessels.These changes modify the basic density of the wood in addition to growth (FERNÁNDEZ et al. 2019, ELLI et al. 2019).
The influence of climatic variation on the basic density of wood of commercial species of Eucalyptus at the age of four was studied for 11 locations in Brazil and it was identified that the meteorological variables modified each genetic material differently.It was observed that E. grandis x E. camaldulensis presented wood up to 9% denser in dry environments.E. saligna maintained its constant density in both dry and humid environments, and E. urohylla showed wood up to 5% denser in humid environments (ROCHA et al. 2020).
The growth and basic density of wood of tropical E. urophylla clones at the age of four, planted in 10 locations in Brazil, were studied and it was observed that the average annual wood increment of the stands was more altered by climatic variation than the basic wood density and that not always trees that grow more have less dense wood (COSTA et al. 2020).
When verifying the influence of climatic variables on the wood properties of two E. urophylla clones a seven-year-old Eucalyptus plantation and planted in the same region of this study, low variation in the basic density of wood was identified in different environments.However, the climatic variable with the greatest influence on the properties of the wood was the water deficit because it has the ability to gather climatic and soil parameters in its calculation, being more representative of the characteristics of the environment (BARBOSA et al. 2019).
In this study, the best scenario for estimating the BWD was scenario 2 (edaphoclimatic variables), because according to ELLI et al. (2020), climatic variables are strongly correlated with radial growth and wood density, indicating a real potential for being used as estimators of these characteristics.
The role of environmental variables and growth in the modification of the basic density of wood in Eucalyptus species is undeniable, making them potential for estimating property.However, it is necessary to observe the peculiarities of each genetic material and the relationship between climate, growth and consequences on the quality of the wood, especially when thinking about developing methods for estimating the basic density of the wood, obtaining a diverse database that represents the maximum possible situations for the site in studies and genetic materials is indispensable.
Different methods for estimating basic wood density have also been tested in order to select the best input variables.In Eucalyptus plantations located in the same region of the present study, it was identified that, among the ANN's brute force methods, Garson and Random Forest algorithm, the ANNs presented estimates with a higher degree of accuracy and the clone input variables, age, total volume with shell, average temperature and water deficit contributed positively to this result (LOPES 2018).It is noticed that types of clones and the environmental and dendrometric variables stand out in the estimation of the basic density of the wood because they include the genotype environment relations that govern the production and quality of the wood.
When investigating methods for estimating the basic density of wood from species in the Cerrado, it was identified that the ANN's presented R² = 0.72, similarly to the regression models.However, the error associated with ANN's was on average 38% lower, it was also observed that characteristics such as species and BWD of trees contributed positively to greater precision in the use of ANN's to estimate the basic density of wood (SILVA et al. 2018).
When testing ANN's for estimating basic wood density in Eucalyptus stands at the cutting age of six years it was identified that when using input dendrometric variables.It was possible to identify optimal configurations, the best with four hidden layer neurons and a hyperbolic tangent activation function in the hidden and sigmoid layer in the output layer.The errors associated with the estimation were concentrated in the range of -2 to 2 for this configuration, and the greatest dispersions of errors were found in the wood density ranges between 0.400 g.cm³ and 0.500 g.cm³ (LEITE et al. 2016).These results corroborate with the results found in this scenery, mainly, the greater dispersion of errors in the same wood density ranges.
ANNs are proven to be viable for estimating basic wood density in Eucalyptus plantations with satisfactory accuracy when using dendrometric and edaphoclimatic variables.However, the choice of input variables is a decisive factor for using the method, because, in the context of this study, both variables did not present significant differences between them.In other studies, available in the literature, the two variables are highlighted as the most indicated, since the basic density of wood is considered as a product resulting from the interaction between genetic material, environment and growth.

CONCLUSION
The ANN's proved to be viable for use in estimating the basic density of Eucalyptus wood.Independent of the input variables for prediction (edaphic variables + CFI, edaphoclimatic variables or edaphoclimatic variables + CFI), the training of ANN's for this variable makes it possible to include the basic density of wood in the continuous forest inventory programs, minimizing expenses related to labor, displacement in the field and processing in the laboratory.

Figure 2 .
Figure 2. Dispersion graphs of percentage errors as a function of the estimated values for each scenery.

Figure 3 .
Figure 3. Histogram of dispersion of percentage errors as a function of the estimated values for each scenery and grouped into amplitude classes and their respective frequencies.

Table 1 .
Quantitative and qualitative information that represent the continuous forest inventory variables in this research.

Table 2 .
Quantitative information that represent the edaphoclimatic variables.

Table 3 .
Input variables (qualitative and quantitative) and output variable for Scenarios 1 to 3.

Table 4 .
Correlation and percentage of root mean square statistics error (RMSE%) for training and generalization of the five best networks in each study to estimate the wood density.