Intra- and Inter-Rater Reliability In Log Volume Estimation Based On Lidar Data and Shape Reconstruction Algorithms: A Case Study On Poplar Logs
Reliable log volume data is crucial in the wood supply chain. A promising technology used lately to source log volume data is based on LiDAR sensing. Several interpolation algorithms have been developed to extract generic volume data from LiDAR point clouds, such as the Poisson interpolation and Random Sampling and Consensus (RANSAC). Some tests have checked the accuracy of LiDAR estimates by comparative studies. As a common methodological approach, point clouds require several post-processing tasks to get useful data. These include segmentation, removing noise and assigning normals, and are followed by shape reconstruction by dedicated algorithms. Although this procedure has been used quite frequently, and several papers have pointed out the accuracy limits of single comparative experiments, no paper has addressed the intra- and inter-rater reliability of these procedures. This raises two important questions, namely: (i) Would the same person processing the same data by the use of same procedures come to the same results? and (ii) How much would the results deviate when different persons will process the same data using the same procedures? A set of 432 poplar logs arranged at the ground at about 1 m each other was scanned by a professional mobile LiDAR scanner (Z) as groups, then the first 418 logs of the initial data set were scanned individually by an iPhone-compatible scanning app (3D). Then the data was assigned for processing to two researchers (R1 and R2) who used a protocol to process the point clouds twice (A1 and A2), by repeating the same steps each time, which included calculation of volume by Poisson interpolation (P), and reconstruction of cylinders (Ci) and cones by (Co) by RANSAC algorithm. Intra- and inter-rater agreement was evaluated by several metrics, out of which the mean absolute error (MAE) was retained here for comparison. For inter-rating experiment, when working with Z data, four comparisons were in place for each volume extraction algorithm. MAE ranged from 0.051 to 0.060 m3 (P), from 0.123 to 0.130 m3 (Ci), and from 0.123 to 0.221 m3 (Co); for the same algorithms but when working with 3D data, MAE ranged from 0.029 to 0.045, 0.093 to 0.109, and from 0.081 to 0.095 m3, respectively. For the intra-rater experiment with the Z data, the MAE returned values of 0.024 and 0.064 for P, 0.090 and 0.142 for Ci, and 0.091 and 0.141 for Co. In the same experiment, but for the 3D data, the values of MAE were of 0.016 and 0.026 for P, 0.088 and 0.101 for Ci, and 0.067 and 0.139 for Co. All of these suggest that there are important differences in volume estimates brought by the raters when repeating the experiment as individuals, or by different persons running the experiment on the same data. This has implications for the use of the data and makes a good point to find ways for automating the whole process to get better in data consistency.