Comparing Predictive Models: Is Simpler Always Better for Tree Age Estimation?

Initially, I was under the impression that including factors lacking predictive power would introduce unnecessary uncertainty. This added uncertainty, I thought, would negatively impact the elpd (expected log predictive density) of left-out observations in both exact and PSIS-LOO (Pareto Smoothed Importance Sampling Leave-One-Out) cross-validation. My reasoning was that this would penalize the model for being over-parameterized. Therefore, if removing these seemingly redundant factors doesn’t improve the elpd, wouldn’t a model that includes every possible factor invariably be the optimal predictive model? This is where my confusion arises, and I would appreciate clarification.

My current model aims to predict tree age without needing to drill into the core, relying on more easily obtainable observations. I’m exploring the relationship between tree age and various explanatory factors. It’s not clear that the logarithm of age increases linearly with these factors. Instead of predefining a transformation, I wanted to empirically determine the optimal power transformations for each factor to achieve the best model fit.

However, the analysis reveals that only tree diameter exhibits a clear, significant effect. Examining the 95% credible intervals for the coefficients of all other factors, they all encompass zero. This suggests that estimating exponents for these other factors is likely unproductive. Consequently, I might consider simplifying the model by omitting the exponents and simply log-transforming the diameter data. Nevertheless, it remains interesting to Compare Too whether a power transformation with estimated exponents outperforms a model with an a priori specified transformation.

To assess the model’s predictive performance, I’ve generated violin plots based on a model that solely includes diameter as a predictor. For each tree, I predicted logAge using predictive data and then grouped these predictions by diameter classes: narrow (<30cm), medium (30-50 cm), and wide (>50cm).

The violin plots show that the distribution of both predictive distributions and the observed data is broader in the narrowest diameter class (a). This observation also aligns with biological intuition. Only older trees can attain a wide diameter, but trees can grow slowly due to various environmental factors, meaning narrow trees can be either young or old.

I welcome any insights or suggestions regarding model evaluation and potential improvements.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *