spatialRF Modernization Roadmap
This document outlines the planned modernization of
spatialRF, organized into logical phases to minimize
disruption and maximize coherence.
Phase 1: Backend Infrastructure (Non-Breaking)
Rationale: Modernize core dependencies before API changes.
1.1 Dependency Modernization
- Replace
foreach+doParallelwithfuture+future.apply- More flexible execution strategies
- Better control over parallel backends
- Add
progressrfor progress reporting- Works seamlessly with
future - User-configurable progress handlers
- Works seamlessly with
- Remove
dplyrandtidyrdependencies- Use base R equivalents to reduce dependency footprint
- Improve installation reliability
- Faster package load times
- Replace
magrittrpipe (%>%) with base pipe (|>)- Native R syntax
- Remove
huxtablefrom print functions to reduce dependency footprint.
1.2 Function Delegation
- Replace
auto_cor()andauto_vif()withcollinearpackage calls- Leverage actively maintained specialized package
Phase 2: Testing & Performance (Non-Breaking)
Rationale: Establish robust testing before API changes, and optimize performance with current architecture.
2.1 Test Suite Enhancement
- Expand unit test coverage to >80%
- Test edge cases and error conditions
- Add tests for all exported functions
- Test parallel execution paths
- Add integration tests
- Test complete workflows
- Test piped operations
- Test model comparison scenarios
- Add benchmarking suite
- Track performance across versions
- Identify regression
- Document expected execution times
2.2 Performance Optimization
- Profile and optimize bottleneck functions
-
rf_spatial()spatial predictor generation -
rf_evaluate()cross-validation loops -
rf_tuning()grid search - Distance matrix operations
-
- Add intelligent caching where appropriate
- Cache expensive computations
- Clear cache invalidation rules
- Optimize memory usage
- Reduce object copying
- Better memory management in parallel operations
- Streaming where possible for large operations
Phase 3: Documentation Overhaul (Non-Breaking)
Rationale: Improve documentation while maintaining current API, preparing for future changes.
3.1 Roxygen Improvements
Improve explanations in roxygen docs.
Define clear input types for args, and output types for returns
-
Use
@inheritParamsto reduce redundancy- Document common parameters once
- Easier maintenance
- Consistent parameter descriptions
3.2 Vignettes & Articles
- Add technical vignette: “Spatial Predictors Theory”
- Conceptual background on MEMs
- When to use which method
- Interpretation guidelines
- Add technical vignette: “Model Evaluation & Cross-Validation”
- Spatial cross-validation theory
- Interpretation of metrics
- Comparison with standard CV
- Add practical vignette: “Working with Large Datasets”
- Memory management strategies
- Parallel processing best practices
- Sampling strategies
- Add practical vignette: “Species Distribution Modeling”
- Binary response workflows
- Class imbalance handling
- Spatial projection considerations
- Add practical vignette: “Hyperparameter Tuning Guide”
- When tuning is necessary
- Computational considerations
- Interpretation of results
Phase 4: User Experience Improvements (Non-Breaking)
Rationale: Improve UX without breaking changes.
4.1 Messaging & Feedback
- Replace message/warning/stop with
clipackage- Prettier, more informative messages
- Better formatting
- Consistent style
- Improve error messages
- Suggest fixes for common errors
- Include relevant context
- Point to documentation
4.2 Object Consistency
- Formalize S3 classes for model objects
- Consistent structure across model types
- Better print methods
- Better summary methods
- Improve print methods
- Show most relevant information first
- Cleaner formatting
- Improve plot methods
- Consistent theming across all plots
- Better defaults
- Return ggplot objects for user modification
Phase 5: API Modernization (BREAKING CHANGES - Major Version)
Rationale: Breaking changes grouped together in a major version release.
5.1 Argument Name Standardization
Old → New:
-
data→df -
dependent.variable.name→response -
predictor.variable.names→predictors -
distance.matrix→distance_matrix -
distance.thresholds→distance_thresholds - Use ellipsis for
rangerarguments and deprecate argument `ranger.arguments`` -
num.trees→n_trees -
min.node.size→min_node_size…
Implementation:
- Create wrapper functions maintaining old API with
.Deprecated() - Provide clear migration guide
- Add lifecycle badges to all functions
5.3 Simplified Function Interfaces
- Reduce number of required arguments where possible
- Better defaults based on data
- Auto-detection of data characteristics
- Smart parameter inference
- Group related parameters into list arguments
-
spatial_configfor spatial-related parameters -
cv_configfor cross-validation parameters -
plot_configfor plotting parameters
-
Phase 6: New Features & Extensions
Rationale: Add new functionality after core modernization is complete.
6.1 Core Features
-
Temporal autocorrelation support
- Extend spatial methods to space-time
- Temporal cross-validation
- Spatiotemporal predictors
Additional modelling engines: lm, gam, xgboost
-
Additional spatial predictor methods
- Spatial wavelets
- Additional kernel methods
- User-defined spatial basis functions
Phase 7: Advanced Optimizations
Rationale: Long-term performance improvements for specialized use cases.
7.1 Computational Efficiency
- Rcpp integration for bottlenecks
- Distance matrix operations
- Spatial predictor generation
- Moran’s I calculations
- Sparse matrix support
- Memory efficiency for large distance matrices
- Faster operations on sparse structures
- Approximate methods for large datasets
- Sampling-based spatial predictors
- Approximate cross-validation
- Scalability beyond current limits
Implementation Principles
Notes
This is a living document and will be updated based on community feedback and emerging best practices. Some phases can overlap and be developed in parallel. Priority may shift based on user needs and bug reports. Emergency releases for critical bugs take precedence over roadmap.
Last updated: 2025-12-20