DeepCanola: Phenotyping brassica pods using semi-synthetic data and active learning
Phenotyping, the measurement of attributes or traits, is crucial in selecting superior cultivars for specific environmental situations. This is a time-consuming process when applied to large populations but can be accelerated through the use of deep learning, resulting in an algorithm that can phenotype images of specimens in negligible amounts of time. The primary issue with deep learning is the large quantities of high-quality training data required to make a viable phenotyping pipeline. To address this, we present a semi-synthetic training data generation system which significantly reduces the amount of human effort spent on data collection. We use active learning alongside this system to create DeepCanola, an instance segmentation model that successfully segments and measures the valves from Brassica napus pods. We demonstrate that the model accurately estimates the effect of different winter cold treatments on a range of different cultivars and crop types as effectively as manually curated measurements. Furthermore, the resulting model is effective on data from various experimental settings and on different, but related, species such as Arabidopsis thaliana, Allaria petiolate (garlic mustard) and Raphanus raphanistrum subsp. sativus (radish). This robust tool could be easily scaled, thereby accelerating breeding or fundamental research programs. Code and model weights: https://github.com/kieranatkins/deepcanola.
| Item Type | Article |
|---|---|
| Open Access | Gold |
| Keywords | Deep learning , Plant phenotyping , Semi-synthetic data , Active learning , Human-in-the-loop , Pod length |
| Project | Tailoring Plant Metabolism (TPM) - Work package 1 (WP1) - High value lipids for health and industry, Brassica Rapeseed And Vegetable Optimisation (BRAVO) |
| Date Deposited | 05 Dec 2025 10:46 |
| Last Modified | 19 Dec 2025 14:58 |


