Introduction
Automation technologies are increasingly influencing aquaculture by enhancing fish monitoring, biomass estimation, and feeding management. Traditional manual sampling methods remain labor-intensive, invasive, and subject to bias, underscoring the need for non-intrusive and scalable alternatives (Fernandes et al., 2020). Stereo vision systems combined with deep learning (DL) provide potential solutions; however, existing applications are often expensive, have depth limitations, or are complex to implement. In this study, we present a compact, portable, and cost-effective stereo vision system integrated with a DL model for automated fish weight estimation, developed for offshore aquaculture environments.
Materials and methods
The proposed system consists of a Luxonis OAK-D PoE stereo camera and a PoE-48V portable power supply, both housed in a waterproof enclosure (Figure 1, left panel). The camera captures synchronized RGB and depth images at 20 frames per second (FPS) and operates within a depth sensing range of 0.7 to 12 meters. The system was deployed in a commercial offshore aquaculture cage managed by Kimagro Fishfarming Ltd. (Figure 1, middle panel) stocked with Mediterranean gilthead seabream (Sparus aurata), following an underwater calibration procedure to optimize the camera parameters. During the field study, the system was suspended 2 meters below the sea surface and recorded continuously for 30 minutes, resulting in 291 RGB-depth image pairs. Simultaneously, 220 fish were manually weighed (mean weight=122.91 g, standard error of the mean (SEM)=1.69 g) to provide ground truth (GT) data. The weighed fish were isolated with a splitting net in the cage to ensure accurate correspondence between estimated and measured weights. Three anatomical keypoints—the snout tip, body midpoint, and middle caudal rays—were detected on the fish body in RGB images using the DL model (YOLOv11n-Pose) and subsequently aligned with corresponding depth data to enable 3D reconstruction of the fish body. Fish length was computed as the Euclidean distance between the three identified keypoints (Figure 1, right panel) and converted to weight using a species-specific empirical length–weight relationship: 𝑊=0.0123𝐿3.0284 (Mehanna, 2007). An overview of the proposed methodological framework is presented in Figure 2.
Results and discussion
The system achieved a mean estimated weight of 124.29 g (SEM=6.00 g) after the removal of outliers using the interquartile range (IQR) method, resulting in an absolute error (AE) of 1.39 g and a percentage error (PE) of 1.12% compared to the ground truth (GT) mean weight. Further analysis indicated that training the DL model on ~250 annotated fish instances was sufficient for accurate keypoint localization, while stable weight estimations could be obtained using 60 inference images, corresponding to approximately 9 minutes of camera recording time. Results also demonstrated that including the fish body midpoint as a third keypoint in length and weight estimation improved the system’s accuracy and precision by better accounting for body curvature. Outlier analysis further revealed that deviations in weight estimations were primarily associated with artifacts in the depth maps rather than inaccuracies in the DL-based keypoint detections or morphological abnormalities on fish body, highlighting the importance of reliable depth acquisition. The system’s compact dimensions (17.8 cm (L) × 10.2 cm (W) × 25.4 cm (H)), low production cost (~€620), minimal operational requirements (9 minutes), and straightforward installation further support its viable application for automated fish biomass monitoring in offshore aquacultures.
Conclusion
This study demonstrates that integrating a portable stereo vision system with DL enables accurate, efficient, and non-invasive fish weight estimation in offshore aquaculture settings. By reducing time, operational costs, and labor demands, the proposed system represents a robust, accurate, and scalable solution for advancing precision aquaculture practices.
Acknowledgments
This work was supported by EuroCC2 project under Grant Agreement No. 101101903, co-funded by the EU’s Digital Europe Programme and the Republic of Cyprus through the “THALIA 2021–2027” Programme.
References
Arthur F.A. Fernandes, Eduardo M. Turra, Érika R. de Alvarenga, Tiago L. Passafaro, Fernando B. Lopes, Gabriel F.O. Alves, Vikas Singh, Guilherme J.M. Rosa. "Deep learning image segmentation for extraction of fish body measurements and prediction of body weight and carcass traits in nile tilapia." Computers and Electronics in Agriculture 170.105274 (2020).
Mehanna, S. F. "A Preliminary Assessment and Management of Gilthead Bream Sparus aurata in the Port Said Fishery, the Southeastern Mediterranean, Egypt." Turkish Journal of Fisheries and Aquatic Sciences 7 (2007): 123-130.