Introduction
In recent year, there is an increasing number of applications of computer vision in agri culture and aquaculture. The applications include individual tracking, behavior monitoring, species identification , phenotyping and prediction of complex traits based on images or video (Vo et al., 2021). Image data has the benefits of being high-throughput and non-invasive. In image data analysis, o ne of the most popular methods is neural network, especially convolutional neural network (CNN). However, as CNN has been criticized for its ‘black box’ nature, the lack of interpretability of the image-based neural network can make the results questionable for real-life application . Some methods attempt to understand how CNNs learn by revealing the imaginal features they use to establish the prediction. One of the most intuitive methods is c lass ac tivation m ap (CAM). The core function of CAM is to reveal the regions in an image that are most relevant for the prediction. Field experts can therefore use CAM-revealed features as a referen ce to judge the reliability of the CNN prediction . Ideall y, class activation map can make CNN more transparent and promote data driven decisions.
The study aims to investigate the validity of the predictive features of CNNs revealed by CAM from a genetic perspective with a case study. The case study is interested in the physical characteristics of fish that contribute to their critical swimming speed () (Brett, 1964). We use individual 3D images and CNN to predict in rainbow trout. With fish physiologist w e then defined two new traits based on the CAM features derived from the prediction . At las t, we calculated the genetic properties of these new traits i n relation to swimming speed and body weights.
Materials & Methods
We conducted swimming test and acquired individual records on for 1037 rainbow trout . Each fish was weighted and measured for length manually and allowed time for recovery. Afterwards, all fish were subjected to imag ing through an imaging equipment. E ach individual fish was placed towards the same direction when its lateral side was captured simultaneously in two images: one RGB image, and one depth image where each pixel contains the distance information between the camera and the lateral surface of the fish (Figure 1). By combining these two images, we obtained a 3D-colored hologram of each fish with its lateral side up (Figure 2, left).
We also corrected the recorded for body length for each fish. For corrected prediction we used part of the analytical framework for image data proposed by Xue et al. (2023) . Predictive features were visualized using gradient weighted class activation map (GradCAM) (Selvaraju et al., 2019) . We included the interpretation from fish physiologists and refined the visualized predictive features into two swimming traits. We then annotated these traits on the original RGB and depth images for each individual (Figure 3). We estimated the heritability of these traits and their genetic correlation with corrected using an animal model: , w here are the measurements of the traits, is the overall mean of each trait, are the additive genetic effects and the residuals.
Results
The correlation of the CNN prediction with the real corrected was 0.23. The right side of figure 2 shows the 3D predictive features revealed by GradCAM. The regions highlighted in red correspond to the contour of the fish, the volume of the head, the caudal fin and the volume of a narrow region along the dorsal side. Based on the location of these features and the biological function of the corresponding regions in a fish , two s wimming traits were defined in consulta tion with fish physiologists: One trait is the volume of the head. A nother one is the volume of epaxial muscle corrected by body weight , hereafter refer to as the ratio of epaxial muscle. The heritability of head volume and ratio of epaxial muscle were 0 and 0.23, respectively. No significant genetic correlation was found between the volume of the head and corrected . However, the ratio of epaxial muscle had a genetic correlation of 0.35 with corrected .
Discussion
In this study, we built a CNN using 3D images to investigate the relationship between the physical characteristics and the swimming speed of rainbow trout. The predictive features from CAM were narrowed down to two traits by annotation : head volume and ratio of epaxial muscle. Image-based CNN explained only 23% of the variance within corrected but this is enough for the activation map of the trained CNN to provide valuable information for pinpointing novel predi ctor traits. T he results of genetic analysis validate the intrinsic relationship between and the ratio of the epaxial muscle - a trait derived from the activation map features of the trained CNN . The positive genetic correlation makes intuitive sense; if two fish are of the same weight, the one with a higher volume of epaxial muscle swims better. Head volume, however, shows no significant genetic relationship with . The highlighted region of head in CAM might due to variation in its other propert ies like shape or its relative distance to other highlighted parts. Future study will continue the interpretation of predictive features, also with different animal models.
Reference
Brett, J.R., 1964. The respiratory metabolism and swimming performance of young sockeye salmon. Journal of the Fisheries Board of Canada, 21(5), pp.1183-1226.
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D. and Batra, D., 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (pp. 618-626).
Vo, T.T.E., Ko, H., Huh, J.H. and Kim, Y., 2021. Overview of smart aquaculture system: Focusing on applications of machine learning and computer vision. Electronics, 10(22), p.2882.
Xue, Y., Bastiaansen, J.W., Khan, H.A. and Komen, H., 2023. An analytical framework to predict slaughter traits from images in fish. Aquaculture, 566, p.739175.