Fish counting makes it possible to estimate the appropriate amount of feeding and the total volume of product in harvest timing from the fish cage. The number of fish is currently estimated by dividing the total weight of juvenile fish by the average weight of sampled ones when the fish farmer buys them. With this method, the gap between the estimated and actual fish count can be up to 50%.
At first, detecting fish is needed for counting fish in a fish cage. Deep learning meth- ods have proven to work well for such detection tasks. The challenge for training such a model for our objective is collecting enough datasets with significant variations by capturing enough videos underwater. Because water clarity changes when the season, weather, and other environmental factors change, we need to collect enough variations to train the visually based counting method that works in any conditions robustly.
Methods and Results
Our proposed method uses synthetic data generated by the realistic bio-inspired fish simulation and physically based underwater simulation, which is called Foids [1,2]. The simulation considers the density of chlorophyll and sediment, light scattering, intensity, and casting orientation varying depending on the day and time, and the shape and size of the net pen. The Foids algorithm enables us to obtain the training data of the fish school movies on the arbitrarily chosen day, time, and weather (See Fig. A). Furthermore, it is possible to set the number of fish and arbitrarily obtain the 3D configurations of fish. Then, it is a better training dataset than the videos taken in real situations. This paper introduces the fish counting application based on the Foids, in which YOLOv4  (Fig. B) is used for fish detection (See Fig. C). Figure D exhibits the pictures obtained from the field works (upper) and the ones provided by the simulation (lower) for each species. In the simulation, cameras are set on the same points as in the field works carried out in the actual net pen. The result of the fish counting using our method is shown in Fig. E. Our method with deep learning methods, using the synthetic data created by using Foids, reduces the time 98% from the time taken manually, and the difference between them is approximately 3%. We will further develop the method providing the synthetic data set and apply it to estimate the fish size from the videos taken in the fish cage.
 Y. Ishiwaka et al. “Foids: bio-inspired fish simulation for generating synthetic datasets” ACM Transactions on Graphics 40, 207, 1–15 (2021).
 Y. Ishiwaka et al. “DeepFoids: Adaptive Bio-Inspired Fish Simulation with Deep Rein- forcement Learning” Advances in Neural Information Processing Systems 35 (NeurIPS 2022).
 A. Bochkovskiy et al., “Yolov4: Optimal speed and accuracy of object detection” 2020.
A: Physically-based environment simulation. B: Network architecture with YOLOv4 network architecture. C : Fish detection. The left panel shows the scene and from left to right panels exhibit the detection by use of 2D bounding box, 3D bounding box, and silhouette, respectively. D: The upper panels show the pictures from the videos taken in actual fish cages. The lower panels show the ones taken from the synthetic data. For each row, the pictures exhibit yellowtail, red seabream, and coho salmon respectively. E: This panel compares the time and results of fish counting done by our proposed methods and in a manual way.