Introduction
The aquaculture industry produces enormous amounts of data every day. Data comes from sensors in the fish cages, cameras, boats, feeding barges, but also from e-mails, notes, and through conversations between the people working on the production sites. Some tools exist for gathering and displaying the produced data, and there are several initiatives for standardizing the data, but there are also challenges with today’s solutions related to finding and visualizing the interconnections between the gathered data. When the data quantity is high the interconnections between data become more important. Furthermore, it is difficult for humans to navigate through millions of data points in a table and find these interconnections. A graph database [1] stores relationships and connections between data, as well as the data itself, and is therefore well suited for this particular purpose. The Aquagraph project (RCN 321422) aims to implement aquaculture data in a graph database and develop methods and algorithms that will aid the fish farmers in their daily operations as well as in their long-term planning. The project group consists of the fish farming company Eide Fjordbruk [2], one of Europe’s largest ocean research organizations, SINTEF Ocean [3], and leading data scientists from the innovative company and project owner Searis AS [4].
Materials and methods
The project utilizes the Neo4j graph database software to store and structure the data [5]. The Neo4j software features extensive libraries for graph representations and multiple algorithms allowing searches, graph traversal, and clustering, amongst others.
Prior to bringing the data into a graph database the assets that belong to an aquaculture production process, and the relationships between them, need to be given a formal name and a definition in the form of an ontology data model [6]. The data model defines the general concepts in a domain context, e.g., the aquaculture fish farming operations. Once the ontology is in place one can easily add and remove assets from the graph, and the graph database will automatically ensure that the asset has the correct properties and relationships. As such, scaling and updating a graph database is fast and effective, compared to traditional databases. Figure 1 shows how some assets and the life cycle of the fish are described in the ontology, while Figure 2 shows how these are implemented in the graph database and connected to all the proper dependencies, with the correct relationships and properties. Other assets included in the ontology can be listed as sensor data, batches of fish, observations, actors, and events.
The software Clarify [4], developed by Searis, is used in the project to represent time-series data from sensors, but also as a platform for operators to communicate. Clarify serves the graph database with data in form of sensor measurements, time stamps of important events (e.g., delousing operations and feeding times), inputs from the operators, and maybe most importantly: Anomaly detections that, when represented in a graph, will make it simple for the operators to detect if something needs attention.
Results
Initial results from the project includes a complete ontology for aquaculture data structuring. Furthermore, a graph database representation of all aquaculture assets at one of Eide Fjordbruk’s fish farms has been implemented in Neo4j. The graph database, and graph visualization, have been applied to new data, but historical data has also been implemented. By doing this it is possible to compare historical productions that e.g., has been very good, to new, ongoing productions, and determine if there are differences between them, and where one should focus the effort e.g., leading up to a delousing operation or when splitting and merging groups of fish.
Conclusion and future work
The results from the Aquagraph project, i.e., the new way of visualizing aquaculture data and the optimization algorithms, may enable the fish farmers to learn more about the cause and effect of their actions, the significance of various environmental parameters, and may make it easier to track a group of fish throughout a production cycle, from egg to slaughter. The graph database representation can lead to the discovery of new relationships that were previously underestimated or unknown to the fish farmers. It will help put the important data in front of the right people which may secure a better foundation for decision making. The project lays the foundation for the implementation of new tools on top of the existing Clarify platform. Furthermore, machine learning can be applied to predict outcomes of production, and future relationships which were previously unknown. The project group believes that augmenting the framework with machine learning capabilities will be beneficial to the project since this opens for the use of not only machine learning plus data, which is good, but machine learning, plus data, plus context of data, which is better.
References
[1] I. Robinson, J. Webber, and E. Eifrem, Graph databases: new opportunities for connected data. O’Reilly Media, Inc., 2015.
[2] Eide Fjordbruk. https://www.efb.no/
[3] SINTEF Ocean. https://www.sintef.no/
[4] Searis AS, Clarify. https://www.clarify.io/
[5] Neo4j, Inc., Neo4j. Accessed: Apr. 25, 2022. [Online]. Available: https://neo4j.com/product/neo4j-graph-database/
[6] C. M. Keet, ‘An Introduction to Ontology Engineering’. University of Cape Town, 2018.