THE PROJECT
Objectives & general description of the action
The main purpose of the project is to develop an integrated fisheries information system with the aim to enable reliable scientific advice and to support the work of the RCGs, facilitating a better performance towards efficient management, fast response times of data processing and increase of data robustness delivered to end-users. In addition, the RDBs will facilitate the work of the EU Member States by reducing the burden of multiple data submissions (for data calls) under different formats. They will allow end users to calculate statistical estimates of data tailored to their needs, and help to streamline and ease the reporting of Member States on the EU data collection. For the construction of the database and for the estimation subsystem, the hierarchies, structure and estimation methods of existing development initiatives (similar to RDBES) will be considered. The tools incorporated in the Med&BS RDBFIS will facilitate the fulfilment by EU MS of data collection submission and reporting obligations at the EU and international levels. Therefore, the RDBFIS should become the source of EU data to support data requirements under the DCF for the specified marine region and should contain detailed biological data of demersal and small pelagic species and aggregated transversal data (i.e. landings and effort). There are also a number of other areas that the RDBFIS could service, such as: bycatch and PETS data, large pelagic data, recreational data, alien species. As an added value, a series of advanced functionalities will be incorporated in the RDBFIS allowing for: (i) estimation and mapping of effort and landings by rectangle for the small scale fisheries, (ii) analysis of VMS data and (iii) estimation of ecological indicators. The tools and processes that will comprise the parts of the RDB will ensure the ‘value’ and quality of data: data has to be validated, quality-checked, compliant, inter-exchangeable and well defined.
Building on the results of the MARE/2014/19, MARE/2016/22 and other relevant existing work (RDBES, IMAS-Fish, FishFrame), this project aims to support projects that further enable RCGs and MS to strengthen regional or EU-wide cooperation on data collection. The final product will be a state of the art web-based integrated fisheries information system driven by a friendly graphical user interface to support the management, analysis and elaboration of alphanumeric and spatial data from the fisheries sector. Open source packages will be used for the development of the Med&BS RDBFIS. The final hosting location of the Med & BS RDB will be decided from the Commission, RCG and Member States.
Workpackages Details
The Work Package 0 will form the basis of the management of the project and deals with the organization of the first meeting of the project (kick-off) as well as the regular coordination meetings during which the coordinator will monitor the work progress, resolve any problems that may arise in the course of the project and organize future activities. In addition the objectives of coordination are: (i) to monitor all the activities for ensuring the project outputs according to the project time schedule and decisions taken, (ii) to ensure a communication among stakeholders: the Commission, the Member States of the RCG Med & BS, the main end users of the region (STECF, GFCM, ICCAT). In addition to the regular meetings, the coordinator will be able to organize minor meetings with certain Work Package responsible scientists in case that any need arises in a certain implementation phase. The activities to be realized in WP0 will be: (i) Preparation of the kick-off, 1st progress, mid-term, 2nd progress and final meetings; (ii) Coordination of the WPs activities in collaboration with WP and task leaders ensuring connectivity among them; (iii) Preparation and submission of the required reports (1st Progress, Interim, 2nd Progress, Draft Final, and Final Reports). One of the main objectives of WP 0 is to organise the preparation of the deliverables and the establishment of a procedure to monitor their progress and submission within the deadlines as set in the contract. The Coordinator will be solely responsible to monitor the progress of the production of the reports, the collection and control of the material required for the compilation of the reports from the responsible scientists of the Work Packages in accordance to the provisions of the contract. Taking advantage of the experience of the previous grants, a strict collaboration between this proposal and the ones in place from the other RCGs will be implemented, in case of grants award. Furthermore, a proactive cooperation will be built with the grant working on the Annex 1 of the Call MARE/2020/08 (Establishing Regional Work Plans (RWP) for the following regions covered by the work of RCGs: Baltic / North Atlantic, North Sea, Eastern Arctic / Mediterranean & Black Seas / long distance fisheries / large pelagics) and Annex 2 (Actions in support of the work of RCGs).
The Work Package 1 will aim at providing a mapping and analysis of the current situation, ongoing studies, tools and developments that are relevant for the RDB in the Mediterranean and Black Seas. This mapping will include the stakeholders to be involved: RCG Med & BS, including the Steering Committee on the RDB, JRC, STECF. The overview of the current situation, ongoing studies, developments and suggestions developed in the deliverable 3.2 of the STREAM project (https://datacollection.jrc.ec.europa.eu/docs/regional-grants) will be taken into account and updated on the basis of the new mapping, in order to give a picture of the elements as a basis for the evaluation of options regarding the hosting of a RDB. The recommendations from STREAM Deliverable 3.2 regarding the main characteristics of the RDB will be taken into account, considering the assessment of data quality and the emerging configuration of the Regional Database and Estimation System (RDBES). ICES will be thus consulted on the development of the RDBES, to allow for synergies. The chairs of the RCG North Atlantic, North Sea & Eastern Arctic and Baltic will be consulted on the Regional Database Fishframe, which is currently operational. This thorough consultation will be important for this WP, taking into consideration the need of a further evolution of standard and data formats adopted at Mediterranean and Black Sea level, a crucial point on which task 3, 4 and 5 of WP4 will be deeply involved in this project. A list of expectations of the various stakeholders (requirement specifications), the potential users (actors) and the list of actions defining the interactions between a user and the system to achieve a goal (use cases) will be described during the course of this project. This task will be accomplished building upon the results of the STREAM deliverable 3.2, in which the main stakeholders, potential users and their expectations, were identified based on different information sources. Furthermore, an inventory and analysis of the main existing tools in support of DCF carried out in MARE 2014/19 and 2016/22 Med & BS (https://datacollection.jrc.ec.europa.eu/docs/regional-grants), the existence of licensed and open-source/free tools that can be used out-of-the-box with little configuration and no development, information coming from consultation of the relevant end-users/stakeholders will be finalised in this WP.
In order to facilitate the MS to properly answer to the data submission and reporting obligation, this WP will utilize and will update the inventory work made in STREAM project concerning the main RDB stakeholders, potential users and their expectations as well as the collation of relevant information about the data calls relevant for Mediterranean and Black Seas. Among the aggregated data the formats required by the different data calls will be considered: DG MARE MED&BS, FDI, GFCM DCRF , RCG Med&BS (Data Sharing Agreement, STREAM project) as well as the tables of Annual Report of the DCF. The development of tools able to communicate with the new RDB to automatically produce the aggregated data according to consolidated algorithms and in any of the requested formats is, thus, pivotal to guarantee the punctual and accurate data submission and reporting by a MS. For abovementioned reasons developments to the R tools implemented in the previous grants (MARE 2014/19 and STREAM) will be needed, to ensure the communication with the new RDB and the compliance with the new data calls specifications. Moreover, these updates will facilitate the synergies with a new project if awarded under the grant MARE/2020/08. A specific module concerning the data validation and the quality checks is foreseen in order to ensure an acceptable level of quality to the detailed data stored in the RDB and, consequently, to the aggregated data submitted to end-users. The consistency among the aggregated data submitted through the different data calls it is also of paramount importance, and it was also raised by STECF EWG in 2019 (STECF-19-14 report), where discrepancies between DGMARE MED&BS and FDI data were observed. The Med & BS RDBFIS is to be managed by a decision-making panel, hereafter named the Mediterranean and Black Sea Regional Data Base Steering Committee. It will comprise of a representative from all involved parties: (i) European Commission (ii) EU Member States (iii) Host and (iv) Developer. Data access policy will be dictated by the relevant provisions in the EU rules of law and more specific the Regulation on the fisheries data collection framework (COM 1004/2017)[1]. Access to data will be provided based on the specific ‘end-users’. The Med & BS RDBFIS will be hosted on a secure server with the following features: Access control, Authentication, Encryption, Integrity controls, Backups. The data to be submitted will follow the formats, guidelines and vocabularies specific to the type of data associated with the multiannual Union Programme for the collection and management of data in the fisheries and aquaculture sectors (COM 1004/2017 and COM 909/2019[6], COM 910/2019[7]). Timetable of periodic submissions will be defined by the Steering Committee based on the needs of relevant groups (RCGs, STECF EWGs, GFCM SAC, ICCAT WGs).
The aim of this WP is to define the specifications for the RDB and final requirements. It will be taken into account the different dimensions from the previous Work Packages: technology, people, processes, data, institutions, and existing databases. The database should contain detailed biological data of demersal and small pelagic species and aggregated transversal data (i.e. landings and effort). The aggregation level is related to the type of data and should be a combination of gear, fishing techniques, mesh type and size coding, fishery, vessel length category, species, port, GSA, ERS rectangle, etc. The DCF codification system and the aggregation levels defined in the ANNEXES of data call will be incorporated and fully supported in the system. Reference lists will be constructed to increase the performance and support the consistency and integrity of the database. The lists will reefers among others to the species (according to FAO, ITIS etc.), the fishing ports, FDI fishing rectangles, fishing gears etc. One of the main goals of the project will be the construction of a data validation and quality checking module. This will build on the experience gained within the STREAM project concerning data quality procedures to be applied both on detailed and aggregated data. RCG and end users will be contacted for indicating specific needs. In the system additional tools will be incorporated to support: user-oriented upload and download data, an exchange format, data processing tools for statistical analysis, reporting specific to data calls, automatic reporting linked to DCF processes. A web-based user interface will be constructed to ensure a friendly interaction between users, application and database. The system functionalities will support between others processes to upload data, to validate data, to report data, to search, compare, compile, aggregate, plot, map and visualize data, etc.)
Task 4.1 – Data base construction
For the development of the RDB, a data model and documentation will be produced to specify the structure of the data, the hierarchies, and the referential integrity between tables. The data model will take into account the RDBES one, in accordance with the proposed functionalities for the Med&BS RDB reported in the latest report of the ICES Steering Committee of the Regional Fisheries Database (SCRDB) (ICES, 2020). In addition, the RDB will be extended incorporating additional data to support the advanced functionalities proposed to be implemented in the Med & BS RDB (fishing pressure and fishing effort from small scale fisheries using a multi-criteria decision analysis, VMS data, environmental data, mapping fishing effort and landings by rectangle). There are also a number of other areas that the RDB could be used for, including: bycatch and PETS data, large pelagic data, recreational data, alien species, survey data.
Task 4.2 – Design and develop a graphical user interface (GUI), access and security subsystem
Users interact with the RDB platform through a rich client running in the browser, which will be implemented using a state-of-the-art Javascript framework (Angular / React / Vue.js / etc.). The User Interface (UI) components of the front end are the most direct interconnection of the end user with the platform. The entities of the database are modeled as JavaScript Object Notation (JSON) objects. Data are exchanged over HTTPS via the TLS protocol in an encrypted form in order to ensure confidentiality. All requests are filtered in order to check for authentication and authorization. Authorization ensures the enforcement of access rights according to the user roles defined in the data policy document. The Data Validation / Processing Layer includes software modules for input data validation according to common quality checks (Data Validation); as well as software modules for simple and advanced statistical data analysis (Data Processing). The results of data processing are provided to the front end of the platform. The Data Access Layer provides read and write access to the persistent storage which will be deployed on a PostgreSQL database server. Information pertaining to the users, their roles and relevant access rights are also stored in persistent storage.
Task 4. 3 – Developing data validation and quality checking tools
This task will build on the experience gained within the STREAM project concerning data quality procedures to be applied both on detailed and aggregated data. In the development of this task contacts and communications with RCG and end users will be implemented for identifying specific needs. Ad hoc R data validation and quality check tools are based on the concept of a 2-steps process to verify the consistency of the biological data: (i) a priori quality checks (QC), to detect possible inconsistency and inaccuracies already present in the detailed data, (ii) a posteriori QC, designed to verify the temporal and spatial coverage, as well as that the data consistency is maintained in the aggregated dataset. A new input RDB data format will be conceived, building on the progress made within the ICES Steering Committee of the Regional Fisheries Database (https://www.ices.dk/community/groups/Pages/SCRDB.aspx), in line with the structure of the new RDB designed and developed in task 1 of this WP. This new format will have the advantage to track additional information on the sampling scheme, in terms of sampling frame strata and hierarchy, in line with the RDBES format (https://github.com/ices-tools-dev/RDBES/tree/master/Documents). In the case also survey data will be included in the RDB, the functions of new RoME package will be included in this task for data validation and quality check.
Task 4.4 – Developing data processing tools
The aim of task 4 will be to update the relevant scripts according to the last specifications of DGMARE MED&BS and FDI datacalls. New functionalities, through R scripts, will be developed to support users to check and prepare input data to be ready for the upload process in the database, focusing on the correctness of the data formats. Specific checks will be included to assess the presence of the expected values for each field and type of table. The check will use data validation rules based on available dictionaries or expected ranges of values. These functions will support users in the correction of formal errors producing log files, in which a list of errors detected during the checks will be reported. The end-user will be also supported in the data export process submitting precompiled queries to the database and provide the selected output to the user in comma separated values (csv) format. The export functionalities will be available on all the data included in the database, taking in account the reference data policy, dealing with data confidentiality and data owner-ship issues, taking into account what suggested in the latest report of the ICES Steering Committee of the Regional Fisheries Database (ICES, 2020). The precompiled queries would allow the possibility to apply specific filters (e.g. country, area, temporal extension) to select the exact set of data required.
This task will take advantage from the scripts developed in STREAM project that will produce the fishery biological aggregated data in the formats required by the three relevant data calls of Mediterranean and Black Sea: DGMARE MED&BS, FDI and GFCM DCRF. Indeed, STREAM project developed a set of R auxiliary scripts for the conversion of dataset into the relevant formats for the data transmission and specifically: (i) from RCG Med&BS Data Call format to the SDEF (COST), (ii) from the SDEF (COST) format to the DG MARE MED&BS Data Call, (iii) from the SDEF (COST) format to the GFCM/DCRF Data Call and (iv) from the SDEF (COST) format into DG MARE FDI Data Call format (using DG MARE Med&BS Data Call format).
Task 4.5 – Input – output facilities, automatic reporting
In the perspective of ensuring an acceptable data quality in the RDB and in accordance with the proposed functionalities for the Med&BS RDB reported in the latest report of the ICES Steering Committee of the Regional Fisheries Database (ICES, 2020), the present task will focus to the production of R tools to support data validation, quality check (QC) and format conversion, following a structured flow. Indeed, these tools will guide users in the production of automatic reports which will collect all the results provided by quality checks curried out on data during both input and output phases of RDB use. The R tools will be collected in a GitHub repository freely available, in order to make these resources easily accessible. The automatic reporting facilities will be provided by the use of R scripts embedded in R Markdown documents (Rmd files to be run in RStudio environment), thus enabling the production of automatic updated report documents in different final formats, such as HTML, MS Word, Adobe PDF (among the others). The Rmd files will integrate in their structure all the functions developed in the Task 3 and included in the ad hoc R library produced as deliverable of that task. Specific automatic reports will be produced to perform both a priori and a posteriori checks, in order to carry out data validation and quality checks respectively on the detailed data and output data required by end-users. Taking in account the different formats required by the end-users, different Rmd report files will be provided for each one among DGMARE Med&BS, FDI and GFCM DCRF Data Call formats:
Based on what will be achieved under the previous WPs, WP5 will be in charge of testing the population of the new Med & BS RDB with real data. Specifically, different features will be tested using input from relevant stakeholders and Member States. Member States will incrementally populate the RDB and provide feedback for potential improvements (process described in detail in the next paragraph). The rest of the stakeholders deemed relevant will be asked to provide feedback towards completion of the RDB and following feedback from the Member States. This is an essential process to check the feasibility of the outputs in the implementation of the RDB.
A dedicated workshop (currently we have planned to held the workshop in remote, due to the still uncertain evolution of the Covid-19 outbreak and consequent restrictive measures) will be organized at month 20 of the project to test the use of the RDB by populating it with data coming from a range of case study areas, which will be identified as suitable tests for the RDB. During the workshop, the experts in charge of the testing activities will populate the RDB and annotate any issue that may emerge (e.g. data format issues, acceptable values, etc.), providing prompt feedback to the experts in charge of the RDB development. The continuous flow of information and feedback will allow updating the RDB functionalities almost in real time. In addition to the Member States, and following the workshop, additional feedback will be sought by relevant stakeholders such as the Commission and other regional end users such as STECF, GFCM and ICCAT. In parallel an User Manual will be produced.
The aim of the WP6 is to put each incremental version of the RDB on production with corresponding documentation, including SLA (Service Level Agreement). This phase should also include training of stakeholders, including Member States experts and administration.
Deliverables: (i) Minimum Viable Product, tested and rolled out in production, together with its user manual, (ii) an undefined number of intermediate versions incrementing scope, individually tested and rolled out in production, together with its user manual, (iii) the entire scope of product, tested and rolled out in production, together with its user manual