Capstone

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Enable interactive data visualization for MBARI AUV data

CST489-Capstone Project Planning

Jake Horne, Kevin Yoshimoto, Luis Navarro, Michael Watson & Myles Lopez

Professor Cassandra Eccles & Professor Brian Robertson

Summer 2023
1

Executive Summary

The Monterey Bay Aquarium has collected data from the past few decades on all pertinent

information on oceans, from changing salinity levels to the underwater topography as well as

temperature changes. In total, millions upon millions of data points have been taken in raw form

and have been extrapolated and cleaned to be used in a plethora of current projects, all pertaining

to the overall trends that are occurring in our oceans as well as relationships that are not

immediately intuitive to the human eye and are readily found through machine learning models

and data extrapolation. Our goal as a team from CSUMB is to create widgets that make data

visualization more accessible to a human user, whether that be toggling for a certain time period,

or looking at a particular range of salinity concentrations. By enabling a built-in feature from an

end user perspective, it makes the process of looking at different views much more user friendly,

in that only a click of a button is needed to create the desired data visualization as opposed to

coding the desired output explicitly each time. From a outcome perspective, our team is expected

to create a pull request with a jupyter notebook that has sample public data, and utilizes the

newly developed widgets and their overall capabilities and flexibility in visualizing the data that

has already been cleaned, and give users an easier system to look at data from a human’s

perspective, to see trends over time as well as anticipated projections based off the data that has

been previously collected.


2

Table of Contents

Introduction/Background............................................................................................................3
Project Name and Description................................................................................................. 3
Problem and Issue With Technology....................................................................................... 3
Solution to the problem and/or issue in technology................................................................. 3
Environmental Scan/Literature Review..................................................................................... 4
Stakeholders................................................................................................................................ 5
Ethical Considerations............................................................................................................. 5
Legal Considerations............................................................................................................... 6
Project Goals and Objectives..................................................................................................... 6
Goals....................................................................................................................................... 6
Objectives................................................................................................................................ 6
Final Deliverables........................................................................................................................ 7
Approach/Methodology...............................................................................................................7
Timeline/Resources.....................................................................................................................8
Milestones................................................................................................................................9
Resources Needed.................................................................................................................. 9
Platform........................................................................................................................................ 9
Risks and Dependencies.......................................................................................................... 10
Risks...................................................................................................................................... 10
Dependencies........................................................................................................................ 11
Testing Plan................................................................................................................................12
Team Members...........................................................................................................................13
References................................................................................................................................. 13
3

Introduction/Background

Project Name and Description

The organization that we are working in tangent with is the MBARI lab, which is a

non-profit oceanographic research center that uses data to make inferences about oceanic

changes and the impact that these changes will have in the future. There is no specific project

name associated with the work that we will be doing; however, it will be utilized by the internal

team that we are directly working with, with the goal of using the work to help the Dorado class

sensor data processing project to eventually be utilized as a general function for other teams.

This work that is being completed is important at MBARI not just from a direct oceanic

perspective, but since a large portion of the world’s population is in direct contact with the

oceans the information provided by MBARI can be utilized to determine the living conditions of

these populations, for example the effects of rising sea levels and Decreasing pH due to

increased atmospheric carbon dioxide is a greater concern.

Problem and Issue With Technology

By using robots, MBARI both autonomous and remotely controlled are used to collect

information on salinity and chlorophyll monthly, to take a few examples of the many projects

currently being conducted. The issue with the current situation at MBARI is that the amount of

data points collected from these robots is much too large for scientists to process and analyze

without using software to help visualize the data.

Solution to the problem and/or issue in technology

Our goal as students at CSUMB is to provide easier access for scientists within the

MBARI team to visualize the data in a more appealing format for a human viewer, where the
4

intention of the visualization is apparent from the get-go. We will utilize Javascript employed

widgets in jupyter notebook to show data by providing readily available masks to quickly toggle

between different views in the data. The technology used for the gathering of data is constantly

being fine tuned further and worked on to provide increasing levels of precision; however, our

scope and involvement does not deal directly with the technology involved in the collection of

data, but rather the user interface associated with viewing the data that has been preprocessed,

cleaned, and extrapolated. At its current state, the masks that are currently in use take too long

based on the volume of data provided, so new masks need to be created to alleviate the current

strain on the preexisting system.

Environmental Scan/Literature Review

Mote Marine Laboratory in Florida deployed a similar AUV as the MBARI vessel named

“Genie”. This AUV’s main goal is to gather data useful for ocean observing and research. Genie

carries instruments that can monitor water temperature, depth, salinity, colored dissolved organic

matter, and turbidity. The vessel can also be used to monitor microscopic plant-like organisms,

which include a toxic algae that causes Florida red tides and can be harmful for marine life and

people. Where this AUV differs from MBARI is the acoustic receiver that Genie carries to detect

fish that were tagged by researchers to collect data on fish migration patterns. Scientists at Mote

Marine Laboratory believe that collecting this data can help discover patterns in the movement of

the toxic algae and use it to mitigate the issues it is causing to the Florida population. Once

researchers get data on the red tide location they send the information out to the public and to

resource managers as soon as it’s available. It is one of the many benefits that AUV data

collection can provide to the public. With access to helpful information when going to the beach

or in the ocean. All the data that is collected by Genie is sent to the GCOOS (Gulf of Mexico
5

Coastal Ocean Observing System) website and is used to populate an animated dashboard of the

ocean currents, water direction of the currents, and the speed of the currents. The dashboard also

shows the locations of all the AUVs that are providing data points to the website. This

information provides patterns of the ocean currents which can help engineers determine where to

design and build structures like bridges, dams, and offshore platforms, shipping companies can

also use this info to optimize shipping routes and avoid areas with strong currents. Marine

ecosystems also get affected from water current and this data can help understand how they

function and how different species interact with their environment.

Stakeholders

The stakeholders for this project are the direct team we are working with at MBARI, with

the long term goal of providing a resource that is both versatile and flexible enough to be utilized

and adapted further for other teams’ use. There is minimal risk in the terms of risk, besides

having a functionality that does not work to the full extent as expected and there are limitations

in its functionality. However, if thorough testing is conducted and the functionality is rigorously

tested, the team can hope to gain access to widgets that lessen the load of the current

infrastructure, thereby decreasing processing time as well as increasing efficiency of data

visualizations within the team. This can help with creating easy to read visualizations for end

users to understand trends that are occurring in the ocean using the data collected from the robots

over the past decades.

Ethical Considerations

There are no ethical considerations in question, if anything the research done by MBARI

will help people further.


6

Legal Considerations

The only legal considerations are making sure that countries acknowledge and accept the

rationale for robots getting data in international waters, should they need to explore that far from

shore as well as the robots minimally affecting the environment they are gathering data from.

Otherwise, there are minimal legal regulations in consideration.

Project Goals and Objectives

Our goals go beyond the actual deliverable that is due at the end of the capstone planning

and execution, but encompasses the nature of working in a team with a project outline and

deadline. Instead of having a prefabricated outline and rubric on what is expected, the only

information provided is that of the final deliverable without the addition of the step-by-step

planning that is typically utilized in teaching and classroom environments.

Goals

- To learn how to solve a - Be able to complete a - Coordinate fluidly within a

problem from inception deliverable by a team environment and cater

predetermined date, even with to team members’ strengths

unaccounted for problems and expertise

Objectives

- Gathering of resources that - Breaking down the problem - Navigate problems by

can be utilized for overall into manageable and tangible meticulous planning and

project design, instead of chunks that can be evaluated troubleshooting, as well

succinctly as complete or
7

jumping straight into the needs further development,

software aspect used during sprint planning

Final Deliverables

Our project has a tangible goal of creating widgets that aid in the enablement of

interactive data visualization for MBARI AUV data. The current widgets being utilized take

seconds to fully process, whereas our goal centers around giving end users a more responsive

interface that shows data visualizations and specified parameters much quicker, giving the user a

seamless experience. This is similar to an existing project STOQS, but the data visualization is

still too slow for consumers and as such is only used by the engineers that are working directly

with the data, not the casual bystander. We are hoping to bridge this gap by giving widgets that

allow for manipulation of activity_name, time, depth specifically, and create a pull request that

consists of a Notebook with public example data, demonstrating this functionality. This pull

request will be accepted and merged by Mr. McCann directly, if he sees it as a valuable feature to

include in the github repo that he has instantiated.

Approach/Methodology

As a team, we hope to have weekly meetings so not quite daily standups that are typical

of an Agile workflow (mainly dictated by differing work schedules), and to assign points and

stories based on the bandwidth of each team member. If there is an overlap of information that

will benefit multiple stories, the team members working on the stories that overlap will work

together, to troubleshoot and develop either in real time or through messaging. The preliminary

stage of the project will involve planning as well as getting familiar with the resources required

for the project, which include making sure all team members have all software configured the
8

same in order to have seamless integration. All information relating to data that needs to be

extrapolated and cleaned will be provided by Professor McCann and any additional research that

needs to be conducted on creation of widgets can be derived from the sample that he provided

based on other configurations as well search the web for anything we may deem applicable.

Once all research has been conducted, the stories will be separated on the separate functionalities

(widgets for activity_name, time, depth) and further divided into substories focused on creation

of the software associated, testing of their functionality, and eventually implementation on a

main development branch once all necessary checks have passed and have been approved by the

team as a whole.

Timeline/Resources

Include a detailed schedule for completing the project. Use a chart/table with your description.

Include major stages (Milestones) of the process toward completion of the project.

Detailed Timeline

Week 1 and 2: Setup Week 3 and 4: Begin Week 5 and 6: Week 7: Integrate all
of Brew, Anaconda, design of widgets Implement changes widgets into a main
and Poetry, become based off SME based on design and dev branch and check
familiarized with the suggestion and start design review at the for compatibility and
software and make documentation of end of Week 4, any residual issues
sure compatibility of preplanning and meeting twice a week before submitting for
software is seamless initial stages for status updates a pull request for
among members regarding timeline, do Professor McCann.
testing for each
widget separately
throughout.
Document any results
as well as testing
results.
9

Milestones

Week ½ : Setup of environments and background research complete

Week ¾: Design and review

Week ⅚: Design implementation and continuous testing

Week 7: Integration testing and deliverable completed

Resources Needed

No main resources needed besides publicly provided data and personal computers.

Platform

The softwares that was used to create the AUV data processing application consists of

only Python and sets of tools for it. The main software being Python is the driver for the purpose

of the application, which is processing large amounts of data. Python has a large list of libraries

and frameworks that are useful for data manipulation, transformation, and analysis. For our

capstone we are writing a Jupyter Notebook file that processes large amounts of data collected by

an AUV and filters out data based on the user’s preference, which makes Python the perfect

language to code the project in. We are using a Jupyter Notebook file because we can use code

with rich-text, allowing us to write our script and explain how to use it to the user. Anaconda is

the platform that we will be using to write the notebook file in. It consists of a package manager,

some pre-compiled packages, and other tools that make it easier to work with Python. As a team,

we found that Anaconda provides a robust foundation for working with Python. Overall the

platform we have utilized is Anaconda, which allows us to write a Jupyter Notebook file that will

manipulate the data shown to a user based on their preference of what they would like to be

displayed. Because our capstone is to create a script that works with a project that has already

been created by our client we are also using Poetry in order to manage the dependencies
10

necessary for our project. With Poetry our team can inherit all the dependencies that the already

created project has imported and will be able to jump right into working on the script without

having to spend too much time setting up our environment.

Risks and Dependencies

The risk and dependencies of the project are not all known due to there not being

accessibility to the Jupiter notebook which will be provided by the client. From the meetings

with the client, the notebook they will provide will determine the project’s schedule and division

of labor. However, the risks for the project are known. These risks are few, and this is mainly

because the project is dependent on a previously constructed notebook and incapability to

corrupt the data used in the project.

Risks

The risks for the success of this project are few due to the nature of the project’s scope.

The most concerning risk for this project is the limitation of adequate testing for the totality of

the dataset. Due to there being a vast amount of data points in the project, the programming will

have to be consistently tested on small sections of the dataset. This limits the possibility of

providing adequate bounds on which we can predict before we are capable of testing on the

entire dataset. Another risk that we might encounter is problems with the project being able to

merge together near the end of the project development. Considering that individual team

members will be focusing on specific features and parts of the project, bringing these parts

together may prove problematic. We intend for each section of the project to be capable of

overlapping in a single Jupyter notebook, so there is substantial risk that these sections can prove

incompatible and require significant tuning to work correctly. One risk we are avoiding is the
11

risk that we can corrupt some of the dataset considering we will not be altering any data directly

in the database.

Dependencies

When it comes to dependencies, there are only two parts of the project with which the

project will depend on prior parts. We will originally use the already existing Jupiter notebook

with which we will gain access to at the beginning of the project. This notebook will be our main

dependency because it is the foundation with which the rest of the project will be based on.

When we are able to view this notebook the team will then split up working on specific features.

The next dependency will be joining all of the features towards the end of the project. The

project will be dependent on each of the team member’s work to compile into a single Jupiter

notebook. Another dependency of note is the access to the dataset provided by MBARI, which is

publicly available. This project will be engineered in the agile methodology, and as such the

dependencies of the project are not known beyond these two until we have all information

provided with the existing Jupyter notebook mentioned previously.

Testing Plan

To deliver a reliable, user-friendly, and performant solution for exploring and visualizing

the extensive AUV data archive. Our testing plan for the project encompasses a comprehensive

approach to ensure the quality and effectiveness of the Jupyter Notebook and the newly

developed data selection widgets. We will begin with unit testing, meticulously verifying the

correctness of each function and module to guarantee their individual functionality. Integration

testing will follow, focusing on seamless interaction between different components to identify

any potential conflicts or issues. The next phase involves functional testing, where we will

thoroughly validate the functionality of the data selection widgets. This will include testing the
12

drop-down selector for activity__name, range selectors for depth and time, and their seamless

integration with the existing biplot() function. We will conduct extensive testing, exploring

various combinations of selections to ensure accurate data filtering and visualization, and to

verify that the generated plots align with the selected criteria. In addition to functional testing,

we recognize the importance of usability testing. We will engage users and domain experts who

are familiar with AUV data and its use cases. Through usability testing, we will gather valuable

feedback on the user interface, the intuitiveness of the data selection widgets, and the overall

user experience. This feedback will guide us in refining and improving the user interaction and

interface design to ensure a user-friendly and intuitive experience when exploring and

visualizing the extensive AUV data archive. Further, performance testing to evaluate handling of

large datasets, and rigorous testing of error handling and edge cases will be considered.

Thorough documentation review and peer review will be conducted to ensure clarity, accuracy,

and usability.

Team Members

Kevin Yoshimoto - Project manager ( Team leader ). Creation of project plan and delegation of

project tasks to team members.

Jake Horne - Widget creation. Creation of widget drop down selector for activity_name, range

selector for depth and time data features.

Michael Watson - Widget creation. Creation of widget drop down selector for activity_name,

range selector for depth and time data features.

Luis Navarro - Stress testing. comprehensive testing to ensure the accurate filtering and

visualization of data, as well as the alignment of generated plots with the selected criteria. This
13

testing will involve the utilization of the drop-down selector for activity names, range selectors

for depth and time, and their seamless integration with the existing biplot() function.

Myles Lopez - Stress testing. comprehensive testing to ensure the accurate filtering and

visualization of data, as well as the alignment of generated plots with the selected criteria. This

testing will involve the utilization of the drop-down selector for activity names, range selectors

for depth and time, and their seamless integration with the existing biplot() function.
14

References

Monterey Bay Aquarium Research Institute. (n.d.) Seafloor mapping AUV.

https://www.mbari.org/technology/seafloor-mapping-auv/

Rutger, H. (2015, November 10). New underwater robot “Genie” deployed to monitor harmful

algae and more. Mote.

https://mote.org/news/article/new-underwater-robot-genie-deployed-to-monitor-harmful-

algae-and-more

You might also like