Real-time underwater fish identification and biomonitoring via machine learning-based compression of video to text

$5,020
Pledged
51%
Funded
$10,000
Goal
11
Days Left
  • $5,020
    pledged
  • 51%
    funded
  • 11
    days left

About This Project

Underwater monitoring of marine life has traditionally followed a "set it and retrieve it” approach due to the challenge of transmitting high-bandwidth data like video from subsurface sensors to land networks. As a result, data must be physically collected, causing delays or data loss. However, on-device machine learning with affordable, off-the-shelf hardware and open source models now enables real-time conversion of video into text that can be reliably transmitted, overcoming prior limits.

Ask the Scientists

Join The Discussion

What is the context of this research?

As an avid scuba diver, I’ve often wished for a better way to identify fish underwater than flipping through a laminated booklet. As an engineer, I realized that recent advances in augmented reality (AR), machine learning, and battery tech could solve this problem—yet few AR devices reflect these capabilities. The constraints that limit commercial AR (size, battery life, aesthetics) are irrelevant underwater. Divers already wear bulky gear (requiring only neutral buoyancy and reasonably wearable devices), prioritize function over form, and stay submerged for limited times, thus minimizing battery concerns. That insight led to this project: developing a wearable underwater device that leverages existing hardware and open-source fish recognition models to enable real-time fish identification should be fairly practically doable without requiring any fundamental technology improvements. And the potential for such a device to improve marine research should be very in reach.

What is the significance of this project?

What began as a diver’s idle wish while diving became a broader, key realization: this device isn’t just a fish ID tool—it’s a low-bandwidth, edge-computing system capable of transforming underwater monitoring by providing real-time data to improve our knowledge of the underwater world. Today, researchers either tether sensors (expensive and fragile) or “set and retrieve” them (risking total data loss). By using on-device ML to convert high-bandwidth video into low-bandwidth text, the proposed system enables real-time subsurface monitoring with video stored locally for validation. Data can be transmitted to surface buoys, then relayed to shore—removing the need for tethers and enabling distributed sensor networks. A marine biologist I spoke with lamented losing two years of hydrophone data; this device could prevent such losses. The result would be a paradigm shift in oceanographic research: more frequent, reliable, and affordable data collection.

What are the goals of the project?

The goal is to build a proof-of-concept system for real-time, underwater fish identification that will function for fish populations near Monterey, CA (initial geography). First, we'll develop a diver-wearable device that isn’t power-constrained and can run fish recognition models in real time that have been trained to fish in a limited geography, using water-based cooling to prevent overheating. Second, we'll integrate a subsea pulse modem to transmit the identification data to a surface buoy. This will test the feasibility of real-time, underwater-to-surface communication. Later stages—beyond this grant—will explore low-power deployment for fixed sensors and strategies for long-term power (e.g., current-powered turbines), as well as network design questions like sensor density and range, and expansion of the functioning to include other geographies (e.g., mini models for other geos (e.g., Hawaii)). We’ll begin prototyping the first stage once grant money is received.

Budget

Please wait...

The above is estimated to be the starting budget to create a prototype device that may be used as a support and testing device while scuba diving, capable of real-time identification of fish with an AR display and a camera that would enable a diver to determine whether it was properly identifying fish, and to determine what type of camera and other equipment would be necessary (e.g., determining power budget) for long term placement at different depths. For testing purposes, it would permit onboard processing and broadcasting to a surface buoy (short range (500-900m) or, long range hydrophone, 3.7km+ (device cost is an estimate)). Ultimately, the goal would be to develop a refined version of the device that could be used to perform low-cost real-time monitoring of fish (e.g., on a reef). The budget is broken into fabrication, stage 1 (S1) (ID device and UI), stage 2 (S2) (transmission), overhead and miscellaneous expenses. Specific devices may be changed.

Endorsed by

Justin and I worked closely for over 5 years at the forefront of Data and AI at Databricks. With his experience, Justin is both passionate and uniquely qualified to combine frontier open-source models with edge computing. Rapid advancements in software and hardware make this project not only newly viable, but practical given the right domain expertise.

Project Timeline

The timeline is broken down into two stages, each with two basic sets of tasks -- hardware and software. Stage 1 is development of the underwater fish ID device. Stage 2 is enabling data transmission. The timeline assumes that off-the-shelf devices and open source software can be combined together (with custom enclosures and some custom software) to create a proof of concept device without requiring fundamental technical hurdles to be overcome.

Apr 07, 2025

Project Launched

Apr 15, 2025

Procurement of fabrication hardware (3d printer)

Apr 30, 2025

Procurement of Stage 1 hardware

May 31, 2025

Creation of hardware schematic for Stage 1

May 31, 2025

Modification of existing OSS to trace and identify which particular fish is in frame (note that current OSS model is limited in # of fish it identifies)

Meet the Team

Justin Olsson
Justin Olsson
Thomas Allen
Thomas Allen
Madelyn Knipp
Madelyn Knipp

Team Bio

Our team combines diving experience, and both informal and formal technical expertise, including in computer vision, software development and general hardware fabrication, hacking and tinkering.

Justin Olsson

Justin Olsson is an avid scuba diver and technologist with a lifelong love of all things marine. His passion for building and problem-solving began early, programming and learning electrical engineering and machine fabrication while captaining a FIRST Robotics team in high school. He went on to earn a degree in chemical engineering from Johns Hopkins.

Though his career path veered into law, Justin has remained deeply immersed in the world of AI and machine learning. As a product lawyer, he spent over eight years at the heart of foundational AI infrastructure companies — Databricks (creators of Apache Spark) and Anyscale (creators of Ray, the distributed computing technology behind OpenAI’s training of ChatGPT). These roles kept him embedded with engineers, working closely with the people building the tools that power modern machine learning.

Outside of work, Justin is a relentless tinkerer. He codes and builds hardware in his spare time, crafting custom IoT devices, hacking his car to deliver a near-OEM Android Auto experience, and stitching together open source tools while reverse-engineering vehicle CAN bus systems to create custom UX solutions.

A recent dive trip to the Great Barrier Reef — preceded by watching Chasing Coral with his son to learn a bit before we went — became a turning point. When he explained that not everyone was doing everything they could to save the reef, his son looked up and asked, “Well Dad, are you doing everything you can?” The question landed hard.

While Justin knows he can’t save the reef alone, he’s now choosing to redirect his skills — his love of diving, his technical fluency with AI, and his maker mindset — toward something that matters deeply: the protection and understanding of our oceans.

Thomas Allen

Tom Allen is a computer vision and robotics expert with a strong track record in AR/VR (Google, Meta) and startups, including OpenSpace, where he applied machine learning and classical computer vision algorithms to construction imagery. He holds a PhD in robotics from Caltech and a BS in mechanical engineering from UC Berkeley. Tom's expertise lies in bridging practical mechanical systems with sophisticated software and algorithmic challenges.

Madelyn Knipp

Madelyn is a senior engineering leader in big tech and a startup founder with a background in large-scale systems, applied machine learning, and geospatial data. She holds a degree in Computer Science from Colorado State University and has led engineering teams delivering production systems across developer platforms, GIS infrastructure, and AI-powered applications.

Her technical expertise spans edge inference, system optimization, and real-time data processing—especially in environments where power, bandwidth, and latency are tightly constrained. Earlier in her career, she worked on environmental water monitoring systems, gaining firsthand experience with field-deployed, low-power edge devices for data collection in rugged conditions.

At home, Madelyn is also a relentless tinkerer. She runs her own AI infrastructure, builds custom home automation and energy-monitoring tools, and experiments with local inference across a fleet of compute nodes. Her work blends practical engineering with a curiosity-driven mindset.

She brings deep systems thinking, architectural rigor, and founder-level velocity to the team. Madelyn and Justin share a maker ethos and a drive to translate ideas into working systems quickly and effectively. She’s excited to contribute alongside others on the team bringing deep expertise in hardware and computer vision.

Madelyn is drawn to this project by a long-standing interest in environmental ecosystems and the intersection of AI with real-world sensing. She believes that field-deployable intelligence like this can unlock better understanding, resilience, and protection of the natural world.

Additional Information

Systematic review and meta-analysis of AR in medicine, retail, and games https://pmc.ncbi.nlm.nih.gov/a... ("Limited battery life, large devices, and cumbersome cables are the limitations of the technology") (this provides support for the proposition that current AR devices are highly limited by these issues; the basic point here is that given that these limits are less relevant in the diving context (because you're already carrying tons of weight and annoying equipment, no additional major technical breakthroughs are necessary for currently solved AR to work in the diving context).

A data collection algorithm for underwater optical-acoustic sensor networks https://www.sciencedirect.com/... ("Acoustic modems are capable of transmitting data at low data rates (few kbps) for several kilometers, while optical modems can send large amounts of data per second for just some meters.") (this provides support for the main issue this device would solve -- the ability to effectively broadcast video (or rather, what is in the video) from subsurface to surface at significant distance. While high bandwidth transmission is not possible today, it is not needed for this to work, as the machine learning analysis of the video serves essentially as "compression" by turning it into text descriptions that are very small and transmittable via current technology from subsurface to surface. Therefore, no additional fundamental technical breakthroughs in transmission need to be made in order for this device to serve its desired purpose.)

Underwater Optical Wireless Communications: Overview https://pmc.ncbi.nlm.nih.gov/a... (see chart on Comparison of underwater wireless communication technologies) (this provides further evidence of the current ability for various underwater communication (e.g., RF is highly restricted in range; acoustic is long range, but low bandwidth (and optical is low range but high bandwidth).

The project image is generated with AI to illustrate the basic concept and does not represent an actual video of the hardware/software in action.


Project Backers

  • 2Backers
  • 51%Funded
  • $5,020Total Donations
  • $2,510.00Average Donation
Please wait...

See Your Scientific Impact

You can help a unique discovery by joining 2 other backers.
Fund This Project