Shyam Madhusudhana virtual talk: Machine learning in marine bioacoustics

IEEE OCEANIC ENGINEERING SOCIETY -TECHNICAL ACTIVITY

An exciting virtual talk cum demonstration by Shyam Madhusudhana, Postdoctoral Fellow, K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, USA

Moderator: Gopu R Potty, Assoc. Research Professor, University of Rhode Island

Contact: gpotty@uri.edu

Title: Machine learning in marine bioacoustics

Zoom Link: https://uri- edu.zoom.us/j/96392816010?pwd=Z0NFazAyeS9MSEZNczl3cHo2aW1nQT09

Date: 21^st July 2021 Time: 8.30 PM EDT (0830hrs SGT)

The talk will be followed by an optional, informal networking session. Details will be provided at the time of the event

Abstract: Passive acoustic monitoring (PAM) methods are used for monitoring and studying a wide variety of soniferous marine fauna. The use of automatic recognition techniques has largely underpinned the successes of PAM undertakings by improving the ease and repeatability of analyses. Over the past decade, the adoption of machine learning (ML) based recognition techniques has brought about improved accuracy and reliability in mining large acoustic datasets, facilitating a suite of ecological studies, such as call- or cue-based density estimation, stock identification, or cultural transmissions.

This talk will provide an overview of PAM undertakings, present a brief overview of the various automation techniques used and contrast them with modern ML based techniques. We will present a gentle introduction to ML concepts as they apply to acoustic event recognition, and provide a hands-on demonstration of developing a ML model using real underwater acoustic recordings (please see attached instructions).

About the Speaker:

Shyam Madhusudhana (shyamm@cornell.edu) is a postdoctoral researcher at the K. Lisa Yang Center for Conservation Bioacoustics (CCB) within the Cornell Lab of Ornithology. His research interests are largely multidisciplinary as is his academic background–Bachelors in Engineering, Masters in Computer Science, and PhD in Applied Physics. He has also worked as a speech scientist for a leading Automatic Speech Recognition solutions provider. Prior to joining CCB, he has been a research associate at the Centre for Marine Science and Technology in Australia, a research associate at the National Institute of Oceanography, Goa, India and a postdoctoral research fellow at the Indian Institute of Science Education and Research in Tirupati, India. His current research involves developing deep-learning techniques for realizing effective and efficient machine-listening in the big-data realm, with applications in the monitoring of both marine and terrestrial fauna.

He is a Senior Member of IEEE, and currently serves as an Administrative Committee member in IEEE’s Oceanic Engineering Society (OES). He is also the Coordinator of Technology Committees in OES and a co-Chair of the Student Poster Competitions at the biannual OCEANS conference. He referees manuscripts for journals focused on animal bioacoustics, pattern recognition and machine learning.

Preparation for the hands-on demo:

As part of the session, we will be demonstrating the use of a machine learning (ML) based approach to automation of bioacoustic data analyses. The demonstration will follow a hands-on approach where the participants can follow along to experience developing and using a ML-based solution to automatic recognition using a dataset containing North Atlantic Right Whale (NARW) calls. Familiarity with the Python programming language would be handy, but not critical.

The dataset used for the exercise is a part of the publicly available annotated NARW recordings that were part of the 2013 Detection, Classification, Localization and Density Estimation (DCLDE) challenge [1]. The original dataset consists of 7 days of continuous underwater recordings, 4 of which were earmarked for training and the remaining 3 for testing. In the interest of time considering the short session, the demonstration will only utilize a subset comprising of 2 days of recording, one for training and one for testing. The chosen audio data was downsampled and compressed for efficiency. The corresponding manual annotations (comprising of the start and end times, and lower and upper frequency bounds of the calls) of the occurrence of NARW up-calls were converted into RavenPro selection table format, which present the data in a tab-limited text file format.

The demonstration during the workshop will utilize Google Colaboratory, which is a free platform (for non-commercial use) offering cloud computation facility. We assume that the participants have an account with Google (having a Gmail account will suffice). Clicking on the below link

https://drive.google.com/drive/folders/1xyLZf63ixLzXECHpbUB_Tf5V2UosOe62?usp=sharing

will take you to where the dataset for the workshop demo is available. Once on the page, create a link to the dataset from your Google Drive storage by selecting the ‘Add shortcut to Drive’ as shown below:

At the subsequent prompt, make sure “My Drive” is highlighted and then click on “ADD SHORTCUT”. This will create a “shared” folder link under your “My Drive”. You can verify this by clicking on the “Drive” icon at the top-left of the page and seeing that an item named “koogu_demo_data”. That’s it! Leave that item as is, and we will give you the program to process the data on the workshop day.

References:

[1] Gillespie, D. DCLDE 2013 Workshop dataset. University of St Andrews Research Portal. https://doi.org/10.17630/62c3eebc-5574-4ec0-bfef-367ad839fe1a (2019).