Automatic identification of curve shapes with applications to ultrasonic vocalization


Gao, Zhikun
Tang, Yanlin
Wang, Huixia Judy
Wu, Guangying K.
Lin, Jeff

KAUST Grant Number


Like human beings, many animals produce sounds for communication and social interactions. The vocalizations of mice have the characteristics of songs, consisting of syllables of different types determined by the frequency modulations and structure variations. To characterize the impact of social environments and genotypes on vocalizations, it is important to identify the patterns of syllables based on the shapes of frequency contours. Using existing hypothesis testing methods to determine the shape classes would require testing various null and alternative hypotheses for each curve, and is impractical for vocalization studies where the interest is on a large number of frequency contours. A new penalization-based method is proposed, which provides function estimation and automatic shape identification simultaneously. The method estimates the functional curve through quadratic B-spline approximation, and captures the shape feature by penalizing the positive and negative parts of the first two derivatives of the spline function in a group manner. It is shown that under some regularity conditions, the proposed method can identify the correct shape with probability approaching one, and the resulting nonparametric estimator can achieve the optimal convergence rate. Simulation shows that the proposed method gives more stable curve estimation and more accurate curve classification than the unconstrained B-spline estimator, and it is competitive to the shape-constrained estimator assuming prior knowledge of the curve shape. The proposed method is applied to the motivating vocalization study to examine the effect of Methyl-CpG binding protein 2 gene on the vocalizations of mice during courtship.

Gao, Z., Tang, Y., Wang, H. J., Wu, G. K., & Lin, J. (2020). Automatic identification of curve shapes with applications to ultrasonic vocalization. Computational Statistics & Data Analysis, 148, 106956. doi:10.1016/j.csda.2020.106956

The authors would like to thank two reviewers, an associate editor, and the editor for constructive comments and helpful suggestions. The research was partly supported by National Science Foundation, United States of America grant DMS-1712760, the OSR-2015-CRG4-2582 grant from KAUST, Saudi Arabia, the National Natural Science Foundation of China grant 11871376, Shanghai Pujiang Program, China18PJ1409800, the Whitehall Foundation, United States of America, Brain and Behavior Research Foundation, United States of America, and the IR/D program from the National Science Foundation, United States of America . Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Elsevier BV

Computational Statistics & Data Analysis


Additional Links

Permanent link to this record