MLSP Group - Publications

🚧 last updated May 2, 2024 🚧

2024

A closer look at reinforcement learning-based automatic speech recognition
Fan Yang and Muqiao Yang and Xiang Li and Yuxuan Wu and Zhiyuan Zhao and Bhiksha Raj and Rita Singh Computer Speech & Language 2024

A closer look at reinforcement learning-based automatic speech recognition
Fan Yang and Muqiao Yang and Xiang Li and Yuxuan Wu and Zhiyuan Zhao and Bhiksha Raj and Rita Singh Computer Speech & Language 2024

Improving Continual Learning of Acoustic Scene Classification via Mutual Information Optimization
Muqiao Yang and Umberto Cappellazzo and Xiang Li and Bhiksha Raj ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

Prompting Audios Using Acoustic Properties for Emotion Representation
Hira Dhamyal and Benjamin Elizalde and Soham Deshmukh and Huaming Wang and Bhiksha Raj and Rita Singh ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models
Muqiao Yang and Chunlei Zhang and Yong Xu and Zhongweiyang Xu and Heming Wang and Bhiksha Raj and Dong Yu ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

Importance of negative sampling in weak label learning
Ankit Shah and Fuyu Tang and Zelin Ye and Rita Singh and Bhiksha Raj ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

Training audio captioning models without audio
Soham Deshmukh and Benjamin Elizalde and Dimitra Emmanouilidou and Bhiksha Raj and Rita Singh and Huaming Wang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

Fixed Inter-Neuron Covariability Induces Adversarial Robustness
Muhammad A Shah and Bhiksha Raj ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization
Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson NeurIPS 2024 2024

Learning with Noisy Foundation Models
Hao Chen and Jindong Wang and Zihan Wang and Ran Tao and Hongxin Wei and Xing Xie and Masashi Sugiyama and Bhiksha Raj arXiv preprint arXiv:2403.06869 2024

ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li, Hao Chen, Kai Qiu, Jason Kuen, Jiuxiang Gu, Bhiksha Raj, Zhe Lin Arxiv 2024

ControlVAR: Exploring Controllable Visual Autoregressive Modeling
Xiang Li, Hao Chen, Kai Qiu, Jason Kuen, Jiuxiang Gu, Bhiksha Raj, Zhe Lin Arxiv 2024

Efficient Autoregressive Audio Modeling via Next-Scale Prediction
Kai Qiu, Xiang Li, Hao Chen, Jie Sun, Jinglu Wang, Zhe Lin, Marios Savvides, Bhiksha Raj Arxiv 2024

-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations
Xiang Li and Kai Qiu and Jinglu Wang and Xiaohao Xu and Rita Singh and Kashu Yamazak and Hao Chen and Xiaonan Huang and Bhiksha Raj ECCV 2024 2024

AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Zhaorun Chen and Zhuokai Zhao and Zhihong Zhu and Ruiqi Zhang and Xiang Li and Bhiksha Raj and Huaxiu Yao NAACL 2024

Evaluating and Improving Continual Learning in Spoken Language Understanding
Muqiao Yang and Xiang Li and Umberto Cappellazzo and Shinji Watanabe and Bhiksha Raj arXiv preprint arXiv:2402.10427 2024

SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Hazim Bukhari and Soham Deshmukh and Hira Dhamyal and Bhiksha Raj and Rita Singh INTERSPEECH 2024 2024

Domain Adaptation for Contrastive Audio-Language Models
Soham Deshmukh and Rita Singh and Bhiksha Raj INTERSPEECH 2024 2024

🐧 Pengi: An audio language model for audio tasks
Soham Deshmukh and Benjamin Elizalde and Rita Singh and Huaming Wang Advances in Neural Information Processing Systems 2024

Weakly-Supervised Audio-Visual Segmentation
Shentong Mo and Bhiksha Raj Advances in Neural Information Processing Systems 2024

PaintSeg: Painting Pixels for Training-free Segmentation
Xiang Li and Chung-Ching Lin and Yinpeng Chen and Zicheng Liu and Jinglu Wang and Rita Singh and Bhiksha Raj Advances in Neural Information Processing Systems 2024

Training on Foveated Images Improves Robustness to Adversarial Attacks
Muhammad Shah and Aqsa Kashaf and Bhiksha Raj Advances in Neural Information Processing Systems 2024

Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments
Thanh-Dat Truong and Hoang-Quan Nguyen and Bhiksha Raj and Khoa Luu Advances in Neural Information Processing Systems 2024

Customizable Perturbation Synthesis for Robust SLAM Benchmarking
Xiaohao Xu and Tianyi Zhang and Sibo Wang and Xiang Li and Yongqi Chen and Ye Li and Bhiksha Raj and Matthew Johnson-Roberson and Xiaonan Huang arXiv preprint arXiv:2402.08125 2024

A General Framework for Learning from Weak Supervision
Hao Chen and Jindong Wang and Lei Feng and Xiang Li and Yidong Wang and Xing Xie and Masashi Sugiyama and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2402.01922 2024

On Catastrophic Inheritance of Large Foundation Models
Hao Chen and Bhiksha Raj and Xing Xie and Jindong Wang arXiv preprint arXiv:2402.01909 2024

PAM: Prompting Audio-Language Models for Audio Quality Assessment
Soham Deshmukh and Dareen Alharthi and Benjamin Elizalde and Hannes Gamper and Mahmoud Al Ismail and Rita Singh and Bhiksha Raj and Huaming Wang INTERSPEECH 2024 2024

AugSumm: towards generalizable speech summarization using synthetic labels from large language model
Jee-weon Jung and Roshan Sharma and William Chen and Bhiksha Raj and Shinji Watanabe arXiv preprint arXiv:2401.06806 2024

2023

Online Active Learning For Sound Event Detection
Mark Lindsey and Ankit Shah and Francis Kubala and Richard M Stern arXiv preprint arXiv:2309.14460 2023

Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios
Eduardo Alvarado and Nicolás Grágeda and Alejandro Luzanto and Rodrigo Mahu and Jorge Wuth and Laura Mendoza and Richard M Stern and Néstor Becerra Yoma Sensors 2023

Respiratory Distress Estimation in Human-Robot Interaction Scenario
Eduardo Alvarado and Nicolás Grágeda and Alejandro Luzanto and Rodrigo Mahu and Jorge Wuth and Laura Mendoza and Richard Stern and Néstor Becerra Yoma Proceedings of the Interspeech, Dublin, Ireland 2023

Unsupervised Voice Type Discrimination Score Adaptation Using X-Vector Clusters
Mark Lindsey and Tyler Vuong and Richard M Stern ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Investigating the Important Temporal Modulations for Deep-Learning-Based Speech Activity Detection
Tyler Vuong and Nikhil Madaan and Rohan Panda and Richard M Stern 2022 IEEE Spoken Language Technology Workshop (SLT) 2023

Audio Retrieval with WavText5K and CLAP Training
Soham Deshmukh and Benjamin Elizalde and Huaming Wang Interspeech 2023 2023

Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems
Roshan Sharma and William Chen and Takatomo Kano and Ruchira Sharma and Siddhant Arora and Shinji Watanabe and Atsunori Ogawa and Marc Delcroix and Rita Singh and Bhiksha Raj 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023

Towards noise-tolerant speech-referring video object segmentation: Bridging speech and text
Xiang Li and Jinglu Wang and Xiaohao Xu and Muqiao Yang and Fan Yang and Yizhou Zhao and Rita Singh and Bhiksha Raj Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023

FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding in Open World
Thanh-Dat Truong and Utsav Prabhu and Bhiksha Raj and Jackson Cothren and Khoa Luu arXiv preprint arXiv:2311.15965 2023

Token Prediction as Implicit Classification to Identify LLM-Generated Text
Yutian Chen and Hao Kang and Vivian Zhai and Liangze Li and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2311.08723 2023

Rethinking Voice-Face Correlation: A Geometry View
Xiang Li and Yandong Wen and Muqiao Yang and Jinglu Wang and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2311.08723 2023

Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms
Joseph Konan and Ojas Bhargave and Shikhar Agnihotri and Shuo Han and Yunyang Zeng and Ankit Shah and Bhiksha Raj arXiv preprint arXiv:2310.07161 2023

Privacy-oriented manipulation of speaker representations
Francisco Teixeira and Alberto Abad and Bhiksha Raj and Isabel Trancoso arXiv preprint arXiv:2310.06652 2023

Continual Contrastive Spoken Language Understanding
Umberto Cappellazzo and Enrico Fini and Muqiao Yang and Daniele Falavigna and Alessio Brutti and Bhiksha Raj arXiv preprint arXiv:2310.02699 2023

Loft: Local proxy fine-tuning for improving transferability of adversarial attacks against large language model
Muhammad Ahmed Shah and Roshan Sharma and Hira Dhamyal and Raphael Olivier and Ankit Shah and Dareen Alharthi and Hazim T Bukhari and Massa Baali and Soham Deshmukh and Michael Kuhlmann and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2310.04445 2023

Evaluating speech synthesis by training recognizers on synthetic speech
Dareen Alharthi and Roshan Sharma and Hira Dhamyal and Soumi Maiti and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2310.00706 2023

Completing visual objects via bridging generation and segmentation
Xiang Li and Yinpeng Chen and Chung-Ching Lin and Rita Singh and Bhiksha Raj and Zicheng Liu arXiv preprint arXiv:2310.00808 2023

Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiulian Peng and Rita Singh and Yan Lu and Bhiksha Raj arXiv preprint arXiv:2310.00132 2023

Understanding and mitigating the label noise in pre-training on downstream tasks
Hao Chen and Jindong Wang and Ankit Shah and Ran Tao and Hongxin Wei and Xing Xie and Masashi Sugiyama and Bhiksha Raj arXiv preprint arXiv:2309.17002 2023

Rethinking audiovisual segmentation with semantic quantization and decomposition
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiulian Peng and Rita Singh and Yan Lu and Bhiksha Raj arXiv e-prints 2023

Understanding political polarization using language models: A dataset and method
Samiran Gode and Supreeth Bare and Bhiksha Raj and Hyungon Yoo AI Magazine 2023

Transferable Adversarial Perturbations between Self-Supervised Speech Recognition Models
Raphael Olivier and Hadi Abdullah and Bhiksha Raj The Second Workshop on New Frontiers in Adversarial Machine Learning 2023

The hidden dance of phonemes and visage: Unveiling the enigmatic link between phonemes and facial features
Liao Qu and Xianwei Zou and Xiang Li and Yandong Wen and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2307.13953 2023

BASS: Block-wise Adaptation for Speech Summarization
Roshan Sharma and Kenneth Zheng and Siddhant Arora and Shinji Watanabe and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2307.08217 2023

How many perturbations break this model? evaluating robustness beyond adversarial accuracy
Raphael Olivier and Bhiksha Raj International Conference on Machine Learning 2023

Panoramic video salient object detection with ambisonic audio guidance
Xiang Li and Haoyuan Cao and Shijie Zhao and Junlin Li and Li Zhang and Bhiksha Raj Proceedings of the AAAI Conference on Artificial Intelligence 2023

VLTinT: visual-linguistic transformer-in-transformer for coherent video paragraph captioning
Kashu Yamazaki and Khoa Vo and Quang Sang Truong and Bhiksha Raj and Ngan Le Proceedings of the AAAI Conference on Artificial Intelligence 2023

Utopia: Unconstrained tracking objects without preliminary examination via cross-domain adaptation
Pha Nguyen and Kha Gia Quach and John Gauch and Samee U Khan and Bhiksha Raj and Khoa Luu arXiv preprint arXiv:2306.09613 2023

An Approach to Ontological Learning from Weak Labels
Ankit Shah and Larry Tang and Po Hao Chou and Yi Yu Zheng and Ziqian Ge and Bhiksha Raj ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

👏 CLAP: learning audio concepts from natural language supervision
Benjamin Elizalde and Soham Deshmukh and Mahmoud Al Ismail and Huaming Wang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

Privacy-Preserving Automatic Speaker Diarization
Francisco Teixeira and Alberto Abad and Bhiksha Raj and Isabel Trancoso ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

The Interactive Machine Learning Paradigm
Mark Lindsey and Richard M Stern and Bhiksha Raj and Aswin Sankaranarayanan and Rita Singh and Francis Kubala ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement
Yunyang Zeng and Joseph Konan and David Bick and Muqiao Wang and Anurag Kumar and Shinji Watanabe and Bhiksha Raj ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023

PaintSeg: Training-free Segmentation via Painting
Xiang Li and Chung-Ching Lin and Yinpeng Chen and Zicheng Liu and Jinglu Wang and Bhiksha Raj arXiv preprint arXiv:2305.19406 2023

Imprecise label learning: A unified framework for learning with various imprecise label configurations
Hao Chen and Ankit Shah and Jindong Wang and Ran Tao and Yidong Wang and Xing Xie and Masashi Sugiyama and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2305.12715 2023

Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Joseph Konan and Ojas Bhargave and Shikhar Agnihotri and Hojeong Lee and Ankit Shah and Shuo Han and Yunyang Zeng and Amanda Shu and Haohui Liu and Xuankai Chang and Hamza Khalid and Minseon Gwak and Kawon Lee and Minjeong Kim and Bhiksha Raj arXiv preprint arXiv:2303.09048 2023

Approach to Learning Generalized Audio Representation Through Batch Embedding Covariance Regularization and Constant-Q Transforms
Ankit Shah and Shuyi Chen and Kejun Zhou and Yue Chen and Bhiksha Raj arXiv preprint arXiv:2303.03591 2023

Improving sound event detection with ontologies
Bhiksha Raj The Journal of the Acoustical Society of America 2023

Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session
Laurie M Heller and Benjamin Elizalde and Bhiksha Raj and Soham Deshmukh The Journal of the Acoustical Society of America 2023

Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning
Hao Chen and Ran Tao and Yue Fan and Yidong Wang and Jindong Wang and Bernt Schiele and Xing Xie and Bhiksha Raj and Marios Savvides arXiv preprint arXiv:2301.10921 2023

Robust referring video object segmentation with cyclic structural consensus
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiao Li and Bhiksha Raj and Yan Lu Proceedings of the IEEE/CVF International Conference on Computer Vision 2023

Pairwise Similarity Learning is SimPLE
Yandong Wen and Weiyang Liu and Yao Feng and Bhiksha Raj and Rita Singh and Adrian Weller and Michael J Black and Bernhard Schölkopf Proceedings of the IEEE/CVF International Conference on Computer Vision 2023

Fredom: Fairness domain adaptation approach to semantic scene understanding
Thanh-Dat Truong and Ngan Le and Bhiksha Raj and Jackson Cothren and Khoa Luu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023

Aoe-net: Entities interactions modeling with adaptive attention mechanism for temporal action proposals generation
Khoa Vo and Sang Truong and Kashu Yamazaki and Bhiksha Raj and Minh-Triet Tran and Ngan Le International Journal of Computer Vision 2023

2022

Connecting human voice profiling to genomics: A predictive algorithm for linking speech phenotypes to genetic microdeletion syndromes
Rita Singh bioRxiv, doi: https://doi.org/10.1101/2022.05.23. 2022

Learnable Front Ends Based on Temporal Modulation for Music Tagging
Yinghao Ma and Richard M Stern arXiv preprint arXiv:2211.15254 2022

Effect of Titrated Exposure to Non-Traumatic Noise on Unvoiced Speech Recognition in Human Listeners with Normal Audiological Profiles
Mengchao Zhang and Richard M Stern and Deborah Moncrieff and Catherine Palmer and Christopher A Brown arXiv preprint arXiv:2211.15254 2022

Improved Modulation-Domain Loss for Neural-Network-based Speech Enhancement}}
Tyler Vuong and Richard Stern Proc. Interspeech 2022 2022

L3DAS22: Exploring Loss Functions for 3D Speech Enhancement
Tyler Vuong and Mark Lindsey and Yangyang Xia and Richard Stern Proc. L3DAS22: Machine Learning for 3D Audio Signal Processing 2022

Usb: A unified semi-supervised learning benchmark for classification
Yidong Wang and Hao Chen and Yue Fan and Wang Sun and Ran Tao and Wenxin Hou and Renjie Wang and Linyi Yang and Zhi Zhou and Lan-Zhe Guo and Heli Qi and Zhen Wu and Yu-Feng Li and Satoshi Nakamura and Wei Ye and Marios Savvides and Bhiksha Raj and Takahiro Shinozaki and Bernt Schiele and Jindong Wang and Xing Xie and Yue Zhang Advances in Neural Information Processing Systems 2022

5 and L Shaped ACS Fed MIMO Antenna with Improved Isolation and Diversity Metrics
Praveen V Naidu and B Sri Hasa and B Anjani Reddy and B Samuel Raj and Naveen Kalasani and Aravind Kumar 2022 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI) 2022

An embarrassingly simple baseline for imbalanced semi-supervised learning
Hao Chen and Yue Fan and Yidong Wang and Jindong Wang and Bernt Schiele and Xing Xie and Marios Savvides and Bhiksha Raj arXiv preprint arXiv:2211.11086 2022

Describing emotions with acoustic property prompts for speech emotion recognition
Hira Dhamyal and Benjamin Elizalde and Soham Deshmukh and Huaming Wang and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2211.07737 2022

Cross-utterance context for multimodal video transcription
Roshan Sharma and Bhiksha Raj 2022 56th Asilomar Conference on Signals, Systems, and Computers 2022

R^ 2-VOS: Robust Referring Video Object Segmentation via Relational Cycle Consistency
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiao Li and Yan Lu and Bhiksha Raj 2022 56th Asilomar Conference on Signals, Systems, and Computers 2022

Xnor-former: Learning accurate approximations in long speech transformers
Roshan Sharma and Bhiksha Raj arXiv preprint arXiv:2210.16643 2022

Unifying the discrete and continuous emotion labels for speech emotion recognition
Roshan Sharma and Hira Dhamyal and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2210.16642 2022

There is more than one kind of robustness: Fooling whisper with adversarial examples
Raphael Olivier and Bhiksha Raj arXiv preprint arXiv:2210.17316 2022

-VOS: Robust Referring Video Object Segmentation via Relational Cycle Consistency
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiao Li and Yan Lu and Bhiksha Raj arXiv preprint arXiv:2210.17316 2022

Less Is More: Training on Low-Fidelity Images Improves Robustness to Adversarial Attacks
Muhammad A Shah and Bhiksha Raj arXiv preprint arXiv:2210.17316 2022

Training image classifiers using semi-weak label data
Ankit Parag Shah and Bhiksha Raj arXiv preprint arXiv:2210.17316 2022

Watch What You Pretrain For: Targeted, Transferable Adversarial Examples on Self-Supervised Speech Recognition models
Raphael Olivier and Hadi Abdullah and Bhiksha Raj arXiv preprint arXiv:2209.13523 2022

Hear: Holistic evaluation of audio representations
Joseph Turian and Jordie Shier and Humair Raj Khan and Bhiksha Raj and Björn W Schuller and Christian J Steinmetz and Colin Malloy and George Tzanetakis and Gissel Velarde and Kirk McNally and Max Henry and Nicolas Pinto and Camille Noufi and Christian Clough and Dorien Herremans and Eduardo Fonseca and Jesse Engel and Justin Salamon and Philippe Esling and Pranay Manocha and Shinji Watanabe and Zeyu Jin and Yonatan Bisk NeurIPS 2021 Competitions and Demonstrations Track 2022

Online video instance segmentation via robust context fusion
Xiang Li and Jinglu Wang and Xiaohao Xu and Bhiksha Raj and Yan Lu arXiv preprint arXiv:2207.05580 2022

R^ 2VOS: Robust Referring Video Object Segmentation via Relational Multimodal Cycle Consistency
Xiang Li and Jinglu Wang and Xiaohao Xu and Xiao Li and Yan Lu and Bhiksha Raj arXiv preprint arXiv:2207.01203 2022

Improving speech enhancement through fine-grained speech characteristics
Muqiao Yang and Joseph Konan and David Bick and Anurag Kumar and Shinji Watanabe and Bhiksha Raj arXiv preprint arXiv:2207.00237 2022

Self-supervision and learnable strfs for age, emotion, and country prediction
Roshan Sharma and Tyler Vuong and Mark Lindsey and Hira Dhamyal and Rita Singh and Bhiksha Raj arXiv preprint arXiv:2206.12568 2022

Towards End-to-End Private Automatic Speaker Recognition
Francisco Teixeira and Alberto Abad and Bhiksha Raj and Isabel Trancoso arXiv preprint arXiv:2206.11750 2022

Bear the Query in Mind: Visual Grounding with Query-conditioned Convolution
Chonghan Chen and Qi Jiang and Chih-Hao Wang and Noel Chen and Haohan Wang and Xiang Li and Bhiksha Raj arXiv preprint arXiv:2206.09114 2022

Freematch: Self-adaptive thresholding for semi-supervised learning
Yidong Wang and Hao Chen and Qiang Heng and Wenxin Hou and Yue Fan and Zhen Wu and Jindong Wang and Marios Savvides and Takahiro Shinozaki and Bhiksha Raj and Bernt Schiele and Xing Xie arXiv preprint arXiv:2205.07246 2022

On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice
Ankit Shah and Hira Dhamyal and Yang Gao and Daniel Arancibia and Mario Arancibia and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2204.04802 2022

Recent improvements of asr models in the face of adversarial attacks
Raphael Olivier and Bhiksha Raj arXiv preprint arXiv:2203.16536 2022

Point3D: tracking actions as moving points with 3D CNNs
Shentong Mo and Jingfei Xia and Xiaoqing Tan and Bhiksha Raj arXiv preprint arXiv:2203.10584 2022

Sphereface revived: Unifying hyperspherical face recognition
Weiyang Liu and Yandong Wen and Bhiksha Raj and Rita Singh and Adrian Weller IEEE Transactions on Pattern Analysis and Machine Intelligence 2022

Ontological Learning from Weak Labels
Larry Tang and Po Hao Chou and Yi Yu Zheng and Ziqian Ge and Ankit Shah and Bhiksha Raj arXiv preprint arXiv:2203.02483 2022

Positional Encoding for Capturing Modality Specific Cadence for Emotion Detection}}
Hira Dhamyal and Bhiksha Raj and Rita Singh Proc. Interspeech 2022 2022

Not all broken defenses are equal: The dead angles of adversarial accuracy.
Raphaël Olivier and Bhiksha Raj CoRR 2022

Usb: A unified semi-supervised learning benchmark
Yidong Wang and Hao Chen and Yue Fan and Wang Sun and Ran Tao and Wenxin Hou and Renjie Wang and Linyi Yang and Zhi Zhou and Lan-Zhe Guo and Heli Qi and Zhen Wu and Yu-Feng Li and Satoshi Nakamura and Wei Ye and Marios Savvides and Bhiksha Raj and Takahiro Shinozaki and Bernt Schiele and Jindong Wang and Xing Xie and Yue Zhang Conference on Neural Information Processing Systems (NeurIPS) 2022

SphereFace2: Binary Classification is All You Need for Deep Face Recognition
Yandong Wen and Weiyang Liu and Adrian Weller and Bhiksha Raj and Rita Singh International Conference on Learning Representations (ICLR) 2022

2021

The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge.
Tyler Vuong and Yangyang Xia and Richard M Stern Interspeech 2021

Temporal Context in Speech Emotion Recognition.
Yangyang Xia and Li-Wei Chen and Alexander Rudnicky and Richard M Stern Interspeech 2021

A modulation-domain loss for neural-network-based real-time speech enhancement
Tyler Vuong and Yangyang Xia and Richard M Stern ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

Discriminative dictionary learning for autism Spectrum disorder identification
Wenbo Liu and Ming Li and Xiaobing Zou and Bhiksha Raj Frontiers in Computational Neuroscience 2021

Sequential randomized smoothing for adversarially robust speech recognition
Raphael Olivier and Bhiksha Raj arXiv preprint arXiv:2112.03000 2021

Identifying actions for sound event classification
Benjamin Elizalde and Radu Revutchi and Samarjit Das and Bhiksha Raj and Ian Lane and Laurie M Heller 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2021

Interpreting glottal flow dynamics for detecting covid-19 from voice
Soham Deshmukh and Mahmoud Al Ismail and Rita Singh ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

Detection of COVID-19 through the analysis of vocal fold oscillations
Mahmoud Al Ismail and Soham Deshmukh and Rita Singh ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

The in-the-wild speech medical corpus
Joana Correia and Francisco Teixeira and Catarina Botelho and Isabel Trancoso and Bhiksha Raj ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

High-frequency adversarial defense for speech and audio
Raphael Olivier and Bhiksha Raj and Muhammad Shah ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

Towards adversarial robustness via compact feature representations
Muhammad A Shah and Raphael Olivier and Bhiksha Raj ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

Foolhd: Fooling speaker identification by highly imperceptible adversarial disturbances
Ali Shahin Shamsabadi and Francisco Sepúlveda Teixeira and Alberto Abad and Bhiksha Raj and Andrea Cavallaro and Isabel Trancoso ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021

Constant random perturbations provide adversarial robustness with minimal effect on accuracy
Bronya Roni Chernyak and Bhiksha Raj and Tamir Hazan and Joseph Keshet arXiv preprint arXiv:2103.08265 2021

Constant Random Perturbations Provide Adversarial Robustness with Minimal Effect on Accuracy
Bronya Roni Chernyak and Bhiksha Raj and Tamir Hazan and Joseph Keshet arXiv e-prints 2021

Detection and evaluation of human and machine generated speech in spoofing attacks on automatic speaker verification systems
Yang Gao and Jiachen Lian and Bhiksha Raj and Rita Singh 2021 IEEE Spoken Language Technology Workshop (SLT) 2021

Optimal Strategies For Comparing Covariates To Solve Matching Problems
Muhammad A Shah and Raphael Olivier and Bhiksha Raj 2020 25th International Conference on Pattern Recognition (ICPR) 2021

Exploiting non-linear redundancy for neural model compression
Muhammad A Shah and Raphael Olivier and Bhiksha Raj 2020 25th International Conference on Pattern Recognition (ICPR) 2021

Hierarchical routing mixture of experts
Wenbo Zhao and Yang Gao and Shahan Ali Memon and Bhiksha Raj and Rita Singh 2020 25th International Conference on Pattern Recognition (ICPR) 2021

Masked Proxy Loss for Text-Independent Speaker Verification
Rita Singh Jiachen Lian and Aiswarya Vinod Kumar and Hira Dhamyal and Bhiksha Raj Interspeech 2021 2021

Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks
Soham Deshmukh and Rita Singh and Bhiksha Raj Interspeech 2021 2021

Shadowing as peer experiential learning for faculty instructional development strategy: A case study on a computer science course
Rodolfo M Vega and Enrique Peláez and Bhiksha Raj International Journal of Educational Research Open 2021

Self-supervised 3d face reconstruction via conditional estimation
Yandong Wen and Weiyang Liu and Bhiksha Raj and Rita Singh Proceedings of the IEEE/CVF International Conference on Computer Vision 2021

Contrast and order representations for video self-supervised learning
Kai Hu and Jie Shao and Yuan Liu and Bhiksha Raj and Marios Savvides and Zhiqiang Shen Proceedings of the IEEE/CVF International Conference on Computer Vision 2021

The right to talk: An audio-visual transformer approach
Thanh-Dat Truong and Chi Nhan Duong and Hoang Anh Pham and Bhiksha Raj and Ngan Le and Khoa Luu Proceedings of the IEEE/CVF International Conference on Computer Vision 2021

2020

Speech-based parameter estimation of an asymmetric vocal fold oscillation model and its application in discriminating vocal fold pathologies
Wenbo Zhao and Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

Introduction to Neural Networks
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

Learnable spectro-temporal receptive fields for robust voice type discrimination
Tyler Vuong and Yangyang Xia and Richard Stern arXiv preprint arXiv:2010.09151 2020

Non causal deep learning based dereverberation
Jorge Wuth and Richard M Stern and Nestor Becerra Yoma arXiv preprint arXiv:2009.02832 2020

Binaural Technology for Machine Speech Recognition and Understanding
Richard M Stern and Anjali Menon The Technology of Binaural Understanding 2020

Sherlock: A crowd-sourced system for automatic tagging of indoor floor plans
Muhammad A Shah and Khaled A Harras and Bhiksha Raj 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS) 2020

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances
Ali Shahin Shamsabadi and Francisco Sepúlveda Teixeira and Alberto Abad and Bhiksha Raj and Andrea Cavallaro and Isabel Trancoso arXiv e-prints 2020

Multi-task learning for interpretable weakly labelled sound event detection
Soham Deshmukh and Bhiksha Raj and Rita Singh arXiv preprint arXiv:2008.07085 2020

Exploring optimal dnn architecture for end-to-end beamformers based on time-frequency references
Yuichiro Koyama and Bhiksha Raj arXiv preprint arXiv:2005.12683 2020

Exploring the best loss function for DNN-based low-latency speech enhancement with temporal convolutional networks
Yuichiro Koyama and Tyler Vuong and Stefan Uhlich and Bhiksha Raj arXiv preprint arXiv:2005.11611 2020

Efficient integration of multi-channel information for speaker-independent speech separation
Yuichiro Koyama and Oluwafemi Azeez and Bhiksha Raj arXiv preprint arXiv:2005.11612 2020

Deriving compact feature representations via annealed contraction
Muhammad A Shah and Bhiksha Raj ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

Automatic in-the-wild dataset annotation with deep generalized multiple instance learning
Joana Correia and Isabel Trancoso and Bhiksha Raj Proceedings of the Twelfth Language Resources and Evaluation Conference 2020

Artificial Creative Intelligence: Breaking the Imitation Barrier.
Rowland Chen and Roger B Dannenberg and Bhiksha Raj and Rita Singh ICCC 2020

Controlled autoencoders to generate faces from voices
Hao Liang and Lulan Yu and Guikang Xu and Bhiksha Raj and Rita Singh Advances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part I 15 2020

Mask Proxy Loss for Text-Independent Speaker Recognition.
Jiachen Lian and Aiswarya Vinod Kumar and Hira Dhamyal and Bhiksha Raj and Rita Singh CoRR 2020

Is normalization indispensable for training deep neural network?
Jie Shao and Kai Hu and Changhu Wang and Xiangyang Xue and Bhiksha Raj Advances in Neural Information Processing Systems 2020

2019

Profiling humans from their voice
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Feature Engineering for Profiling
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Mechanisms for Profiling
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Profiling and its facets
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Relations Between Voice and Profile Parameters
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

The Voice Signal and Its Information Content—2
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

The Voice Signal and Its Information Content—1
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Qualitative Aspects of the Voice Signal
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Production and perception of voice
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Reconstruction of the Human Persona in 3D from Voice, and its Reverse
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Applied Profiling: Uses, Reliability and Ethics
Rita Singh ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

The fMRI Data of Thompson et al.(2006) do not constrain how the human midbrain represents interaural time delay
Richard M Stern and H Steven Colburn and Leslie R Bernstein and Constantine Trahiotis Journal of the Association for Research in Otolaryngology 2019

Weighted delay-and-sum beamforming guided by visual tracking for human-robot interaction
José Novoa and Rodrigo Mahu and Alejandro Díaz and Jorge Wuth and Richard Stern and Nestor Becerra Yoma arXiv preprint arXiv:1906.07298 2019

On combining features for single-channel robust speech recognition in reverberant environments
José Novoa and Josué Fredes and Jorge Wuth and Fernando Huenupán and Richard M Stern and Nestor Becerra Yoma arXiv preprint arXiv:1906.07299 2019

Robust Recognition of Reverberant and Noisy Speech Using Coherence-based Processing
Anjali Menon and Chanwoo Kim and Richard M Stern ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

“Straightness” versus “briefness” in binaural cue extraction
G Christopher Stecker and Mathias Dietz and Richard M Stern The Journal of the Acoustical Society of America 2019

Emotion Recognition from Voice in the Wild
Oren Wright and Rita Singh and Richard M Stern and CARNEGIE-MELLON UNIV PITTSBURGH PA PITTSBURGH United States The Journal of the Acoustical Society of America 2019

In-the-wild end-to-end detection of speech affecting diseases
Joana Correia and Isabel Trancoso and Bhiksha Raj 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019

Optimizing neural network embeddings using a pair-wise loss for text-independent speaker verification
Hira Dhamyal and Tianyan Zhou and Bhiksha Raj and Rita Singh 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019

Face Reconstruction from Voice using Generative Adversarial Networks
Yandong Wen and Rita Singh and Bhiksha Raj Neural Information Processing Systems 2019

The phonetic bases of vocal expressed emotion: natural versus acted
Hira Dhamyal and Shahan Ali Memon and Bhiksha Raj and Rita Singh arXiv preprint arXiv:1911.05733 2019

Preserving privacy in speaker and speech characterisation
Andreas Nautsch and Abelino Jiménez and Amos Treiber and Jascha Kolberg and Catherine Jasserand and Els Kindt and Héctor Delgado and Massimiliano Todisco and Mohamed Amine Hmani and Aymen Mtibaa and Mohammed Ahmed Abdelraheem and Alberto Abad and Francisco Teixeira and Driss Matrouf and Marta Gomez-Barrero and Dijana Petrovska-Delacrétaz and Gérard Chollet and Nicholas Evans and Thomas Schneider and Jean-François Bonastre and Bhiksha Raj and Isabel Trancoso and Christoph Busch arXiv preprint arXiv:1911.05733 2019

W-Net BF: DNN-based beamformer using joint training approach
Yuichiro Koyama and Bhiksha Raj arXiv preprint arXiv:1910.14262 2019

Detecting gender differences in perception of emotion in crowdsourced data
Shahan Ali Memon and Hira Dhamyal and Oren Wright and Daniel Justice and Vijaykumar Palat and William Boler and Bhiksha Raj and Rita Singh arXiv preprint arXiv:1910.11386 2019

Neural regression trees
Shahan Ali Memon and Wenbo Zhao and Bhiksha Raj and Rita Singh 2019 International Joint Conference on Neural Networks (IJCNN) 2019

Non-Determinism in Neural Networks for Adversarial Robustness
Daanish Ali Khan and Linhong Li and Ninghao Sha and Zhuoran Liu and Abelino Jimenez and Bhiksha Raj and Rita Singh arXiv preprint arXiv:1905.10906 2019

Nonlinear semi-parametric models for survival analysis
Chirag Nagpal and Rohan Sangave and Amit Chahar and Parth Shah and Artur Dubrawski and Bhiksha Raj arXiv preprint arXiv:1905.05865 2019

Time signal classification using random convolutional features
Abelino Jiménez and Bhiksha Raj ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Human behaviour recognition using WiFi channel state information
Daanish Ali Khan and Saquib Razak and Bhiksha Raj and Rita Singh ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Cross modal audio search and retrieval with joint embeddings based on text and audio
Benjamin Elizalde and Shuayb Zarar and Bhiksha Raj ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019

Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
Yandong Wen and Mahmoud Al Ismai and Weiyang Liu and Bhiksha Raj and Rita Singh Seventh International Conference on Learning Representations 2019

Reconstructing Faces from Voices
Yandong Wen and Rita Singh and Bhiksha Raj Seventh International Conference on Learning Representations 2019

Reconstructing Faces from Voices
Yandong Wen and Rita Singh and Bhiksha Raj arXiv preprint arXiv:1905.10604 2019

Sound event detection in the DCASE 2017 challenge
Annamaria Mesaros and Aleksandr Diment and Benjamin Elizalde and Toni Heittola and Emmanuel Vincent and Bhiksha Raj and Tuomas Virtanen IEEE/ACM Transactions on Audio, Speech, and Language Processing 2019

Hide and speak: Towards deep neural networks for speech steganography
Felix Kreuk and Yossi Adi and Bhiksha Raj and Rita Singh and Joseph Keshet arXiv preprint arXiv:1902.03083 2019

Speech as a (private?) biomarker for speech affecting diseases
Isabel Trancoso and Maria Joana Ribeiro Folgado Correia and Francisco Teixeira and Alberto Abad and Maria Catarina Tavares Botelho and Bhiksha Raj In ICIEA 2019

Hide and speak: Deep neural networks for speech steganography
Felix Kreuk and Yossi Adi and Bhiksha Raj and Rita Singh and Joseph Keshet arXiv preprint arXiv:1902.03083 2019

2018

Automatic guitar tablature transcription from audio using inharmonicity regression and bayesian classification
Jonathan Michelson and Richard Stern and Thomas Sullivan Audio Engineering Society Convention 145 2018

Model compensation and matched condition methods for robust speech recognition
Rita Singh and Bhiksha Raj and Richard M Stern Audio Engineering Society Convention 145 2018

Signal and feature compensation methods for robust speech recognition
Rita Singh and Richard M Stern and Bhiksha Raj Audio Engineering Society Convention 145 2018

Sound source separation using phase difference and reliable mask selection selection
Chanwoo Kim and Anjali Menon and Michiel Bacchiani and Richard Stern 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018

An improved DNN-based spectral feature mapping that removes noise and reverberation for robust automatic speech recognition
Juan Pablo Escudero and José Novoa and Rodrigo Mahu and Jorge Wuth and Fernando Huenupán and Richard Stern and Néstor Becerra Yoma arXiv preprint arXiv:1803.09016 2018

Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments
José Novoa and Juan Pablo Escudero and Jorge Wuth and Victor Poblete and Simon King and Richard Stern and Néstor Becerra Yoma arXiv preprint arXiv:1803.09013 2018

A framework for testing and comparing binaural models
Mathias Dietz and Jean-Hugues Lestang and Piotr Majdak and Richard M Stern and Torsten Marquardt and Stephan D Ewert and William M Hartmann and Dan FM Goodman arXiv preprint arXiv:1803.09013 2018

Highly-reverberant real environment database: Hrre
Juan Pablo Escudero and Victor Poblete and José Novoa and Jorge Wuth and Josué Fredes and Rodrigo Mahu and Richard Stern and Néstor Becerra Yoma arXiv preprint arXiv:1801.09651 2018

A Priori SNR Estimation Based on a Recurrent Neural Network for Robust Speech Enhancement.
Yangyang Xia and Richard M Stern INTERSPEECH 2018

A Comparative Study of Spatial Speech Separation Techniques to Improve Speech Recognition
Xinhui Zhou and Chiman Kwan and Bulent Ayhan and Chanwoo Kim and Kshitiz Kumar and Richard Stern Advances in Neural Networks–ISNN 2018: 15th International Symposium on Neural Networks, ISNN 2018, Minsk, Belarus, June 25–28, 2018, Proceedings 15 2018

Sound source separation using phase difference and reliable mask selection
Chanwoo Kim and Anjali Menon and Michiel Bacchiani and Richard M Stern Advances in Neural Networks–ISNN 2018: 15th International Symposium on Neural Networks, ISNN 2018, Minsk, Belarus, June 25–28, 2018, Proceedings 15 2018

Querying depression vlogs
Joana Correia and Bhiksha Raj and Isabel Trancoso 2018 IEEE Spoken Language Technology Workshop (SLT) 2018

Interactive evaluation of classifiers under limited resources
Sabit Hassan and Shaden Shaar and Bhiksha Raj and Saquib Razak 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) 2018

AudioPairBank: towards a large-scale tag-pair-based audio content analysis
Sebastian Säger and Benjamin Elizalde and Damian Borth and Christian Schulze and Bhiksha Raj and Ian Lane EURASIP Journal on Audio, Speech, and Music Processing 2018

Learning sound events from webly labeled data
Anurag Kumar and Ankit Shah and Bhiksha Raj and Alex Hauptmann arXiv preprint arXiv:1811.09967 2018

Higher-order Network for Action Recognition
Kai Hu and Bhiksha Raj arXiv preprint arXiv:1811.07519 2018

Analysing Speech for Clinical Applications
Bhiksha Raj and Alberto Abad Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Model compensation and matched condition methods for robust speech recognition
Rita Singh and Bhiksha Raj and Richard M Stern Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Signal and feature compensation methods for robust speech recognition
Rita Singh and Richard M Stern and Bhiksha Raj Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Neural Regression Tree
Wenbo Zhao and Shahan Ali Memon and Bhiksha Raj and Rita Singh Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Analysing speech for clinical applications
Isabel Trancoso and Joana Correia and Francisco Teixeira and Bhiksha Raj and Alberto Abad Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Semantic Analysis of Audio Content
Sourish Chaudhuri and Bhiksha Raj Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Speech analytics for medical applications
Isabel Trancoso and Joana Correia and Francisco Teixeira and Bhiksha Raj and Alberto Abad Statistical Language and Speech Processing: 6th International Conference, SLSP 2018, Mons, Belgium, October 15–16, 2018, Proceedings 2018

Optimal strategies for matching and retrieval problems by comparing covariates
Yandong Wen and Mahmoud Al Ismail and Bhiksha Raj and Rita Singh arXiv preprint arXiv:1807.04834 2018

Classifier risk estimation under limited labeling resources
Anurag Kumar and Bhiksha Raj arXiv preprint arXiv:1807.04834 2018

A closer look at weak label learning for audio events
Ankit Shah and Anurag Kumar and Alexander G Hauptmann and Bhiksha Raj arXiv preprint arXiv:1804.09288 2018

NELS–Never-Ending Learner of Sounds
Benjamin Elizalde and Rohan Badlani and Ankit Shah and Anurag Kumar and Bhiksha Raj arXiv preprint arXiv:1801.05544 2018

DCASE 2017 task 1: Acoustic scene classification using shift-invariant kernels and random features
Abelino Jimenez and Benjamin Elizalde and Bhiksha Raj arXiv preprint arXiv:1801.02690 2018

Sound event classification using ontology-based neural networks
Benjamin Elizalde and Abelino Jimenez and Bhiksha Raj NIPS 2018 Workshop 2018

Privacy-Preserving Cloud Computing
Bhiksha Raj and Gérard Chollet International Conference on Advanced Technologies for Signal and Image Processing 2018

Sound event classification using ontology-based neural networks
Abelino Jimenez and Benjamin Elizalde and Bhiksha Raj Proceedings of the Annual Conference on Neural Information Processing Systems 2018

Mining multimodal repositories for speech affecting diseases
Joana Correia12 and Bhiksha Raj and Isabel Trancoso and Francisco Teixeira Proceedings of the Annual Conference on Neural Information Processing Systems 2018

Framework For Evaluation Of Sound Event Detection in Web Videos
Rohan Badlani and Ankit Shah and Benjamin Elizalde and Anurag Kumar and Bhiksha Raj IEEE International Conference on Acoustics Speech and Signal Processing 2018

Acoustic Scene Classification Using Discrete Random Hashing for Laplacian Kernel Machines
Abelino Jimenez and Benjamin Elizalde and Bhiksha Raj IEEE International Conference on Acoustics Speech and Signal Processing 2018

Voice impersonation using Generative Adversarial Networks
Yang Gao and Bhiksha Raj and Rita Singh IEEE International Conference on Acoustics Speech and Signal Processing 2018

Content-based representations of audio using Siamese Neural Networks
Pranay Manocha and Rohan Badlani and Anurag Kumar and Ankit Shah and Benjamin Elizalde and Bhiksha Raj IEEE International Conference on Acoustics Speech and Signal Processing 2018

A corrective learning approach for text-independent speaker verification
Yandong Wen and Tianyan Zhou and Rita Singh and Bhiksha Raj IEEE International Conference on Acoustics Speech and Signal Processing 2018

Future perspective
Dan Ellis and Tuomas Virtanen and Mark D Plumbley and Bhiksha Raj IEEE International Conference on Acoustics Speech and Signal Processing 2018

A comparative analysis of human-mediated and system-mediated interruptions for multi-user, multitasking interactions
Nia Peters and Griffin Romigh and George Bradley and Bhiksha Raj Advances in Human Factors and Systems Interaction: Proceedings of the AHFE 2017 International Conference on Human Factors and Systems Interaction, July 17− 21, 2017, The Westin Bonaventure Hotel, Los Angeles, California, USA 8 2018

2017

Binaural processing for robust recognition of degraded speech
Anjali Menon and Chanwoo Kim and Umpei Kurokawa and Richard M Stern 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2017

Fingerprinting field programmable gate arrays
Vinayaka Jyothi and Ashik Poojari and Richard Stern and Ramesh Karri 2017 IEEE International Conference on Computer Design (ICCD) 2017

Synchrony-based feature extraction for robust automatic speech recognition
Fernando de-La-Calle-Silos and Richard M Stern IEEE Signal Processing Letters 2017

An initiative for testability and comparability of binaural models
Mathias Dietz and Torsten Marquardt and Piotr Majdak and Richard M Stern and William M Hartmann and Dan F Goodman and Stephan D Ewert The Journal of the Acoustical Society of America 2017

Predicting binaural lateralization and discrimination using the position-variable model
Richard M Stern The Journal of the Acoustical Society of America 2017

Locally normalized filter banks applied to deep neural-network-based robust speech recognition
Josué Fredes and José Novoa and Simon King and Richard M Stern and Nestor Becerra Yoma IEEE Signal Processing Letters 2017

Binaural processing for robust speech recognition of degraded speech
Anjali Menon and Chanwoo Kim and Umpei Kurokawa and Richard M Stern IEEE Signal Processing Letters 2017

Robustness over time-varying channels in DNN-hmm ASR based human-robot interaction.
José Novoa and Jorge Wuth and Juan Pablo Escudero and Josué Fredes and Rodrigo Mahu and Richard M Stern and Nestor Becerra Yoma INTERSPEECH 2017

Robust speech recognition based on binaural auditory processing
Anjali Menon and Chanwoo Kim and Richard M Stern INTERSPEECH 2017

Robust features in deep-learning-based speech recognition
Vikramjit Mitra and Horacio Franco and Richard M Stern and Julien Van Hout and Luciana Ferrer and Martin Graciarena and Wen Wang and Dimitra Vergyri and Abeer Alwan and John HL Hansen New Era for Robust Speech Recognition: Exploiting Deep Learning 2017

Audition for multimedia computing
Gerald Friedland and Paris Smaragdis and Josh McDermott and Bhiksha Raj Advances in Human Factors and Systems Interaction: Proceedings of the AHFE 2017 International Conference on Human Factors and Systems Interaction, July 17− 21, 2017, The Westin Bonaventure Hotel, Los Angeles, California, USA 8 2017

A two factor transformation for speaker verification through ℓ1comparison
Abelino Jiménez and Bhiksha Raj 2017 IEEE Workshop on Information Forensics and Security (WIFS) 2017

DCASE 2017 challenge setup: Tasks, datasets and baseline system
Annamaria Mesaros and Toni Heittola and Aleksandr Diment and Benjamin Elizalde and Ankit Shah and Emmanuel Vincent and Bhiksha Raj and Tuomas Virtanen DCASE 2017-Workshop on Detection and Classification of Acoustic Scenes and Events 2017

DCASE 2017 Challenge Setup: Tasks
Annamaria Mesaros and Toni Heittola and Aleksandr Diment and Benjamin Elizalde and Ankit Shah and Emmanuel Vincent and Bhiksha Raj and Tuomas Virtanen Datasets and Baseline System, DCASE 2017

Topic and prosodic modeling for interruption management in multi-user multitasking communication interactions
Nia Peters and Bhiksha Raj and Griffin Romigh 2017 AAAI Fall Symposium Series 2017

Inferring room semantics using acoustic monitoring
Muhammad Ahmed Shah and Bhiksha Raj and Khaled A Harras 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) 2017

An approach for self-training audio event detectors using web data
Benjamin Elizalde and Ankit Shah and Siddharth Dalmia and Min Hun Lee and Rohan Badlani and Anurag Kumar and Bhiksha Raj and Ian Lane 2017 25th European Signal Processing Conference (EUSIPCO) 2017

Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery.
Janek Ebbers and Jahn Heymann and Lukas Drude and Thomas Glarner and Reinhold Haeb-Umbach and Bhiksha Raj InterSpeech 2017

Be careful what you backpropagate: A case for linear output activations & gradient boosting
Anders Oland and Aayush Bansal and Roger B Dannenberg and Bhiksha Raj arXiv preprint arXiv:1707.04199 2017

Deep CNN framework for audio event recognition using weakly labeled web data
Anurag Kumar and Bhiksha Raj arXiv preprint arXiv:1707.02530 2017

Audio event and scene recognition: A unified approach using strongly and weakly labeled data
Anurag Kumar and Bhiksha Raj 2017 International Joint Conference on Neural Networks (IJCNN) 2017

Supervised monaural source separation based on autoencoders
Keiichi Osako and Yuki Mitsufuji and Rita Singh and Bhiksha Raj 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017

Discovering sound concepts and acoustic relations in text
Anurag Kumar and Bhiksha Raj and Ndapandula Nakashole 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017

Privacy preserving distance computation using somewhat-trusted third parties
Abelino Jimenez and Bhiksha Raj 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017

On the origin of deep learning
Haohan Wang and Bhiksha Raj 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2017

Privacy preserving biometric identity verification
Gérard Chollet and Abelino Jimenez and Dijana Petrovska-Delacrétaz and Bhiksha Raj COST IC1206 Training School 2017

The incredible shrinking neural network: New perspectives on learning representations through the lens of pruning
Aditya Sharma and Nikolas Wolfe and Bhiksha Raj arXiv preprint arXiv:1701.04465 2017

2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2014)
Guilin Liu and Y Wen and Z Yu and M Li and B Raj and L Song arXiv preprint arXiv:1701.04465 2017

Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR):(Dagstuhl Seminar 16442)
Roger Moore and Serge Thill and Ricard Marxer Dagstuhl Reports 2017

On the origin of deep learning (2017)
Haohan Wang and Bhiksha Raj and EP Xing arXiv preprint arXiv:1702.07800 2017

SphereFace: deep hypersphere embedding for face recognition. CoRR abs/1704.08063 (2017)
Weiyang Liu and Yandong Wen and Zhiding Yu and Ming Li and Bhiksha Raj and Le Song arXiv preprint arXiv:1704.08063 2017

Sphereface: Deep hypersphere embedding for face recognition
Weiyang Liu and Yandong Wen and Zhiding Yu and Ming Li and Bhiksha Raj and Le Song Proceedings of the IEEE conference on computer vision and pattern recognition 2017

The REVERB challenge: A benchmark task for reverberation-robust ASR techniques
Keisuke Kinoshita and Marc Delcroix and Sharon Gannot and Emanuël AP Habets and Reinhold Haeb-Umbach and Walter Kellermann and Volker Leutnant and Roland Maas and Tomohiro Nakatani and Bhiksha Raj and Armin Sehr and Takuya Yoshioka New Era for Robust Speech Recognition: Exploiting Deep Learning 2017

When to interrupt: A comparative analysis of interruption timings within collaborative communication tasks
Nia Peters and Griffin Romigh and George Bradley and Bhiksha Raj Advances in Human Factors and System Interactions: Proceedings of the AHFE 2016 International Conference on Human Factors and System Interactions, July 27-31, 2016, Walt Disney World®, Florida, USA 2017

2016

FPGA Trust Zone: Incorporating trust and reliability into FPGA designs
Vinayaka Jyothi and Manasa Thoonoli and Richard Stern and Ramesh Karri 2016 IEEE 34th International Conference on Computer Design (ICCD) 2016

A framework for auditory model comparability and applicability
Mathias Dietz and Piotr Majdak and Richard M Stern and Torsten Marquardt and William M Hartmann and Dan F Goodman and Stephan D Ewert Journal of the Acoustical Society of America 2016

The position-variable model at 40
Richard M Stern The Journal of the Acoustical Society of America 2016

A subband-based stationary-component suppression method using harmonics and power ratio for reverberant speech recognition
Byung Joon Cho and Haeyong Kwon and Ji-Won Cho and Chanwoo Kim and Richard M Stern and Hyung-Min Park IEEE Signal Processing Letters 2016

Power-normalized cepstral coefficients (PNCC) for robust speech recognition
Chanwoo Kim and Richard M Stern IEEE/ACM Transactions on audio, speech, and language processing 2016

Applying Physiologically-Motivated Models of Auditory Processing to Automatic Speech Recognition: Promise and Progress
Richard Stern IEEE/ACM Transactions on audio, speech, and language processing 2016

Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel-and Noise-Degraded Speech.
Vikramjit Mitra and Julien van Hout and Wen Wang and Chris Bartels and Horacio Franco and Dimitra Vergyri and Abeer Alwan and Adam Janin and John HL Hansen and Richard M Stern and Abhijeet Sangwan and Nelson Morgan INTERSPEECH 2016

The Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms.
Víctor Poblete and Juan Pablo Escudero and Josué Fredes and José Novoa and Richard M Stern and Simon King and Néstor Becerra Yoma INTERSPEECH 2016

Adaptation of SVM for MIL for infering the polarity of movies and movie reviews
Joana Correia and Isabel Trancoso and Bhiksha Raj 2016 IEEE Workshop on Spoken Language Technology 2016

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Keisuke Kinoshita and Marc Delcroix and Sharon Gannot and Emanuël A P. Habets and Reinhold Haeb-Umbach and Walter Kellermann and Volker Leutnant and Roland Maas and Tomohiro Nakatani and Bhiksha Raj and Armin Sehr and Takuya Yoshioka EURASIP Journal on Advances in Signal Processing 2016

The incredible shrinking neural network: New perspectives on learning representations through the lens of pruning
Nikolas Wolfe and Aditya Sharma and Lukas Drude and Bhiksha Raj EURASIP Journal on Advances in Signal Processing 2016

Audio event detection using weakly labeled data
Anurag Kumar and Bhiksha Raj EURASIP Journal on Advances in Signal Processing 2016

On the Appropriateness of Complex-Valued Neural Networks for Speech Enhancement.
Lukas Drude and Bhiksha Raj and Reinhold Haeb-Umbach Interspeech 2016

DCASE challenge task 1
Anurag Kumar and Benjamin Elizalde and Ankit Shah and Rohan Badlani and Emmanuel Vincent and Bhiksha Raj and Ian Lane Tech. Rep., DCASE2016 Challenge 2016

Experimentation on the dcase challenge 2016: Task 1-acoustic scene classification and task 3-sound event detection in real life audio
Benjamin Elizalde and Anurag Kumar and Ankit Shah and Rohan Badlani and Emmanuel Vincent and Bhiksha Raj and Ian Lane Detection and Classification of Acoustic Scenes and Events 2016

Experiments on the DCASE challenge 2016: Acoustic scene classification and sound event detection in real life recording
Benjamin Elizalde and Anurag Kumar and Ankit Shah and Rohan Badlani and Emmanuel Vincent and Bhiksha Raj and Ian Lane arXiv preprint arXiv:1607.06706 2016

Features and kernels for audio event recognition
Anurag Kumar and Bhiksha Raj arXiv preprint arXiv:1607.05765 2016

Weakly supervised scalable audio content analysis
Anurag Kumar and Bhiksha Raj 2016 IEEE International Conference on Multimedia and Expo (ICME) 2016

Audiosentibank: Large-scale semantic ontology of acoustic concepts for audio content analysis
Sebastian Sager and Damian Borth and Benjamin Elizalde and Christian Schulze and Bhiksha Raj and Ian Lane and Andreas Dengel arXiv preprint arXiv:1607.03766 2016

Audio content based geotagging in multimedia
Anurag Kumar and Benjamin Elizalde and Bhiksha Raj arXiv preprint arXiv:1606.02816 2016

Viral spread via entertainment and voice-messaging among telephone users in india
Agha Ali Raza and Rajat Kulshreshtha and Spandana Gella and Sean Blagsvedt and Maya Chandrasekaran and Bhiksha Raj and Roni Rosenfeld arXiv preprint arXiv:1606.02816 2016

Forensic anthropometry from voice: an articulatory-phonetic approach
Rita Singh and Bhiksha Raj and Deniz Gencaga 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) 2016

The relationship of voice onset time and voice offset time to physical age
Rita Singh and Joseph Keshet and Deniz Gencaga and Bhiksha Raj 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016

Short-term analysis for estimating physical parameters of speakers
Rita Singh and Bhiksha Raj and James Baker 2016 4th International Conference on Biometrics and Forensics (IWBF) 2016

Formant manipulations in voice disguise by mimicry
Rita Singh and Deniz Gencaga and Bhiksha Raj 2016 4th International Conference on Biometrics and Forensics (IWBF) 2016

Content-based video indexing and retrieval using corr-lda
Rahul Radhakrishnan Iyer and Sanjeel Parekh and Vikas Mohandoss and Anush Ramsurat and Bhiksha Raj and Rita Singh arXiv preprint arXiv:1602.08581 2016

Content-based Video Indexing and Retrieval Using Corr-LDA
Rahul Radhakrishnan Iyer and Sanjeel Parekh and Vikas Mohandoss and Anush Ramsurat and Bhiksha Raj and Rita Singh arXiv e-prints 2016

Environmental noise embeddings for robust speech recognition
Suyoun Kim and Bhiksha Raj and Ian Lane arXiv preprint arXiv:1601.02553 2016

Learning model-based sparsity via projected gradient descent
Sohail Bahmani and Petros T Boufounos and Bhiksha Raj IEEE Transactions on Information Theory 2016

Revisiting Exploding Gradient: A Ghost That Never Leaves
Kai Hu and Matt Fredrikson IEEE Transactions on Information Theory 2016

An approach for self-training audio event detectors using web data
Ankit Shah and Rohan Badlani and Anurag Kumar and Benjamin Elizalde and Bhiksha Raj arXiv preprint arXiv:1609.06026 2016

The best of both worlds: Combining data-independent and data-driven approaches for action recognition
Zhenzhong Lan and Shoou-I Yu and Dezhong Yao and Ming Lin and Bhiksha Raj and Alexander Hauptmann Proceedings of the IEEE conference on computer vision and pattern recognition workshops 2016

Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices
João Miranda and Ramón F Astudillo and Ângela Costa and André Silva and Hugo Silva and João Graça and Bhiksha Raj Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016, Lisbon, Portugal, November 23-25, 2016, Proceedings 3 2016

Detecting psychological distress in adults through transcriptions of clinical interviews
Joana Correia and Isabel Trancoso and Bhiksha Raj Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016, Lisbon, Portugal, November 23-25, 2016, Proceedings 3 2016

2015

A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification
Victor Poblete and Felipe Espic and Simon King and Richard M Stern and Fernando Huenupan and Josue Fredes and Nestor Becerra Yoma Computer Speech & Language 2015

Efficient real spherical harmonic representation of head-related transfer functions
Griffin D Romigh and Douglas S Brungart and Richard M Stern and Brian D Simpson IEEE Journal of Selected Topics in Signal Processing 2015

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop
Hynek Hermansky and Lukáš Burget and Jordan Cohen and Emmanuel Dupoux and Naomi Feldman and John Godfrey and Sanjeev Khudanpur and Matthew Maciejewski and Sri Harish Mallidi and Anjali Menon and Tetsuji Ogawa and Vijayaditya Peddinti and Richard Rose and Richard Stern and Matthew Wiesner and Karel Veselý 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Efficient audio declipping using regularized least squares
Mark J Harvilla and Richard M Stern 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Robust automatic speech recognition in reverberation: onset enhancement versus binaural source separation
Hyung-Min Park and Matthew Maciejewski and Chanwoo Kim and Richard M Stern The Journal of the Acoustical Society of America 2015

A Bayesian framework for the estimation of head-related transfer functions
Griffin D Romigh and Richard M Stern and Douglas S Brungart and Brian D Simpson The Journal of the Acoustical Society of America 2015

CMU informedia@ TrecVID 2015 MED/SIN/LNK/SED
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann TREC Video Retrieval Evaluation 2015 2015

A perceptually-motivated low-complexity instantaneous linearchannel normalization technique applied to speaker verification
Víctor Poblete Ramírez and Felipe Espic and Simon King and Richard M Stern and Fernando Huenupán and Josué Abraham Fredes Sandoval and Néstor Becerra Yoma TREC Video Retrieval Evaluation 2015 2015

Robustness to additive noise of locally-normalized cepstral coefficients in speaker verification
Josué Fredes and José Novoa and Victor Poblete and Simon King and Richard M Stern and Nestor Becerra Yoma Sixteenth Annual Conference of the International Speech Communication Association 2015

Robust parameter estimation for audio declipping in noise
Mark J Harvilla and Richard M Stern Sixteenth Annual Conference of the International Speech Communication Association 2015

Secure modular hashing
Abelino Jiménez and Bhiksha Raj and Jose Portelo and Isabel Trancoso 2015 IEEE international workshop on information forensics and security (WIFS) 2015

Handcrafted local features are convolutional neural networks
Zhenzhong Lan and Shoou-I Yu and Ming Lin and Bhiksha Raj and Alexander G Hauptmann arXiv preprint arXiv:1511.05045 2015

Complex recurrent neural networks for denoising speech signals
Keiichi Osako and Rita Singh and Bhiksha Raj 2015 IEEE workshop on applications of signal processing to audio and acoustics (WASPAA) 2015

A survey: Time travel in deep learning space: An introduction to deep learning models and how deep learning models evolved from the initial ideas
Haohan Wang and Bhiksha Raj arXiv preprint arXiv:1510.04781 2015

Binary sparse coding of convolutive mixtures for sound localization and separation via spatialization
Afsaneh Asaei and Mohammad Javad Taghizadeh and Saeid Haghighatshoar and Bhiksha Raj and Hervé Bourlard and Volkan Cevher IEEE Transactions on Signal Processing 2015

Efficient autism spectrum disorder prediction with eye movement: A machine learning framework
Wenbo Liu and Xhiding Yu and Bhiksha Raj and Li Yi and Xiaobing Zou and Ming Li 2015 International conference on affective computing and intelligent interaction (ACII) 2015

Rapid development of public health education systems in low-literacy multilingual environments: combating ebola through voice messaging.
Nikolas Wolfe and Juneki Hong and Agha Ali Raza and Bhiksha Raj and Roni Rosenfeld SLaTE 2015

Privacy-preserving multi-document summarization
Luís Marujo and José Portêlo and Wang Ling and David Martins de Matos and João P Neto and Anatole Gershman and Jaime Carbonell and Isabel Trancoso and Bhiksha Raj arXiv preprint arXiv:1508.01420 2015

Improving headphone spatialization for stereo music
Muhammad Haris Usmani arXiv preprint arXiv:1508.01420 2015

A novel ranking method for multiple classifier systems
Anurag Kumar and Bhiksha Raj 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Privacy-preserving query-by-example speech search
José Portêlo and Alberto Abad and Bhiksha Raj and Isabel Trancoso 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Reducing communication overhead in distributed learning by an order of magnitude (almost)
Anders Øland and Bhiksha Raj 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015

Logsum using garbled circuits
José Portêlo and Bhiksha Raj and Isabel Trancoso Plos one 2015

Compositional models for audio processing: Uncovering the structure of sound mixtures
Tuomas Virtanen and Jort Florent Gemmeke and Bhiksha Raj and Paris Smaragdis IEEE Signal Processing Magazine 2015

Unsupervised fusion weight learning in multiple classifier systems
Anurag Kumar and Bhiksha Raj arXiv preprint arXiv:1502.01823 2015

Gaussian Pyramid: Multi-skip Feature Stacking for Action Recognition
Zhenzhong Lan and Ming Lin and Xuanchong Li and Alexander G Hauptmann and Bhiksha Raj Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2015

CMU informedia@ TrecVID 2015 MED/SIN/LNK/SED
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann TREC Video Retrieval Evaluation 2015 2015

Locality constrained transitive distance clustering on speech data.
Wenbo Liu and Zhiding Yu and Bhiksha Raj and Ming Li INTERSPEECH 2015

Beyond gaussian pyramid: Multi-skip feature stacking for action recognition
Zhengzhong Lan and Ming Lin and Xuanchong Li and Alex G Hauptmann and Bhiksha Raj Proceedings of the IEEE conference on computer vision and pattern recognition 2015

2014

In vivo treatment sensitivity testing with positron emission tomography/computed tomography after one cycle of chemotherapy for Hodgkin lymphoma
Martin Hutchings and Lale Kostakoglu and Jan Maciej Zaucha and Bogdan Malkowski and Alberto Biggi and Iwona Danielewicz and Annika Loft and Lena Specht and Dominick Lamonica and Myron S Czuczman and Christina Nanni and Pier Luigi Zinzani and Louis Diehl and Richard Stern and Morton Coleman J Clin Oncol 2014

Signal separation system and method for automatically selecting threshold to separate sound sources
Chan Woo Kim and Ki Wan Eom and Jae Won Lee and Richard M Stern J Clin Oncol 2014

An analysis of binaural spectro-temporal masking as nonlinear beamforming
Amir R Moghimi and Richard M Stern 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014

Time and intensity with Tino
Richard M Stern The Journal of the Acoustical Society of America 2014

Optimization of the parameters characterizing sigmoidal rate-level functions based on acoustic features
Víctor Poblete Ramírez and Néstor Becerra Yoma and Richard Stern The Journal of the Acoustical Society of America 2014

Informedia@ trecvid 2014 med and mer
Shoou-I Yu and Lu Jiang and Zexi Mao and Xiaojun Chang and Xingzhong Du and Chuang Gan and Zhenzhong Lan and Zhongwen Xu and Xuanchong Li and Yang Cai and Anurag Kumar and Yajie Miao and Lara Martin and Nikolas Wolfe and Shicheng Xu and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann NIST TRECVID Video Retrieval Evaluation Workshop 2014

Post-masking: A hybrid approach to array processing for speech recognition
Amir R Moghimi and Bhiksha Raj and Richard M Stern Fifteenth Annual Conference of the International Speech Communication Association 2014

Robust speech recognition in reverberant environments using subband-based steady-state monaural and binaural suppression
Hyung-Min Park and Matthew Maciejewski and Chanwoo Kim and Richard M Stern Fifteenth Annual Conference of the International Speech Communication Association 2014

Least squares signal declipping for robust speech recognition
Mark J Harvilla and Richard M Stern Fifteenth Annual Conference of the International Speech Communication Association 2014

Robust speech recognition using temporal masking and thresholding algorithm
Chanwoo Kim and Kean K Chin and Michiel Bacchiani and Richard M Stern Fifteenth Annual Conference of the International Speech Communication Association 2014

Optimization of the parameters characterizing sigmoidal rate-level functions based on acoustic features
Victor Poblete and Néstor Becerra Yoma and Richard M Stern Speech Communication 2014

Bach in 2014: Music composition with recurrent neural network
I Liu and Bhiksha Ramakrishnan arXiv preprint arXiv:1412.3191 2014

Privacy-preserving speaker verification using garbled GMMs
José Portêlo and Bhiksha Raj and Alberto Abad and Isabel Trancoso 2014 22nd European Signal Processing Conference (EUSIPCO) 2014

Detecting sound objects in audio recordings
Anurag Kumar and Rita Singh and Bhiksha Raj 2014 22nd European Signal Processing Conference (EUSIPCO) 2014

Privacy-preserving important passage retrieval
Luís Marujo and José Portêlo and David Martins de Matos and João P Neto and Anatole Gershman and Jaime Carbonell and Isabel Trancoso and Bhiksha Raj arXiv preprint arXiv:1407.5416 2014

Privacy-preserving speaker verification using secure binary embeddings
José Portêlo and Bhiksha Raj and Alberto Abad and Isabel Trancoso 2014 37th International convention on information and communication technology, electronics and microelectronics (MIPRO) 2014

Active-set newton algorithm for non-negative sparse coding of audio
Tuomas Virtanen and Bhiksha Raj and Jort F Gemmeke 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014

Iterative Bayesian word segmentation for unsupervised vocabulary discovery from phoneme lattices
Jahn Heymann and Oliver Walter and Reinhold Haeb-Umbach and Bhiksha Raj 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014

4.4 Unsupervised Learning for Audio
Tuomas Virtanen and Jon Barker and Shrikanth Narayanan and Alexandros Potamianos and Bhiksha Raj and Gaël Richard and Rita Singh and Paris Smaragdis and Stefano Squartini and Shiva Sundaram Computational Audio Analysis 2014

Informedia@ TrecVID 2014: MED and MER
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard A Stern and Alexander G Hauptmann TREC Video Retrieval Evaluation 2014 2014

Identification of Safest Path using Crime Records
Puneet Singh and Vasu Sharma and Rajat Kulshreshtha and Nishant Agrawal and Akshay Kumar and Bhiksha Raj and Rita Singh TREC Video Retrieval Evaluation 2014 2014

Post-masking: a hybrid approach to array processing for speech recognition.
Amir R Moghimi and Bhiksha Raj and Richard M Stern INTERSPEECH 2014

2013

Cmu-informedia at trecvid 2013 multimedia event detection
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Shourabh Rawat and Yang Cai and Chenqiang Gao and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard Stern and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann TRECVID 2013 Workshop 2013

Lateralization, discrimination, detection, and Ervin
Richard M Stern The Journal of the Acoustical Society of America 2013

Cognition in older breast cancer patients prior to systemictherapy: The Thinking and Living With Cancer (TLC) study
J Mandelblatt and R Stern and G Luta and J Clapp and A Hurria and P Jacobsen and A Saykin and T Ahles Journal of geriatric oncology 2013

PERCEPTION-BASED MEDIA PROCESSING
K Brandenburg and C Faller and J Herre and JD Johnston and WB Kleijn and S Spors and H Wierstorf and A Raake and F Melchior and M Frank and F Zotter and G Richard and S Sundaram and S Narayanan and S Möller and R Heusdens and H Hermansky and JR Cohen and RM Stern and J Wouters and S Doclo and R Koning and T Francart and E Reinhard and AA Efros and J Kautz and HP Seidel and AC Bovik and HR Wu and AR Reibman and W Lin and F Pereira and SS Hemami and LB Kish and CG Granqvist and LJ Karam and K MacLean and R Garner Proceedings of the IEEE 2013

Optimization of sigmoidal rate-level function based on acoustic features.
Víctor Poblete and Néstor Becerra Yoma and Richard M Stern INTERSPEECH 2013

Perceptual properties of current speech recognition technology
Hynek Hermansky and Jordan R Cohen and Richard M Stern Proceedings of the IEEE 2013

The role of spatial detail in sound-source localization: Impact on HRTF modeling and personalization.
Griffin D Romigh and Douglas Brungart and Richard M Stern and Brian D Simpson Proceedings of Meetings on Acoustics 2013

The role of spatial detail in sound-source localization: Impact on head-related transfer function modeling and personalization
Griffin D Romigh and Douglas S Brungart and Richard M Stern and Brian D Simpson The Journal of the Acoustical Society of America 2013

Master of Science in Music and Technology
Dalong Cheng and Roger Dannenberg and Richard Stern and Richard Randall The Journal of the Acoustical Society of America 2013

Informedia E-Lamp@ TRECVID 2013: Multimedia Event Detection and Recounting (MED and MER)
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Chenqiang Gao and Shourabh Rawat and Yang Cai and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Nicolas Ballas and Deyu Meng and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg The Journal of the Acoustical Society of America 2013

A hierarchical system for word discovery exploiting DTW-based initialization
Oliver Walter and Timo Korthals and Reinhold Haeb-Umbach and Bhiksha Raj 2013 IEEE Workshop on Automatic Speech Recognition and Understanding 2013

Unsupervised word segmentation from noisy input
Jahn Heymann and Oliver Walter and Reinhold Haeb-Umbach and Bhiksha Raj 2013 IEEE workshop on automatic speech recognition and understanding 2013

Swara Histogram Based Structural Analysis And Identification Of Indian Classical Ragas.
Pranay Dighe and Harish Karnick and Bhiksha Raj ISMIR 2013

Cmu-informedia at trecvid 2013 multimedia event detection
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Shourabh Rawat and Yang Cai and Chenqiang Gao and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard Stern and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann TRECVID 2013 Workshop 2013

Hidden Markov Models
Bhiksha Raj TRECVID 2013 Workshop 2013

The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
Keisuke Kinoshita and Marc Delcroix and Takuya Yoshioka and Tomohiro Nakatani and Emanuel Habets and Reinhold Haeb-Umbach and Volker Leutnant and Armin Sehr and Walter Kellermann and Roland Maas and Sharon Gannot and Bhiksha Raj 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2013

Event detection in short duration audio using gaussian mixture model and random forest classifier
Anurag Kumar and Rajesh M Hegde and Rita Singh and Bhiksha Raj 21st European Signal Processing Conference (EUSIPCO 2013) 2013

Speaker verification using secure binary embeddings
José Portêlo and Bhiksha Raj and Petros Boufounos and Isabel Trancoso and Alberto Abad 21st European Signal Processing Conference (EUSIPCO 2013) 2013

Secure binary embeddings of front-end factor analysis for privacy preserving speaker verification.
José Portelo and Alberto Abad and Bhiksha Raj and Isabel Trancoso INTERSPEECH 2013

Doppler based speed estimation of vehicles using passive sensor
Shubhranshu Barnwal and Rohit Barnwal and Rajesh Hegde and Rita Singh and Bhiksha Raj 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) 2013

Scale independent raga identification using chromagram patterns and swara based features
Pranay Dighe and Parul Agrawal and Harish Karnick and Siddartha Thota and Bhiksha Raj 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) 2013

Measuring prevalence of other-oriented transactive contributions using an automated measure of speech style accommodation
Gahgene Gweon and Mahaveer Jain and John McDonough and Bhiksha Raj and Carolyn P Rosé International Journal of Computer-Supported Collaborative Learning 2013

Joint constrained maximum likelihood regression for overlapping speech recognition
Kenichi Kumatani and Rita Singh and Friedrich Faubel and John McDonough and Youssef Oualil 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013

Optimization of the DET curve in speaker verification under noisy conditions
Leibny Paola Garcia Perera and Bhiksha Raj and Juan Arturo Nolazco Flores Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on 2013

Unsupervised hierarchical structure induction for deeper semantic analysis of audio
Sourish Chaudhuri and Bhiksha Raj 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013

Speaker tracking with spherical microphone arrays
John McDonough and Kenichi Kumatani and Takayuki Arakawa and Kazumasa Yamamoto and Bhiksha Raj 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013

Active-set Newton algorithm for overcomplete non-negative representations of audio
Tuomas Virtanen and Jort Florent Gemmeke and Bhiksha Raj IEEE Transactions on Audio, Speech, and Language Processing 2013

A unifying analysis of projected gradient descent for ℓp-constrained least squares
Sohail Bahmani and Bhiksha Raj Applied and Computational Harmonic Analysis 2013

Privacy-preserving probabilistic inference based on hidden Markov models
Shantanu Rane and Wei Sun and Manas A Pathak and Bhiksha Raj Applied and Computational Harmonic Analysis 2013

Robust 1-bit compressive sensing via gradient support pursuit
Sohail Bahmani and Petros T Boufounos and Bhiksha Raj arXiv preprint arXiv:1304.6627 2013

Greedy sparsity-constrained optimization
Sohail Bahmani and Bhiksha Raj and Petros T Boufounos The Journal of Machine Learning Research 2013

Privacy-preserving speech processing: cryptographic and string-matching frameworks show promise
Manas A Pathak and Bhiksha Raj and Shantanu D Rane and Paris Smaragdis IEEE signal processing magazine 2013

Ensemble approach in speaker verification
Leibny Paola Garcia Perera and Bhiksha Raj and Juan Arturo Nolazco Flores Proc. Interspeech 2013 2013

Differentially private aggregate classifier for multiple databases
Bhiksha Ramakrishnan Shantanu Rane and Manas A. Pathak Proc. Interspeech 2013 2013

Block-sparse basis sets for improved audio content estimation
Sourish Chaudhuri and Rita Singh and Bhiksha Raj ICASSP 2013

Informedia E-Lamp@ TRECVID 2013: Multimedia Event Detection and Recounting (MED and MER)
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Chenqiang Gao and Shourabh Rawat and Yang Cai and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Nicolas Ballas and Deyu Meng and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg ICASSP 2013

A Comparative Study Of Indian And Western Music Forms.
Parul Agarwal and Harish Karnick and Bhiksha Raj ISMIR 2013

Discriminatively trained dependency language modeling for conversational speech recognition.
Benjamin Lambert and Bhiksha Raj and Rita Singh INTERSPEECH 2013

A multipath sparse beamforming method
Afsaneh Asaei and Bhiksha Raj and Hervé Bourlard and Volkan Cevher Signal Processing with Adaptive Sparse Structured Representations (SPARS) 2013

Unsupervised word discovery from phonetic input using nested pitman-yor language modeling
Oliver Walter and Reinhold Haeb-Umbach and Sourish Chaudhuri and Bhiksha Raj ICRA Workshop on Autonomous Learning 2013

2012

Department of Electrical Engineering and Biomedical Engineering Program Carnegie-Mellon University Pittsburgh, Pennsylvania 15213 USA
Richard M Stern and Stephen J Bachorski HEARING-Physiological Bases and Psychophysics 2012

Optimization of the DET curve in speaker verification
L Paola Garcia-Perera and Juan A Nolazco-Flores and Bhiksha Raj and Richard Stern 2012 IEEE Spoken Language Technology Workshop (SLT) 2012

Features based on auditory physiology and perception
Richard M Stern and Nelson Morgan Techniques for Noise Robustness in Automatic Speech Recognition 2012

Informedia e-lamp@ trecvid 2012: multimedia event detection and recounting (med and mer)
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Sohail Bahmani and Antonio Juarez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann Techniques for Noise Robustness in Automatic Speech Recognition 2012

Hearing is believing: Biologically inspired methods for robust automatic speech recognition
Richard M Stern and Nelson Morgan IEEE signal processing magazine 2012

Pretherapy metabolic tumor burden (MTV) may risk-stratify lymphoma patients: Comparison with early metabolic response
Lale Kostakoglu and Neetha Gandikota and Martin Hutchings and Ryan Cotter and Dominick Lamonica and Josef Machac and Richard Stern and Morton Coleman IEEE signal processing magazine 2012

Two-microphone source separation algorithm based on statistical modeling of angle distributions
Chanwoo Kim and Charbel Khawand and Richard M Stern 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training
Mark J Harvilla and Richard M Stern 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

Review of research exploring school attitude and related constructs
Mandy Stern and Mandy Stern 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

Microphone array processing for distant speech recognition: Spherical arrays
John McDonough and Kenichi Kumatani and Bhiksha Raj Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2012

Microphone array processing for distant speech recognition: Towards real-world deployment
Kenichi Kumatani and Takayuki Arakawa and Kazumasa Yamamoto and John McDonough and Bhiksha Raj and Rita Singh and Ivan Tashev Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2012

Optimization of the DET curve in speaker verification
L Paola Garcia-Perera and Juan A Nolazco-Flores and Bhiksha Raj and Richard Stern 2012 IEEE Spoken Language Technology Workshop (SLT) 2012

Techniques for noise robustness in automatic speech recognition
Tuomas Virtanen and Rita Singh and Bhiksha Raj 2012 IEEE Spoken Language Technology Workshop (SLT) 2012

Features based on auditory physiology and perception
Richard M Stern and Nelson Morgan Techniques for Noise Robustness in Automatic Speech Recognition 2012

The problem of robustness in automatic speech recognition
Bhiksha Raj and Tuomas Virtanen and Rita Singh Techniques for Noise Robustness in Automatic Speech Recognition 2012

Missing‐Data Techniques: Feature Reconstruction
Jort Florent Gemmeke and Ulpu Remes Techniques for Noise Robustness in Automatic Speech Recognition 2012

Acoustic model training for robust speech recognition
Michael L Seltzer Techniques for Noise Robustness in Automatic Speech Recognition 2012

Uncertainty Decoding.
Hank Liao and Tuomas Virtanen and Bhiksha Raj and Rita Singh Techniques for Noise Robustness in Automatic Speech Recognition 2012

Microphone array processing for distant speech recognition
Kumatani Kenichi and McDonough John and Raj Bhiksha IEEE Signal Process. Mag 2012

Informedia e-lamp@ trecvid 2012: multimedia event detection and recounting (med and mer)
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Sohail Bahmani and Antonio Juarez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann IEEE Signal Process. Mag 2012

Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors
Kenichi Kumatani and John McDonough and Bhiksha Raj IEEE Signal Processing Magazine 2012

Ultrasonic doppler sensor for speaker recognition
Bhiksha Raj Ramakrishnan and Kaustubh Kalgaonkar IEEE Signal Processing Magazine 2012

Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia.
Sourish Chaudhuri and Rita Singh and Bhiksha Raj INTERSPEECH 2012

Plagiarism Detection in Polyphonic Music using Monaural Signal Separation.
Soham De and Indradyumna Roy and Tarunima Prabhakar and Kriti Suneja and Sourish Chaudhuri and Rita Singh and Bhiksha Raj INTERSPEECH 2012

Microphone array post-filter based on spatially-correlated noise measurements for distant speech recognition
Kenichi Kumatani and Bhiksha Raj and Rita Singh and John McDonough Proc. Interspeech, Portland, OR 2012

Privacy-preserving speaker verification and identification using gaussian mixture models
Manas A Pathak and Bhiksha Raj IEEE Transactions on Audio, Speech, and Language Processing 2012

Method for determining distributions of unobserved classes of a classifier
Bhisksha Raj Ramakrishnan and Evandro Bacci Gouvêa IEEE Transactions on Audio, Speech, and Language Processing 2012

Large margin Gaussian mixture models with differential privacy
Manas A Pathak and Bhiksha Raj IEEE Transactions on dependable and secure computing 2012

An unsupervised dynamic bayesian network approach to measuring speech style accommodation
Mahaveer Jain and John McDonough and Gahgene Gweon and Bhiksha Raj and Carolyn Rose Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics 2012

Spectrographic seam patterns for discriminative word spotting
Shubhranshu Barnwal and Kamal Sahni and Rita Singh and Bhiksha Raj 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

Audio event detection from acoustic unit occurrence patterns
Anurag Kumar and Pranay Dighe and Rita Singh and Sourish Chaudhuri and Bhiksha Raj 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP) 2012

Attacking a privacy preserving music matching algorithm
José Portêlo and Bhiksha Raj and Isabel Trancoso 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

Privacy-preserving speaker verification as password matching
Manas A Pathak and Bhiksha Raj 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012

The Markov selection model for concurrent speech recognition
Paris Smaragdis and Bhiksha Raj Neurocomputing 2012

Ultrasonic doppler sensing in hci
Bhiksha Raj and Kaustubh Kalgaonkar and Chris Harrison and Paul Dietz IEEE Pervasive Computing 2012

Informedia@ TRECVID 2012.
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Nisarga Markandaiah and Sohail Bahmani and Antonio Juárez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard M Stern and Teruko Mitamura and Eric Nyberg and Lu Jiang and Qiang Chen and Lisa M Brown and Ankur Datta and Quanfu Fan and Rogério Schmidt Feris and Shuicheng Yan and Alexander G Hauptmann and Sharath Pankanti TRECVID 2012

Foundations 2. The Basics of Automatic Speech Recognition
Tuomas Virtanen and Rita Singh and Bhiksha Raj TRECVID 2012

Microphone Arrays
T Virtanen and R Singh and B Raj TRECVID 2012

Method for indexing for retrieving documents using particles
Bret A. Harsham Bhiksha Ramakrishnan and Evandro B. Gouvêa and Bent Schmidt-Nielsen and Garrett Weinberg TRECVID 2012

Segment and conquer
Benjamin Elizalde and Bhiksha Raj and Gerald Friedland and Juan Nolazco and Leibny Garcia 2nd Multimedia and Vision Meeting in Greater New York Area 2012

Demonstration of advanced multi-modal, network-centric communication management suite
Victor Finomore Jr and John Stewart and Rita Singh and Bhiksha Raj and Ron Dallman Thirteenth Annual Conference of the International Speech Communication Association 2012

Distant Multi-Speaker Voice Activity Detection Using Relative Energy Ratio
Gang Chen and Kenichi Kumatani and John McDonough and Bhiksha Raj International Conference on Acoustics, Speech and Signal Processing 2012

Unsupervised structure discovery for semantic analysis of audio
Sourish Chaudhuri and Bhiksha Raj Advances in Neural Information Processing Systems 2012

Language identification using spectro-temporal patch features
Kamal Sahni and Pranay Dighe and Rita Singh and Bhiksha Raj SAPA-SCALE Conference 2012

Privacy-preserving speaker authentication
Manas Pathak and Jose Portelo and Bhiksha Raj and Isabel Trancoso Information Security: 15th International Conference, ISC 2012, Passau, Germany, September 19-21, 2012. Proceedings 15 2012

Structured Sparse Coding for Microphone Array Location Calibration
Afsaneh Asaei and Bhiksha Raj and Hervé Bourlard and Volkan Cevher The 5th ISCA workshop on statistical and perceptual audition (SAPA2012) 2012

Predicting idea co-construction in speech data using insights from sociolinguistics
Gahgene Gweon and Mahaveer Jain and John McDonough and Bhiksha Raj and Carolyn Rose The 5th ISCA workshop on statistical and perceptual audition (SAPA2012) 2012

2011

Applying physiologically-motivated models of auditory processing to automatic speech recognition
Richard M Stern Proceedings of the International Symposium on Auditory and Audiological Research 2011

Acoustical Society of America Distinguished Service Citation
Richard Stern The Journal of the Acoustical Society of America 2011

Learning-based auditory encoding for robust speech recognition
Yu-Hsiang Bosco Chiu and Bhiksha Raj and Richard M Stern IEEE transactions on audio, speech, and language processing 2011

Gammatone sub-band magnitude-domain dereverberation for ASR
Kshitiz Kumar and Rita Singh and Bhiksha Raj and Richard Stern 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

An iterative least-squares technique for dereverberation
Kshitiz Kumar and Bhiksha Raj and Rita Singh and Richard M Stern 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

Binaural sound source separation motivated by auditory processing
Chanwoo Kim and Kshitiz Kumar and Richard M Stern 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

Delta-spectral cepstral coefficients for robust speech recognition
Kshitiz Kumar and Chanwoo Kim and Richard M Stern 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP) 2011

Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
Wooil Kim and Richard M Stern Speech Communication 2011

Modelling the combined effect of binaural hearing and reverberation
Birger Kollmeier and Jan Rennies and Anna Warzybok and Thomas Brand Proceedings of the International Symposium on Auditory and Audiological Research 2011

Maximum kurtosis beamforming with a subspace filter for distant speech recognition
Kenichi Kumatani and John McDonough and Bhiksha Raj 2011 IEEE Workshop on Automatic Speech Recognition & Understanding 2011

Scalable audio-content analysis
Bhiksha Raj and Paris Smaragdis and Malcolm Slaney and Chung-Hsien Wu and Liming Chen and Hyoung-Gook Kim 2011 IEEE Workshop on Automatic Speech Recognition & Understanding 2011

Sphinx-4: A flexible open source framework for speech recognition (2004)
Willie Walker and Paul Lamere and Philip Kwok and Bhiksha Raj and Rita Singh and Evandro Gouvea and Peter Wolf and Joe Woelfel Relatório Técnico. Disponível em:< http://cmusphinx. sourceforge. net>. Acesso em 2011

Efficient Protocols for Principal Eigenvector Computation over Private Data.
Manas A Pathak and Bhiksha Raj Trans. Data Priv. 2011

Missing data imputation for time-frequency representations of audio signals
Paris Smaragdis and Bhiksha Raj and Madhusudana Shashanka Journal of signal processing systems 2011

An information filter for voice prompt suppression
John McDonough and Wei Chu and Kenichi Kumatani and Bhiksha Raj and Jill Fain Lehman 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR) 2011

Learning contextual relevance of audio segments using discriminative models over AUD sequences
Sourish Chaudhuri and Bhiksha Raj 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2011

Block-wise incremental adaptation algorithm for maximum kurtosis beamforming
Kenichi Kumatani and John McDonough and Bhiksha Raj 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2011

On the combination of voice prompt suppression with maximum kurtosis beamforming
John McDonough and Bhiksha Raj and Kenichi Kumatani 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2011

Joint sparsity models for wideband array processing
Petros T Boufounos and Paris Smaragdis and Bhiksha Raj Wavelets and Sparsity XIV 2011

Learning-based auditory encoding for robust speech recognition
Yu-Hsiang Bosco Chiu and Bhiksha Raj and Richard M Stern IEEE transactions on audio, speech, and language processing 2011

On the implementation of a secure musical database matching
José Portêlo and Bhiksha Raj and Alberto Abad and Isabel Trancoso 2011 19th European Signal Processing Conference 2011

Privacy Preserving Speaker Verification Using Adapted GMMs.
Manas A Pathak and Bhiksha Raj Interspeech 2011

Reconstructing noise-corrupted spectrographic components for robust speech recognition
Bhiksha Raj and Rita Singh Interspeech 2011

A comparison of latent variable models for conversation analysis
Sourish Chaudhuri and Bhiksha Raj Proceedings of the SIGDIAL 2011 Conference 2011

Channel selection based on multichannel cross-correlation coefficients for distant speech recognition
Kenichi Kumatani and John McDonough and Jill Fain Lehman and Bhiksha Raj 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays 2011

A paired test for recognizer selection with untranscribed data
Bhiksha Raj and Rita Singh and James Baker 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

An iterative least-squares technique for dereverberation
Kshitiz Kumar and Bhiksha Raj and Rita Singh and Richard M Stern 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

Gammatone sub-band magnitude-domain dereverberation for ASR
Kshitiz Kumar and Rita Singh and Bhiksha Raj and Richard Stern 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

Privacy preserving probabilistic inference with hidden Markov models
Manas Pathak and Shantanu Rane and Wei Sun and Bhiksha Raj 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011

Speech Communication: Preface
Martin Heckmann and Bhiksha Raj and Paris Smaragdis Speech Communication 2011

Perceptual and Statistical Audition Preface
Martin Heckmann and Bhiksha Raj and Paris Smaragdis Speech Communication 2011

Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and Classification.
Sourish Chaudhuri and Mark Harvilla and Bhiksha Raj Interspeech 2011

Privacy preserving spam filtering
Manas A Pathak and Mehrbod Sharifi and Bhiksha Raj arXiv preprint arXiv:1102.4021 2011

Design and implementation of speech recognition systems
Bhiksha Raj and Rita Singh Carniege Mellon School of Computer Science 2011

The automatic assessment of knowledge integration processes in project teams
Pulkit Agrawal and Mikesh Udani Long papers 2011

System And Method For Acquiring Acoustic Signals Using Doppler Techniques
Bhiksha Ramakrishnan Acoustical Society of America Journal 2011

Nonlinear Dimensionality Reduction of Spectrograms
Bhiksha R. Ramakrishnan Kevin W. Wilson Acoustical Society of America Journal 2011

Method for expanding audio signal bandwidth
Bhiksha R. Ramakrishnan Paris Smaragdis Acoustical Society of America Journal 2011

Method for interacting with users of speech recognition systems
Bret A. Harsham Garrett Weinberg and Bhiksha Ramakrishnan and Bent Schmidt-Nielsen Acoustical Society of America Journal 2011

Denoising acoustic signals using constrained non-negative matrix factorization
Paris Smaragdis Kevin W. Wilson and Ajay Divakaran and Bhiksha Ramakrishnan Acoustical Society of America Journal 2011

Denoising acoustic signals using constrained non-negative matrix factorization
Paris Smaragdis Kevin W. Wilson and Ajay Divakaran and Bhiksha Ramakrishnan Acoustical Society of America Journal 2011

Topic Molels for Signal Processing
P Smaragdis and B Raj IEEE International Conference on Acoustics, Speech and Signal Processing 2011

A mutual information criterion for voice activity detection
John McDonough and Kenichi Kumatani and Bhiksha Raj and Jill Fain Lehman Proc. Interspeech, submitted for publication 2011

The Automatic Assessment of Knowledge Integration Processes in Project Teams
Carolyn Rose Gahgene Gweon and Pulkit Agarwal and Mikesh Udani and Bhiksha Raj Proceedings of the 9th International Conference on Computer-Supported Collaborative Learning CSCL 2011 2011

Perceptual and Statistical Audition
Bhiksha Raj and Paris Smaragdis Proceedings of the 9th International Conference on Computer-Supported Collaborative Learning CSCL 2011 2011

A comparison of prosody modification using instants of significant excitation and mel-cepstral vocoder
B Bajibabu and Ronanki Srikanth and Sathya Adithya Thati and Bhiksha Raj and B Yegnanarayana and Kishore Prahallad Proceedings of the Centenary Conference on Electrical Engineering,(CEE’11), Indian Institute of Science, Bangalore 2011

A Paradigm for Limited Vocabulary Speech Recognition Based on Redundant Spectro-Temporal Feature Sets.
Sourish Chaudhuri and Bhiksha Raj and Tony Ezzat INTERSPEECH 2011

Phoneme-dependent NMF for speech enhancement in monaural mixtures
Bhiksha Raj and Rita Singh and Tuomas Virtanen Twelfth Annual Conference of the International Speech Communication Association 2011

2010

The impact of the distribution of internal delays in binaural models on predictions for psychoacoustical data.
Richard M Stern The Journal of the Acoustical Society of America 2010

Automatic selection of thresholds for signal separation algorithms based on interaural delay.
Chanwoo Kim and Richard M Stern and Kiwan Eom and Jaewon Lee INTERSPEECH 2010

Nonlinear enhancement of onset for robust speech recognition.
Chanwoo Kim and Richard M Stern INTERSPEECH 2010

A hybrid physical and statistical dynamic articulatory framework incorporating analysis-by-synthesis for improved phone classification
Ziad Al Bawab and Bhiksha Raj and Richard M Stern 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Maximum-likelihood-based cepstral inverse filtering for blind speech dereverberation
Kshitiz Kumar and Richard M Stern 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring
Chanwoo Kim and Richard M Stern 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Non-negative hidden Markov modeling of audio with application to source separation
Gautham J Mysore and Paris Smaragdis and Bhiksha Raj Twelfth Annual Conference of the International Speech Communication Association 2010

Privacy preserving protocols for eigenvector computation
Manas Pathak and Bhiksha Raj Twelfth Annual Conference of the International Speech Communication Association 2010

Large margin multiclass Gaussian classification with differential privacy
Manas A Pathak and Bhiksha Raj Twelfth Annual Conference of the International Speech Communication Association 2010

System and method for acquiring acoustic signals using doppler techniques
Bhiksha Ramakrishnan and Paul H Dietz and Bent Schmidt-Nielsen Twelfth Annual Conference of the International Speech Communication Association 2010

Creating a linguistic plausibility dataset with non-expert annotators.
Benjamin Lambert and Rita Singh and Bhiksha Raj INTERSPEECH 2010

Non-negative matrix factorization based compensation of music for automatic speech recognition.
Bhiksha Raj and Tuomas Virtanen and Sourish Chaudhuri and Rita Singh Interspeech 2010

Ultrasonic Doppler System and Method for Gesture Recognition
Bhiksha Raj Ramakrishnan and Kaustubh Kalgaonkar Interspeech 2010

Spectrogram dimensionality reductionwith independence constraints
Kevin W Wilson and Bhiksha Raj 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Synthesizing speech from Doppler signals
Arthur R Toth and Kaustubh Kalgaonkar and Bhiksha Raj and Tony Ezzat 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

A hybrid physical and statistical dynamic articulatory framework incorporating analysis-by-synthesis for improved phone classification
Ziad Al Bawab and Bhiksha Raj and Richard M Stern 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Ultrasonic sensing for robust speech recognition
Sundararajan Srinivasan and Bhiksha Raj and Tony Ezzat 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Latent-variable decomposition based dereverberation of monaural and multi-channel signals
Rita Singh and Bhiksha Raj and Paris Smaragdis 2010 IEEE International Conference on Acoustics, Speech and Signal Processing 2010

Synthesizing speech from surface electromyography and acoustic Doppler sonar.
Arthur R Toth and Michael Wand and Szu‐Chen Stan Jou and Tanja Schultz and Bhiksha Raj and Kaustubh Kalgaonkar and Tony Ezzat The Journal of the Acoustical Society of America 2010

Ultrasonic Sensing for Robust Speech Recognition
Bhiksha Raj Sundararajan Srinivasan and Tony Ezzat The Journal of the Acoustical Society of America 2010

Single-channel speech separation based on instantaneous frequency
Lingyun Gu The Journal of the Acoustical Society of America 2010

Method and system for FFT-based companding for automatic speech recognition
Rahul Sarpeshkar Bhiksha Ramakrishnan and Bent Schmidt-Nielsen and Lorenzo Turicchia The Journal of the Acoustical Society of America 2010

Method and system for FFT-based companding for automatic speech recognition
Rahul Sarpeshkar Bhiksha Ramakrishnan and Bent Schmidt-Nielsen and Lorenzo Turicchia The Journal of the Acoustical Society of America 2010

Method and system for FFT-based companding for automatic speech recognition
Rahul Sarpeshkar Bhiksha Ramakrishnan and Bent Schmidt-Nielsen and Lorenzo Turicchia The Journal of the Acoustical Society of America 2010

Ultrasonic Doppler System and Method for Gesture Recognition
Kaustubh Kalgaonkar Bhiksha Raj Ramakrishnan The Journal of the Acoustical Society of America 2010

Ultrasonic Doppler System and Method for Gesture Recognition
Kaustubh Kalgaonkar Bhiksha Raj Ramakrishnan The Journal of the Acoustical Society of America 2010

Ultrasonic Doppler System and Method for Gesture Recognition
Kaustubh Kalgaonkar Bhiksha Raj Ramakrishnan The Journal of the Acoustical Society of America 2010

Constructing broad-band acoustic signals from lower-band acoustic signals
Paris Smaragdis Bhiksha Ramakrishnan The Journal of the Acoustical Society of America 2010

Method and system for identifying moving objects using acoustic signals
Bhiksha Ramakrishnan The Journal of the Acoustical Society of America 2010

The use of sense in unsupervised training of acoustic models for ASR systems
Rita Singh and Benjamin Lambert and Bhiksha Raj Eleventh Annual Conference of the International Speech Communication Association 2010

Ungrounded independent non-negative factor analysis.
Bhiksha Raj and Kevin W Wilson and Alexander Krueger and Reinhold Haeb-Umbach Interspeech 2010

Subword unit approaches for retrieval by voice
Evandro Gouvea and Tony Ezzat and Bhiksha Raj SpokenQuery Workshop on Voice Search 2010

Multiparty differential privacy via aggregation of locally trained classifiers
Manas Pathak and Shantanu Rane and Bhiksha Raj Advances in neural information processing systems 2010

2009

Robust speech recognition using a small power boosting algorithm
Chanwoo Kim and Kshitiz Kumar and Richard M Stern 2009 IEEE Workshop on Automatic Speech Recognition & Understanding 2009

Power function-based power distribution normalization algorithm for robust speech recognition
Chanwoo Kim and Richard M Stern 2009 IEEE Workshop on Automatic Speech Recognition & Understanding 2009

Deriving vocal tract shapes from electromagnetic articulograph data via geometric adaptation and matching.
Ziad Al Bawab and Lorenzo Turicchia and Richard M Stern and Bhiksha Raj Interspeech 2009

Minimum variance modulation filter for robust speech recognition
Yu-Hsiang Bosco Chiu and Richard M Stern 2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009

Speaker Segmentation and Clustering for Simultaneously Presented Speech
Lingyun Gu and Richard M Stern Tenth Annual Conference of the International Speech Communication Association 2009

Unsupervised Training Scheme with Non-Stereo Data for Empirical Feature Vector Compensation
Luis Buera and Antonio Miguel and Alfonso Ortega and Eduardo Lleida and Richard M Stern Tenth Annual Conference of the International Speech Communication Association 2009

Towards fusion of feature extraction and acoustic model training: A top down process for robust speech recognition
Yu-Hsiang Bosco Chiu and Bhiksha Raj and Richard M Stern Tenth Annual Conference of the International Speech Communication Association 2009

Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings
Hyung-Min Park and Richard M Stern Speech Communication 2009

Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain
Chanwoo Kim and Kshitiz Kumar and Bhiksha Raj and Richard M Stern Tenth Annual Conference of the International Speech Communication Association 2009

Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction
Chanwoo Kim and Richard M Stern tenth annual conference of the international speech communication association 2009

Towards fusion of feature extraction and acoustic model training: a top down process for robust speech recognition.
Yu-Hsiang Bosco Chiu and Bhiksha Raj and Richard M Stern INTERSPEECH 2009

Deriving vocal tract shapes from electromagnetic articulograph data via geometric adaptation and matching.
Ziad Al Bawab and Lorenzo Turicchia and Richard M Stern and Bhiksha Raj Interspeech 2009

Missing data imputation for spectral audio signals
Paris Smaragdis and Bhiksha Raj and Madhusudana Shashanka 2009 IEEE International Workshop on Machine Learning for Signal Processing 2009

Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain.
Chanwoo Kim and Kshitiz Kumar and Bhiksha Raj and Richard M Stern INTERSPEECH 2009

Single-channel Speech Separation Based on Instantaneous Frequency
Bhiksha Raj INTERSPEECH 2009

One-handed gesture recognition using ultrasonic Doppler sonar
Kaustubh Kalgaonkar and Bhiksha Raj 2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009

A joint decoding algorithm for multiple-example-based addition of words to a pronunciation lexicon
Dhananjay Bansal and Nishanth Nair and Rita Singh and Bhiksha Raj 2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009

Feature Computation: Representing the Speech Signal
Bhiksha Raj and Rita Singh Signal 2009

Probabilistic factorization of non-negative data with entropic co-occurrence constraints
Paris Smaragdis and Madhusudana Shashanka and Bhiksha Raj and Gautham J Mysore Signal 2009

One-handed gesture recognition using ultrasonic sonar
Kaustubh Kalgaonkar and Bhiksha Raj Signal 2009

Method and system for retrieving documents with spoken queries
Peter P. Wolf and Joseph K. Woelfel and Bhiksha Ramakrishnan Signal 2009

Topic models for audio mixture analysis
Paris Smaragdis and Madhusudana Shashanka and Bhiksha Raj Proc. of the NIPS workshop on applications for topic models: text and beyond 2009

A Knowledge-Based Architecture for using Semantics in Automatic Speech Recognition
Benjamin E Lambert and Scott E Fahlman and Bhiksha Raj and Roni Rosenfeld and Candy Sidner Ph. D. thesis 2009

Word particles applied to information retrieval
Evandro B Gouvêa and Bhiksha Raj Advances in Information Retrieval: 31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings 31 2009

A sparse non-parametric approach for single channel separation of known sounds
Paris Smaragdis and Madhusudana Shashanka and Bhiksha Raj Advances in neural information processing systems 2009

2008

Underwater
David L Bradley and Richard Stern tenth annual conference of the international speech communication association 2008

Binaural and multiple-microphone signal processing motivated by auditory perception
Richard M Stern and Evandro Gouvêa and Chanwoo Kim and Kshitiz Kumar and Hyung-Min Park 2008 Hands-Free Speech Communication and Microphone Arrays 2008

” Polyaural” array processing for robust automatic speech recognition in noisy and reverberant environments
Richard M Stern and Evandro B Gouvea and Kshitiz Kumar The Journal of the Acoustical Society of America 2008

Single-channel speech separation based on modulation frequency
Lingyun Gu and Richard M Stern 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Analysis-by-synthesis features for speech recognition
Ziad Al Bawab and Bhiksha Raj and Richard M Stern 2008 IEEE international conference on acoustics, speech and signal processing 2008

Environment-invariant compensation for reverberation using linear post-filtering for minimum distortion
Kshitiz Kumar and Richard M Stern 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Flammendes Inferno: The Towering Inferno
Richard Martin Stern and Thomas N Scortia 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

The Towering Inferno
Frank M Robinson and Stirling Silliphant and Richard Martin Stern 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Compensation approaches for far-field speaker identification
Qin Jin and Kshitiz Kumar and Tanja Schultz and Richard Stern NIST SRE Workshop 2008

Analysis of physiologically-motivated signal processing for robust speech recognition
Yu-Hsiang Bosco Chiu and Richard M Stern Ninth Annual Conference of the International Speech Communication Association 2008

Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis
Chanwoo Kim and Richard M Stern Ninth Annual Conference of the International Speech Communication Association 2008

Discovery of temporal patterns in continuous nonrandom sound sequences.
Rita Singh and Bhiksha Raj The Journal of the Acoustical Society of America 2008

Inferring missing spectral data.
Paris Smaragdis and Bhiksha Raj The Journal of the Acoustical Society of America 2008

Regularized non-negative matrix factorization with temporal dependencies for speech denoising.
Kevin W Wilson and Bhiksha Raj and Paris Smaragdis Interspeech 2008

Recognizing talking faces from acoustic doppler reflections
Kaustubh Kalgaonkar and Bhiksha Raj 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition 2008

Ultrasonic Doppler sensor for speech-based user interface
Bhiksha Ramakrishnan and Kaustubh Kalgaonkar 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition 2008

Probabilistic latent variable models as nonnegative factorizations
Madhusudana Shashanka and Bhiksha Raj and Paris Smaragdis Computational intelligence and neuroscience 2008

Analysis-by-synthesis features for speech recognition
Ziad Al Bawab and Bhiksha Raj and Richard M Stern 2008 IEEE international conference on acoustics, speech and signal processing 2008

Ultrasonic doppler sensor for speaker recognition
Kaustubh Kalgaonkar and Bhiksha Raj 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Speech denoising using nonnegative matrix factorization with priors
Kevin W Wilson and Bhiksha Raj and Paris Smaragdis and Ajay Divakaran 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Sparse and shift-invariant feature extraction from non-negative data
Paris Smaragdis and Bhiksha Raj and Madhusudana Shashanka 2008 IEEE International Conference on Acoustics, Speech and Signal Processing 2008

Probabilistic latent variable models as nonnegative factorizations
Paris Smaragdis and Madhusudana Shashanka and Bhiksha Raj Computational Intelligence and Neuroscience 2008

Ultrasonic Doppler sensor for speech-based user interface
Kaustubh Kalgaonkar Bhiksha Ramakrishnan Computational Intelligence and Neuroscience 2008

Shift-invariant probabilistic latent component analysis
Paris Smaragdis and Bhiksha Ramakrishnan Computational Intelligence and Neuroscience 2008

Separating multiple audio signals recorded as a single mixed signal
Aarthi M. Reddy Bhiksha Ramakrishnan Computational Intelligence and Neuroscience 2008

Speech-based UI design for the automobile
Bent Schmidt-Nielsen and Bret Harsham and Bhiksha Raj and Clifton Forlines Computational Intelligence and Neuroscience 2008

2007

Continuous feature adaptation for non-native speech recognition
Yunbin Deng and Xiaokun Li and Chiman Kwan and B Raj and R Stern International Journal of Computer and Information Engineering 2007

Missing feature speech recognition using dereverberation and echo suppression in reverberant environments
Hyung-Min Park and Richard M Stern 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Profile view lip reading
Kshitiz Kumar and Tsuhan Chen and Richard M Stern 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Automatic Threshold Selection Algorithm for Sound Source Separation Based on Inter-microphone Time Difference
Chanwoo Kim and R Stern Journal of Latex Class Files 2007

” Polyaural” Array Processing for Automatic Speech Recognition in Degraded Environments
Richard M Stern and Evandro B Gouvêa and Govindarajan Thattai Eighth Annual Conference of the International Speech Communication Association 2007

An FFT-based companding front end for noise-robust automatic speech recognition
Bhiksha Raj and Lorenzo Turicchia and Bent Schmidt-Nielsen and Rahul Sarpeshkar EURASIP Journal on Audio, Speech, and Music Processing 2007

Example-driven bandwidth expansion
Paris Smaragdis and Bhiksha Raj 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2007

Ultrasonic doppler sensor for voice activity detection
Kaustubh Kalgaonkar and Rongquiang Hu and Bhiksha Raj IEEE Signal Processing Letters 2007

Supervised and semi-supervised separation of sounds from single-channel mixtures
Paris Smaragdis and Bhiksha Raj and Madhusudana Shashanka IEEE Signal Processing Letters 2007

Acoustic doppler sonar for gait recogination
Kaustubh Kalgaonkar and Bhiksha Raj 2007 IEEE Conference on Advanced Video and Signal Based Surveillance 2007

Soft mask methods for single-channel speaker separation
Aarthi M Reddy and Bhiksha Raj IEEE Transactions on Audio, Speech, and Language Processing 2007

Continuous feature adaptation for non-native speech recognition
Yunbin Deng and Xiaokun Li and Chiman Kwan and B Raj and R Stern International Journal of Computer and Information Engineering 2007

Sensor and data systems, audio-assisted cameras and acoustic Doppler sensors
Kaustubh Kalgaonkar and Paris Smaragdis and Bhiksha Raj 2007 IEEE Conference on Computer Vision and Pattern Recognition 2007

Bandwidth expansionwith a pólya urn model
Bhiksha Raj and Rita Singh and Madhusudana Shashanka and Paris Smaragdis 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Sparse overcomplete decomposition for single channel speaker separation
Madhusudana VS Shashanka and Bhiksha Raj and Paris Smaragdis 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Compression of language model structures and word identifiers for automated speech recognition systems
Bhiksha Ramakrishnan Edward W. D. Whittaker 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Classifier-based non-linear projection for continuous speech segmentation
Bhiksha Ramakrishnan and Rita Singh 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Classification in likelihood spaces
Bhiksha Ramakrishnan Rita Singh 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07 2007

Probabilistic deduction of symbol mappings for extension of lexicons.
Rita Singh and Evandro B Gouvêa and Bhiksha Raj INTERSPEECH 2007

Separating a foreground singer from background music
Bhiksha Raj and Paris Smaragdis and Madhusudhana Shashanka and Rita Singh Proc. Int. Symp. Frontiers Res. Speech Music 2007

Probabilistic latent variable model for sparse decompositions of non-negative data
M Shashanka and B Raj and P Smaragdis IEEE Transactions on Pattern Analysis and Machine Intelligence 2007

Sparse overcomplete latent variable decomposition of counts data
Madhusudana Shashanka and Bhiksha Raj and Paris Smaragdis Advances in neural information processing systems 2007

2006

Subband likelihood-maximizing beamforming for speech recognition in reverberant environments
Michael L Seltzer and Richard M Stern IEEE Transactions on Audio, Speech, and Language Processing 2006

Fluctuations in amplitude and frequency enable interaural delays to foster the identification of speech-like stimuli
Richard M Stern and Constantine Trahiotis and Angelo M Ripepi and P Divenyi and S Greenberg and G Meyer Dynamics of speech production and perception 2006

Spatial separation of speech signals using continuously-variable masks estimated from comparisons of zero crossings
Hyung-Min Park and Richard M Stern 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings 2006

Band-independent mask estimation for missing-feature reconstruction in the presence of unknown background noise
Wooil Kim and Richard M Stern 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings 2006

Improved breast and axillary lesion detection on PET imaging through the novel use of a breast holding device
Steven Parmett and Dana Rausch and Ping Lu and Ash Rafique and Nazar Golewale and Melissa Quispe and Chun Kim and Josef Machac and Richard Stern and Borys Krynyckyi 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings 2006

Mask Estimation Based on Band-Independent Bayesian Classifler for Missing-Feature Reconstruction
M Stern Richard and Ko Hanseok The Journal of the Acoustical Society of Korea 2006

An integrated approach to improve speech recognition rate for non-native speakers
Yunbin Deng and Xiaokun Li and Chiman Kwan and Roger Xu and Bhiksha Raj and Richard M Stern and David Williamson Ninth International Conference on Spoken Language Processing 2006

Voting for two speaker segmentation
Balakrishnan Narayanaswamy and Rashmi Gangadharaiah and Richard M Stern Ninth International Conference on Spoken Language Processing 2006

Physiologically-motivated synchrony-based processing for robust automatic speech recognition
Chanwoo Kim and Yu-Hsiang Chiu and Richard M Stern Ninth International Conference on Spoken Language Processing 2006

Binaural sound localization
DeLiang Wang and Guy J Brown Ninth International Conference on Spoken Language Processing 2006

An acoustic Doppler-based front end for hands free spoken user interfaces
Kaustubh Kalgaonkar and Bhiksha Raj 2006 IEEE Spoken Language Technology Workshop 2006

A probabilistic latent variable model for acoustic modeling
Paris Smaragdis and Bhiksha Raj and Madhusudana Shashanka Advances in models for acoustic processing, NIPS 2006

Latent dirichlet decomposition for single channel speaker separation
Bhiksha Raj and Madhusudana VS Shashanka and Paris Smaragdis 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings 2006

Latent Dirichlet decomposition for single channel speaker separation
Paris Smaragdis and Madhusudana V Shashanka and Bhiksha Raj IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France 2006

Multi-sensor noise suppression and bandwidth extension for enhancement of speech
Rongqiang Hu IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France 2006

Two New Techniques for Natural Spoken User Interfaces
Garrett Weinberg and Bhiksha Raj and Kaustubh Kalgaonkar IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France 2006

Ajikumar, PK, 145 Alterkop, B., 215 Arratia, PE, 202
Z Lj Arsenijevic and Y Bao and Z Barkay and N Bay and JI Bech and F Berruti and AF Blandin and RL Boxman and C Briens and H Chang and FF Chen and G Cheng and Y Cheng and C DellaCorte and H Dong and N-h Duong and M Eriksen and E Forssberg and G Frenkel and BS Gardiner and RV Garic-Grulovic and P Godbole and O Goldstein and MH Hancock and M He and C Huang and KJ Hwang and HD Jang and Y Jin and M Kamruddin and F Kayihan and HC Kim and WB Kim and JP Klein and M Kolenbrander and M Kwapinska and A Lange and Q Li and W Li and CL Lin and Y Liu and Y-f Liu and CY Lu and D-y Lu and Y-n Lu and SY Lyu and J McMillan and MH Moys and FJ Muzzio and MS Nielsen and R Nithya and P Pandey and N Parkansky and W Peukert and T-s Qian and Z Qian and B Raj and S Reynolds and A Rivoire and CM Romo-Krfger and Yu Rosenberg and V Rudolph and G Saage and M Saberian and R Sahoo and S-z Shi and Y Song and MK Stanford and C Subero-Couroyer and E Tang and A Tordesillas and E Tsotsas and R Turton and AK Tyagi and J Wang and Y Wang and Z Wang and F Wei and MY Wey and C Xu and M Xu and Q Yin and AB Yu and V Zaspalis and X Zeng and J Zhang and M Zhang and Q Zhang and D Zhou and HP Zhu and J Zhu Powder Technology 2006

Tracking noise via dynamic systems with a continuum of states
Bhiksha Ramakrishnan Rita Singh Powder Technology 2006

An integrated approach to improve speech recognition rate for non-native speakers.
Yunbin Deng and Xiaokun Li and Chiman Kwan and Roger Xu and Bhiksha Raj and Richard M Stern and David Williamson and Inc MERL INTERSPEECH 2006

Distributed speech recognition with codec parameters
Bhiksha Raj Acoustical Society of America Journal 2006

Special section on statistical and perceptual audio processing
Daniel PW Ellis and Bhiksha Raj and Judith C Brown and Malcolm Slaney and Paris Smaragdis Acoustical Society of America Journal 2006

2005

Missing-feature approaches in speech recognition
Bhiksha Raj and Richard M Stern IEEE Signal Processing Magazine 2005

Voice driven applications in non-stationary and chaotic environment
Chiman Kwan and Xiaokun Li and Debang Lao and Yunbin Deng and Zhubing Ren and Bhiksha Raj and Rita Singh and R Stern 2005 IEEE International Conference on Robotics and Biomimetics-ROBIO 2005

Feature compensation based on switching linear dynamic model
Nam Soo Kim and Woohyung Lim and Richard M Stern IEEE Signal Processing Letters 2005

Missing-feature methods for robust automatic speech recognition
B Raj and RM Stern IEEE Signal Processing Letters 2005

Signal separation motivated by human auditory perception: Applications to automatic speech recognition
Richard M Stern IEEE Signal Processing Letters 2005

Environment-independent mask estimation for missing-feature reconstruction
Wooil Kim and Richard M Stern and Hanseok Ko Ninth European Conference on Speech Communication and Technology 2005

Interaural correlation as the basis of a working model of binaural processing: an introduction
Constantine Trahiotis and Leslie R Bernstein and Richard M Stern and Thomas N Buell Sound source localization 2005

Recognizing speech from simultaneous speakers.
Bhiksha Raj and Rita Singh and Paris Smaragdis and BRR Singh INTERSPEECH 2005

Reconstructing spectral vectors with uncertain spectrographic masks for robust speech recognition
Bhiksha Raj and Rita Singh IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. 2005

A robust voice activity detector using an acoustic Doppler radar
Rongqiang Hu and Bhiksha Raj IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. 2005

Surveillance system with acoustically augmented video monitoring
Paris Smaragdis and Bhiksha Ramakrishnan IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. 2005

Latent variable decomposition of spectrograms for single channel speaker separation
Bhiksha Raj and Paris Smaragdis IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005. 2005

Missing-feature approaches in speech recognition
Bhiksha Raj and Richard M Stern IEEE Signal Processing Magazine 2005

A comparison between spoken queries and menu-based interfaces for in-car digital music selection
Clifton Forlines and Bent Schmidt-Nielsen and Bhiksha Raj and Kent Wittenburg and Peter Wolf IEEE Signal Processing Magazine 2005

Feature compensation with secondary sensor measurements for robust speech recognition
Bhiksha Raj and Rita Singh 2005 13th European Signal Processing Conference 2005

Bandwidth expansion of narrowband speech using non-negative matrix factorization.
Dhananjay Bansal and Bhiksha Raj and Paris Smaragdis INTERSPEECH 2005

Improving recognition accuracy in noise by using partial spectrographic information
Bhiksha Raj and Richard M Stern IEEE Signal Processing Magazine 2005

Voice driven applications in non-stationary and chaotic environment
Chiman Kwan and Xiaokun Li and Debang Lao and Yunbin Deng and Zhubing Ren and Bhiksha Raj and Rita Singh and R Stern 2005 IEEE International Conference on Robotics and Biomimetics-ROBIO 2005

A companding front end for noise-robust automatic speech recognition
Jethran Guinness and Bhiksha Raj and Bent Schmidt-Nielsen and Lorenzo Turicchia and R Sarpeshkars Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. 2005

Missing-feature methods for robust automatic speech recognition
B Raj and RM Stern Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. 2005

Method and system for retrieving documents with spoken queries
Peter P. Wolf and Bhiksha Ramakrishnan Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. 2005

Speech Recognizer Based Maximum Likelihood Beamforming
Bhiksha Raj and Michael Seltzer and Manuel Jesus Reyes-Gomez Proceedings.(ICASSP’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. 2005

2004

A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern Speech Communication 2004

Reconstruction of missing features for robust speech recognition
Bhiksha Raj and Michael L Seltzer and Richard M Stern Speech communication 2004

Likelihood-maximizing beamforming for robust hands-free speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern IEEE Transactions on speech and audio processing 2004

Parameter sharing in subband likelihood-maximizing beamforming for speech recognition using microphone arrays
Michael L Seltzer and Richard M Stern 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

Feature generation based on maximum normalized acoustic likelihood for improved speech recognition
Xiang Li and Richard M Stern 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

Creating multi-modal, user-centric records of meetings with the carnegie mellon meeting recorder architecture
Satanjeev Banerjee and Jason Cohen and Thomas Quisel and Arthur Chan and Yash Patodia and Ziad Al Bawab and Rong Zhang and Alan Black and Roxana Sarbu and Alexander Rudnicky and Paul E Rybski and Manuela Veloso 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

On tracking noise with linear dynamical system models
Bhiksha Raj and Rita Singh and Richard Stern 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

Normalization of time-derivative parameters for robust speech recognition in small devices
Yasunari Obuchi and Nobuo Hataoka and Richard M Stern IEICE Transactions on Information and Systems 2004

Parallel feature generation based on maximizing normalized acoustic likelihood
Xiang Li and Richard Stern Eighth International Conference on Spoken Language Processing 2004

N-Best list rescoring using syntactic trigrams
Luis R Salgado-Garza and Richard M Stern and Juan A Nolazco F MICAI 2004: Advances in Artificial Intelligence: Third Mexican International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004. Proceedings 3 2004

Spokenquery: an alternate approach to chosing items with speech.
Peter Wolf and Joseph Woelfel and Jan C van Gemert and Bhiksha Raj and David Wong INTERSPEECH 2004

A minimum mean squared error estimator for single channel speaker separation.
Aarthi M Reddy and Bhiksha Raj Interspeech 2004

A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern Speech Communication 2004

Reconstruction of missing features for robust speech recognition
Bhiksha Raj and Michael L Seltzer and Richard M Stern Speech communication 2004

Likelihood-maximizing beamforming for robust hands-free speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern IEEE Transactions on speech and audio processing 2004

Classification in likelihood spaces
Rita Singh and Bhiksha Raj Technometrics 2004

On tracking noise with linear dynamical system models
Bhiksha Raj and Rita Singh and Richard Stern 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

Sphinx-4: A flexible open source framework for speech recognition
Willie Walker and Paul Lamere and Philip Kwok and Bhiksha Raj and Rita Singh and Evandro Gouvea and Peter Wolf and Joe Woelfel 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004

Sphinx-4: A Flexible Open Source Framework for Speech Recognition; SMLI TR2004-0811; SUN Microsystems Inc
Willie Walker and Paul Lamere and Philip Kwok and Bhiksha Raj and Rita Singh and Evandro Gouvea and Peter Wolf and Joe Woelfel Santa Clara, CA, USA 2004

Sphinx-4: A flexible open source framework for voice recognition
W Walker and P Lamere and P Kwok and B Raj and R Singh and E Gouvea and P Wolf and J Woelfel Sun Microsystems, Inc. Mountain View, CA, USA 2004

Sphinx-4: A ﬂexible open source framework for speech recognition
Willie Walker and Paul Lamere and Philip Kwok and RS Bhiksha Raj and E Gouvea and P Wolf and J Woelfel Sun Microsystems, Inc, Tech. Rep. SMLI TR-2004-139 2004

Soft mask estimation for single channel speaker separation
Aarthi M Reddy and Bhiksha Raj ISCA Tutorial and Research Workshop (ITRW) on Statistical and Perceptual Audio Processing 2004

A speech-in list-out approach to spoken user interfaces
Vijay Divi and Clifton Forlines and Jan Van Gemert and Bhiksha Raj and Bent Schmidt-Nielsen and Kent Wittenburg and Joseph Woelfel and Fang-Fang Zhang Proceedings of HLT-NAACL 2004: Short Papers 2004

2003

Feature generation based on maximum classification probability for improved speech recognition.
Xiang Li and Richard M Stern INTERSPEECH 2003

Normalization of time-derivative parameters using histogram equalization.
Yasunari Obuchi and Richard M Stern INTERSPEECH 2003

Training of stream weights for the decoding of speech using parallel feature streams
Xiang Li and Richard M Stern 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003

Subband parameter optimization of microphone arrays for speech recognition in reverberant environments
Michael L Seltzer and Richard M Stern 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003

SPEECH-P12. 1: TRAINING OF STREAM WEIGHTS FOR THE DECODING OF SPEECH USING PARALLEL FEATURE STREAMS
X Li and R Stern IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS SPEECH AND SIGNAL PROCESSING 2003

Duration Normalization and Hypothesis Combination for Improved Spontaneous Speech Recognition
Jon P Nedel and Richard M Stern Eighth European Conference on Speech Communication and Technology 2003

Investigation on effectiveness of mid-level feature representation for semantic boundary detection in news video
Regunathan Radhakrishan and Ziyou Xiong and Ajay Divakaran and Bhiksha Raj Internet Multimedia Management Systems IV 2003

Multi-channel source separation by beamforming trained with factorial hmms
Manuel J Reyes-Gomez and R Bhiksha and Daniel PW Ellis 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No. 03TH8684) 2003

Classification with free energy at raised temperatures.
Rita Singh and Manfred K Warmuth and Bhiksha Raj and Paul Lamere INTERSPEECH 2003

Design of the CMU sphinx-4 decoder.
Paul Lamere and Philip Kwok and William Walker and Evandro B Gouvêa and Rita Singh and Bhiksha Raj and Peter Wolf Interspeech 2003

Multi-channel source separation by factorial HMMs
Manuel J Reyes-Gomez and Bhiksha Raj and Daniel P. W. Ellis Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003 IEEE International Conference on 2003

The CMU SPHINX-4 speech recognition system
Paul Lamere and Philip Kwok and Evandro Gouvea and Bhiksha Raj and Rita Singh and William Walker and Manfred Warmuth and Peter Wolf Ieee intl. conf. on acoustics, speech and signal processing (icassp 2003), hong kong 2003

Lossless compression of language model structure and word identifiers
Bhiksha Raj and Edward WD Whittaker 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003

Tracking noise via dynamical systems with a continuum of states
Rita Singh and Bhiksha Raj 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). 2003

Speech-recognizer-based filter optimization for microphone array processing
Michael L Seltzer and Bhiksha Raj IEEE Signal Processing Letters 2003

Design of the CMU sphinx-4 decoder
LAMERE Paul Proc. of Eurospeech 2003 2003

Lossless compression of ordered integer lists
Edward Whittaker Bhiksha Ramakrishnan Proc. of Eurospeech 2003 2003

Classifier-based non-linear projection for adaptive endpointing of continuous speech
Bhiksha Raj and Rita Singh Computer Speech & Language 2003

2002

Mitsubishi Electric Research Laboratories
荒井兼秀 システム/制御/情報 2002

Duration normalization for improved automatic speech recognition
Jon P Nedel and Richard M Stern The Journal of the Acoustical Society of America 2002

REFERENCES TO CONTEMPORARY PAPERS ON ACOUSTICS 2002 Edition
Richard Stern J. Acoust. Soc. Am 2002

Combining search spaces of heterogeneous recognizers for improved speech recogniton.
Xiang Li and Rita Singh and Richard M Stern INTERSPEECH 2002

Speech recognizer-based microphone array processing for robust hands-free speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing 2002

Automatic generation of subword units for speech recognition systems
Rita Singh and Bhiksha Raj and Richard M Stern IEEE Transactions on Speech and Audio Processing 2002

Lattice combination for improved speech recognition
Xiang Li and Rita Singh and Richard M Stern ICSLP’02 2002

Mitsubishi Electric Research Laboratories
荒井兼秀 システム/制御/情報 2002

The MERL SpokenQuery information retrieval system a system for retrieving pertinent documents from a spoken query
Peter Wolf and Bhiksha Raj Proceedings. IEEE International Conference on Multimedia and Expo 2002

Speech recognizer-based microphone array processing for robust hands-free speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing 2002

Automatic generation of subword units for speech recognition systems
Rita Singh and Bhiksha Raj and Richard M Stern IEEE Transactions on Speech and Audio Processing 2002

Key word and key phrase based speech recognizer for information retrieval systems
D McDonald P Wolf and B Ramakrishnan IEEE Transactions on Speech and Audio Processing 2002

Key word and key phrase based speech recognizer for information retrieval systems
Peter Wolf and Bhiksha Ramakrishnan and David McDonald IEEE Transactions on Speech and Audio Processing 2002

2001

Robust speech recognition: the case for restoring missing features
Bhiksha Raj and Michael L Seltzer and Richard M Stern Proc. of Eurospeech, The Workshop on Consistent and Reliable Acoustic Cues, Aalborg, Denmark 2001

Duration normalization for improved recognition of spontaneous and read speech via missing feature methods
Jon P Nedel and Richard M Stern 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221) 2001

Speech in noisy environments: robust automatic segmentation, feature extraction, and hypothesis combination
Rita Singh and Michael L Seltzer and Bhiksha Raj and Richard M Stern 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221) 2001

Distortion-class modeling for robust speech recognition under GSM RPE-LTP coding
Juan M Huerta and Richard M Stern Speech Communication 2001

Dynamic methods for measuring the elastic properties of solids
Arthur G Every and Wolfgang Sachse and Veerle Keppens (No Title) 2001

Distributed speech recognition with codec parameters
Bhiksha Raj and Joshua Migdal and Rita Singh IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU’01. 2001

Quantization-based language model compression.
Edward WD Whittaker and Bhiksha Raj INTERSPEECH 2001

Calibration of microphone arrays for improved speech recognition.
Michael L Seltzer and Bhiksha Raj INTERSPEECH 2001

Robust speech recognition: the case for restoring missing features
Bhiksha Raj and Michael L Seltzer and Richard M Stern Proc. of Eurospeech, The Workshop on Consistent and Reliable Acoustic Cues, Aalborg, Denmark 2001

A boosting approach for confidence scoring.
Pedro J Moreno and Beth Logan and Bhiksha Raj Interspeech 2001

Speech in noisy environments: robust automatic segmentation, feature extraction, and hypothesis combination
Rita Singh and Michael L Seltzer and Bhiksha Raj and Richard M Stern 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221) 2001

Comparison of width-wise and length-wise language model compression.
Edward WD Whittaker and Bhiksha Raj INTERSPEECH 2001

2000

Automatic generation of phone sets and lexical transcriptions
Rita Singh and Bhiksha Raj and Richard M Stern 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100) 2000

Inter-class MLLR for speaker adaptation
S-J Doh and R Stern 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100) 2000

The 1999 CMU 10x real time broadcast news transcription system
Mosur Ravishankar and Rita Singh and Bhiksha Raj and Richard M Stern Proc. darpa workshop on automatic transcription of broadcast news 2000

A monitoring tool for self-organizing overlay network
TungFai Chan and Annie Cheng and Hui Zhang and Dave Nagle and Richard Stern Proc. darpa workshop on automatic transcription of broadcast news 2000

Instantaneous-Distortion Based Weighted Acoustic Modeling for Robust Recognition of Coded Speech
Juan M Huerta and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Using class weighting in inter-class MLLR
Sam-Joo Doh and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems.
Jon P Nedel and Rita Singh and Richard M Stern INTERSPEECH 2000

Automatic subword unit refinement for spontaneous speech recognition via phone splitting.
Jon P Nedel and Rita Singh and Richard M Stern INTERSPEECH 2000

Structured redefinition of sound units by merging and splitting for improved speech recognition
Rita Singh and Bhiksha Raj and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Classifier-based mask estimation for missing feature methods of robust speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Reconstruction of damaged spectrographic features for robust speech recognition
Bhiksha Raj and Michael L Seltzer and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Structured redefinition of sound units by merging and splitting for improved speech recognition.
Rita Singh and Bhiksha Raj and Richard M Stern INTERSPEECH 2000

Automatic generation of phone sets and lexical transcriptions
Rita Singh and Bhiksha Raj and Richard M Stern 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100) 2000

The 1999 CMU 10x real time broadcast news transcription system
Mosur Ravishankar and Rita Singh and Bhiksha Raj and Richard M Stern Proc. darpa workshop on automatic transcription of broadcast news 2000

Structured redefinition of sound units for improved speech recognition
Rita Singh and Bhiksha Raj and Richard M Stern Proceedings of the 6th International Conference on Speech and Language Processing 2000

Structured Redefinition of Sound Units by Merging and Splitting
Rita Singh and Bhiksha Raj and Richard M Stern Proceedings of the International Conference on Acoustics, Speech and Signal Processing 2000

Reconstruction of incomplete spectrograms for robust speech recognition
Bhiksha Raj Ramakrishnan Proceedings of the International Conference on Acoustics, Speech and Signal Processing 2000

Classifier-based mask estimation for missing feature methods of robust speech recognition
Michael L Seltzer and Bhiksha Raj and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

Reconstruction of damaged spectrographic features for robust speech recognition
Bhiksha Raj and Michael L Seltzer and Richard M Stern Sixth International Conference on Spoken Language Processing 2000

1999

Weighted principal component MLLR for speaker adaptation
Sam-Joo Doh and Richard M Stern Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 1999

Selected aspects of across‐frequency processing in binaural hearing
Constantine Trahiotis and Leslie R Bernstein and Richard M Stern The Journal of the Acoustical Society of America 1999

Automatic clustering and generation of contextual questions for tied states in hidden Markov models
Rita Singh and Bhiksha Raj and Richard M Stern 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258) 1999

The 1998 carnegie mellon university sphinx-3 spanish broadcast news transcription system
Juan M Huerta and SJ Chen and Richard M Stern Proc. of the DARPA Broadcast News Transcription and Understanding Workshop 1999

Sturm ist ihre Ernte: Roman
Richard Martin Stern Proc. of the DARPA Broadcast News Transcription and Understanding Workshop 1999

References to Contemporary Papers on Acoustics
Richard Stern Journal of the Acoustical Society of America 1999

Distortion-class weighted acoustic modeling for robust speech recognition under GSM RPE-LTP coding
Juan M Huerta and Richard M Stern Proceedings of the Robust Methods for Speech Recognition in Adverse Conditions, Tampere Finland 1999

Domain Adduced State Tying for Cross-Domain Acoustic Modelling
Rita Singh and Bhiksha Raj and Richard M Stern Sixth European Conference on Speech Communication and Technology 1999

Domain adduced state tying for cross-domain acoustic modelling.
Rita Singh and Bhiksha Raj and Richard M Stern EUROSPEECH 1999

Automatic clustering and generation of contextual questions for tied states in hidden Markov models
Rita Singh and Bhiksha Raj and Richard M Stern 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258) 1999

1998

Inference of missing spectrographic features for robust speech recognition.
Bhiksha Raj and Rita Singh and Richard M Stern ICSLP 1998

Speech recognition from GSM codec parameters.
Juan M Huerta and Richard M Stern ICSLP 1998

Data-driven environmental compensation for speech recognition: A unified approach
Pedro J Moreno and Bhiksha Raj and Richard M Stern Speech Communication 1998

The development of the 1997 CMU Spanish broadcast news transcription system
Juan M Huerta and Eric Thayer and Mosur Ravishankar and Richard M Stern Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA 1998

The 1997 CMU Sphinx-3 English broadcast news transcription system
Kristie Seymore and Ronald Rosenfeld and S Chen and Maxine Eskenazi and E Gouvea and Raj Reddy and Mosur Ravishankar and Matthew Siegler and Richard Stern and Eric Thayer Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA 1998

Binaural mechanisms that emphasize consistent interaural timing information over frequency
Richard M Stern and Constantine Trahiotis Proceedings of the XI International Symposium on Hearing, AR Palmer, A. Rees, AQ Summerfield, and R. Meddis, Eds 1998

Inference of missing spectrographic features for robust speech recognition.
Bhiksha Raj and Rita Singh and Richard M Stern ICSLP 1998

Data-driven environmental compensation for speech recognition: A unified approach
Pedro J Moreno and Bhiksha Raj and Richard M Stern Speech Communication 1998

The 1997 CMU Sphinx-3 English broadcast news transcription system
Kristie Seymore and Ronald Rosenfeld and S Chen and Maxine Eskenazi and E Gouvea and Raj Reddy and Mosur Ravishankar and Matthew Siegler and Richard Stern and Eric Thayer Speech Communication 1998

Stem. Data-driven environmental compensation for speechrecognition
Pedro JMoreno and Bhiksha Raj RichardM Speech Communication 1998

1997

Speaker normalization through formant-based warping of the frequency scale.
Evandro B Gouvêa and Richard M Stern Eurospeech 1997

Using the international normalized ratio to standardize prothrombin time
Richard Stern and VASILIKl KARLIS and Lisa Kinney and Robert Glickman The Journal of the American Dental Association 1997

The effects of background music on speech recognition accuracy
Bhiksha Raj and Vipul N Parikh and Richard M Stern 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing 1997

Cepstral compensation using statistical linearization
Bhiksha Raj and Evandro Gouvêa and Richard M Stern Proc. of the ESCA Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-au-Mousson, France 1997

Automatic segmentation, classification and clustering of broadcast news audio
Matthew A Siegler and Uday Jain and Bhiksha Raj and Richard M Stern Proc. DARPA speech recognition workshop 1997

Speaker adaptation and environmental compensation for the 1996 broadcast news task
Vipul N Parikh and Bhiksha Raj and Richard M Stern Proc. DARPA Speech Recognition Workshop 1997

The 1996 hub-4 sphinx-3 system
Paul Placeway and Scotte Chen and Maxine Eskenazi and Uday Jain and Vipul Parikh and Bhiksha Raj and Mosur Ravishankar and Roni Rosenfeld and Kristie Seymore and M Siegler and R Stern and Eric Thayer Proc. DARPA Speech recognition workshop 1997

The 1996 Hub-4 Sphinx-3 system
Sumin Chen and M Eskenazi and U Jain and V Parikh and B Raj and M Ravishankar and R Rosenfeld and K Seymore and M Siegler and R Stern and E Thayer Proceedings of the ARPA Speech Recognition Workshop 1997

Compensation for Environmental Degradation in Automatic Speech Rpeech Recognition in Automatic Speech Recognition
RM STERN ESCA-NATO Workshop on Robust Speech Recognition for Unknown Communication Channels 1997

Vector polynomial approximations for robust speech recognition
Bhiksha Raj and Evandro B Gouvêa and Richard M Stern Proc. of the ESCA Workshop ETRW on Speech Processing in Adverse Conditions 1997

Compensation for environmental and speaker variability by normalization of pole locations.
Juan M Huerta and Richard M Stern EUROSPEECH 1997

Specification of the 1996 Hub4 broadcast news evaluation
R Stern Proc. ARPA Speech Recognition Workshop 1997

Models of binaural perception
Richard M Stern and Constantine Trahiotis Binaural and spatial hearing in real and virtual environments 1997

Compensation for environmental degradation in automatic speech recognition
Richard M Stern and Bhiksha Raj and Pedro J Moreno Binaural and spatial hearing in real and virtual environments 1997

The effects of background music on speech recognition accuracy
Bhiksha Raj and Vipul N Parikh and Richard M Stern 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing 1997

Cepstral compensation using statistical linearization
Bhiksha Raj and Evandro Gouvêa and Richard M Stern Proc. of the ESCA Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-au-Mousson, France 1997

Compensation for environmental degradation in automatic speech recognition
Richard M Stern and Bhiksha Raj and Pedro J Moreno Proc. of the ESCA Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-au-Mousson, France 1997

Automatic segmentation, classification and clustering of broadcast news audio
Matthew A Siegler and Uday Jain and Bhiksha Raj and Richard M Stern Proc. DARPA speech recognition workshop 1997

Speaker adaptation and environmental compensation for the 1996 broadcast news task
Vipul N Parikh and Bhiksha Raj and Richard M Stern Proc. DARPA Speech Recognition Workshop 1997

The 1996 hub-4 sphinx-3 system
Paul Placeway and Scotte Chen and Maxine Eskenazi and Uday Jain and Vipul Parikh and Bhiksha Raj and Mosur Ravishankar and Roni Rosenfeld and Kristie Seymore and M Siegler and R Stern and Eric Thayer Proc. DARPA Speech recognition workshop 1997

Speaker adaptation and environmental compensation for the 1996 broadcast news task
R Stern V Parikh and B Raj Proc. DARPA Speech recognition workshop 1997

Vector polynomial approximations for robust speech recognition
Bhiksha Raj and Evandro B Gouvêa and Richard M Stern Proc. of the ESCA Workshop ETRW on Speech Processing in Adverse Conditions 1997

1996

Cepstral compensation by polynomial approximation for environment-independent speech recognition
Bhiksha Raj and Evandro Bacci Gouvêa and Pedro J Moreno and Richard M Stern Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96 1996

Compensation for speech recognition in degraded acoustical environments
Richard M Stern and Pedro J Moreno and Bhiksha Raj The Journal of the Acoustical Society of America 1996

Lateralization and detection of low‐frequency binaural stimuli: Effects of distribution of internal delay
Richard M Stern and Glenn D Shear The Journal of the Acoustical Society of America 1996

Lateralization and Detection of Low-Frequency Binaural Stimuli: Specification ot the Extended Position-Variable Model
RM Stern and GD Shear J. Acoust. Soc. Am. AIP Document No. E–PAPS E–JASMA–100–2278 1996

A vector Taylor series approach for environment-independent speech recognition
Pedro J Moreno and Bhiksha Raj and Richard M Stern 1996 IEEE international conference on acoustics, speech, and signal processing conference proceedings 1996

Signal processing for robust speech recognition
Richard M Stern and Alejandro Acero and Fu-Hua Liu and Yoshiaki Ohshima 1996 IEEE international conference on acoustics, speech, and signal processing conference proceedings 1996

Adaptation and compensation: Approaches to microphone and speaker independence in automatic speech recognition
Evandro B Gouvêa and Pedro J Moreno and Bhiksha Raj and Thomas M Sullivan and Richard M Stern Proc. DARPA Speech Recognition Workshop 1996

Specification of the 1995 ARPA hub 3 evaluation: Unlimited vocabulary NAB news baseline
Richard M Stern Proceedings of the DARPA Speech Recognition Workshop 1996

Recognition of continuous broadcast news with multiple unknown speakers and environments
Uday Jain and Matthew A Siegler and Sam-Joo Doh and Evandro Gouvea and Juan Huerta and Pedro J Moreno and Bhiksha Raj and Richard M Stern Proc. DARPA Speech Recognition Workshop 1996

Robust speech recognition using signal processing based on binaural perception
RM Stern and TM Sullivan ACUSTICA 1996

Cepstral compensation by polynomial approximation for environment-independent speech recognition
Bhiksha Raj and Evandro Bacci Gouvêa and Pedro J Moreno and Richard M Stern Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96 1996

Compensation for speech recognition in degraded acoustical environments
Richard M Stern and Pedro J Moreno and Bhiksha Raj Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96 1996

A vector Taylor series approach for environment-independent speech recognition
Pedro J Moreno and Bhiksha Raj and Richard M Stern 1996 IEEE international conference on acoustics, speech, and signal processing conference proceedings 1996

Adaptation and compensation: Approaches to microphone and speaker independence in automatic speech recognition
Evandro B Gouvêa and Pedro J Moreno and Bhiksha Raj and Thomas M Sullivan and Richard M Stern Proc. DARPA Speech Recognition Workshop 1996

Recognition of continuous broadcast news with multiple unknown speakers and environments
Uday Jain and Matthew A Siegler and Sam-Joo Doh and Evandro Gouvea and Juan Huerta and Pedro J Moreno and Bhiksha Raj and Richard M Stern Proc. DARPA Speech Recognition Workshop 1996

1995

A unified approach for robust speech recognition.
Pedro J Moreno and Bhiksha Raj and Richard M Stern EUROSPEECH 1995

Multivariate-Gaussian-based cepstral normalization for robust speech recognition
Pedro J Moreno and Bhiksha Raj and Evandro Gouvea and Richard M Stern 1995 International Conference on Acoustics, Speech, and Signal Processing 1995

On the effects of speech rate in large vocabulary speech recognition systems
Matthew A Siegler and Richard M Stern 1995 international conference on acoustics, speech, and signal processing 1995

Automatic speech recognition using signal processing based on auditory physiology and perception
Richard Stern The Journal of the Acoustical Society of America 1995

RABBITPOX PS/HR (VV-WR-B5R ORF) GENE-PRODUCT INHIBITS THE GENERATION OF AN INFLAMMATORY RESPONSE IN-VIVO
GJ PALUMBO and R STERN and WC GLASGOW and L MARTINEZ and RM BULLER and RW MOYER JOURNAL OF CELLULAR BIOCHEMISTRY 1995

Approaches to Environment Compensation in Automatic Speech Recognition
Pedro J Moreno and Bhiksha Raj and Richard M Stern Proceeding of the 1995 International Conference in Acoustics ICA’95 1995

Approaches to microphone independence in Automatic Speech Recognition
Pedro J Moreno and Uday Jain and Bhiksha Raj and Richard M Stern Proceedings of the ARPA Spoken Language Systems Technology Workshop 1995

Robust speech recognition
Richard M Stern Proceedings of the ARPA Spoken Language Systems Technology Workshop 1995

Continuous Recognition of Large-Vocabulary Telephone-Quality Speech
Pedro J Moreno and Matthew A Siegler and Uday Jain and Richard M Stern Proc. ARPA Spoken Language Systems Technology Workshop 1995

Models of binaural interaction
Richard M Stern and Constantine Trahiotis Handbook of perception and cognition 1995

A unified approach for robust speech recognition.
Pedro J Moreno and Bhiksha Raj and Richard M Stern EUROSPEECH 1995

Multivariate-Gaussian-based cepstral normalization for robust speech recognition
Pedro J Moreno and Bhiksha Raj and Evandro Gouvea and Richard M Stern 1995 International Conference on Acoustics, Speech, and Signal Processing 1995

Approaches to Environment Compensation in Automatic Speech Recognition
Pedro J Moreno and Bhiksha Raj and Richard M Stern Proceeding of the 1995 International Conference in Acoustics ICA’95 1995

Approaches to microphone independence in Automatic Speech Recognition
Pedro J Moreno and Uday Jain and Bhiksha Raj and Richard M Stern Proceedings of the ARPA Spoken Language Systems Technology Workshop 1995

1994

Across‐frequency interaction in lateralization of complex binaural stimuli
Constantine Trahiotis and Richard M Stern The Journal of the Acoustical Society of America 1994

Environmental robustness in automatic speech recognition using physiologic ally-motivated signal processing.
Yoshiaki Ohshima and Richard M Stern ICSLP 1994

Robust speech recognition in the automobile.
Nobutoshi Hanai and Richard M Stern ICSLP 1994

Consistency over frequency in high‐frequency binaural lateralization
Wonseok Lee and Richard M Stern The Journal of the Acoustical Society of America 1994

Environment normalization for robust speech recognition using direct cepstral comparison
Fu-Hua Liu and Richard M Stern and Alejandro Acero and Pedro J Moreno Proceedings of ICASSP’94. IEEE International Conference on Acoustics, Speech and Signal Processing 1994

Sources of degradation of speech recognition in the telephone network
Pedro J Moreno and Richard M Stern Proceedings of ICASSP’94. IEEE International Conference on Acoustics, Speech and Signal Processing 1994

Session 14: New Directions/Applications
Richard M Stern Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994 1994

Signal processing for robust speech recognition
Fu-Hua Liu and Pedro J Moreno and Richard M Stem and Alejandro Acero Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994 1994

1993

Multi-microphone correlation-based processing for robust speech recognition
Thomas M Sullivan and Richard M Stern 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing 1993

Multi‐microphone cross‐correlation based processing for robust speech recognition
Thomas M Sullivan and Richard M Stern The Journal of the Acoustical Society of America 1993

Efficient cepstral normalization for robust speech recognition
Fu-Hua Liu and Richard M Stern and Xuedong Huang and Alejandro Acero Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993 1993

1992

Additive versus multiplicative combination of differences of interaural time and intensity.
Samuel H Tao and Richard M Stern The Journal of the Acoustical Society of America 1992

Efficient joint compensation of speech for the effects of additive noise and linear filtering
F-H Liu and Alejandro Acero and Richard M Stern [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing 1992

Cepstral normalization for robust speech recognition
Alejandro Acero and Richard M Stern Speech processing in adverse conditions 1992

Speech understanding in open tasks
Wayne Ward and Sunil Issar and Xuedong Huang and Hsiao-Wuen Hon and Mei-Yuh Hwang and Sheryl Young and Mike Matessa and Fu-Hua Liu and Richard M Stern Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992 1992

The role of consistency of interaural timing over frequency in binaural lateralization
Richard M Stern and Constantine Trahiotis Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992 1992

Multiple approaches to robust speech recognition
Richard M Stern and Fu-Hua Liu and Yoshiaki Ohshima and Thomas M Sullivan and Alejandro Acero Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992 1992

1991

Erratum: Lateralization of complex binaural stimuli: A weighted‐image model [J. Acoust. Soc. Am. 84, 156–165 (1988)]
Richard M Stern and Andrew S Zeiberg and Constantine Trahiotis The Journal of the Acoustical Society of America 1991

Lateralization of rectangularly modulated noise: Explanations for counterintuitive reversals
Richard M Stern and Torsten Zeppenfeld and Glenn D Shear The Journal of the Acoustical Society of America 1991

Robust speech recognition by normalization of the acoustic space.
Alejandro Acero and Richard M Stern icassp 1991

Speaker adaptation in continuous speech recognition via estimation of correlated mean vectors
William Anthony Michael Rozzi icassp 1991

1990

ACOUSTICAL PRE-PROCESSING
Alejandro Acero and Richard M Stern icassp 1990

An approach to cardiac arrhythmia analysis using hidden Markov models
Douglas A Coast and Richard M Stern and Gerald G Cano and Stanley A Briller IEEE Transactions on biomedical Engineering 1990

Environmental robustness in automatic speech recognition
Alejandro Acero and Richard M Stern International Conference on Acoustics, Speech, and Signal Processing 1990

Tsunami: jeder Tag zählt; Roman
Richard Martin Stern International Conference on Acoustics, Speech, and Signal Processing 1990

Overview of the Third DARPA Speech and Natural Language Workshop
Richard M Stern Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27, 1990 1990

References to contemporary papers on acoustics.
Richard Stern The Journal of the Acoustical Society of America 1990

Acoustical Pre-Processing for Robust Spoken Language Systems
Alejandro Acero and Richard M Stern First International Conference on Spoken Language Processing 1990

Towards environment-independent spoken language systems
Alejandro Acero and Richard M Stern Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27, 1990 1990

1989

Acoustical preprocessing for automatic speech recognition
Richard Stern and Alex Acero DARPA Speech and Natural Language Workshop 1989

Lateralization of bands of noise: Effects of bandwidth and differences of interaural time and phase
Constantine Trahiotis and Richard M Stern The Journal of the Acoustical Society of America 1989

Sound Localization by Human Observers Symposium Proceedings Held in National Academy of Sciences on 14-16 October 1988
David M Green and Georeg F Kuhn and Robert Butler and HS Colburn and John C Middlebrooks and Richard M Stern and James R Lackner and William A Yost The Journal of the Acoustical Society of America 1989

Tödliche Flut: Roman
Richard Martin Stern The Journal of the Acoustical Society of America 1989

Die Todesbrücke: Roman
Richard Martin Stern (No Title) 1989

Spoken language systems II
Richard M Stern Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, Massachusetts, October 15-18, 1989 1989

Acoustical pre-processing for robust speech recognition
Richard M Stern and Alejandro Acero Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, Massachusetts, October 15-18, 1989 1989

1988

Lateralization predictions for high‐frequency binaural stimuli
Richard M Stern and Glenn D Shear and Torsten Zeppenfeld The Journal of the Acoustical Society of America 1988

Lateralization of complex binaural stimuli: A weighted‐image model
Richard M Stern and Andrew S Zeiberg and Constantine Trahiotis The Journal of the Acoustical Society of America 1988

Parsing spoken phrases despite missing words.
Wayne H Ward and Alexander G Hauptmann and Richard M Stern and Thomas Chanak ICASSP 1988

Hölle im Schnee: Roman
Richard Martin Stern and Karl-Otto von Czernicki ICASSP 1988

Die Himmelsmaschine: Roman
Richard Martin Stern and Helmug Bergner ICASSP 1988

differences of interaural time and phase
Constantine Trahiotis and Richard M Stern Acoust. Soc. Am 1988

1987

A satellite classroom for advanced acoustic studies
Richard Stern The Journal of the Acoustical Society of America 1987

Dynamic speaker adaptation for feature-based isolated word recognition
RM Stern and M Lasry IEEE transactions on acoustics, speech, and signal processing 1987

Extending the position‐variable model: Dependence of lateralization on frequency and bandwidth
Glenn D Shear and Richard M Stern The Journal of the Acoustical Society of America 1987

Sentence parsing with weak grammatical constraints
R Stern and W Ward and A Hauptmann and Juan Leon ICASSP’87. IEEE International Conference on Acoustics, Speech, and Signal Processing 1987

Waldfeuer: Roman
Richard Martin Stern ICASSP’87. IEEE International Conference on Acoustics, Speech, and Signal Processing 1987

1986

A weighted image model for binaural lateralization
Richard M Stern and Andrew S Zeiberg The Journal of the Acoustical Society of America 1986

Perception of modulations in pitch and lateralization
Laural Beecher and Richard M Stern The Journal of the Acoustical Society of America 1986

Performing fine phonetic distinctions: templates versus features
Ronald Cole and Richard M Stern and Moshe J Lasry Invariance and Variability of Speech Processes 1986

1985

Perception of modulations in pitch and lateralization
Laurel Beecher and Richard M Stern The Journal of the Acoustical Society of America 1985

Lateral‐position‐based models of interaural discrimination
Richard M Stern Jr and H Steven Colburn The Journal of the Acoustical Society of America 1985

1984

Interaural time discrimination in tonal maskers
Richard M Stern and Ann E Elsner and Jeffrey L Schiano The Journal of the Acoustical Society of America 1984

A posteriori estimation of correlated jointly Gaussian mean vectors
Moshe J Lasry and Richard M Stern IEEE transactions on pattern analysis and machine intelligence 1984

Unsupervised adaptation to new speakers in feature-based letter recognition
M Lasry and R Stern ICASSP’84. IEEE International Conference on Acoustics, Speech, and Signal Processing 1984

Fast computation of the difference of low-pass transform
James L Crowley and Richard M Stern IEEE transactions on pattern analysis and machine intelligence 1984

Geschenk zum Abschied: eine Kurzfassung des Buches
Frances Sharkey and Heiner Simon and Judith Richards and Richard Martin Stern IEEE transactions on pattern analysis and machine intelligence 1984

1983

Unsupervised speaker adaptation in feature‐based isolated letter recognition
Moshé J Lasry and Richard M Stern The Journal of the Acoustical Society of America 1983

aaLSLLL GGG LL LS GGG L GGG LLLL GGGG AND INTERAURAL DISCRIVIINATION
Richard M Stern Jr The Journal of the Acoustical Society of America 1983

Subjective Lateral Position and Interaural Discrimination
Richard M Stern Jr and H Steven Colburn The Journal of the Acoustical Society of America 1983

Interaural time and amplitude discrimination in noise
Richard M Stern Jr and Janet E Slocum and Michael S Phillips The Journal of the Acoustical Society of America 1983

Dynamic cues in binaural perception
Stephen J Bachorski and Richard M Stern The Journal of the Acoustical Society of America 1983

Dynamic speaker adaptation for isolated letter recognition using MAP estimation
R Stern and M Lasry ICASSP’83. IEEE International Conference on Acoustics, Speech, and Signal Processing 1983

Feature-based speaker-independent recognition of isolated English letters
Ronald Cole and Richard Stern and M Phillips and S Brill and A Pilant and Philippe Specker ICASSP’83. IEEE International Conference on Acoustics, Speech, and Signal Processing 1983

Can Software Be Tied to Hardware? Part II
Richard Stern IEEE Micro 1983

Erratum:‘‘Refractive and other acoustic effects produced by a prism‐shaped network of rigid strips’’[J. Acoust. Soc. Am. 70, 1463–1472 (1981)]
Maurice Amram and Richard Stern The Journal of the Acoustical Society of America 1983

Dynamic cues in binaural perception
Richard M Stern and Stephen J Bachorski The Journal of the Acoustical Society of America 1983

1982

Tuning to the speaker: dynamic adaptation of statistical parameters in isolated letter recognition
Richard M Stern and Moshe J Lasry The Journal of the Acoustical Society of America 1982

Decisions about features
Scott M Brill and Michael S Phillips and Moshe J Lasry and Richard M Stern The Journal of the Acoustical Society of America 1982

The phase angle of addition in temporal masking for diotic and dichotic listening conditions
Wlliam A Yost and D Wesley Grantham and Robert A Lutfi and Richard M Stern Hearing Research 1982

Software and copyright law: court judgments remain unpredictable
Richard Stern IEEE Micro 1982

Classification of spectral patterns obtained from eustachian tube sonometry
Krishna G Murti and Richard M Stern and Erdem I Cantekin and Charles D Bluestone IEEE Transactions on Biomedical Engineering 1982

1981

Interaural time determination in tonal maskers
Donald L Kaiser and Richard M Stern The Journal of the Acoustical Society of America 1981

Die Söhne: Roman
Richard Martin Stern The Journal of the Acoustical Society of America 1981

Satan ist auf Gottes seite
Hans Herlin and Joan Martin Hundley and Richard Martin Stern and Hugo Hartung The Journal of the Acoustical Society of America 1981

1980

Interaural time and amplitude discrimination in noise
Janet E Slocum and Richard M Stern The Journal of the Acoustical Society of America 1980

Audibility of phase changes in vowel sounds and complex tones
Richard M Stern and Alexander H Waibel The Journal of the Acoustical Society of America 1980

Sonometric evaluation of eustachian tube function using broadband stimuli
Krishna G Murti and Erdem I Cantekin and Richard M Stern and Charles D Bluestone Annals of Otology, Rhinology & Laryngology 1980

Hölle im Schnee: eine Kurzfassung des Buches
Richard Martin Stern and Hans Blickensdörfer and Roger Bourgeon and Henry Denker Annals of Otology, Rhinology & Laryngology 1980

A FAVORABLE PROGNOSIS FOR DRUG STOCKS
S REID and N SWEIG and G FARR and D SAKS and M HARSHBARGER and R STERN INSTITUTIONAL INVESTOR 1980

1979

Discrimination of symmetric time‐intensity traded binaural stimuli
B Robert Ruotolo and Richard M Stern Jr and H Steven Colburn The Journal of the Acoustical Society of America 1979

Effects of binaural maskers on the subjective laterality of diotic targets
Eliot M Rubinov and Richard M Stern The Journal of the Acoustical Society of America 1979

On the use of multiple perceptual images in binaural discrimination experiments
Richard M Stern The Journal of the Acoustical Society of America 1979

A forced‐choice paradigm for pulsation‐threshold measurements
Gregory J DuMond and Richard M Stern The Journal of the Acoustical Society of America 1979

1978

Eardrum impedance effects on the free field frequency response of occluded ears
S Gilman and D Dirks and R Stern The Journal of the Acoustical Society of America 1978

Theory of binaural interaction based on auditory‐nerve data. IV. A model for subjective lateral position
Richard M Stern Jr and H Steven Colburn The Journal of the Acoustical Society of America 1978

Die Himmelsmaschine (I hide, we seek, dt.) Roman
Richard Martin Stern and Wulf Bergner The Journal of the Acoustical Society of America 1978

Sturm ist ihre Ernte (Stanfield harvest, dt.) Roman
Richard Martin Stern and Georg Albrecht von Ihering The Journal of the Acoustical Society of America 1978

1977

Laterization and the MLD: detection‐threshold performance of an auditory‐nerve‐based model for lateral position
Richard M Stern The Journal of the Acoustical Society of America 1977

Discrimination of symmetric, time‐intensity traded stimuli
B Robert Ruotolo and Richard M Stern Jr and H Steven Colburn The Journal of the Acoustical Society of America 1977

Der weisse Hai: eine Kurzfassung des Buches
Peter Benchley and Ephraim Kishon and Richard Martin Stern and Hans Blickensdörfer The Journal of the Acoustical Society of America 1977

1976

Current problems in binaural hearing research
HS Colburn and RH Domnitz and RM Stern Jr and NI Durlach The Journal of the Acoustical Society of America 1976

Lateral position, interaural discrimination, and binaural detection: Model based on auditory‐nerve activity
Richard M Stern The Journal of the Acoustical Society of America 1976

Flammendes Inferno
Richard Martin Stern The Journal of the Acoustical Society of America 1976

Die Herren von El Rancho: Roman
Richard Martin Stern The Journal of the Acoustical Society of America 1976

Die Herren von El Rancho (Power. dt) Roman
Richard Martin Stern and Charlotte von Ihering and Georg Albrecht von Ihering The Journal of the Acoustical Society of America 1976

Lateralization, discrimination, and detection of binaural pure tones
Richard Martin Stern The Journal of the Acoustical Society of America 1976