đ§ last updated May 2, 2024 đ§
Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems
Roshan Sharma and William Chen and Takatomo Kano and Ruchira Sharma and Siddhant Arora and Shinji Watanabe and Atsunori Ogawa and Marc Delcroix and Rita Singh and Bhiksha Raj
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023
Loft: Local proxy fine-tuning for improving transferability of adversarial attacks against large language model
Muhammad Ahmed Shah and Roshan Sharma and Hira Dhamyal and Raphael Olivier and Ankit Shah and Dareen Alharthi and Hazim T Bukhari and Massa Baali and Soham Deshmukh and Michael Kuhlmann and Bhiksha Raj and Rita Singh
arXiv preprint arXiv:2310.04445 2023
Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms
Joseph Konan and Ojas Bhargave and Shikhar Agnihotri and Hojeong Lee and Ankit Shah and Shuo Han and Yunyang Zeng and Amanda Shu and Haohui Liu and Xuankai Chang and Hamza Khalid and Minseon Gwak and Kawon Lee and Minjeong Kim and Bhiksha Raj
arXiv preprint arXiv:2303.09048 2023
Usb: A unified semi-supervised learning benchmark for classification
Yidong Wang and Hao Chen and Yue Fan and Wang Sun and Ran Tao and Wenxin Hou and Renjie Wang and Linyi Yang and Zhi Zhou and Lan-Zhe Guo and Heli Qi and Zhen Wu and Yu-Feng Li and Satoshi Nakamura and Wei Ye and Marios Savvides and Bhiksha Raj and Takahiro Shinozaki and Bernt Schiele and Jindong Wang and Xing Xie and Yue Zhang
Advances in Neural Information Processing Systems 2022
Hear: Holistic evaluation of audio representations
Joseph Turian and Jordie Shier and Humair Raj Khan and Bhiksha Raj and Björn W Schuller and Christian J Steinmetz and Colin Malloy and George Tzanetakis and Gissel Velarde and Kirk McNally and Max Henry and Nicolas Pinto and Camille Noufi and Christian Clough and Dorien Herremans and Eduardo Fonseca and Jesse Engel and Justin Salamon and Philippe Esling and Pranay Manocha and Shinji Watanabe and Zeyu Jin and Yonatan Bisk
NeurIPS 2021 Competitions and Demonstrations Track 2022
Usb: A unified semi-supervised learning benchmark
Yidong Wang and Hao Chen and Yue Fan and Wang Sun and Ran Tao and Wenxin Hou and Renjie Wang and Linyi Yang and Zhi Zhou and Lan-Zhe Guo and Heli Qi and Zhen Wu and Yu-Feng Li and Satoshi Nakamura and Wei Ye and Marios Savvides and Bhiksha Raj and Takahiro Shinozaki and Bernt Schiele and Jindong Wang and Xing Xie and Yue Zhang
Conference on Neural Information Processing Systems (NeurIPS) 2022
Foolhd: Fooling speaker identification by highly imperceptible adversarial disturbances
Ali Shahin Shamsabadi and Francisco SepĂșlveda Teixeira and Alberto Abad and Bhiksha Raj and Andrea Cavallaro and Isabel Trancoso
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
Preserving privacy in speaker and speech characterisation
Andreas Nautsch and Abelino Jiménez and Amos Treiber and Jascha Kolberg and Catherine Jasserand and Els Kindt and Héctor Delgado and Massimiliano Todisco and Mohamed Amine Hmani and Aymen Mtibaa and Mohammed Ahmed Abdelraheem and Alberto Abad and Francisco Teixeira and Driss Matrouf and Marta Gomez-Barrero and Dijana Petrovska-Delacrétaz and Gérard Chollet and Nicholas Evans and Thomas Schneider and Jean-François Bonastre and Bhiksha Raj and Isabel Trancoso and Christoph Busch
arXiv preprint arXiv:1911.05733 2019
A Comparative Study of Spatial Speech Separation Techniques to Improve Speech Recognition
Xinhui Zhou and Chiman Kwan and Bulent Ayhan and Chanwoo Kim and Kshitiz Kumar and Richard Stern
Advances in Neural NetworksâISNN 2018: 15th International Symposium on Neural Networks, ISNN 2018, Minsk, Belarus, June 25â28, 2018, Proceedings 15 2018
Sound source separation using phase difference and reliable mask selection
Chanwoo Kim and Anjali Menon and Michiel Bacchiani and Richard M Stern
Advances in Neural NetworksâISNN 2018: 15th International Symposium on Neural Networks, ISNN 2018, Minsk, Belarus, June 25â28, 2018, Proceedings 15 2018
A comparative analysis of human-mediated and system-mediated interruptions for multi-user, multitasking interactions
Nia Peters and Griffin Romigh and George Bradley and Bhiksha Raj
Advances in Human Factors and Systems Interaction: Proceedings of the AHFE 2017 International Conference on Human Factors and Systems Interaction, July 17â 21, 2017, The Westin Bonaventure Hotel, Los Angeles, California, USA 8 2018
Robust features in deep-learning-based speech recognition
Vikramjit Mitra and Horacio Franco and Richard M Stern and Julien Van Hout and Luciana Ferrer and Martin Graciarena and Wen Wang and Dimitra Vergyri and Abeer Alwan and John HL Hansen
New Era for Robust Speech Recognition: Exploiting Deep Learning 2017
Audition for multimedia computing
Gerald Friedland and Paris Smaragdis and Josh McDermott and Bhiksha Raj
Advances in Human Factors and Systems Interaction: Proceedings of the AHFE 2017 International Conference on Human Factors and Systems Interaction, July 17â 21, 2017, The Westin Bonaventure Hotel, Los Angeles, California, USA 8 2017
DCASE 2017 challenge setup: Tasks, datasets and baseline system
Annamaria Mesaros and Toni Heittola and Aleksandr Diment and Benjamin Elizalde and Ankit Shah and Emmanuel Vincent and Bhiksha Raj and Tuomas Virtanen
DCASE 2017-Workshop on Detection and Classification of Acoustic Scenes and Events 2017
The REVERB challenge: A benchmark task for reverberation-robust ASR techniques
Keisuke Kinoshita and Marc Delcroix and Sharon Gannot and Emanuël AP Habets and Reinhold Haeb-Umbach and Walter Kellermann and Volker Leutnant and Roland Maas and Tomohiro Nakatani and Bhiksha Raj and Armin Sehr and Takuya Yoshioka
New Era for Robust Speech Recognition: Exploiting Deep Learning 2017
When to interrupt: A comparative analysis of interruption timings within collaborative communication tasks
Nia Peters and Griffin Romigh and George Bradley and Bhiksha Raj
Advances in Human Factors and System Interactions: Proceedings of the AHFE 2016 International Conference on Human Factors and System Interactions, July 27-31, 2016, Walt Disney WorldÂź, Florida, USA 2017
Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel-and Noise-Degraded Speech.
Vikramjit Mitra and Julien van Hout and Wen Wang and Chris Bartels and Horacio Franco and Dimitra Vergyri and Abeer Alwan and Adam Janin and John HL Hansen and Richard M Stern and Abhijeet Sangwan and Nelson Morgan
INTERSPEECH 2016
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
Keisuke Kinoshita and Marc Delcroix and Sharon Gannot and Emanuël A P. Habets and Reinhold Haeb-Umbach and Walter Kellermann and Volker Leutnant and Roland Maas and Tomohiro Nakatani and Bhiksha Raj and Armin Sehr and Takuya Yoshioka
EURASIP Journal on Advances in Signal Processing 2016
Experimentation on the dcase challenge 2016: Task 1-acoustic scene classification and task 3-sound event detection in real life audio
Benjamin Elizalde and Anurag Kumar and Ankit Shah and Rohan Badlani and Emmanuel Vincent and Bhiksha Raj and Ian Lane
Detection and Classification of Acoustic Scenes and Events 2016
Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices
JoĂŁo Miranda and RamĂłn F Astudillo and Ăngela Costa and AndrĂ© Silva and Hugo Silva and JoĂŁo Graça and Bhiksha Raj
Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016, Lisbon, Portugal, November 23-25, 2016, Proceedings 3 2016
Detecting psychological distress in adults through transcriptions of clinical interviews
Joana Correia and Isabel Trancoso and Bhiksha Raj
Advances in Speech and Language Technologies for Iberian Languages: Third International Conference, IberSPEECH 2016, Lisbon, Portugal, November 23-25, 2016, Proceedings 3 2016
Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop
Hynek Hermansky and LukĂĄĆĄ Burget and Jordan Cohen and Emmanuel Dupoux and Naomi Feldman and John Godfrey and Sanjeev Khudanpur and Matthew Maciejewski and Sri Harish Mallidi and Anjali Menon and Tetsuji Ogawa and Vijayaditya Peddinti and Richard Rose and Richard Stern and Matthew Wiesner and Karel VeselĂœ
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015
CMU informedia@ TrecVID 2015 MED/SIN/LNK/SED
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann
TREC Video Retrieval Evaluation 2015 2015
A perceptually-motivated low-complexity instantaneous linearchannel normalization technique applied to speaker verification
VĂctor Poblete RamĂrez and Felipe Espic and Simon King and Richard M Stern and Fernando HuenupĂĄn and JosuĂ© Abraham Fredes Sandoval and NĂ©stor Becerra Yoma
TREC Video Retrieval Evaluation 2015 2015
CMU informedia@ TrecVID 2015 MED/SIN/LNK/SED
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann
TREC Video Retrieval Evaluation 2015 2015
In vivo treatment sensitivity testing with positron emission tomography/computed tomography after one cycle of chemotherapy for Hodgkin lymphoma
Martin Hutchings and Lale Kostakoglu and Jan Maciej Zaucha and Bogdan Malkowski and Alberto Biggi and Iwona Danielewicz and Annika Loft and Lena Specht and Dominick Lamonica and Myron S Czuczman and Christina Nanni and Pier Luigi Zinzani and Louis Diehl and Richard Stern and Morton Coleman
J Clin Oncol 2014
Informedia@ trecvid 2014 med and mer
Shoou-I Yu and Lu Jiang and Zexi Mao and Xiaojun Chang and Xingzhong Du and Chuang Gan and Zhenzhong Lan and Zhongwen Xu and Xuanchong Li and Yang Cai and Anurag Kumar and Yajie Miao and Lara Martin and Nikolas Wolfe and Shicheng Xu and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard Stern and Alexander Hauptmann
NIST TRECVID Video Retrieval Evaluation Workshop 2014
Informedia@ TrecVID 2014: MED and MER
Shoou-I Yu and Lu Jiang and Zhongwen Xu and Zhenzhong Lan and Shicheng Xu and Xiaojun Chang and Xuanchong Li and Zexi Mao and Chuang Gan and Yajie Miao and Xingzhong Du and Yang Cai and Lara Martin and Nikolas Wolfe and Anurag Kumar and Huan Li and Ming Lin and Zhigang Ma and Yi Yang and Deyu Meng and Shiguang Shan and Pinar Duygulu Sahin and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Teruko Mitamura and Richard A Stern and Alexander G Hauptmann
TREC Video Retrieval Evaluation 2014 2014
Cmu-informedia at trecvid 2013 multimedia event detection
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Shourabh Rawat and Yang Cai and Chenqiang Gao and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard Stern and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann
TRECVID 2013 Workshop 2013
PERCEPTION-BASED MEDIA PROCESSING
K Brandenburg and C Faller and J Herre and JD Johnston and WB Kleijn and S Spors and H Wierstorf and A Raake and F Melchior and M Frank and F Zotter and G Richard and S Sundaram and S Narayanan and S Möller and R Heusdens and H Hermansky and JR Cohen and RM Stern and J Wouters and S Doclo and R Koning and T Francart and E Reinhard and AA Efros and J Kautz and HP Seidel and AC Bovik and HR Wu and AR Reibman and W Lin and F Pereira and SS Hemami and LB Kish and CG Granqvist and LJ Karam and K MacLean and R Garner
Proceedings of the IEEE 2013
Informedia E-Lamp@ TRECVID 2013: Multimedia Event Detection and Recounting (MED and MER)
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Chenqiang Gao and Shourabh Rawat and Yang Cai and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Nicolas Ballas and Deyu Meng and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg
The Journal of the Acoustical Society of America 2013
Cmu-informedia at trecvid 2013 multimedia event detection
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Shourabh Rawat and Yang Cai and Chenqiang Gao and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard Stern and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann
TRECVID 2013 Workshop 2013
The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
Keisuke Kinoshita and Marc Delcroix and Takuya Yoshioka and Tomohiro Nakatani and Emanuel Habets and Reinhold Haeb-Umbach and Volker Leutnant and Armin Sehr and Walter Kellermann and Roland Maas and Sharon Gannot and Bhiksha Raj
2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics 2013
Informedia E-Lamp@ TRECVID 2013: Multimedia Event Detection and Recounting (MED and MER)
Zhen-Zhong Lan and Lu Jiang and Shoou-I Yu and Chenqiang Gao and Shourabh Rawat and Yang Cai and Shicheng Xu and Haoquan Shen and Xuanchong Li and Yipei Wang and Waito Sze and Yan Yan and Zhigang Ma and Nicolas Ballas and Deyu Meng and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg
ICASSP 2013
Informedia e-lamp@ trecvid 2012: multimedia event detection and recounting (med and mer)
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Sohail Bahmani and Antonio Juarez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann
Techniques for Noise Robustness in Automatic Speech Recognition 2012
Pretherapy metabolic tumor burden (MTV) may risk-stratify lymphoma patients: Comparison with early metabolic response
Lale Kostakoglu and Neetha Gandikota and Martin Hutchings and Ryan Cotter and Dominick Lamonica and Josef Machac and Richard Stern and Morton Coleman
IEEE signal processing magazine 2012
Microphone array processing for distant speech recognition: Towards real-world deployment
Kenichi Kumatani and Takayuki Arakawa and Kazumasa Yamamoto and John McDonough and Bhiksha Raj and Rita Singh and Ivan Tashev
Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2012
Informedia e-lamp@ trecvid 2012: multimedia event detection and recounting (med and mer)
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Sohail Bahmani and Antonio Juarez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Roxana Sarbu and Teruko Mitamura and Eric Nyberg and Alexander Hauptmann
IEEE Signal Process. Mag 2012
Informedia@ TRECVID 2012.
Shoou-I Yu and Zhongwen Xu and Duo Ding and Waito Sze and Francisco Vicente and Zhenzhong Lan and Yang Cai and Shourabh Rawat and Peter F Schulam and Nisarga Markandaiah and Sohail Bahmani and Antonio Juårez and Wei Tong and Yi Yang and Susanne Burger and Florian Metze and Rita Singh and Bhiksha Raj and Richard M Stern and Teruko Mitamura and Eric Nyberg and Lu Jiang and Qiang Chen and Lisa M Brown and Ankur Datta and Quanfu Fan and Rogério Schmidt Feris and Shuicheng Yan and Alexander G Hauptmann and Sharath Pankanti
TRECVID 2012
A comparison of prosody modification using instants of significant excitation and mel-cepstral vocoder
B Bajibabu and Ronanki Srikanth and Sathya Adithya Thati and Bhiksha Raj and B Yegnanarayana and Kishore Prahallad
Proceedings of the Centenary Conference on Electrical Engineering,(CEEâ11), Indian Institute of Science, Bangalore 2011
Improved breast and axillary lesion detection on PET imaging through the novel use of a breast holding device
Steven Parmett and Dana Rausch and Ping Lu and Ash Rafique and Nazar Golewale and Melissa Quispe and Chun Kim and Josef Machac and Richard Stern and Borys Krynyckyi
2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings 2006
Ajikumar, PK, 145 Alterkop, B., 215 Arratia, PE, 202
Z Lj Arsenijevic and Y Bao and Z Barkay and N Bay and JI Bech and F Berruti and AF Blandin and RL Boxman and C Briens and H Chang and FF Chen and G Cheng and Y Cheng and C DellaCorte and H Dong and N-h Duong and M Eriksen and E Forssberg and G Frenkel and BS Gardiner and RV Garic-Grulovic and P Godbole and O Goldstein and MH Hancock and M He and C Huang and KJ Hwang and HD Jang and Y Jin and M Kamruddin and F Kayihan and HC Kim and WB Kim and JP Klein and M Kolenbrander and M Kwapinska and A Lange and Q Li and W Li and CL Lin and Y Liu and Y-f Liu and CY Lu and D-y Lu and Y-n Lu and SY Lyu and J McMillan and MH Moys and FJ Muzzio and MS Nielsen and R Nithya and P Pandey and N Parkansky and W Peukert and T-s Qian and Z Qian and B Raj and S Reynolds and A Rivoire and CM Romo-Krfger and Yu Rosenberg and V Rudolph and G Saage and M Saberian and R Sahoo and S-z Shi and Y Song and MK Stanford and C Subero-Couroyer and E Tang and A Tordesillas and E Tsotsas and R Turton and AK Tyagi and J Wang and Y Wang and Z Wang and F Wei and MY Wey and C Xu and M Xu and Q Yin and AB Yu and V Zaspalis and X Zeng and J Zhang and M Zhang and Q Zhang and D Zhou and HP Zhu and J Zhu
Powder Technology 2006
Creating multi-modal, user-centric records of meetings with the carnegie mellon meeting recorder architecture
Satanjeev Banerjee and Jason Cohen and Thomas Quisel and Arthur Chan and Yash Patodia and Ziad Al Bawab and Rong Zhang and Alan Black and Roxana Sarbu and Alexander Rudnicky and Paul E Rybski and Manuela Veloso
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing 2004
The 1997 CMU Sphinx-3 English broadcast news transcription system
Kristie Seymore and Ronald Rosenfeld and S Chen and Maxine Eskenazi and E Gouvea and Raj Reddy and Mosur Ravishankar and Matthew Siegler and Richard Stern and Eric Thayer
Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA 1998
Sound Localization by Human Observers Symposium Proceedings Held in National Academy of Sciences on 14-16 October 1988
David M Green and Georeg F Kuhn and Robert Butler and HS Colburn and John C Middlebrooks and Richard M Stern and James R Lackner and William A Yost
The Journal of the Acoustical Society of America 1989