Xiao-Lei Zhang ()
Professor,
1. School of Marine Science and Technology, Northwestern Polytechnical University,
China
2. Research and
Development Institute of Northwestern Polytechnical University in Shenzhen,
China
Email:
xiaolei.zhang@nwpu.edu.cn
Address: 127 Youyi West Road, Beilin District, Xi'an, Shaanxi
710072, China.
Research interests:
Audio and Speech Processing, Machine
Learning, Statistical Signal Processing, Artificial Intelligence.
Biography
Xiao-Lei Zhang
received the Ph.D. degree with honors from the Information and Communication
Engineering, Tsinghua University, Beijing, China, in 2012. He was a postdoctoral
researcher with the Department of Electronic Engineering, Tsinghua University
from 2012 to 2014. He was a visiting scholar with the Perception and
Neurodynamics Laboratory,The Ohio State University, Columbus, OH, USA from 2013
to 2014. He was a postdoctoral researcher with the Department of Computer
Science and Engineering, The Ohio State University, Columbus, OH, USA from 2014
to 2016. Since 2016, he joint the Northwestern Polytechnical University, Xi'an,
China, where he is currently a full professor.
His research
interests are the topics in speech processing, machine learning,
statistical signal processing, and artificial intelligence. He has published
numerous peer-reviewed articles in journals and conference proceedings including
IEEE TPAMI, IEEE TASLP, IEEE TCYB, IEEE TSMC, Neural Networks, Pattern
Recognition, Computer Speech and Language, Speech Communication, etc. He has
authored a book 'speech signal processing based on deep learning in adverse
environments', and co-authored a text book in statistics.
He received the 2020
Neural Networks Best Paper Award from the
International Neural Network Society, the
Excellent Paper Award Winner of Ubi-Media 2019 (with Mou Wang and Susanto
Rahardja), the First-class Beijing Science and Technology Award in 2014, He was an APSIPA Distinguished Lecturer in
2018-2019. He was selected into the Youth Program of National Distinguished
Experts of China in 2017, and the Hundred Talents Plan of Shaanxi Province of
China, in 2017.
He serves/served an action editor of Neural Networks, an
associate editor of IEEE/ACM Transactions on Audio, Speech, and Language
Processing, EURASIP Journal on Audio, Speech, and Music Processing, IEEE
Signal Processing Letters, and
many other journals, and helps organized many conferences and challenges. He is
an elected member of the IEEE SPS
Speech and Language Processing Technical Committee (SLTC)
(2021-2023), and an elected vice chairman of the Speech Information
Technical Committee of the Chinese Information Processing Society of China
(2023-).
He is a senior member of IEEE, a member of the IEEE SPS,
ISCA, CAA, CAAI, CCF, CIPS, etc.
Publications
Books
-
张晓雷(著).
复杂环境下语音信号处理的深度学习方法. 清华大学出版社, 2021年12月. ISBN:9787302590002.
[Xiao-Lei Zhang. Speech signal processing based on deep
learning in adverse environments. Tsinghua University Press,
December 2021. (in chinese)]
Journal
articles
-
Linfeng Feng,
Xiao-Lei Zhang,
and Xuelong Li,
“Eliminating quantization errors in classification-based sound source
localization,''
Neural Networks (NNJ), volume
181, number 106679,
pages 1-11, 2024. [code]
-
Linfeng Feng,
Yijun Gong, Zhi Liu, Xiao-Lei Zhang,
and Xuelong Li,
“Learning multi-dimensional speaker localization: Axis partitioning,
unbiased label distribution, and data augmentation,''
IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE
TASLP), volume
32,
pages 4013-4025, 2024.
-
Xiao-Lei Zhang and Xuelong Li, "Robust
multilayer bootstrap networks in ensemble for unsupervised
representation learning and clustering," Pattern Recognition,
volume 156, pages 110739-110752, 2024. [supplement][code]
-
Yibo Bai, Xiao-Lei Zhang, and
Xuelong Li, "Diffusion-based adversarial purification for speaker
verification,"
IEEE Signal Processing Letters (IEEE SPL), volume 31,
pages 2300-2304, 2024.
-
Lei Zhao, Wenbo
Zhu, Shengqiang Li, Hong Luo, Xiao-Lei Zhang,
and Susanto Rahardja,
“Multi-resolution convolutional residual neural networks for
monaural speech dereverberation,''
IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE
TASLP), volume
38,
pages 2338-2351, 2024.
-
Jiadi yao, Hong
Luo, Jun Qi, and Xiao-Lei Zhang,
“Interpretable spectrum transformation attacks to speaker recognition,''
IEEE/ACM Transactions on Audio, Speech, and Language Processing (IEEE
TASLP), volume
32,
pages 1531-1545, 2024.[code]
-
Xueqing Li,
Shengqiang Li, Xiao-Lei Zhang, and
Susanto Rahardja, "Transformer-based
end-to-end speech translation with rotary position embedding,"
IEEE Signal Processing Letters (IEEE SPL), volume 31,
pages 371-375, 2024.
-
Jie Wang, Xing
Chen, and Xiao-Lei Zhang, “Zeroth and first-order difference discrimination
for unsupervised domain adaptation,” Complex & Intelligent Systems, 2023.
-
Xing Chen, Jie
Wang, Xiao-Lei Zhang, Wei-Qiang Zhang, and Kunde Yang, "LMD: A
learnable mask network to detect adversarial examples for speaker
verification," IEEE/ACM Transactions on Audio, Speech, and Language
Processing (IEEE TASLP), volume 31, pages 2476-2490, 2023.
-
Jie Wang and Xiao-Lei Zhang, “Improving
pseudo labels with intra-class similarity for unsupervised domain
adaptation,” Pattern Recognition, volume 138, number 109379,
pages 1-11, 2023. [code]
-
Jiadi Yao,
Chengdong Liang, Xing Chen, Xiao-Lei Zhang, Wei-Qiang Zhang, and
Kunde Yang, “Symmetric
saliency-based adversarial attack to speaker identification,”
IEEE Signal Processing Letters (IEEE SPL), volume 30, pages 1-5,
2023.
-
Mou Wang, Junqi
Chen, Xiao-Lei Zhang, and Susanto Rahardja, “End-to-end
multi-modal speech recognition on an air and bone conducted speech
corpus,” IEEE/ACM Transactions on Audio, Speech, and Language
Processing (IEEE TASLP), volume 31, pages 513-524, 2023.
-
Xiao-Lei
Zhang, Lei Xie, Eric Fosler-Lussier, Emmanuel Vincent, “Special
issue on advances in deep learning based speech processing,”
Neural Networks (NNJ), volume 158, pages 328-330, 2023.
-
Jianyu Wang and
Xiao-Lei Zhang, “Deep
NMF topic modeling,” Neurocomputing, volume 515, pages
157-173, 2023.
-
Xiao-Lei
Zhang and Menglong Xu, “AUC
optimization for deep learning based voice activity detection,” EURASIP Journal on Audio, Speech, and Music Processing, volume 2022,
pages 1-12, 2022.
-
Mou Wang, Junqi
Chen, Xiao-Lei Zhang, Zhiyong Huang, and Susanto Rahardja. “Multi-modal
speech enhancement with bone-conducted speech in time domain.” Applied Acoustics, volume 200, number 109058, pages, 2022.
-
Qian Wang, Mou
Wang, Yan Yang, and Xiaolei Zhang, “Multi-modal
emotion recognition using EEG and speech signals,” Computers in
Biology and Medicine, volume 105907, pages 1-16, 2022.
-
Ziye Yang,
Shanzheng Guan, and Xiao-Lei Zhang. “Deep
ad-hoc beamforming based on speaker extraction for target-dependent
speech separation,” Speech Communication, volume 140, pages
87-97, 2022.
-
Zhongxin Bai,
Jianyu Wang, Xiao-Lei Zhang, and Jingdong Chen “End-to-end
speaker verification via curriculum bipartite ranking weighted binary
cross-entropy,” IEEE/ACM Transactions on Audio, Speech, and
Language Processing (IEEE TASLP), volume 30, pages 1330-1344, 2022.
-
Jianyu Wang,
Shanzheng Guan, Shupei Liu, and Xiao-Lei Zhang.
Minimum-volume
multichannel nonnegative matrix factorization for blind audio source
separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing
(IEEE TASLP),
volume 29, pages 3089-3103, 2021.
-
Zhongxin Bai and
Xiao-Lei Zhang.
Speaker recognition based on deep learning: An overview. Neural
Networks (NNJ), volume 140, pages 65-99, 2021.
-
Xiao-Lei Zhang.
Deep ad-hoc beamforming.
Computer Speech and Language, volume 68, number 101201, pages
1-18, 2021 (arXiv preprint arXiv:1811.01233,
6th November 2018,
last revised on 7th January 2019.)
[arXiv]
-
Mou Wang, Xiao-Lei Zhang, and Susanto Rahardja.
An unsupervised deep learning system for acoustic scene analysis.
Applied Sciences, volume 10, number 6, pages 2076-2080, 2020.
-
Zhongxin Bai, Xiao-Lei Zhang,
and Jingdong Chen.
Speaker
verification by partial AUC optimization with Mahalanobis distance
metric learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing
(IEEE TASLP),
volume 28, pages 1533-1548, 2020.
-
Zhongxin Bai, Xiao-Lei Zhang,
and Jingdong Chen.
Cosine metric learning based speaker verification.
Speech
Communication, volume 118, pages 10--20, 2020
-
Naijun Zheng and Xiao-Lei
Zhang.
Phase-aware speech enhancement based on deep neural networks.
IEEE/ACM Transactions on Audio, Speech, and Language Processing
(IEEE TASLP),
volume 27, number 1, pages 63-76, 2019.
-
Xiao-Lei Zhang.
Multilayer
bootstrap networks.
Neural Networks (NNJ), volume 103,
pages 29-43, 2018. [arXiv][project
page]
(recepient of the 2020 Best Paper Award
from journal Neural Networks and International Neural
Network Society)
-
Xiao-Lei Zhang and
DeLiang Wang.
A deep ensemble learning method for monaural speech separation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing
(IEEE TASLP),
volume 24, number 5, pages 967-977, 2016. [supplement]
-
Xiao-Lei Zhang and
DeLiang Wang.
Boosting contextual information for deep neural network based voice activity
detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing
(IEEE TASLP), volume
24, number 2, pages 252-264, February 2016. [supplement][example_of_automatic_labeling (6.0MB)]
-
Xiao-Lei
Zhang.
Heuristic ternary error correcting output codes via weight
optimization and layered clustering-based
approach.
IEEE Transactions on Cybernetics
(IEEE TCYB), volume 45, number 2, pages
289-301, February 2015.
-
Xiao-Lei Zhang.
Convex discriminative multitask clustering.
IEEE Transactions on Pattern Analysis and
Machine Intelligence (IEEE TPAMI), volume 37, number 1, pages 28-40, January,
2015.[supplement]
-
Xiao-Lei Zhang and Ji Wu.
Deep belief networks based voice activity detection.
IEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), volume 21,
number 4, pages 697-710, April 2013.
-
Xiao-Lei Zhang and Ji Wu.
Linearithmic time sparse and convex maximum margin clustering.
IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics (IEEE
TSMCB), volume 42, number 6, pages 1669-1692, December 2012.
-
Ji Wu and Xiao-Lei Zhang,
Efficient multiple kernel support vector machine based voice activity
detection, IEEE Signal Processing Letters (IEEE SPL), volume 18,
number 8, pages 466-469, August 2011.
-
Ji Wu and Xiao-Lei Zhang,
An efficient voice activity detection algorithm by combining statistical
model and energy detection, EURASIP Journal on Advances in Signal
Processing, volume 2011, number 1, pages 18-27, July 2011.
-
Ji Wu and Xiao-Lei Zhang,
Maximum margin clustering based statistical VAD with multiple observation
compound feature, IEEE Signal Processing Letters (IEEE SPL),
volume 18, number 5, pages 283-286, May 2011.
Conference papers
-
Qing Wang,
Hongmei Guo, Jian Kang, Mengjie Du, Jie Li,
Xiao-Lei Zhang, and Lei Xie,
“Speaker
Contrastive Learning for Source Speaker Tracing,''
Proceedings of
2024 IEEE Spoken Language Technology Workshop (SLT'24),
Macow, China, December 2024, pages X-X.
-
YiBo Bai,
Xiao-Lei Zhang, and Xuelong Li,
“Adversarial
Purification for Speaker Verification by Two-Stage Diffusion Models,''
Proceedings of
2024 IEEE Spoken Language Technology Workshop (SLT'24),
Macow, China, December 2024, pages X-X.
-
Yuanyuan Zhu,
Jiaxu He, Ruihao Jing, Yaodong Song, Jie Lian,
Xiao-Lei Zhang, and Jie Li,
“LLM-based expressive text-to-speech synthesizer with style and timbre
disentanglement,''
Proceedings of 14th International Symposium on Chinese Spoken Language
Processing (ISCSLP'24), Beijing, China, November 2024, pages X-X.
-
Linfeng Feng,
Xiao-Lei Zhang,
and Xuelong Li,
“Quantization-error-free soft label for 2D sound source localization,''
Proceedings of 14th International Symposium on Chinese Spoken Language
Processing (ISCSLP'24), Beijing, China, November 2024, pages X-X.
-
Hongmei Guo,
Yijiang Chen, Xiao-Lei Zhang, and Xuelong
Li,
“Graph attention based multi-channel U-Net for speech dereverberation with
ad-hoc microphone arrays,” In Proceedings of Interspeech,
Kos Island, Greece, September 2024, pages 617-621.
-
Sizhou Chen,
Yibo Bai, Jiadi Yao, Xiao-Lei Zhang,
and Xuelong Li, “Textual-driven adversarial purification for speaker
verification,” In Proceedings of Interspeech,
Kos Island, Greece, September 2024, pages 527-531.
-
Xianyan Fu,
Xiao-Lei Zhang, Chao-Han Huck Yang, and Jun
Qi, "Exploiting a quantum multiple kernel learning approach for low-resource
spoken command recognition,'' In Proceedings of the 2024 IEEE
International Conference on Acoustic, Speech, and Signal Processing
(ICASSP'24), Seoul, Korea, April 2024, pages 12931-12935.
-
Lei Zhao and
Xiao-Lei Zhang, "A hierarchical multi-proxy
loss with dynamic main-proxy for deep metric learning,'' In Proceedings of the 2024 IEEE International Conference on Acoustic, Speech,
and Signal Processing (ICASSP'24), Seoul, Korea, April 2024, pages
2695-2699.
-
Jiadi Yao, Chengdong
Liang, Zhendong Peng, Binbin Zhang, and Xiao-Lei Zhang,
“Branch-ECAPA-TDNN: A parallel branch architecture to capture local and
global features for speaker verification,” In Proceedings of Interspeech,
Dublin, Ireland, August 2023, pages 1943-1947.
-
Jie Wang, Menglong
Xu, Jingyong Hou, Binbin Zhang, Xiao-Lei Zhang, Lei Xie, and Fuping
Pan, “WeKws: A production first small-footprint end-to-end Keyword Spotting
Toolkit,” In Proceedings of the 47th IEEE International Conference on
Acoustic, Speech, and Signal Processing (ICASSP’23), Rhodes Is- land,
Greece, June 2023.
-
Linfeng Feng, Yijun
Gong, and Xiao-Lei Zhang, “Soft label coding for end-to-end sound
source localization with ad-hoc microphone arrays,” In Proceedings of the
47th IEEE International Conference on Acoustic, Speech, and Signal
Processing (ICASSP’23), Rhodes Island, Greece, June 2023.
-
Jun Qi, Xiao-Lei
Zhang, and Javier Tejedor, “Optimizing quantum federated learning based
on federated quantum natural gradient descent,” In Proceedings of the
47th IEEE International Conference onAcoustic, Speech, and Signal Processing
(ICASSP’23), Rhodes Island, Greece, June 2023.
-
Chengdong Liang,
Xiao-Lei Zhang, BinBin Zhang, Di Wu, Shengqiang Li, Xingchen Song,
Zhendong Peng, and Fuping Pan, “Fast-U2++: Fast and Accurate End-to-End
Speech Recognition in Joint CTC/Attention Frames,” In Proceedings of the
47th IEEE International Conference on Acoustic, Speech, and Signal
Processing (ICASSP’23), Rhodes Island, Greece, June 2023.
-
Xing Chen, Jiadi
Yao, and Xiao-Lei Zhang, “Masking speech feature to detect adversarial examples
for speaker verification,” In Proceedings of Asia-Pacific Signal and
Information Processing Association (APSIPA ASC’22), Chiang Mai,
Thailand, November 2022,
pages 191-195
-
Yijun Gong, Shupei
Liu, and Xiao-Lei Zhang, “End-to-end
two-dimensional sound source localization with
ad-hoc microphone arrays,” In Proceedings of Asia-Pacific Signal and
Information Processing Association (APSIPA ASC’22), Chiang Mai,
Thailand, November 2022,
pages 1944-1949.
-
Menglong Xu,
Shengqiang Li, Chengdong Liang, and Xiao-Lei Zhang. “Multiclass AUC optimization
for robust small-footprint keyword spotting with limited training data,” In
Proceedings of Interspeech, Incheon, Korea, September
2022, pages 3278-3282.
-
Chengdong Liang,
Yijiang Chen, Jiadi Yao, and Xiao-Lei Zhang. “Multi-Channel Far-Field
Speaker Verification with Large-Scale Ad-hoc Microphone Arrays,” In
Proceedings of Interspeech, Incheon, Korea, September 2022, pages
3679-3683.
-
Yijun Gong and
Xiao-Lei Zhang. “DP-Means:
an efficient Bayesian nonparametric model for speaker diarization,” In
Odyssey Workshop, Beijing, China, June 2022, pages 156-161.
-
Junqi Chen, Mou
Wang, Xiao-Lei Zhang, Zhiyong Huang, and Susanto Rahardja. “End-to-end
multi-modal speech recognition with air and bone conducted speech.” In
Proceedings
of the 46th IEEE International Conference on Acoustic, Speech, and Signal
Processing (ICASSP'22),
Singapore, Singapore, May 2022, pages 6052-6056.
-
Wenbo Zhu, Mou Wang,
Xiao-Lei Zhang, and Susanto Rahardja.
A comparison of handcrafted,
parameterized, and learnable features for speech separation. In Proceedings of Asia-Pacific Signal and Information Processing Association
(APSIPA ASC'21), Tokyo, Japan, December 2021, pages 635-639.
-
Shengqiang Li,
Menglong Xu, and Xiao-Lei Zhang.
Efficient conformer-based speech
recognition with linear attention. In Proceedings of Asia-Pacific Signal
and Information Processing Association (APSIPA ASC'21), Tokyo, Japan,
December 2021, pages 448-453.
-
Shengqiang Li,
Menglong Xu, and Xiao-Lei Zhang.
Conformer-based end-to-end speech
recognition with rotary position embedding. In Proceedings of
Asia-Pacific Signal and Information Processing Association (APSIPA ASC'21),
Tokyo, Japan, December 2021, pages 443-447.
-
Shanzheng Guan,
Shupei Liu, Junqi Chen, Wenbo Zhu, Shengqiang Li, Xu Tan, Ziye Yang,
Menglong Xu, Yijiang Chen, Chengdong Liang, Jianyu Wang, and Xiao-Lei
Zhang.
Libri-adhoc40: A dataset collected from synchronized ad-hoc
microphone arrays. In Proceedings of Asia-Pacific Signal and Information
Processing Association (APSIPA ASC'21), Tokyo, Japan, December 2021,
pages 1116-1120.
-
Chengdong Liang,
Junqi Chen, Shanzheng Guan, and Xiao-Lei Zhang.
Attention-based
multi-channel speaker verification with ad-hoc microphone arrays. In Proceedings of Asia-Pacific Signal and Information Processing Association
(APSIPA ASC'21), Tokyo, Japan, December 2021, pages 1111-1115.
-
Jianyu Wang,
Shanzheng Guan, and Xiao-Lei Zhang.
Minimum-volume regularized ILRMA
for blind audio source separation. In Proceedings of Asia-Pacific Signal
and Information Processing Association (APSIPA ASC'21), Tokyo, Japan,
December 2021, pages 630-634.
-
Junqi Chen and
Xiao-Lei Zhang. Scaling sparsemax based channel selection for speech
recognition with ad-hoc microphone arrays.
In Proceedings of Interspeech, Brno, Czech Republic, September 2021, pages
291-295.
-
Xu Tan and Xiao-Lei Zhang.
Speech enhancement aided end-to-end multi-task learning for voice activity
detection.
In Proceedings of the 45th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2021), Toronto, Ontario, Canada,
June 2021,
pages
6823-6827.
-
Menglong Xu,
Shengqiang Li, and Xiao-Lei Zhang.
Transformer-based
end-to-end speech recognition with local dense synthesizer attention.
In Proceedings of the 45th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2021), Toronto, Ontario, Canada,
June 2021,
pages
5899-5903.
-
Ziye Yang, Xiao-Lei Zhang,
and Zhonghua Fu.
Multi-channel speech separation using deep embedding
with multilayer bootstrap networks.
In
Proceedings of Asia-Pacific Signal and Information Processing Association
Annual Summit and Conference
(APSIPA ASC 2020),
Auckland, New Zealand, December 2020,
pages
716-719.
-
Menglong Xu and
Xiao-Lei Zhang.
Depthwise separable convolutional ResNet with
squeeze-and-excitation blocks for small-footprint keyword spotting.
In
Proceedings of Interspeech,
Shanghai, China, October 2020, pages 2547-2551.
-
Jian-Yu Wang and
Xiao-Lei Zhang.
Deep topic modeling by multilayer bootstrap networks and lasso.
In Proceedings of the 25th IEEE International Conference on Pattern
Recognition (ICPR 2020), Milan, Italy, January 2021, pages 2470-2475.
-
Zhongxin Bai, Xiao-Lei Zhang, and Jingdong Chen.
Partial AUC metric learning based speaker verification back-end. In
Proceedings of the Odyssey Workshop (Odyssey 2020), Tokyo Japan,
November 2020,
pages
380-384.
(one of the three
inaugural Jack
Godfrey's Best Student Paper Award Finallists)
-
Zhongxin Bai, Xiao-Lei Zhang,
and Jingdong Chen.
Partial AUC optimization based deep speaker embeddings with class-center
learning for text-independent speaker verification.
In Proceedings of the 44th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2020), Virtual Barcelona, May 2020,
pages
6819-6823.
-
Mou Wang, Rui Wang, Xiao-Lei Zhang, and Susanto
Rahardja.
A Hybrid constant-Q transform based CNN ensemble for acoustic
scene classification. In
Proceedings of Asia-Pacific Signal and Information Processing Association
Annual Summit and Conference
(APSIPA ASC 2019),
Lanzhou, China, November 2019,
pages
205-209.
-
Rui Wang, Mou Wang, Xiao-Lei Zhang, and Susanto
Rahardja.
Domain adaptation neural network for acoustic scene classification in
mismatched conditions. In
Proceedings of Asia-Pacific Signal and Information Processing Association
Annual Summit and Conference
(APSIPA ASC
2019),
Lanzhou, China, November 2019,
pages
1501-1505.
-
Ziye Yang and Xiao-Lei Zhang.
Boosting spatial information for deep learning based multichannel
speaker-independent speech separation in reverberant environments.
In Proceedings of
Asia-Pacific Signal and Information Processing Association
Annual Summit and Conference
(APSIPA ASC 2019),
Lanzhou, China, November 2019,
pages
1506-1510.
-
Mou Wang, Xiao-Lei Zhang, and Susanto Rahardja.
A hybrid approach for mobile phone clustering with speech recordings.
In Proceedings of
the 12th International Conference on Ubi-Media Computing (Ubi-Media 2019),
Bali, Indonesia, 2019,
pages
205-209.
(Excellent Paper Award
Winner)
-
Jingli Xie, Danqi
Jin, Wen Zhang, Xiao-Lei Zhang, Jie Chen, and DeLiang Wang.
Robust sparse multichannel active noise control.
In Proceedings of the 44th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2019), Brighton, United Kingdom,
May 2019,
pages
521-525.
-
Zi-Chen Fan,
Zhongxin Bai, Xiao-Lei Zhang, Susanto Rahardja, and Jingdong Chen.
AUC optimization for deep learning based voice activity detection. In
Proceedings of the 44th IEEE International Conference on Acoustic, Speech,
and Signal Processing (ICASSP 2019), Brighton, United Kingdom, May 2019,
pages
6760-6764.
-
Menghzhen Li and
Xiao-Lei Zhang,
An Investigation of Speaker Clustering Algorithms in Adverse Acoustic
Environments, In Proceedings of Asia-Pacific Signal and Information
Processing Association (APSIPA ASC 2018), Honolulu, Hawaii, November
2018,
pages
1462-1466.
-
Zhongxin Bai,
Xiao-Lei Zhang and
Jingdong Chen.
Cosine metric learning for speaker verification in the i-vector space.
In Proceedings of Interspeech,
Hyderabad, India, September 2018,
pages
1126-1130.
-
Xiao-Lei Zhang.
Speech separation by cost sensitive deep learning.
In Proceedings of
Asia-Pacific Signal and Information Processing Association
Annual Summit and Conference (APSIPA ASC 2017), Malaysia, December 2017,
pages
159-162.
-
Xiao-Lei Zhang.
Universal background sparse coding and multilayer bootstrap network for
speaker clustering.
In Proceedings of Interspeech, San Francisco, USA, September 2016,
pages
1858-1862. [supplement][UBSC+MBN_code
(3.0MB)]
-
Xiao-Lei Zhang.
Nonlinear dimensionality reduction of data by deep distributed random
samplings.
In JMLR Workshop and Conference Proceedings 39: the
6th Asian Conference on Machine Learning (ACML 2014), Nha Trang, Vietnam, November 2014,
pages 221--233.
-
Xiao-Lei Zhang and
DeLiang Wang.
Boosted deep neural networks and multi-resolution cochleagram features for
voice activity detection.
In Proceedings of Interspeech, Singapore, 2014, pages 1534--1538.
-
Xiao-Lei Zhang.
Unsupervised domain adaptation for deep neural network based voice
activity detection.
In Proceedings of the 39th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2014), Florence, Italy, May 2014, pages 6864--6868.
-
Xiao-Lei Zhang and
Ji Wu.
Denoising deep neural networks based voice activity detection.
In Proceedings of the 38th IEEE International Conference on
Acoustic, Speech, and Signal Processing (ICASSP 2013), Vancouver, Canada, May 2013, pages
853-857.
-
Xiao-Lei Zhang and
Ji Wu.
Weight optimization and layered clustering-based ecoc.
In Proceedings of the 38th IEEE International Conference on Acoustic,
Speech, and Signal Processing (ICASSP 2013), Vancouver, Canada, May 2013, pages
3477-3481.
-
Xiao-Lei Zhang,
Ji Wu, Zhi-Peng Chen, and Ping Lv.
Optimized weighted decoding for error-correcting output codes.
In Proceedings of the 37th IEEE International Conference on
Acoustic, Speech, and Signal Processing (ICASSP 2012). Kyoto, Japan,
April 2012. pages 2101-2104.
Preprint articles
-
Xiao-Lei Zhang. Linear regression for speaker verification. arXiv preprint arXiv:1802.04113,
12th February 2018.
[pdf][arXiv]
-
Xiao-Lei Zhang. An investigation of universal background sparse coding based speaker
verification on TIMIT. arXiv preprint arXiv:1509.07298,
24th September 2015, last updated on 15th March 2017. [pdf][supplement][code][arXiv]
Call for papers
Special issue on Deep Learning applied to
Music Signal Processing, in EURASIP Journal on Audio, Speech, and
Music Processing
Important Dates
Submission deadline: 1 September 2022
31 January 2023
Lead Guest Editor
Xiaolei Zhang, Northwestern Polytechnical University
Email: xiaolei.zhang@nwpu.edu.cn
Guest Editors
Wenwu Wang, University of Surrey, UK
Ivan Lee, University of South Australia, Australia
Important: Authors should select "Deep
Learning applied to Music Signal Processing" when they reach the “Article
Type” step in the submission system.
Submisison guidelines: https://asmp-eurasipjournals.springeropen.com/submission-guidelines
for the detailed information, see the following
URL:
https://asmp-eurasipjournals.springeropen.com/call-for-papers--deep-learning-applied-to-music-signal-processin
Recruitment advertisement:
1) Multiple post-doc positions at Northwestern
Polytechnical University (NWPU) majoring in
machine learning, deep learning, audio and speech processing, and language
processing are available.
2) Multiple faculty positions at
NWPU at levels of assistant professors,
associate professors, and full professors are available.
For detailed information, please
contact me via
xiaolei.zhang@nwpu.edu.cn.