Selected publications

(* means the corresponding author)


  1. Liangchen Luo#, Yuanhao Xiong#, Yan Liu, Xu Sun*.
    Adaptive Gradient Methods with Dynamic Bound of Learning Rate.
    ICLR 2019

  2. Xu Sun*#, Xuancheng Ren#, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang.
    Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method.
    IEEE Transactions on Knowledge and Data Engineering (TKDE) 2019

  3. Xu Sun*, Shuming Ma, Yi Zhang, Xuancheng Ren.
    Towards Easier and Faster Sequence Labeling for Natural Language Processing: A Search-based Probabilistic Online Learning Framework (SAPO).
    Information Sciences. Elsevier. 478:303-317, 2019

  4. Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu Sun*.
    LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts.
    AAAI 2019
    [pdf][bibtex] [code]

  5. Liangchen Luo, Wenhao Huang, Qi Zeng, Zaiqing Nie, Xu Sun*.
    Learning Personalized End-to-End Goal-Oriented Dialog
    AAAI 2019


  1. Jingjing Xu#, Xu Sun*#,Qi Zeng, Xiaodong Zhang, Xuancheng Ren, Houfeng Wang, Wenjie Li.
    Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach.
    ACL 2018
    [pdf][code] [bibtex][ppt]

  2. Wei Wu, Xu Sun, Houfeng Wang.
    Question Condensing Networks for Answer Selection in Community Question Answering.
    ACL 2018

  3. Shuming Ma, Xu Sun*, Junyang Lin, Houfeng Wang.
    Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization.
    ACL 2018
    [pdf][code] [bibtex][ppt]

  4. Shuming Ma, Xu Sun*, Yizhong Wang, Junyang Lin.
    Bag-of-Words as Target for Neural Machine Translation.
    ACL 2018
    [pdf][code] [bibtex][ppt]

  5. Pengcheng Yang, Xu Sun*, Wei Li, Shuming Ma.
    Automatic Academic Paper Rating Based on Modularized Hierarchical Convolution Neural Network.
    ACL 2018
    [pdf][code] [bibtex][ppt]

  6. Junyang Lin, Xu Sun*, Shuming Ma, Qi Su.
    Global Encoding for Abstractive Summarization.
    ACL 2018
    [pdf][code] [bibtex][ppt]

  7. Pengcheng Yang, Xu Sun*, Wei Li, Shuming Ma, Wei Wu, Houfeng Wang
    SGM: Sequence Generation Model for Multi-label Classification
    COLING 2018 (Best Paper Award[link])
    [pdf][code] [bibtex][ppt]

  8. Yi Zhang, Xu Sun*, Shuming Ma, Yang Yang, Xuancheng Ren
    Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data?
    COLING 2018
    [pdf][code] [bibtex][ppt]

  9. Junyang Lin, Xu Sun*, Xuancheng Ren, Shuming Ma, Jinsong Su, Qi Su
    Deconvolution-Based Global Decoding for Neural Machine Translation
    COLING 2018
    [pdf][code] [bibtex][ppt]

  10. Hao Wang, Xiaodong Zhang, Shuming Ma, Xu Sun, Houfeng Wang
    An End-to-End Question Answering Model Based on Semi-Structured Tables
    COLING 2018

  11. Jingjing Xu, Hangfeng He, Xuancheng Ren, Sujian Li, Xu Sun*
    Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media: A Unified Model
    IEEE Transactions on Audio, Speech and Language Processing (TASLP) 26: 2142-2152, 2018

  12. Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun*
    DP-GAN: A Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  13. Fenglin Liu#, Xuancheng Ren#, Yuanxin Liu, Houfeng Wang, Xu Sun*
    Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  14. Jingjing Xu, Yi Zhang, Qi Zeng, Xuancheng Ren, Xiaoyan Cai, Xu Sun*
    A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  15. Yi Zhang, Jingjing Xu, Pengcheng Yang, Xu Sun*
    Learning Sentiment Memories for Sentiment Modification without Parallel Data
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  16. Liangchen Luo#, Jingjing Xu#, Junyang Lin, Qi Zeng, Xu Sun*
    An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  17. Junyang Lin, Qi Su, Pengcheng Yang, Shuming Ma, Xu Sun*
    Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  18. Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, Qi Su
    Learning When to Concentrate or Divert Attention: Automatic Control of Attention Temperature for Neural Machine Translation
    EMNLP 2018
    [pdf][code] [bibtex][ppt]

  19. C. Shi, Q. Che, L. Sha, S. Li, X. Sun, H. Wang, T. Lin
    Labeling Dialogue Data with Unsupervised Learning
    EMNLP 2018

  20. Shuming Ma, Xu Sun*, Junyang Lin, Xuancheng Ren.
    A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification.
    IJCAI 2018
    [pdf][bibtex] [ppt]

  21. Xiaodong Zhang, Xu Sun*, Houfeng Wang*.
    Duplicate Question Identification by Integrating FrameNet with Neural Networks.
    AAAI 2018

  22. Chengyao Chen, Zhitao Wang, Wenjie Li, Xu Sun.
    Modeling Scientific Influence for Research Trending Topic Prediction.
    AAAI 2018

  23. Shuming Ma, Xu Sun*, Wei Li, Sujian Li, Wenjie Li, Xuancheng Ren.
    Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation.
    NAACL 2018
    [pdf] [code][bibtex] [ppt]

  24. Ji Wen, Xu Sun*, Xuancheng Ren, Qi Su.
    Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text.
    NAACL 2018 (Short Paper)
    [pdf] [data][bibtex]

  25. Yi Zhang, Xu Sun*.
    A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction
    LREC 2018
    [pdf] [data][bibtex]

  26. Xuancheng Ren, Xu Sun*, Ji Wen, Bingzhen Wei, Weidong Zhan, Zhiyuan Zhang.
    Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
    LREC 2018
    [pdf] [data][bibtex]

  27. Shuming Ma, Xu Sun*, Yi Zhang, Bingzhen Wei
    Accelerating Graph-based Dependency Parsing with Lock-Free Parallel Perceptron
    NLPCC 2018

  28. 词法分析:切词、词性标注、和命名实体识别
    上海交通大学出版社,2018 (to appear)


  1. Xu Sun, Xuancheng Ren, Shuming Ma, Houfeng Wang.
    meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting.
    ICML 2017
    [pdf] [code][bibtex]

  2. Shuming Ma, Xu Sun*, Jingjing Xu, Houfeng Wang, Wenjie Li, Qi Su.
    Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization.
    ACL 2017
    [pdf][code] [bibtex][ppt]

  3. Hangfeng He, Xu Sun*.
    A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media.
    AAAI 2017: 3216-3222.

  4. Hangfeng He, Xu Sun*.
    F-Score Driven Max Margin Neural Network for Named Entity Recognition in Chinese Social Media.
    EACL 2017

  5. Shen Huang, Xu Sun, Houfeng Wang.
    Addressing Domain Adaptation for Chinese Word Segmentation with Global Recurrent Structure.
    IJCNLP 2017

  6. Yizhong Wang, Sujian Li, Jingfeng Yang, Xu Sun, Houfeng Wang.
    Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification.
    IJCNLP 2017

  7. Dehong Ma, Sujian Li, Xiaodong Zhang, Houfeng Wang, Xu Sun.
    Cascading Multiway Attentions for Document-level Sentiment Classification.
    IJCNLP 2017

  8. Jingjing Xu, Shuming Ma, Yi Zhang, Bingzhen Wei, Xiaoyan Cai, and Xu Sun*.
    Transfer Deep Learning for Low-Resource Chinese Word Segmentation with a Novel Neural Network.
    NLPCC 2017

  9. Xu Sun, Shuming Ma.
    Lock-Free Parallel Perceptron for Graph-based Dependency Parsing.
    arXiv 2017: 1703.00782.

  10. Jingjing Xu, Xu Sun*.
    Transfer Learning for Low-Resource Chinese Word Segmentation with a Novel Neural Network.
    arXiv 2017: 1702.04488.

  11. Shuming Ma, Xu Sun*.
    A Generic Online Parallel Learning Framework for Large Margin Models.
    arXiv 2017: 1703.00786.

  12. Xu Sun, Bingzhen Wei, Xuancheng Ren, Shuming Ma.
    Label Embedding Network: Learning Label Representation for Soft Training of Deep Networks.
    arXiv 2017: 1710.10393.

  13. Shuming Ma, Xu Sun*.
    A Semantic Relevance Based Neural Network for Text Summarization and Text Simplification.
    arXiv 2017: 1710.02318.


  1. Xu Sun.
    Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features.
    COLING 2016: 192-202

  2. Jingjing Xu, Xu Sun*.
    Dependency-based Gated Recursive Neural Network for Chinese Word Segmentation.
    ACL 2016: 567-572 (Short paper)

  3. C. Shi, S. Liu, S. Ren, S. Feng, M. Li, M. Zhou, X.Sun, H. Wang.
    Knowledge-Based Semantic Embedding for Machine Translation.
    ACL 2016

  4. X.Sun, Yansong Feng.
    Methods and Theories for Large-scale Structured Prediction
    EMNLP 2016 Tutorial
    [download PPT] [download PDF]

  5. Shuming Ma, Xu Sun*.
    A New Recurrent Neural CRF for Learning Non-linear Edge Features.
    arXiv 2016: 1611.04233.


  1. L. Li, B. Chang, S. Zhao, L. Sha, X. Sun, H. Wang.
    Multi-label Text Categorization with Joint Learning Predictions-as-Features Method.
    EMNLP 2015: 835-839

  2. 基于记忆的自然语言处理导读

  3. Xu Sun.
    Towards Shockingly Easy Structured Classification: A Search-based Probabilistic Online Learning Framework.
    (Probabilistic Perceptron: A method with better accuracy than CRFs and almost as fast as perceptrons)
    arXiv:1503.08381. 22 pages. 2015


  1. Xu Sun.
    Structure Regularization for Structured Prediction.
    NIPS 2014:2402-2410
    [pdf][full version with proofs] [code & notes] [bibtex] [slide]

  2. Xu Sun, Wenjie Li, Houfeng Wang, Qin Lu.
    Feature-Frequency-Adaptive Online Training for Fast and Accurate Natural Language Processing.
    Computational Linguistics. 40(3): 563-586. MIT Press. 2014.
    [pdf][code] [bibtex]

  3. L. Zhang, H. Wang, X. Sun.
    Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction.
    EMNLP2014: 1881-1890

  4. L. Zhang, L. Li, H. Wang, X. Sun.
    Predicting Chinese Abbreviations with Minimum Semantic Unit and Global Constraints.
    EMNLP2014: 1405-1414


  1. Xu Sun, Hisashi Kashima, Naonori Ueda.
    Large-Scale Personalized Human Activity Recognition using Online Multi-Task Learning.
    IEEE Transactions on Knowledge and Data Engineering (TKDE). 25(11): 2551-2563. IEEE. 2013
    [pdf][code] [bibtex]

  2. Xu Sun, Takuya Matsuzaki, Wenjie Li.
    Latent Structured Perceptrons for Large-Scale Learning with Hidden Information.
    IEEE Transactions on Knowledge and Data Engineering (TKDE). 25(9): 2063-2075. IEEE. 2013
    [pdf][code] [bibtex]

  3. X. Sun, N. Okazaki, J. Tsujii, H. Wang.
    Learning Abbreviations from Chinese and English Terms by Modeling Non-local Information.
    ACM Transactions on Asian Language Information Processing (TALIP). Vol. 12, No. 2, Article 5, 17 pages. 2013

  4. X. Sun, Y. Zhang, T. Matsuzaki, Y. Tsuruoka, J. Tsujii.
    Probabilistic Chinese Word Segmentation with Non-Local Information and Stochastic Training.
    Information Processing & Management (IPM). 49. 626-636. Elsevier. 2013

  5. X. Sun, W. Li, F. Meng, H. Wang.
    Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search.
    IJCNLP. 641-647. 2013
    [pdf][bibtex] [slide]

  6. F. Meng, D. Gao, W. Li, X. Sun, Y. Hou.
    A Unified Graph Model for Personalized Query-oriented Reference Paper Recommendation.
    CIKM. 1509-1512. 2013

  7. L. Zhang, H. Wang, X. Sun, M. Mansur.
    Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation.
    EMNLP. 311-321. 2013


  1. Xu Sun, Houfeng Wang, Wenjie Li.
    Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection.
    ACL. 253–262. 2012
    [pdf][code] [bibtex] [slide]

  2. X. Sun, A. Shrivastava, P. Li.
    Fast Multi-task Learning for Query Spelling Correction.
    CIKM. 285-294. 2012

  3. X. Sun, A. Shrivastava, P. Li.
    Query Spelling Correction Using Multi-task Learning.
    International Conference on World Wide Web (WWW). Poster. 613-614. 2012


  1. X. Sun, H. Kashima, R. Tomioka, N. Ueda, P. Li.
    Online Multi-Task Learning for Personalized Activity Recognition.
    International Conf. on Data Mining (ICDM). 1218-1223. 2011

  2. X. Sun, H. Kashima, R. Tomioka, N. Ueda.
    Large Scale Real-life Action Recognition Using Conditional Random Fields with Stochastic Training.
    PAKDD. 222-233. 2011


  1. X. Sun, H. Kashima, T. Matsuzaki and N. Ueda.
    Averaged Stochastic Gradient Descent with Feedback: An Accurate, Robust, and Fast Training Method.
    International Conf. on Data Mining (ICDM). 1067-1072. 2010

  2. Xu Sun, Jianfeng Gao, Daniel Micol, Chris Quirk.
    Learning Phrase-Based Spelling Error Models from Clickthrough Data.
    ACL. 266-274. 2010

  3. J. Gao, X. Li, D. Micol, C. Quirk, X. Sun.
    A Large Scale Ranker-Based System for Search Query Spelling Correction.
    COLING. 358-366. 2010


  1. Xu Sun, Naoaki Okazaki, Junichi Tsujii.
    Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information.
    ACL. 905-913. 2009

  2. Xu Sun, Takuya Matsuzaki, Daisuke Okanohara, Junichi Tsujii.
    Latent Variable Perceptron Algorithm for Structured Classification.
    IJCAI. 1236-1242. 2009
    [pdf][code] [bibtex] [slide]

  3. X. Sun, Y. Zhang, T. Matsuzaki, Y. Tsuruoka, J. Tsujii.
    A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information.
    NAACL. 56–64. 2009
    [pdf] [bibtex]

  4. X. Sun, J. Tsujii.
    Sequential Labeling with Latent Variables: An Exact Inference Algorithm and An Efficient Approximation.
    EACL. 772–780. 2009


  1. X. Sun, H. Wang, B. Wang.
    Predicting Chinese Abbreviations from Definitions: An Empirical Learning Approach Using Support Vector Regression.
    Journal of Computer Sci. & Tech. (JCST) 23(4): 602-611. Springer. 2008.

  2. X. Sun, L. Morency, D. Okanohara, J. Tsujii.
    Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference.
    COLING. 841-848. 2008
    [pdf] [bibtex]