Jia Xu (jxu70)

Jia Xu

Assistant Professor


  • PhD (2010) RWTH-Aachen University (Computer Science)


I am creating methods for competitive machine translation systems. These methods often push things beyond the current state-of-the-art. To achieve this, I am devising general machine learning methods, study their empirical and theoretical limitations, and introduce techniques in ensemble learning, subsampling methods, and bringing geometric techniques in the study of structured prediction.

General Information

I am an assistant professor at the Stevens Institute of Technology. My current research interests are in Machine Learning, with a focus on highly competitive machine translation systems. Lately, I have developed an interest and devise techniques that explore the underlying Metric and Geometric properties of machine translation systems. I am publishing in mainstream venues in computational linguistics and machine learning (e.g., AAAI, ICML, ACL).

Institutional Service

  • Hiring Committee Member
  • Working group on improving student-faculty interaction at Stevens Member

Professional Service

  • AAAI'19 Program Committee
  • ACL Program Committee
  • EMNLP Program Committee
  • IJCAI Program Committee
  • AAAI'20 Program Committee
  • Transactions on Knowledge Discovery from Data Reviewer

Consulting Service

Hiring Committee member

Working group on improving student-faculty interaction

Honors and Awards

WMT 2022 SIT 1st Code-Mixing MT subtask II Hinglish->English

WMT 2022 SIT 1st (w.r.t. WER) Code-Mixing MT subtask I Hindi+English->Hinglish

WMT 2018 Hunter 1st French-English Biomedical track team leader

WMT 2017 Hunter 1st (w.r.t. BLEU) Finnish-English News track team leader

NIST 2015 ICT-DCU 1st and 4th 1st (academic inst.) and 4th (overall) team leader and main contributor

WMT 2011DFKI 1st (w.r.t. BLEU) English-German News track team leader

NIST 2008 MSR 1st intern at MSR

NIST 2006 RWTH-Aachen 4th

NIST 2005 RWTH-Aachen 4th

NIST 2004 RWTH-Aachen 2nd

GALE 2008 RWTH-Aachen 2nd in NightInGale

GALE 2007 RWTH-Aachen 2nd in NightInGale

GALE 2006 RWTH-Aachen 2nd in NightInGale

TC-Star 2006 RWTH-Aachen 1st

TC-Star 2005 RWTH-Aachen 1st

TC-Star 2004 RWTH-Aachen1st

Professional Societies

  • Reviewer of IEEE Journal Member
  • ACL 2019 – Program Committee of ACL Member
  • Program Committee of IJCAI Member
  • AAAI – Program Committee of AAAI Member
  • Program committee of EMNLP Member

Grants, Contracts and Funds

NSF CRAFT Pilot 30,000 USDPrincipal investigator 2022 Center for Research toward Advanced Financial Technologies (CRAFT) NSF IUCRC

NSF grant 299,000 USD Co-PI 2018-2023 IUCRC Phase I Rutgers, Newark: Center for Accelerated Real Time Analytics (CARTA) ID: 1747728

NSFC (NSF-China) grant 660,000 RMB (100,000 USD) Co-PI 2017--2019 Key Problems for Tightly-coupled, Multi-signal Fusion based Simultaneously Locating and Mapping ID: 61672524

ICT-CAS grant (Innovation subjects) 500,000 RMB (83,000 USD) Principal investigator 2015--2017 Ensemble learning in machine translation ID: 20156020

KLIIP-ICT-CAS grant 200,000 RMB (33,000 USD )Principal investigator 2015 -- 2016 Novel machine learning methods 20156020 NSFC grant 660,000 RMB (100,000 USD) Co-PI 2014-2017 New approaches to the limits of efficient propositional reasoning: algorithms, approximations and foundations ID: 20131351464

IIIS-Tsinghua grant 150,000 RMB (25,000 USD) Principal investigator 2012-2015 Machine learning and machine translation

Patents and Inventions

Unsupervised Chinese Word Segmentation for Statistical Machine Translation
Jianfeng Gao, Kristina Nikolova Toutanova, and Jia Xu

Selected Publications

Book Chapter

  1. Xu, J.; Gao, J.; Toutanova, K.; Ney, H. (2011). Synchronous Learning of Chinese Word Segmentation and Word Alignment. Handbook of Natural Language Processing and Machine Translation.

Conference Proceeding

  1. Sajjad, H.; Durrani, N.; Dalvi, F.; Alam, F.; Khan, A. R.; Xu, J. (2022). Analyzing Encoded Concepts in Transformer Language Models. Proceedings of NAACL.
  2. Dalvi, F.; Khan, A.; Alam, F.; Durrani, N.; Xu, J.; Sajjad, H. (2022). Discovering Latent Concepts Learned in BERT. Proceedings of ICLR.
  3. Tang, X.; Khan, A. R.; Wang, S.; Xu, J. (2022). Learning by Interpreting. Proceedings of IJCAI.
  4. Chubarian, K.; Khan, A.; Sidiropoulos, A.; Xu, J. (2021). Grouping Words with Semantic Diversity. NAACL.
  5. Khan, A.; Xu, J. (2021). Interpreting Criminal Charge Prediction and Its Algorithmic Bias via Quantum-Inspired Complex Valued Networks. XAI Workshop at ICML.
  6. Khan, A. R.; Xu, J.; Sun, W. (2020). Coding Textual Inputs Boosts the Accuracy of Neural Networks. Proceedings of EMNLP. Proceedings of EMNLP.
  7. Lyu, W.; Huang, S.; Zhang, S.; Khan, A. R.; Sun, W.; Xu, J. (2019). CUNY-PKU Parser at SemEval-2019 Task 1: Cross-lingual Semantic Parsing with UCCA. Proceedins of SemEval 2019.
  8. Cuong, H.; Xu, J. (2018). Assessing Quality Estimation Models for Sentence-Level Prediction. COLING.
  9. Khan, A.; Panda, S.; Xu, J.; Flokas, L. (2018). Hunter NMT System for WMT'18 Biomedical Translation Task: Transfer Learning in Neural Machine Translation.
  10. Xu; Kuang, Y.; Baijoo, S.; Lee, J.; Shahazad, U.; Lancaster, M.; Carlan, C. (2017). Hunter MT: A Course for Young Researchers in WMT'17. Conference on Machine Translation at EMNLP.
  11. Lei, Z.; Ye, X.; Wang, Y.; Li, D.; Xu, J. (2017). On the Efficient Online Model Adaptation by Incremental Simplex Tableau. AAAI.
  12. Papakonstantinou, P.; Xu, J.; Yang, G. (2016). On the Power and Limits of Distance-Based Learning. ICML.
  13. Javadi, S.; Khadivi, S.; Shiri, M.; Xu, J. (2014). An Ant Colony Optimization Method to Detect Communities in Social Networks. ASONAM.
  14. Papakonstantinou, P.; Xu, J.; Cao, Z. (2014). Bagging by Design (On the Sub-optimality of Bagging). AAAI.
  15. Dong, M.; Cheng, Y.; Liu, Y.; Xu, J.; Sun, M. (2014). Query Lattice for Translation Retrieval. COLING.
  16. Gan, C.; Qin, Z.; Xu, J.; Wan, T. (2013). Salient Object Detection in Image Sequences via Spatial-Temporal Cue. Conference on Visual Communications and Image Processing.
  17. Sun, W.; Xu, J. (2011). Enhancing Chinese Word Segmentation Using Unlabeled Data. EMNLP.
  18. Xu, J.; Sun, W. (2011). Generating Virtual Parallel Corpus: A Compatibility Centric Method. Machine Translation Summit.
  19. Federmann, C.; Eisele, A.; Chen, Y.; Hunsicker, S.; Xu, J.; Uszkoreit, H. (2010). Further Experiments with Shallow Hybrid MT Systems. ACL Workshop on Statistical Machine Translation.
  20. Eisele, A.; Xu (2010). Improving Machine Translation Performance Using Comparable Corpora. LREC.
  21. Xu, J.; Gao, J.; Toutanova, K.; Ney, H. (2008). Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation. COLING.
  22. Deng, Y.; Xu, J.; Gao, Y. (2008). Phrase Tabel Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?. ACL.
  23. Xu, J.; Deng, Y.; Gao, Y.; Ney, H. (2007). Domain Dependent Machine Translation. Machine Translation Summit.
  24. Vilar, D.; Xu, J.; D'Haro, L.; Ney, H. (2006). Error Analysis of Statistical Machine Translation Output. LREC.
  25. Xu, J.; Zens, R.; Ney, H. (2006). Partitioning Parallel Documents Using Binary Segmentation. Workshop on Statistical Machine Translation at NAACL.
  26. Xu, J.; Matusov, E.; Zens, R.; Ney, H. (2005). Integrated Chinese Word Segmentation in Statistical Machine Translation. IWSLT.
  27. Zens, R.; Bender, O.; Hasan, S.; Khadivi, S.; Matusov, E.; Xu, J.; Zhang, Y.; Ney, H. (2005). The RWTH-Aachen Phrase-based Statistical Machine Translation System. IWSLT.
  28. Xu, J.; Zens, R.; Ney, H. (2004). Do We Need Chinese Word Segmentation for Statistical Machine Translation?. SIGHAN Workshop on Chinese Language Processing at ACL.

Journal Article

  1. Khan, A.; Karim, A.; Sajjad, H.; Kamiran, F.; Xu, J. (2020). A Clustering Framework for Lexical Normalization of Roman Urdu. Journal of Natural Language Engineering (Impact factor: 1.065 in 2016). Cambridge Press.
  2. Sun, T.; Wang, Y.; Li, D.; Gu, Z.. WCS: Robust Network Localization by Weighted Component Stitching. IEEE/ACM Transactions on Networking.


Deep Learning
Data Structures