Publications

Andy Yang and David Chiang. Counting like transformers: compiling temporal counting logic into softmax transformers. arXiv:2404.04393. PDF BibTeX
Lena Strobl, Dana Angluin, David Chiang, Jonathan Rawski, and Ashish Sabharwal. Transformers as transducers. 2024. arXiv:2404.02040. PDF BibTeX
Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, and Antonios Anastasopoulos. DIALECTBENCH: a NLP benchmark for dialects, varieties, and closely-related languages. 2024. arXiv:2403.11009. PDF BibTeX
Stephen Bothwell, Brian DuSell, David Chiang, and Brian Krostenko. PILA: a historical-linguistic dataset of Proto-Italic and Latin. In Proc. LREC-COLING. 2024. To appear. BibTeX
Chihiro Taguchi, Jefferson Saransig, Dayana Velásquez, and David Chiang. KILLKAN: the automatic speech recognition dataset for Kichwa with morphosyntactic information. In Proc. LREC-COLING. 2024. To appear. BibTeX
Dana Angluin, David Chiang, and Andy Yang. Masked hard-attention transformers and Boolean RASP recognize exactly the star-free languages. 2023. arXiv:2310.13897. PDF BibTeX
Lena Strobl, William Merrill, Gail Weiss, David Chiang, and Dana Angluin. What formal languages can transformers express? A survey. Transactions of the Association for Computational Linguistics, 2024. To appear. PDF BibTeX
Brian DuSell and David Chiang. Stack attention: improving the ability of transformers to model hierarchical patterns. In Proc. ICLR. 2024. PDF BibTeX
Stephen Bothwell, Justin DeBenedetto, Theresa Crnkovich, Hildegund Müller, and David Chiang. Introducing rhetorical parallelism detection: a new task with datasets, metrics, and baselines. In Proc. EMNLP, 5007–5039. 2023. doi:10.18653/v1/2023.emnlp-main.305. PDF BibTeX
Alexandra Butoi, Tim Vieira, Ryan Cotterell, and David Chiang. Efficient algorithms for recognizing weighted tree-adjoining languages. In Proc. EMNLP. 2023. PDF BibTeX
Aarohi Srivastava and David Chiang. BERTwich: extending BERT's capabilities to model dialectal and noisy text. In Findings of ACL: EMNLP. 2023. PDF BibTeX
Chihiro Taguchi, Yusuke Sakai, Parisa Haghani, and David Chiang. Universal automatic phonetic transcription into the International Phonetic Alphabet. In Proc. INTERSPEECH. 2023. doi:10.21437/Interspeech.2023-2584. PDF BibTeX
Alexandra Butoi, Ryan Cotterell, and David Chiang. Convergence and diversity in the control hierarchy. In Proc. ACL. 2023. PDF BibTeX
David Chiang, Peter Cholak, and Anand Pillay. Tighter bounds on the expressivity of transformer encoders. In Proc. ICML, 5544–5562. 2023. PDF BibTeX
Aarohi Srivastava and David Chiang. Fine-tuning BERT with character-level noise for zero-shot transfer to dialects and closely-related languages. In Proc. Workshop on NLP for Similar Languages, Varieties and Dialects. 2023. PDF BibTeX
Patrick Soga and David Chiang. Bridging graph position encodings for transformers with weighted graph-walking automata. Transactions on Machine Learning Research, 2023. PDF BibTeX
Brian DuSell and David Chiang. The surprising computational power of nondeterministic stack RNNs. In Proc. ICLR. 2023. PDF BibTeX
David Chiang, Colin McDonald, and Chung-chieh Shan. Exact recursive probabilistic programming. PACMPL, 2023. doi:10.1145/3586050. PDF BibTeX
Chihiro Taguchi and David Chiang. Introducing morphology in Universal Dependencies Japanese. In Proc. Workshop on Universal Dependencies, 65–72. 2023. PDF BibTeX
David Chiang, Alexander M. Rush, and Boaz Barak. Named tensor notation. Transactions on Machine Learning Research, 2023. PDF BibTeX
Darcey Riley and David Chiang. A continuum of generation tasks for investigating length bias and degenerate repetition. In Proc. BlackboxNLP. 2022. PDF BibTeX
Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang. Algorithms for weighted pushdown automata. In Proc. EMNLP. 2022. PDF BibTeX
David Chiang and Peter Cholak. Overcoming a theoretical limitation of self-attention. In Proc. ACL. 2022. PDF BibTeX
Brian DuSell and David Chiang. Learning hierarchical structures with differentiable nondeterministic stacks. In Proc. ICLR. 2022. PDF BibTeX
Samuel Grieggs, Bingyu Shen, Greta Rauch, Pei Li, Jiaqi Ma, David Chiang, Brian Price, and Walter Scheirer. Measuring human perception to improve handwritten document transcription. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. doi:10.1109/TPAMI.2021.3092688. DOI BibTeX
Toan Q. Nguyen, Kenton Murray, and David Chiang. Data augmentation by concatenation for low-resource translation: a mystery and a solution. In Proc. Conference on Spoken Language Translation. 2021. PDF BibTeX
Colin McDonald and David Chiang. Syntax-based attention masking for neural machine translation. In Proc. NAACL Student Research Workshop. 2021. PDF BibTeX
David Chiang and Chung-chieh Shan. Translating recursive probabilistic programs to factor graph grammars. 2020. Presented at PROBPROG 2020. PDF BibTeX
David Chiang and Darcey Riley. Factor graph grammars. In Proc. NeurIPS, 6648–6658. 2020. PDF BibTeX
Brian DuSell and David Chiang. Learning context-free languages with nondeterministic stack RNNs. In Proc. CoNLL, 507–519. 2020. PDF BibTeX
Justin DeBenedetto and David Chiang. Representing unordered data using complex-weighted multiset automata. In Proc. ICML, 2412–2420. 2020. PDF BibTeX
Kenton Murray, Jeffery Kinnison, Toan Q. Nguyen, Walter Scheirer, and David Chiang. Auto-sizing the Transformer network: improving speed, efficiency, and performance for low-resource machine translation. In Proc. Workshop on Neural Generation and Translation, 231–240. 2019. PDF BibTeX
Kenton Murray, Brian DuSell, and David Chiang. Efficiency through auto-sizing: Notre Dame NLP's submission to the WNGT 2019 efficiency task. In Proc. Workshop on Neural Generation and Translation, 297–301. 2019. doi:10.18653/v1/D19-5634. PDF BibTeX
Arturo Argueta and David Chiang. Accelerating sparse matrix operations in neural networks on graphics processing units. In Proc. ACL, 6215–6224. 2019. PDF BibTeX
Antonios Anastasopoulos, Alison Lui, Toan Q. Nguyen, and David Chiang. Neural machine translation of text from non-native speakers. In Proc. NAACL: HLT, volume 1, 3070–3080. 2019. PDF BibTeX
Kenton Murray and David Chiang. Correcting length bias in neural machine translation. In Proc. WMT, 212–223. 2018. PDF BibTeX
Xinyi Wang, Salvador Aguinaga, Tim Weninger, and David Chiang. Growing better graphs with latent-variable probabilistic graph grammars. In Proc. Workshop on Mining and Learning with Grammars. 2018. PDF BibTeX
Antonios Anastasopoulos, Marika Lekakou, Josep Quer, Eleni Zimianiti, Justin DeBenedetto, and David Chiang. Part-of-speech tagging on an endangered language: a parallel Griko-Italian resource. In Proc. COLING, 2529–2539. 2018. PDF BibTeX
Arturo Argueta and David Chiang. Composing finite state transducers on GPUs. In Proc. ACL, 2697–2705. 2018. PDF BibTeX
Justin DeBenedetto and David Chiang. Algorithms and training for weighted multiset automata and regular expressions. In Proc. Conference on Implementation and Applications of Automata, 146–158. 2018. PDF BibTeX
Antonios Anastasopoulos and David Chiang. Leveraging translations for speech transcription in low-resource settings. In Proc. INTERSPEECH. 2018. PDF BibTeX
Corey Pennycuff, Satyaki Sikdar, Catalina Vajiac, David Chiang, and Tim Weninger. Synchronous hyperedge replacement graph grammars. In Proc. Conference on Graph Transformations. 2018. BibTeX
Antonios Anastasopoulos and David Chiang. Tied multitask learning for neural speech translation. In Proc. NAACL: HLT, volume 1, 82–91. 2018. PDF BibTeX
Toan Nguyen and David Chiang. Improving lexical choice in neural machine translation. In Proc. NAACL: HLT, volume 1, 334–343. 2018. PDF BibTeX
Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, and Jiajun Chen. Combining character and word information in neural machine translation using a multi-level attention. In Proc. NAACL: HLT, volume 1, 1284–1293. 2018. PDF BibTeX
Salvador Aguinaga, David Chiang, and Tim Weninger. Learning hyperedge replacement grammars for graph generation. IEEE Trans. Pattern Analysis and Machine Intelligence, 41(3):625–638, 2019. doi:10.1109/TPAMI.2018.2810877. PDF BibTeX
David Chiang, Frank Drewes, Daniel Gildea, Adam Lopez, and Giorgio Satta. Weighted DAG automata for semantic graphs. Computational Linguistics, 44(1):119–186, 2018. PDF BibTeX
Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. DyNet: the dynamic neural network toolkit. 2017. arXiv:1701.03980. PDF BibTeX
Toan Q. Nguyen and David Chiang. Transfer learning across low-resource, related languages for neural machine translation. In Proc. IJCNLP, volume 2, 296–301. 2017. PDF BibTeX
Huadong Chen, Shujian Huang, David Chiang, Xin-Yu Dai, and Jiajun Chen. Top-rank enhanced listwise optimization for statistical machine translation. In Proc. CoNLL, 90–99. 2017. PDF BibTeX
Antonios Anastasopoulos, Sameer Bansal, David Chiang, Sharon Goldwater, and Adam Lopez. Spoken term discovery for language documentation using translations. In Proc. Workshop on Speech-Centric NLP, 53–58. 2017. PDF BibTeX
Antonios Anastasopoulos and David Chiang. A case study on using speech-to-translation alignments for language documentation. In Proc. Workshop on Use of Computational Methods in Study of Endangered Languages, 170–178. 2017. PDF BibTeX
Huadong Chen, Shujian Huang, David Chiang, and Jiajun Chen. Improved neural machine translation with a syntax-aware encoder and decoder. In Proc. ACL, volume 1, 1936–1945. 2017. PDF BibTeX
Arturo Argueta and David Chiang. Decoding with finite-state transducers on GPUs. In Proc. EACL, volume 1, 1044–1052. 2017. PDF BibTeX
Ulf Hermjakob, Qiang Li, Daniel Marcu, Jonathan May, Sebastian J. Mielke, Nima Pourdamghani, Michael Pust, Xing Shi, Kevin Knight, Tomer Levinboim, Kenton Murray, David Chiang, Boliang Zhang, Xiaoman Pan, Di Lu, Ying Lin, and Heng Ji. Incident-driven machine translation and name tagging for low-resource languages. Machine Translation, 32(1–2):59–89, 2018. doi:10.1007/s10590-017-9207-1. DOI BibTeX
Antonios Anastasopoulos, David Chiang, and Long Duong. An unsupervised probability model for speech-to-translation alignment of low-resource languages. In Proc. EMNLP, 1255–1263. 2016. PDF BibTeX
Salvador Aguiñaga, Rodrigo Palacios, David Chiang, and Tim Weninger. Growing graphs from hyperedge replacement graph grammars. In Proc. CIKM, 469–478. 2016. doi:10.1145/2983323.2983826. DOI BibTeX
Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, and Trevor Cohn. An attentional model for speech translation without transcription. In Proc. NAACL: HLT, 949–959. 2016. PDF BibTeX
Tomer Levinboim and David Chiang. Supervised phrase table triangulation with neural word embeddings for low-resource languages. In Proc. EMNLP, 1079–1083. 2015. PDF BibTeX
Tomer Levinboim and David Chiang. Multi-task word alignment triangulation for low-resource languages. In Proc. NAACL: HLT, 1221–1226. 2015. PDF BibTeX
Kenton Murray and David Chiang. Auto-sizing neural networks: with applications to \(n\)-gram language models. In Proc. EMNLP, 908–916. 2015. PDF BibTeX
Tomer Levinboim, Ashish Vaswani, and David Chiang. Model invertibility regularization: sequence alignment with or without parallel data. In Proc. NAACL: HLT, 609–618. 2015. PDF Code BibTeX
Steven Bird, David Chiang, Friedel Frowein, Florian Hanke, and Ashish Vaswani. Documentary linguistics and computational linguistics: a response to Brooks. Language Documentation and Conservation, 9:10–11, 2015. BibTeX
Theerawat Songyot and David Chiang. Improving word alignment using word similarity. In Proc. EMNLP, 1840–1845. 2014. PDF BibTeX
Hui Zhang and David Chiang. Kneser-Ney smoothing on expected counts. In Proc. ACL, volume 1, 765–774. 2014. PDF BibTeX
Ashish Vaswani, Yinggong Zhao, Victoria Fossum, and David Chiang. Decoding with large-scale neural language models improves translation. In Proc. EMNLP, 1387–1392. 2013. PDF BibTeX
David Chiang, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann, Bevan Jones, and Kevin Knight. Parsing graphs with hyperedge replacement grammars. In Proc. ACL, volume 1, 924–932. 2013. PDF BibTeX
Yuval Marton, David Chiang, and Philip Resnik. Soft syntactic constraints for Arabic-English hierarchical phrase-based translation. Machine Translation, 26(1–2):137–157, 2012. doi:10.1007/s10590-011-9111-z. DOI BibTeX
Steven Bird, David Chiang, Friedel Frowein, Andrea L. Berez, Mark Eby, Florian Hanke, Ryan Shelby, Ashish Vaswani, and Ada Wan. The International Workshop on Language Preservation: an experiment in text collection and language technology. Language Documentation and Conservation, pages 155–167, 2013. PDF BibTeX
Steven Bird and David Chiang. Machine translation for language preservation. In Proc. COLING, 125–134. 2012. PDF BibTeX
David Chiang. Hope and fear for discriminative training of statistical translation models. J. Machine Learning Research, 13:1159–1187, 2012. A few typos corrected, in particular in the definition of the loss function. PDF BibTeX
Ashish Vaswani, Liang Huang, and David Chiang. Smaller alignment models for better translations: unsupervised word alignment with the \(\ell _0\)-norm. In Proc. ACL, volume 1, 311–319. 2012. PDF BibTeX
David Chiang. Grammars for Language and Genes: Theoretical and Empirical Investigations. Theory and Applications of Natural Language Processing. Springer, 2012. BibTeX
Hui Zhang and David Chiang. An exploration of forest-to-string translation: does translation help or hurt parsing? In Proc. ACL, volume 2, 317–321. 2012. PDF BibTeX
David Chiang. An introduction to synchronous grammars. 2006. Notes from a tutorial given at ACL 2006. PDF BibTeX
Ashish Vaswani, Haitao Mi, Liang Huang, and David Chiang. Rule Markov models for fast tree-to-string translation. In Proc. ACL: HLT, 856–864. 2011. PDF BibTeX
Dirk Hovy, Ashish Vaswani, Stephen Tratz, David Chiang, and Eduard Hovy. Models and training for unsupervised preposition sense disambiguation. In Proc. ACL: HLT, 323–328. 2011. PDF BibTeX
David Chiang, Steve DeNeefe, and Michael Pust. Two easy improvements to lexical weighting. In Proc. ACL: HLT, 455–460. 2011. PDF BibTeX
Shu Cai, David Chiang, and Yoav Goldberg. Language-independent parsing with empty elements. In Proc. ACL: HLT, 212–216. 2011. PDF BibTeX
Ashish Vaswani, Adam Pauls, and David Chiang. Efficient optimization of an MDL-inspired objective function for unsupervised part-of-speech tagging. In Proc. ACL, 209–214. 2010. PDF BibTeX
David Chiang. Learning to translate with source and target syntax. In Proc. ACL, 1443–1452. 2010. PDF BibTeX
David Chiang, Jonathan Graehl, Kevin Knight, Adam Pauls, and Sujith Ravi. Bayesian inference for finite-state transducers. In HLT: NAACL, 447–455. 2010. PDF BibTeX
Adam Pauls, Dan Klein, David Chiang, and Kevin Knight. Unsupervised syntactic alignment with inversion transduction grammars. In HLT: NAACL, 118–126. 2010. PDF BibTeX
Sujith Ravi, Ashish Vaswani, Kevin Knight, and David Chiang. Fast, greedy model minimization for unsupervised tagging. In Proc. COLING, 940–948. 2010. PDF BibTeX
John DeNero, David Chiang, and Kevin Knight. Fast consensus decoding over translation forests. In Proc. ACL-IJCNLP, 567–575. 2009. PDF BibTeX
David Chiang, Kevin Knight, and Wei Wang. 11,001 new features for statistical machine translation. In Proc. HLT: NAACL, 218–226. 2009. Best paper award. PDF BibTeX
David Chiang, Steve DeNeefe, Yee Seng Chan, and Hwee Tou Ng. Decomposability of translation metrics for improved evaluation and efficient algorithms. In Proc. EMNLP, 610–619. 2008. PDF BibTeX
David Chiang, Yuval Marton, and Philip Resnik. Online large-margin training of syntactic and structural translation features. In Proc. EMNLP, 224–233. 2008. PDF BibTeX
Hao Zhang, Daniel Gildea, and David Chiang. Extracting synchronous grammar rules from word-level alignments in linear time. In Proc. COLING, 1081–1088. 2008. PDF BibTeX
David Chiang and Tatjana Scheffler. Flexible composition and delayed tree-locality. In Proc. TAG+, 17–24. 2008. PDF BibTeX
Liang Huang and David Chiang. Forest rescoring: faster decoding with integrated language models. In Proc. ACL, 144–151. 2007. PDF BibTeX
Yee Seng Chan, Hwee Tou Ng, and David Chiang. Word sense disambiguation improves statistical machine translation. In Proc. ACL, 33–40. 2007. PDF BibTeX
Ken A. Dill, Adam Lucas, Julia Hockenmaier, Liang Huang, David Chiang, and Aravind K. Joshi. Computational linguistics: a new tool for exploring biopolymer structures and statistical mechanics. Polymer, 48(15):4289–4300, 2007. doi:10.1016/j.polymer.2007.05.018. DOI BibTeX
David Chiang. Hierachical phrase-based translation. Computational Linguistics, 33(2):201–228, 2007. doi:10.1162/coli.2007.33.2.201. DOI BibTeX
David Chiang and Owen Rambow. The hidden TAG model: synchronous grammars for parsing resource-poor languages. In Proc. TAG+, 1–8. 2006. PDF BibTeX
David Chiang, Mona Diab, Nizar Habash, Owen Rambow, and Safiullah Shareef. Parsing Arabic dialects. In Proc. EACL. 2006. PDF BibTeX
David Chiang. The weak generative capacity of linear tree-adjoining grammars. In Proc. TAG+, 25–32. 2006. PDF BibTeX
David Chiang, Aravind K. Joshi, and Ken A. Dill. A grammatical theory for the conformational changes of simple helix bundles. J. Computational Biology, 13(1):21–42, 2006. doi:10.1089/cmb.2006.13.21. DOI BibTeX
David Chiang, Aravind K. Joshi, and David B. Searls. Grammatical representations of macromolecular structure. J. Computational Biology, 13(5):1077–1100, 2006. doi:10.1089/cmb.2006.13.1077. DOI BibTeX
David Chiang, Adam Lopez, Nitin Madnani, Christof Monz, Philip Resnik, and Michael Subotin. The Hiero machine translation system: extensions, evaluation, and analysis. In Proc. HLT-EMNLP, 779–786. 2005. PDF BibTeX
Liang Huang and David Chiang. Better \(k\)-best parsing. In Proc. IWPT, 53–64. 2005. PDF BibTeX
David Chiang. A hierarchical phrase-based model for statistical machine translation. In Proc. ACL, 263–270. 2005. doi:10.3115/1219840.1219873. Best paper award. PDF BibTeX
David Chiang. Evaluation of Grammar Formalisms for Applications to Natural Language Processing and Biological Sequence Analysis. PhD thesis, University of Pennsylvania, 2004. Rubinoff Award. PDF BibTeX
Mark Dras, David Chiang, and William Schuler. On relations of constituency and dependency grammars. Research on Language and Computation, 2:281–305, 2004. BibTeX
David Chiang. Uses and abuses of intersected languages. In Proc. TAG+, 9–15. 2004. PDF BibTeX
David Chiang. Statistical parsing with an automatically extracted tree adjoining grammar. In Rens Bod, Remko Scha, and Khalil Sima'an, editors, Data Oriented Parsing, pages 299–316. CSLI Publications, Stanford, 2003. PDF BibTeX
David Chiang. Mildly context sensitive grammars for estimating maximum entropy models. In Gerald Penn, editor, Proc. Conference on Formal Grammar. 2003. PDF BibTeX
David Chiang. Putting some weakly context-free formalisms in order. In Proc. TAG+, 11–18. 2002. PDF BibTeX
David Chiang and Aravind K. Joshi. Formal grammars for estimating partition functions of double-stranded chain molecules. In Proc. HLT, 63–67. 2002. BibTeX
David Chiang and Daniel M. Bikel. Recovering latent information in treebanks. In Proc. COLING. 2002. PDF BibTeX
David Chiang. Constraints on strong generative power. In Proc. ACL, 132–139. 2001. doi:10.3115/1073012.1073030. PDF BibTeX
Fudong Chiou, David Chiang, and Martha Palmer. Facilitating treebank annotation using a statistical parser. In Proc. HLT. 2001. PDF BibTeX
David Chiang. Statistical parsing with an automatically-extracted tree adjoining grammar. In Proc. ACL, 456–463. 2000. doi:10.3115/1075218.1075276. PDF BibTeX
Daniel M. Bikel and David Chiang. Two statistical parsing models applied to the Chinese Treebank. In Proc. Chinese Language Processing Workshop, 1–6. 2000. doi:10.3115/1117769.1117771. PDF BibTeX
David Chiang, William Schuler, and Mark Dras. Some remarks on an extension of synchronous TAG. In Proc. TAG+, 61–66. 2000. PDF BibTeX
Mark Dras, David Chiang, and William Schuler. A multi-level TAG approach to dependency. In Proc. ESSLLI Workshop on Linguistic Theory and Grammar Implementation, 33–46. 2000. BibTeX
William Schuler, David Chiang, and Mark Dras. Multi-component TAG and notions of formal power. In Proc. ACL, 448–455. 2000. doi:10.3115/1075218.1075275. PDF BibTeX