刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

六万学者关注了→ 语言学心得 2024-02-19

研究必备｜SSCI期刊论文撰写与发表极简教程

2023-05-11

刊讯｜SSCI 期刊《语言和跨文化交际》2022年第1-6期

2023-05-11

刊讯｜SSCI 期刊《写作评估》2022年第52-54卷

2023-05-08

Transactions of the Association for Computational Linguistics

Volume 11, 2023

Transactions of the Association for Computational Linguistics（SSCI一区，2021 IF：9.194）2023年第11卷共发文20篇。研究论文涉及开放式问答系统、抽象句法能力、语义解析、端到端任务型对话的语言模型、跨语言解析、跨语言对话数据集等。欢迎转发扩散！（2023年持续更新中）

往期推荐：

刊讯｜SSCI 期刊《计算语言学协会学报》2022年第10卷

ARTICLES

■Improving the Domain Adaptation of Retrieval Augmented Generation

(RAG) Models for Open Domain Question Answering, by Shamane Siriwardhana, Rivindu Weerasekera, Elliott Wen,Tharindu Kaluarachchi, Rajib Rana, and Suranga Nanayakkara, Pages 1–17.

Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement, by Bingzhi Li and Guillaume Wisniewski and Benoît Crabbé, Pages 18–33.

■On the Role of Negative Precedent in Legal Outcome Prediction, by Josef Valvoda, Ryan Cotterell, Simone Teufel, Pages 34–48.

■Meta-Learning a Cross-lingual Manifold for Semantic Parsing, by Tom Sherborne and Mirella Lapata, Pages 49–67.

■OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue, by Zhi Chen, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu, Kai Yu, Pages 68–84.

■Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation, by Llion Jones, Richard Sproat, Haruko Ishikawa, Alexander Gutkin, Pages 85–101.

■Locally Typical Sampling, by Clara Meister, Tiago Pimentel, Gian Wiher, Ryan Cotterell, Pages 102–121.

■Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization, by Thomas Effland and Michael Collins, Pages 122–138.

■Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation, by Olga Majewska, Evgeniia Razumovskaia, Edoardo M. Ponti, Ivan Vulić, Anna Korhonen, Pages 139–156.

■Modeling Emotion Dynamics in Song Lyrics with State Space Models, by Yingjin Song and Daniel Beck, Pages 157–175.

■FeelingBlue: A Corpus for Understanding the Emotional Connotation of Color in Context, by Amith Ananthram and Olivia Winn and Smaranda Muresan, Pages 176–190.

■An Empirical Survey of Data Augmentation for Limited Data Learning in NLP, by Jiaao Chen, Derek Tam, Colin Raffel, Mohit Bansal, Diyi Yang, Pages 176–190.

■Coreference Resolution through a seq2seq Transition-Based System, by Bernd Bohnet, Chris Alberti, Michael Collins, Pages 212–226.

■Transformers for Tabular Data Representation: A Survey of Models and Applications, by Gilbert Badaro, Mohammed Saeed, Paolo Papotti, Pages 227–249.

■Generative Spoken Dialogue Language Modeling, by Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoît Sagot, Abdelrahman Mohamed, Emmanuel Dupoux, Pages 250–266.

■Discontinuous Combinatory Constituency Parsing, by Zhousi Chen and Mamoru Komachi, Pages 267–283.

■Efficient Long-Text Understanding with Short-Text Models, by Maor Ivgi, Uri Shaham, Jonathan Berant, Pages 284–299.

■Hate Speech Classifiers Learn Normative Social Stereotypes, by Aida Mostafazadeh Davani, Mohammad Atari, Brendan Kennedy, Morteza Dehghani, Pages 300–319.

■Domain-Specific Word Embeddings with Structure Prediction, by David Lassner, Stephanie Brandl, Anne Baillot, Shinichi Nakajima, Pages 320–335.

■Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?, by Byung-Doh Oh and William Schuler, Pages 336–350.

摘要

Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering

Shamane Siriwardhana, Augmented Human Lab, Auckland Bioengineering Institute, the University of Auckland, New Zealand

Rivindu Weerasekera, Augmented Human Lab, Auckland Bioengineering Institute, the University of Auckland, New Zealand

Elliott Wen, Augmented Human Lab, Auckland Bioengineering Institute, the University of Auckland, New Zealand

Tharindu Kaluarachchi, Augmented Human Lab, Auckland Bioengineering Institute, the University of Auckland, New Zealand

Rajib Rana, University of Southern Queensland, Australia

Suranga Nanayakkara, Department of Information Systems & Analytics, National University of Singapore, Singapore, Augmented Human Lab, Auckland Bioengineering Institute, the University of Auckland, New Zealand

Abstract Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia-based external knowledge base and is not optimized for use in other specialized domains such as healthcare and news. In this paper, we evaluate the impact of joint training of the retriever and generator components of RAG for the task of domain adaptation in ODQA. We propose RAG-end2end, an extension to RAG that can adapt to a domain-specific knowledge base by updating all components of the external knowledge base during training. In addition, we introduce an auxiliary training signal to inject more domain-specific knowledge. This auxiliary signal forces RAG-end2end to reconstruct a given sentence by accessing the relevant information from the external knowledge base. Our novel contribution is that, unlike RAG, RAG-end2end does joint training of the retriever and generator for the end QA task and domain adaptation. We evaluate our approach with datasets from three domains: COVID-19, News, and Conversations, and achieve significant performance improvements compared to the original RAG model. Our work has been open-sourced through the HuggingFace Transformers library, attesting to our work’s credibility and technical consistency.

Assessing the Capacity of Transformer to Abstract Syntactic

Representations: A Contrastive Analysis Based on

Long-distance Agreement

Bingzhi Li, Université Paris Cité, LLF, CNRS, 75 013 Paris, Francé

Guillaume Wisniewski, Université Paris Cité, LLF, CNRS, 75 013 Paris, FrancéBenoît Crabbé, Université Paris Cité, LLF, CNRS, 75 013 Paris, Francé

Abstract

Many studies have shown that transformers are able to predict subject-verb agreement, demonstrating their ability to uncover an abstract representation of the sentence in an unsupervised way. Recently, Li et al. (2021) found that transformers were also able to predict the object-past participle agreement in French, the modeling of which in formal grammar is fundamentally different from that of subject-verb agreement and relies on a movement and an anaphora resolution.

To better understand transformers’ internal working, we propose to contrast how they handle these two kinds of agreement. Using probing and counterfactual analysis methods, our experiments on French agreements show that (i) the agreement task suffers from several confounders that partially question the conclusions drawn so far and (ii) transformers handle subject-verb and object-past participle agreements in a way that is consistent with their modeling in theoretical linguistics.

On the Role of Negative Precedent in Legal Outcome Prediction

Josef Valvoda, University of Cambridge, UK

Ryan Cotterell, ETH Zurich, SwitzerlandSimone Teufel, University of Cambridge, UK

Abstract Every legal case sets a precedent by developing the law in one of the following two ways. It either expands its scope, in which case it sets positive precedent, or it narrows it, in which case it sets negative precedent. Legal outcome prediction, the prediction of positive outcome, is an increasingly popular task in AI. In contrast, we turn our focus to negative outcomes here, and introduce a new task of negative outcome prediction. We discover an asymmetry in existing models’ ability to predict positive and negative outcomes. Where the state-of-the-art outcome prediction model we used predicts positive outcomes at 75.06 F1, it predicts negative outcomes at only 10.09 F1, worse than a random baseline. To address this performance gap, we develop two new models inspired by the dynamics of a court process. Our first model significantly improves positive outcome prediction score to 77.15 F1 and our second model more than doubles the negative outcome prediction performance to 24.01 F1. Despite this improvement, shifting focus to negative outcomes reveals that there is still much room for improvement for outcome pre diction models.

Meta-Learning a Cross-lingual Manifold for Semantic Parsing

Tom Sherborne and Mirella Lapata

Institute for Language, Cognition and Computation

School of Informatics, University of Edinburgh

10 Crichton Street, Edinburgh EH8 9AB, UK

Abstract Localizing a semantic parser to support new languages requires effective cross-lingual generalization. Recent work has found success with machine-translation or zero-shot methods, although these approaches can struggle to model how native speakers ask questions. We consider how to effectively leverage minimal annotated examples in new languages for few-shot cross-lingual semantic parsing. We introduce a first-order meta-learning algo rhythm to train a semantic parser with maximal sample efficiency during cross-lingual trans fer. Our algorithm uses high-resource languages to train the parser and simultaneously optimizes for cross-lingual generalization to lower-resource languages. Results across six languages on ATIS demonstrate that our combination of generalization steps yields accurate semantic parsers sampling ≤10% of source training data in each new language. Our approach also trains a competitive model on Spider using English with generalization to Chinese similarly sampling ≤10% of training data.

OPAL: Ontology-Aware Pretrained Language Model for End-to-End

Task-Oriented Dialogue

Zhi Chen, X-LANCE Lab, Department of Computer Science and Engineering

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

State Key Lab of Media Convergence Production Technology and Systems, Beijing, Chin

Yuncong Liu, X-LANCE Lab, Department of Computer Science and Engineering

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

State Key Lab of Media Convergence Production Technology and Systems, Beijing, China

Lu Chen, X-LANCE Lab, Department of Computer Science and Engineering

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

State Key Lab of Media Convergence Production Technology and Systems, Beijing, China

Su Zhu, AISpeech Co., Ltd., Suzhou, China

Mengyue Wu, X-LANCE Lab, Department of Computer Science and Engineering

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

State Key Lab of Media Convergence Production Technology and Systems, Beijing, China

Kai Yu, X-LANCE Lab, Department of Computer Science and Engineering

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong UniversityState Key Lab of Media Convergence Production Technology and Systems, Beijing, China

Abstract This paper presents an ontology-aware pre trained language model (OPAL) for end-to-end task-oriented dialogue (TOD). Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: Dialogue state tracker (DST) and response generator (RG). The dialogue state consists of the domain-slot-value triples, which are regarded as the user’s constraints to search the domain-related databases. The large-scale task-oriented dialogue data with the annotated structured dialogue state usually are inaccessible. It prevents the development of the pretrained language model for the task-oriented dialogue. We propose a simple yet effective pretraining method to alleviate this problem, which consists of two pretraining phases. The first phase is to pretrain on large-scale contextual text data, where the structured information of the text is extracted by the information extracting tool. To bridge the gap between the pretraining method and downstream tasks, we design two pretraining tasks: ontology like triple recovery and next-text generation, which simulates the DST and RG, respectively. The second phase is to fine-tune the pretrained model on the TOD data. The experimental results show that our proposed method achieves an exciting boost and obtains competitive performance even without any TOD data on CamRest676 and MultiWOZ benchmarks.

Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation

Llion Jones, Google Japan

Richard Sproat, Google Japan

Haruko Ishikawa, Google Japan

Alexander Gutkin, Google UK

Abstract If one sees the place name Houston Mercer Dog Run in New York, how does one know how to pronounce it? Assuming one knows that Houston in New York is pronounced and not like the Texas city, then one can probably guess that is also used in the name of the dog park. We present a novel architecture that learns to use the pronunciations of neighboring names in order to guess the pronunciation of a given target feature. Applied to Japanese place names, we demonstrate the utility of the model to finding and proposing corrections for errors in Google Maps. To demonstrate the utility of this approach to structurally similar problems, we also report on an application to a totally different task: Cognate reflex prediction in comparative historical linguistics. A version of the code has been open-sourced.

Locally Typical Sampling

Clara Meister, ETH Zurich, Switzerland

Tiago Pimentel, University of Cambridge, UK

Gian Wiher, ETH Zurich, Switzerland

Ryan Cotterell, ETH Zurich, Switzerland; University of Cambridge, UK

Abstract Today’s probabilistic language generators fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics (e.g., perplexity). This discrepancy has puzzled the language generation community for the last few years. In this work, we posit that the abstraction of natural language generation as a discrete stochastic process—which allows for an information-theoretic analysis—can provide new insights into the behavior of probabilistic language generators, for example, why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, aiming to do so in a simultaneously efficient and error-minimizing manner; in fact, psycholinguistics research suggests humans choose each word in a string with this subconscious goal in mind. We formally define the set of strings that meet this criterion: Those for which each word has an information content close to the expected in formation content, namely, the conditional entropy of our model. We then propose a simple and efficient procedure for enforcing this criterion when generating from probabilistic models, which we call locally typical sampling. Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, locally typical sampling offers competitive performance (in both abstractive summarization and story generation) in terms of quality while consistently reducing degenerate repetitions.

Improving Low-Resource Cross-lingual Parsing

with Expected Statistic Regularization

Thomas Effland, Columbia University, USA

Michael Collins, Google Research, USA

Abstract We present Expected Statistic Regularization (ESR), a novel regularization technique that utilizes low-order multi-task structural statistics to shape model distributions for semi supervised learning on low-resource datasets. We study ESR in the context of cross-lingual transfer for syntactic analysis (POS tagging and labeled dependency parsing) and present several classes of low-order statistic functions that bear on model behavior. Experimentally, we evaluate the proposed statistics with ESR for unsupervised transfer on 5 diverse target languages and show that all statistics, when estimated accurately, yield improvements to both POS and LAS, with the best statistic improving POS by +7.0 and LAS by +8.5 on average. We also present semi-supervised transfer and learning curve experiments that show ESR provides significant gains over strong cross-lingual-transfer-plus-fine-tuning baselines for modest amounts of label data. These results indicate that ESR is a promising and complementary approach to model transfer approaches for cross-lingual parsing.

Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation

Olga Majewska, Language Technology Lab, University of Cambridge, United Kingdom

Evgeniia Razumovskaia, Language Technology Lab, University of Cambridge, United Kingdom

Edoardo M. Ponti, Institute for Language, Cognition and Computation, University of Edinburgh, United Kingdom; Language Technology Lab, University of Cambridge, United Kingdom

Ivan Vulić, Language Technology Lab, University of Cambridge, United Kingdom

Anna Korhonen, Language Technology Lab, University of Cambridge, United Kingdom

Abstract Multilingual task-oriented dialogue (ToD) facilitates access to services and information for many (communities of) speakers. Nevertheless, its potential is not fully realized, as current multilingual ToD datasets—both for modular and end-to-end modeling—suffer from severe limitations. 1) When created from scratch, they are usually small in scale and fail to cover many possible dialogue flows. 2) Translation-based ToD datasets might lack naturalness and cultural specificity in the target language. In this work, to tackle these limitations we propose a novel outline-based annotation process for multilingual ToD datasets, where domain-specific abstract schemata of dialogue are mapped into natural language outlines. These in turn guide the target language an notators in writing dialogues by providing instructions about each turn’s intents and slots. Through this process we annotate a new large-scale dataset for evaluation of multi lingual and cross-lingual ToD systems. Our Cross-lingual Outline-based Dialogue dataset (COD) enables natural language understanding, dialogue state tracking, and end-to-end dialogue evaluation in 4 diverse languages: Arabic, Indonesian, Russian, and Kiswahili. Qualitative and quantitative analyses of COD versus an equivalent translation-based dataset demonstrate improvements in data quality, un locked by the outline-based approach. Finally, we benchmark a series of state-of-the-art systems for cross-lingual ToD, setting reference scores for future work and demonstrating that COD prevents over-inflated performance, typically met with prior translation-based ToD datasets.

Modeling Emotion Dynamics in Song Lyrics with State Space Models

Yingjin Song, Department of Information and Computing Sciences

Utrecht University, Netherlands

Daniel Beck, School of Computing and Information Systems

University of Melbourne, Australia

Abstract Most previous work in music emotion recognition assumes a single or a few song-level labels for the whole song. While it is known that different emotions can vary in intensity within a song, annotated data for this setup is scarce and difficult to obtain. In this work, we propose a method to predict emotion dynamics in song lyrics without song-level super vision. We frame each song as a time series and employ a State Space Model (SSM), combining a sentence-level emotion predictor with an Expectation-Maximization (EM) procedure to generate the full emotion dynamics. Our experiments show that applying our method consistently improves the performance of sentence-level baselines without requiring any annotated songs, making it ideal for limited training data scenarios. Further analysis through case studies shows the benefits of our method while also indicating the limitations and pointing to future directions.

FeelingBlue: A Corpus for Understanding the Emotional Connotation of Color in Context

Amith Ananthram, Department of Computer Science, Columbia University, USA

Olivia Winn, Department of Computer Science, Columbia University, USA

Smaranda Muresan, Department of Computer Science, Columbia University, USA; Data Science Institute, Columbia University, USA

Abstract While the link between color and emotion has been widely studied, how context-based changes in color impact the intensity of perceived emotions is not well understood. In this work, we present a new multimodal dataset for exploring the emotional connotation of color as mediated by line, stroke, texture, shape, and language. Our dataset, FeelingBlue, is a collection of 19,788 4-tuples of abstract art ranked by annotators according to their evoked emotions and paired with rationales for those annotations. Using this corpus, we present a baseline for a new task: Justified Affect Transformation. Given an image I, the task is to 1) recolor I to enhance a specified emotion e and 2) provide a textual justification for the change in e. Our model is an ensemble of deep neural networks which takes I, generates an emotionally transformed color palette p conditioned on I, applies p to I, and then justifies the color transformation in text via a visual-linguistic model. Experimental results shed light on the emotional connotation of color in context, demonstrating both the promise of our approach on this challenging task and the considerable potential for future investigations enabled by our corpus.

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

Jiaao Chen, Georgia Institute of Technology, USA

Derek Tam, UNC Chapel Hill, USA

Colin Raffel, UNC Chapel Hill, USA

Mohit Bansal, UNC Chapel Hill, USA

Diyi Yang, Georgia Institute of Technology, USA

Abstract NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmentation for NLP in the limited labeled data setting, making it difficult to understand which methods work in which settings. In this paper, we provide an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting, summarizing the landscape of methods (including token-level augmentations, sentence-level augmentations, adversarial augmentations, and hidden-space augmentations) and carrying out experiments on 11 datasets covering topics/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks. Based on the results, we draw several conclusions to help practitioners choose appropriate augmentations in different settings and discuss the current challenges and future directions for limited data learning in NLP.

Coreference Resolution through a seq2seq Transition-Based System

Bernd Bohnet, Google Research, The Netherlands

Chris Alberti, Google Research, USA

Michael Collins, Google Research, USA

Abstract Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work [Dobrovolskii, 2021]) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work), and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot set ting, a few-shot setting, and supervised setting using all available training data. We obtain substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested lan guages. We provide the code and models as open source.

Transformers for Tabular Data Representation: A Survey of Models and Applications

Gilbert Badaro, Mohammed Saeed, Paolo Papotti

EURECOM, France

Abstract In the last few years, the natural language processing community has witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in tabular data, recent research efforts extend LMs by developing neural representations for structured data. In this article, we present a survey that analyzes these efforts. We first abstract the different systems according to a traditional machine learning pipeline in terms of training data, input representation, model training, and supported downstream tasks. For each aspect, we characterize and compare the proposed solutions. Finally, we discuss future work directions.

Generative Spoken Dialogue Language Modeling

Tu Anh Nguyen, Meta AI Research, France; Inria, Paris, France

Eugene Kharitonov, Meta AI Research

Jade Copet, Meta AI ResearchYossi Adi, Meta AI Research, IsraelWei-Ning Hsu, Meta AI Research, United StatesAli Elkahky, Meta AI Research, United StatesPaden Tomasello, Meta AI Research, United StatesRobin Algayres, Meta AI Research, FranceBenoît Sagot, Inria, Paris, FranceAbdelrahman Mohamed, Meta AI Research, FranceEmmanuel Dupoux, Meta AI Research, France; EHESS, ENS-PSL, CNRS, Paris, France

Abstract We introduced GSLM, the first "textless" model able to generate audio samples of naturalistic spoken dialogues. It uses recent work on unsupervised spoken unit discovery coupled with a dual-tower transformer architecture with cross-attention trained on 2000 hours of two-channel raw conversational audio (Fisher dataset) without any text or labels. We show that our model is able to generate speech, laughter, and other paralinguistic signals in the two channels simultaneously and reproduces more naturalistic and fluid turn taking compared to a text-based cascaded model.

Discontinuous Combinatory Constituency Parsing

Zhousi Chen and Mamoru Komachi

Faculty of Systems Design

Tokyo Metropolitan University

6-6 Asahigaoka, Hino, Tokyo 191-0065, Japan

Abstract We extend a pair of continuous combinator based constituency parsers (one binary and one multi-branching) into a discontinuous pair. Our parsers iteratively compose constituent vectors from word embeddings without any grammar constraints. Their empirical complexities are subquadratic. Our extension includes 1) a swap action for the orientation based binary model and 2) biaffine attention for the chunker-based multi-branching model. In tests conducted with the Discontinuous Penn Treebank and TIGER Treebank, we achieved state-of-the-art discontinuous accuracy with a significant speed advantage.

Efficient Long-Text Understanding with Short-Text Models

Maor Ivgi, Uri Shaham, Jonathan Berant

The Blavatnik School of Computer Science, Tel-Aviv University, Israel

Abstract Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles, and long documents due to their quadratic complexity. While a myriad of efficient transformer variants have been proposed, they are typically based on custom implementations that require expensive pretraining from scratch. In this work, we propose SLED: SLiding-Encoder and Decoder, a simple approach for processing long sequences that re-uses and leverages battle-tested short-text pretrained LMs. Specifically, we partition the input into overlapping chunks, encode each with a short-text LM encoder and use the pretrained decoder to fuse information across chunks (fusion-in-decoder). We illustrate through controlled experiments that SLED offers a viable strategy for long text understanding and evaluate our approach on SCROLLS, a benchmark with seven datasets across a wide range of language understanding tasks. We find that SLED is competitive with specialized models that are up to 50x larger and require a dedicated and expensive pretraining step.

Hate Speech Classifiers Learn Normative Social Stereotypes

Aida Mostafazadeh Davani, Mohammad Atari, Brendan Kennedy, Morteza Dehghani

University of Southern California, USA

Abstract Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereo types in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereo types embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.

Domain-Specific Word Embeddings with Structure Prediction

David Lassner, TU Berlin, Germany; BIFOLD, Germany

Stephanie Brandl, TU Berlin, Germany; BIFOLD, Germany; University of Copenhagen, Denmark

Anne Baillot, Le Mans Université, Francé

Shinichi Nakajima, TU Berlin, Germany; BIFOLD, Germany; RIKEN Center for AIP, Japan

Abstract Complementary to finding good general word embeddings, an important question for representation learning is to find dynamic word embeddings, for example, across time or domain. Current methods do not offer a way to use or predict information on structure between sub-corpora, time or domain and dynamic embeddings can only be compared after post alignment. We propose novel word embedding methods that provide general word representations for the whole corpus, domain specific representations for each sub-corpus, sub-corpus structure, and embedding alignment simultaneously. We present an empirical evaluation on New York Times articles and two English Wikipedia datasets with articles on science and philosophy. Our method, called Word2Vec with Structure Prediction (W2VPred), provides better performance than baselines in terms of the general analogy tests, domain-specific analogy tests, and multiple specific word embedding evaluations as well as structure prediction performance when no structure is given a priori. As a use case in the field of Digital Humanities we demonstrate how to raise novel research questions for high literature from the German Text Archive.

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?

Byung-Doh Oh, Department of Linguistics, The Ohio State University, USA

William Schuler, Department of Linguistics, The Ohio State University, USA

期刊简介

Transactions of the Association for Computational Linguistics (TACL) is an ACL-sponsored journal published by MIT Press that publishes papers in all areas of computational linguistics and natural language processing. TACL has the following features: 计算机语言学协会学报（TACL）是由麻省理工学院出版社出版的ACL赞助期刊，发表计算语言学和自然语言处理所有领域的论文。TACL具有以下特点：

1.TACL publishes conference-length papers, but has a journal-style reviewing process (for example, the option for an action editor to recommend the “revise and resubmit” category for a paper).TACL发表会议长度的论文，但具有期刊式的审阅过程（例如，行动编辑可以选择为论文推荐“修改并重新提交”类别）。

2.Papers appearing at TACL are eligible for a presentation at certain ACL-sponsored conferences. Thus the model combines the benefits of a journal, with the benefits of being able to present the work at a major conference. (Presentation is optional; authors do not have to present their papers at the conference).因此，该模型结合了期刊的优势，以及能够在主要会议上展示工作成果的优势。（演讲是可选的;作者不必在会议上发表论文）.

3.TACL accepts submissions all year (the 1st day of each month is a submission deadline).TACL全年接受投稿（每个月的第一天是投稿截止日期）.

4.TACL is committed to fast-turnaround reviewing.（TACL致力于快速周转审稿）

官网地址：

https://direct.mit.edu/tacl

本文来源：TACL官网

点击文末“阅读原文”可跳转官网

课程推荐

研究必备｜SSCI期刊论文撰写与发表极简教程

2023-05-11

刊讯｜SSCI 期刊《语言和跨文化交际》2022年第1-6期

2023-05-11

刊讯｜《语言教学与研究》2023年第2期

2023-05-10

刊讯｜《海外华文教育》2022年第5期

2023-05-09

刊讯｜SSCI 期刊《写作评估》2022年第52-54卷

2023-05-08

刊讯｜《中国语文研究》2022年第1-2期

2023-05-07

刊讯｜《新疆大学学报(哲社版)》2022年刊文（语言学）

2023-05-06

刊讯｜CSSCI《华中学术》2022年刊文（语言学）

2023-05-05

刊讯｜SSCI 期刊《语言与交际》2022年第84-87卷

2023-05-04

语言学年报•期刊动态｜《中国语文》（2022）

2023-05-03

刊讯｜《中国社会语言学》2019年第1期

2023-05-02

刊讯｜SSCI 期刊《应用语言学》2022年第4-6期

2023-05-01

欢迎加入
“语言学心得交流分享群”“语言学考博/考研/保研交流群”

请添加“心得君”入群请务必备注“学校+研究方向/专业”

今日小编：Kī

审核：心得小蔓

转载&合作请联系

"心得君"

微信：xindejun_yyxxd

点击“阅读原文”可跳转下载

继续滑动看下一个

刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

六万学者关注了→ 语言学心得

语言学心得

向上滑动看下一个

这样的洞庭湖决堤，实在让人同情不起来

李尚福、魏凤和双双被拿下，与美国一份报告是否有关？

抗洪靠嘴，堵漏靠沙？印度官员真是绝了！

有的人走了，却永远活着

圈内疯传某谣言

刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

您可能也对以下帖子感兴趣

这样的洞庭湖决堤，实在让人同情不起来

李尚福、魏凤和双双被拿下，与美国一份报告是否有关？

抗洪靠嘴，堵漏靠沙？印度官员真是绝了！

有的人走了，却永远活着

圈内疯传某谣言

生成图片，分享到微信朋友圈

刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

刊讯｜SSCI 期刊《计算语言学协会学报》2023年第11卷

您可能也对以下帖子感兴趣