刊讯｜SSCI 期刊《语言测试》2022第2期

五万学者关注了→ 语言学心得 2022-12-22

收录于合集 #外文期刊 136个

博学有道｜澳新加CSC语言学专业申博交流会

2022-09-06

重磅｜2022年度国家社科基金重大项目招标（附语言学选题）

2022-09-06

刊讯丨SSCI 期刊 System 2022年第107卷

2022-09-06

LANGUAGE TESTING

Volume39 , Issue 2, April 2022

LANGUAGE TESTING（SSCI一区，2020 IF：3.551）2022年第2期共发研究性论文6篇。内容涉及二语习得、口译、测试评估、认知负荷理论、标准制定、贝叶斯验证性因子分析等。

往期推荐：

刊讯｜SSCI 期刊 Language Testing 2022第1期

ARTICLES

■ Gauging the impact of literacy and educational background on receptive vocabulary test scores , by Bart Deygers, Marieke Vanbuel, Pages 191–211.

■Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment, by Chi Lai Tsang, Talia Isaacs, Pages 212–238.

■ What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters, by Wenyue Ma , Pages 239–264.

■ Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis, by Shangchao Min, Kyoungwon Bishop, Howard Gary Cook, Pages 265–288.

■ A comparative judgment approach to assessing Chinese Sign Language interpreting, by Chao Han, Xiaoyan Xiao, Pages 289–312.

■ Investigating and optimizing score dependability of a local ITA speaking test across language groups: A generalizability theory approach, by Ji-young Shin, Pages 313–540.

摘要

Gauging the impact of literacy and educational background on receptive vocabulary test scores

Bart Deygers，Ghent University，Belgium

Marieke Vanbuel，Ghent University, Belgium

Abstract The Peabody Picture Vocabulary Test (PPVT) is a widely used test of receptive vocabulary, but no researchers to date have examined the performance of low-educated, low-literate L2 adults, or compared these individuals’ performances to their more highly educated peers. In this study, we used many-facet Rasch analysis and mixed-effects linear regression to determine the impact of educational background and other demographic variables on PPVT test performance. The analyses rely on the performance data of 1,014 adult learners of Dutch as a second language on the Dutch version of the PPVT (PPVT-III-NL). The results show that a substantial proportion of score variance can be attributed to educational background variables and to the educational tracks the participants followed. These tracks, which cater to the needs of different L2 learner profiles, appear to exacerbate rather than mediate any performance differences. Although this study provides evidence of performance differences and differential item functioning resulting from linguistic, demographic, and educational variables, it offers no data to invalidate the use of the PPVT on low-educated L2 adults.

Key words Literacy，Peabody Picture Vocabulary Test， receptive vocabulary，second language acquisition， testing

Hong Kong secondary students’ perspectives on selecting test difficulty level and learner washback: Effects of a graded approach to assessment

Chi Lai Tsang，University College London, UKSt. Joseph’s College, Hong Kong

Talia Isaacs，University College London, UK

Abstract This sequential mixed-methods study investigates washback on learning in a high-stakes school exit examination by examining learner perceptions and reported behaviours in relation to learners’ beliefs and language learning experience, the role of other stakeholders in the washback mechanism, and socio-educational forces. The focus is the graded approach of the Hong Kong Diploma of Secondary Education English Language Examination (HKDSE-English), incorporated in 2012, that allows test-takers to choose between easier and more difficult sections for reading and listening-integrated skills papers. Inductive coding of focus groups involving 12 secondary students fed into the development of the Washback on Students’ Learning questionnaire, which was administered to another 150 learners. Exploratory factor analyses of identified washback effects revealed four major types straddling different settings (classrooms, tutorial schools, learners’ personal environment), and seven categories of mediating variables pertaining to learners themselves, other stakeholders, and societal influences. Simultaneous multiple regressions identified influential clusters of mediating variables and showed the strongest predictors for each macro-level washback type varied. At least one intrinsic and one extrinsic factor category significantly contributed to all types, reaffirming learner washback as a socially situated, negotiated construct. Implications related to the consequences, use, and fairness of the graded approach are discussed.

Key words Factor analysis, impact, multiple regression, second language learners, test preparation, test-taker perceptions, washback

What the analytic versus holistic scoring of international teaching assistants can reveal: Lexical grammar matters

Wenyue Ma

Abstract Second-language (L2) testing researchers have explored the relationship between speakers’ overall speaking ability, reflected by holistic scores, and the speakers’ performance on speaking subcomponents, reflected by analytic scores (e.g., McNamara, 1990; Sato, 2011). These research studies have advanced applied linguists’ understanding of how raters view the components of effective speaking skills, but the authors of the studies either used analytic composite scores, instead of true holistic ratings, or ran regression analyses with highly correlated subscores, which is problematic. To address these issues, 10 experienced ITA raters rated the speaking of 127 international teaching assistant (ITA) candidates using a four-component analytic rubric. In addition, holistic ratings were provided for the 127 test takers from a separate (earlier) scoring by two experienced ITA raters. The two types of scores differentiated examinees in similar ways. The variability observed in students’ holistic scores was reflected in their analytic scores. However, among the four analytic subscales, examinees’ scores on Lexical and Grammatical Competence had the greatest differentiating power. Its scores indicated with a high level of accuracy who passed the test and who did not. The paper discusses the components contributing to ITAs’ L2 oral speaking proficiency, and reviews pedagogical implications.

Key words Analytic scoring, holistic scoring, ITA test, Rasch analysis, speaking assessment

Reading is a multidimensional construct at child-L2-English-literacy onset, but comprises fewer dimensions over time: Evidence from multidimensional IRT analysis

Shangchao Min，Zhejiang University, China

Kyoungwon Bishop，University of Wisconsin-Madison, USA

Howard Gary Cook，University of Wisconsin-Madison, USA

Abstract This study explored the interplay between content knowledge and reading ability in a large-scale multistage adaptive English for academic purposes (EAP) reading assessment at a range of ability levels across 1–12 graders. The datasets for this study were item-level responses to the reading tests of ACCESS for ELLs Online 2.0. A sample of 10,000 test takers were each time randomly drawn from the test-taking population at five grade clusters without manipulation on proficiency levels, and then with manipulation on proficiency levels. The results indicated that although the bi-factor multidimensional item response theory (MIRT) model fit the data significantly better than the unidimensional two-parameter logistic (2PL) model for Grade 1, no clear evidence can be found regarding the dimensionality of the test for Grades 2–12. However, content knowledge was consistently found to contribute substantially to test performance for low-ability-level test takers across all grade clusters. The findings indicate that EAP reading ability is a multidimensional construct in the onset of EAP reading ability development, but the presence of multidimensionality decreases as proficiency level and grade level increase. This study provides insights into the developmental pattern of the interplay between language and content in EAP reading contexts.

Key words Bi-factor MIRT model, content knowledge, EAP reading, K–12 context, proficiency level

A comparative judgment approach to assessing Chinese Sign Language interpreting

Chao Han，Xiamen University, China

Xiaoyan Xiao，Xiamen University, China

Abstract The quality of sign language interpreting (SLI) is a gripping construct among practitioners, educators and researchers, calling for reliable and valid assessment. There has been a diverse array of methods in the extant literature to measure SLI quality, ranging from traditional error analysis to recent rubric scoring. In this study, we want to expand the terrain of SLI assessment, by exploring and evaluating a novel method, known as comparative judgment (CJ), to assess SLI quality. Briefly, CJ involves judges to compare two like objects/items and make a decision by choosing the one with higher quality. The binary outcomes from repeated comparisons by a group of judges are then modelled statistically to produce standardized estimates of perceived quality for each object/item. We recruited 12 expert judges to operationalize CJ via a computerized system to assess the quality of Chinese Sign Language interpreting produced by 36 trainee interpreters. Overall, our analysis of quantitative and qualitative data provided preliminary evidential support for the validity and utility of CJ in SLI assessment. We discussed these results in relation to previous SLI literature, and suggested future research to cast light on CJ’s usefulness in applied assessment contexts.

Key words Chinese Sign Language, comparative judgment, interpreting quality, sign language interpreting, testing and assessment

Investigating and optimizing score dependability of a local ITA speaking test across language groups: A generalizability theory approach

Ji-young Shin

Abstract With the present study I investigated the sources of score variance and dependability in a local oral English proficiency test for potential international teaching assistants (ITAs) across four first language (L1) groups, and suggested alternative test designs. Using generalizability theory, I examined the relative importance of L1s (i.e., Indian, Korean, Mandarin, and Spanish), examinees, tasks, and ratings to score variability, and estimated dependability across the L1s. The analyses identified examinees as the largest contributor, which is important for high dependability and validity arguments for test scores. Effects of ratings and tasks were small, but L1 effects on score variance were considerable, with the Indian group’s dependability lowest. Unlike previous generalizability theory studies on L1 effects, however, further analyses revealed that the L1 effects highly likely reflect proficiency differences rather than strong bias when comparing the percent agreement of the ratings, external criteria of examinee English proficiency, and underlying score distributions. I discuss the proficiency differences related to varied socio-linguistic contexts of using and learning English. Lastly, I suggest an alternative design with fewer items and one additional rating for improved dependability. Considering multiple test purposes specific to ITA testing (i.e., efficiency, construct representation, formative advantages), I propose a flexible approach.

Key words Dependability, generalizability theory, L1 effect, local testing, score variance, ITA testing

期刊简介

Language Testing is an international peer reviewed journal that publishes original research on foreign, second, additional, and bi-/multi-/trans-lingual (henceforth collectively called L2) language testing, assessment, and evaluation. The journal's scope encompasses the testing of L2s being learned by children and adults, and the use of tests as research and evaluation tools that are used to provide information on the knowledge and performance abilities of L2 learners.

《语言测试》是一本国际同行评审期刊，发表关于外语、第二语言、附加语言和双/多语言/跨语言（以下统称为L2）语言测试、评估和评价的原创研究。该期刊的范围包括测试儿童和成人正在学习的L2，以及使用测试作为研究和评估工具，用于提供有关L2学习者的知识和表现能力的信息。

In addition, the journal publishes submissions that deal with L2 testing policy issues, including the use of tests for making high-stakes decisions about L2 learners in fields as diverse as education, employment, and international mobility. The journal welcomes the submission of papers that deal with ethical and philosophical issues in L2 testing, as well as issues centering on L2 test design, validation, and technical matters. Primary studies, replication studies, and secondary analyses of pre-existing data are welcome. Authors are encouraged to adhere to Open Science Initiatives.

此外，该杂志还发表涉及L2测试政策问题的投稿，包括使用测试在教育，就业和国际流动性等各个领域对L2学习者做出高风险决策。该期刊欢迎提交涉及L2测试中的伦理和哲学问题的论文，以及围绕L2测试设计，验证和技术问题的问题。欢迎进行初步研究、重复研究和对已有数据的二次分析。鼓励作者坚持开放科学倡议。

官网地址：

https://journals.sagepub.com/home/ltj

本文来源：Language Testing官网

点击文末“阅读原文”可跳转下载

课程推荐

博学有道｜澳新加CSC语言学专业申博交流会

2022-09-06

重磅｜2022年度国家社科基金重大项目招标（附语言学选题）

2022-09-06

刊讯丨SSCI 期刊 System 2022年第107卷

2022-09-06

刊讯｜《当代语言学》2022年第3期

2022-09-05

纪念文集｜邵敬敏：《中国语文》伴我追逐汉语梦

2022-09-04

刊讯｜《中国语文》2022年第4期

2022-09-04

刊讯｜《国际汉语教学学报》2022年第1-2期

2022-09-03

刊讯｜《华文教学与研究》2022年第3期

2022-09-02

刊讯｜SSCI 期刊《语音学杂志》2022第91-92卷

2022-09-01

系列讲座｜中文汉字网络教学的现在与未来

2022-09-04

欢迎加入

“语言学心得交流分享群”“语言学考博/考研/保研交流群”

请添加“心得君”入群请务必备注“学校+研究方向/专业“

今日小编：讷言

审核：心得小蔓

转载&合作请联系

"心得君"

微信：xindejun_yyxxd

点击“阅读原文”可跳转下载

刚刚，我国DUV光刻机实现里程碑式突破！

微博遗存之六

微博遗存之五

性高潮到底什么感觉？真实记录多位女性的自述

执法队员围殴店主，光停职就算完事儿了？

刊讯｜SSCI 期刊《语言测试》2022第2期

您可能也对以下帖子感兴趣

刚刚，我国DUV光刻机实现里程碑式突破！

微博遗存之六

微博遗存之五

性高潮到底什么感觉？真实记录多位女性的自述

执法队员围殴店主，光停职就算完事儿了？

生成图片，分享到微信朋友圈

刊讯｜SSCI 期刊《语言测试》2022第2期

您可能也对以下帖子感兴趣