Speaker
Description
World Englishes as a field of research has gained considerable traction; however, several gaps remain. First, most studies on variation in different English varieties use either qualitative methods with limited generalizability or quantitative, corpus-based methods lacking extensive speaker metadata (e.g., Botha & Bernaisch, 2024). Second, postcolonial English varieties are often examined in isolation, an approach which neglects their multilingual contexts (e.g., Vida-Mannl et al., 2025). Integrating detailed sociodemographic data with corpuslinguistic methods and the context of local dominant language constellations (e.g., Aronin, 2019) is a powerful approach to addressing these gaps.
Northeast India is significantly more linguistically diverse than mainland India, home to around 220 languages from the Indo-Aryan, Tibeto-Burman, and Austro-Asiatic families (Fuchs et al., 2025; Moral, 1994). English plays an important role in the region as a medium of instruction and lingua franca, partly also due to resistance to the central government’s promotion of Hindi (Fuchs et al., 2025). Investigating the role of English within Northeast Indian multilingualism and its interactions with social dynamics provides a replicable, integral approach to examining variation in this English variety.
To this end, my PhD project statistically models sociolinguistic variation and dominant language constellations in Northeast India and analyzes their influence on the region’s spoken English variety. Data was collected as part of the project “English as a local lingua franca in the multilingual ecology of Northeast India”, funded by the German Research Foundation (research unit FOR 5728). A stratified random sample of Northeast Indian participants (n = 180) completed a researcher-administered questionnaire, which covered sociodemographics, education, language repertoire, proficiency, use, and attitudes. A subset of the informants (n = 60) additionally completed semi-structured interviews of approximately two hours duration each, forming a corpus of spoken English.
This poster presents the first phase of the project: Identifying underlying dimensions of sociodemographic variation in Northeast India using Multiple Correspondence Analysis (MCA). MCA models relationships between individuals and variables to identify underlying dimensions of variation in the data (e.g., Clarke et al., 2021). In this study, dimensions resulting from the MCA computed on the questionnaire data are expected to reveal several components of social dynamics, such as core sociodemographics (e.g., gender), generational dynamics (e.g., age), and educational trajectory (e.g., schooling, occupation). In the future, the project will examine interactions of the resulting dimensions with dominant language constellations and their effects on variation in spoken Northeast Indian English. The poster will thus shed light on the social dynamics of Northeast India and provide a replicable approach for modeling sociolinguistic variation in world Englishes. The proposed approach contributes to bridging the gap between sociolinguistics in world Englishes, multilingualism, and corpuslinguistic methods.