Xixuan Zhang

Hello! I'm Xixuan, a data scientist and PhD candidate in Computational Social Science at the Free University of Berlin. With a dual background in computer engineering and social science, I research how people create, share, and debate information on digital platforms, applying advanced computational methods to analyze large-scale data.

When I'm not immersed in a project, you’ll probably find me reading, running, playing the piano, or cuddling with my cats 🐈🐈.


Selected Research Projects

Discourse Cohesion: Mapping How Conversations Evolve

with Annett Heft and Yangliu Fan
LLMs fine-tuning BERT feature embedding trajectory analysis multilevel regression

Together with my co-authors, I developed a structure-aware embedding space to capture how online discourse forms and unfolds over time. We encoded interactions among key discourse characteristics — submission type, topic, stance, political positions, sentiment, politeness, and toxicity — using 231,042 comments from 36,384 threads on r/climate. (WIP: poster presented at IC2S2 2025).

Alt text


How Wikipedia Edits Evolve — and What Drives Them

NLP fine-tuning BERT active learning text mining frailty model meta analysis

Using a novel, fine-grained approach, I analyzed 140,593 revisions of 76,525 sentences from 537 Wikipedia articles to uncover how collaborative content evolves. By reconstructing detailed revision sequences and extracting factors such as time, content, editor, and context, I examined how revision dynamics are shaped. The study reveals how epistemic power is negotiated through collective editing, as community moderation and bureaucratic rules guide which revisions persist.

Alt text


NEOVEX: Detecting and Analyzing Conspiracy Theories at Scale

with Annett Heft, Kilian Buehling, Joana Becker, and Juni Schindler
NLP fine-tuning BERT word embedding scraping data engineering HPC

As part of the NEOVEX research project, I collaborated with colleagues to develop a graph-based dictionary expansion method for detecting domain-specific unknown keywords, based on a fine-tuned GloVe model for targeted data collection. Together, we built an eleven-year corpus of 32 million conspiracy theory–related posts on the Great Replacement and New World Order from social media, legacy media, and alternative media via large-scale scraping and API integration. To detect conspiracy narratives, I fine-tuned multiple BERT models and created custom NLP methods for this project, including entity recognition, dependency parsing, and both supervised and unsupervised text classification. This work supports large-scale analysis of conspiracy discourse across platforms, enabling deeper insights into misinformation dynamics and informing content moderation strategies.


Diffusion Dynamics of #FridaysForFuture: Mapping Tweet Cascades and Communities

transfer learning topic model clustering network analysis time series analysis

I inferred the early diffusion of the #FridaysForFuture based on 237,892 retweet sequences and the follower–following links of 51,803 participants. Using a top-down perspective, I examined how networks, tweets, and retweet cascades evolved, integrating both actor dynamics and content patterns to reveal how messages spread within and between communities in the movement’s formative stages.

Alt text


Peer-Reviewed Journal Publications

  • Decoding revision mechanisms in Wikipedia: collaboration, moderation, and collectivities
    Zhang, X. — New Media & Society, 2025
    View Online | PDF
  • LGDE: Local Graph-based Dictionary Expansion
    Schindler, J., Jha, S., Zhang, X., Buehling, K., Heft, A., & Barahona, M. — Computational Linguistics, 2025
    View Online | PDF
  • Veiled conspiracism. Particularities and convergence in styles and functions of conspiracy-related communication across digital platforms
    Buehling, K., Zhang, X., & Heft, A. — New Media & Society, 2025
    View Online | PDF
  • Diffusion Dynamics and Digital Movement: the Emergence and Proliferation of the German-speaking #FridaysForFuture Network on Twitter
    Zhang, X. — Social Movement Studies, 2023
    View Online | PDF
  • Challenges of and approaches to data collection across platforms and time: Conspiracy-related digital traces as examples of political contention
    Heft, A., Buehling, K., Zhang, X., Schindler D., & Milzner M. — Journal of Information Technology & Politics, 2023
    View Online | PDF
  • Mainstreaming Political Extremism: Intermediary Networks and Movement-Party Coordination of a Global Anti-immigration Campaign in Germany
    Klinger, U., Bennett, L., Knüpfer, C., Martini, F., & Zhang, X. — Information, Communication and Society, 2022
    View Online | PDF
  • Understanding Digitally Networked Action: A Case Study of #HomeToVote and the Irish Abortion Referendum 2018
    Zhang, X. — SCM Studies in Communication and Media, 2021
    View Online | PDF