Strategies and challenges for constructing and collecting visual corpora from image-based social media platforms

Authors

DOI:

https://doi.org/10.24434/j.scoms.2024.01.3881

Keywords:

visual corpus, social media, data collection, Instagram

Abstract

Visual elements play an important role within the multimodal nature of social media (Pearce et al., 2020). A growing body of research has focused on the analysis of still and moving images from different social media platforms from various perspectives of communication and media studies (Hautea, Parks, Takahashi, & Zeng, 2021; Li & Xie, 2020; Veum & Undrum, 2018). Although the aforementioned studies describe visual data collection, their principal focus does not rely on this collection, but on data analysis. Little attention has been paid to the challenges of collecting visual datasets (Highfield & Leaver, 2016). In this paper, I propose a methodological overview of several strategies for collecting large corpora of visual data from image-based social media platforms. Provided with exemplary publications, I review five strategies for collecting visual corpora: hashtag-based, account-based, metadata-based, random sampling, and mixed approach. Lastly, I present a case study with my own mixed approach to the collection of visual data from Instagram. Considering the usage, advantages and limitations of each strategy, the article will contribute to the developing science of social media research. I believe that a literature analysis of visual data collection strategies and a provided case study can help researchers optimize visual data collection from image-based social media.

Downloads

Published

2024-01-10

How to Cite

Samofalova, Y. (2023). Strategies and challenges for constructing and collecting visual corpora from image-based social media platforms. Studies in Communication Sciences, 1–16. https://doi.org/10.24434/j.scoms.2024.01.3881

Issue

Section

Thematic Section: Images, clusters and types