Strategies and challenges for constructing and collecting visual corpora from image-based social media platforms
Keywords:visual corpus, social media, data collection, Instagram
Visual elements play an important role within the multimodal nature of social media (Pearce et al., 2020). A growing body of research has focused on the analysis of still and moving images from different social media platforms from various perspectives of communication and media studies (Hautea, Parks, Takahashi, & Zeng, 2021; Li & Xie, 2020; Veum & Undrum, 2018). Although the aforementioned studies describe visual data collection, their principal focus does not rely on this collection, but on data analysis. Little attention has been paid to the challenges of collecting visual datasets (Highfield & Leaver, 2016). In this paper, I propose a methodological overview of several strategies for collecting large corpora of visual data from image-based social media platforms. Provided with exemplary publications, I review five strategies for collecting visual corpora: hashtag-based, account-based, metadata-based, random sampling, and mixed approach. Lastly, I present a case study with my own mixed approach to the collection of visual data from Instagram. Considering the usage, advantages and limitations of each strategy, the article will contribute to the developing science of social media research. I believe that a literature analysis of visual data collection strategies and a provided case study can help researchers optimize visual data collection from image-based social media.
How to Cite
Copyright (c) 2024 Yuliya Samofalova
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The electronic contributions in the Internet are distributed under the "Creative Commons Attribution – NonCommercial – NoDerivatives 4.0 International" - License (CC BY-NC-ND 4.0). This license allows others to share the work in any medium or format with an acknowledgement of the work's authorship and initial publication in Studies in Communication Sciences SComS. However, the work may not be altered or transformed and it may not be used for commercial purposes. These conditions are irrevocable. The full text of the license may be read under http://creativecommons.org/licenses/by-nc-nd/4.0/deed.en