Sentiment/tone (Automated Content Analysis)


  • Valerie Hase



sentiment analysis, emotions, dictionary, supervised machine learning


Sentiment/tone describes the way issues or specific actors are described in coverage. Many analyses differentiate between negative, neutral/balanced or positive sentiment/tone as broader categories, but analyses might also measure expressions of incivility, fear, or happiness, for example, as more granular types of sentiment/tone. Analyses can detect sentiment/tone in full texts (e.g., general sentiment in financial news) or concerning specific issues (e.g., specific sentiment towards the stock market in financial news or a specific actor).

The datasets referred to in the table are described in the following paragraph:

Puschmann (2019) uses four data sets to demonstrate how sentiment/tone may be analyzed by the computer. Using Sherlock Holmes stories (18th century, N = 12), tweets (2016, N = 18,826), Swiss newspaper articles (2007-2012, N = 21,280), and debate transcripts (2013-2017, N = 205,584), he illustrates how dictionaries may be applied for such a task. Rauh (2019) uses three data sets to validate his organic German language dictionary for sentiment/tone. His data consists of sentences from German parliament speeches (1991-2013, N = 1,500), German-language quasi-sentences from German, Austrian and Swiss party manifestos (1998-2013, N = 14,008) and newspaper, journal and news wire articles (2011-2012, N = 4,038). Silge and Robinson (2020) use six Jane Austen novels to demonstrate how dictionaries may be used for sentiment analysis. Van Atteveldt and Welbers (2020) use state of the Union speeches (1789-2017, N = 58) for the same purpose. The same authors (van Atteveldt & Welbers, 2019) show based on a dataset of N = 2,000 movie reviews how supervised machine learning might also do the trick. In their Quanteda tutorials, Watanabe and Müller (2019) demonstrate the use of dictionaries and supervised machine learning for sentiment analysis on UK newspaper articles (2012-2016, N = 6,000) as well as the same set of movie reviews (n = 2,000). Lastly, Wiedemann and Niekler (2017) use state of the Union speeches (1790-2017, N = 233) to demonstrate how sentiment/tone can be coded automatically via a dictionary approach.

Field of application/theoretical foundation:

Related to theories of “Framing” and “Bias” in coverage, many analyses are concerned with the way the news evaluates and interprets specific issues and actors.

References/combination with other methods of data collection:

Manual coding is needed for many automated analyses, including the ones concerned with sentiment. Studies for example use manual content analysis to develop dictionaries, to create training sets on which algorithms used for automated classification are trained, or to validate the results of automated analyses (Song et al., 2020).


Table 1. Measurement of “Sentiment/Tone” using automated content analysis.




Formal validity check with manual coding as benchmark*


Puschmann (2019)

(a) Sherlock Holmes stories

(b) Tweets

(c) Swiss newspaper articles

(d) German Parliament transcripts


Dictionary approach

Not reported

Rauh (2018)

(a) Bundestag speeches

(b) Quasi-sentences from German, Austrian and Swiss party manifestos

(c) Newspapers, journals, agency reports

Dictionary approach


Silge & Robinson (2020)

Books by Jane Austen

Dictionary approach

Not reported

van Atteveldt & Welbers (2020)

State of the Union speeches

Dictionary approach


van Atteveldt & Welbers


Movie reviews

Supervised Machine Learning Approach


Watanabe & Müller (2019)

Newspaper articles

Dictionary approach

Not reported

Watanabe & Müller (2019)

Movie reviews

Supervised Machine Learning Approach


Wiedemann & Niekler (2017)

State of the Union speeches

Dictionary approach

Not reported

*Please note that many of the sources listed here are tutorials on how to conducted automated analyses – and therefore not focused on the validation of results. Readers should simply read this column as an indication in terms of which sources they can refer to if they are interested in the validation of results.


Puschmann, C. (2019). Automatisierte Inhaltsanalyse mit R. Retrieved from

Rauh, C. (2018). Validating a sentiment dictionary for German political language—A workbench note. Journal of Information Technology & Politics, 15(4), 319–343. doi:10.1080/19331681.2018.1485608

Silge, J., & Robinson, D. (2020). Text mining with R. A tidy approach. Retrieved from

Song, H., Tolochko, P., Eberl, J.-M., Eisele, O., Greussing, E., Heidenreich, T., Lind, F., Galyga, S., & Boomgaarden, H.G. (2020) In validations we trust? The impact of imperfect human annotations as a gold standard on the quality of validation of automated content analysis. Political Communication, 37(4), 550-572.

van Atteveldt, W., & Welbers, K. (2019). Supervised Text Classification. Retrieved from

van Atteveldt, W., & Welbers, K. (2020). Supervised Sentiment Analysis in R. Retrieved from

Watanabe, K., & Müller, S. (2019). Quanteda tutorials. Retrieved from

Wiedemann, G., Niekler, A. (2017). Hands-on: a five day text mining course for humanists and social scientists in R. Proceedings of the 1st Workshop Teaching NLP for Digital Humanities (Teach4DH@GSCL 2017), Berlin. Retrieved from



How to Cite

Hase, V. (2021). Sentiment/tone (Automated Content Analysis). DOCA - Database of Variables for Content Analysis.



Variables for Automated Content Analysis