Thursday 19 November 2020

Language in 3.5 Million Books... Beautiful Women and Brave Men

Analysing a data set of 3.5 million books (using an AI), fiction and non-fiction, published in English between 1900 and 2008, a research team extracted adjectives and verbs that were associated with gender-specific nouns (e.g. daughter, boy) and examined whether the sentiment was positive, neutral or negative. They came to the conclusion that words chosen for women primarily described their appearance (negative verbs five times the frequency for females than males, positive and neutral adjectives twice as often in descriptions of women) while adjectives chosen for men referred to their behaviour and personal qualities. Women were mostly "beautiful" and "sexy" while men were "righteous", "rational" and "brave" (via and via).

"If the language we use to describe men and women differs, in employee recommendations for example, it will influence who is offered a job when companies use IT systems to sort through job applications."
Isabelle Augenstein

- - - - - - - - -
- Alexander Hoyle et al. (2019). Unsupervised Discovery of Gendered Language through Latent-Variable Modeling. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1706-1716
- photograph by Jeff Mermelstein via


  1. Thanks for sharing this find!

  2. I think I already commented on this, yesterday :-) Anyways. I just wanted to point out how much I love these street photography pieces you post in company with your findings. And Mermelstein's a magnus of his own.

    1. Oh, how lovely, thanks! It feels like hunting and I'm always superhappy when I find beautiful (street) photographs.