New York Times Word Usage Frequency Chart

(davidrozado.substack.com)

73 points by cubefox 3 days ago | 34 comments

The NYT probably produces 10x as many words as it did in 2015, and 50x as many as 2000, and 100x as many as 1980. These charts don't give any sense of if they are reporting frequency divided by total words or just frequency. I suppose the KKK and AIDS examples indicate the former but it isn't clear.

Additionally, the y axis seems to be from 0 to 1: least popular years vs most popular year. This means every graph has a different Y scale. Very misleading, imho.

Also, it is interesting to look at some of the words in Google ngram viewer for the same period.

itishappy 2 days ago | root | parent |

Rescaling the axis is egregious. Normalizing the max to 1 is fine, we're investigating trends not comparing frequencies, but dropping the min to 0 is not. A frequency of zero is meaningful, and it's removal destroys pretty much anything these charts could tell us. If a word occurred 199 times in 1970 and 200 times in 2022 it can look exactly the same as a word that was never used in 1970 and spiked in 2022.

Edit: To expand a bit, some of this info can be gleamed from context cues, such as usage hovering close to the bottom line for many charts implying that's the zero point. But half the charts look like "bigotry," "upsetting," "discrimination," or "duties." Are these strong trends or statistical noise? Who knows!

hammock 2 days ago | root | parent |

Many of those questions can be answered.

The charts show absolutely frequency.

The max is not normalized across the words, each chart is 0 to 1. That’s how the data comes out of google ngram viewer when you search terms one at a time, which is what this is

itishappy 2 days ago | root | parent |

They can be answered by consulting a better chart!

The charts show rise and fall, nothing more. That's enough info to see peaks, but we can't tell absolute anything without values. Zero would have been an easy value to include, but alas.

Google ngram viewer gives significantly more information. It actually gives absolute measurements, to start.

What's going on with "duties"? It dropped by a factor of two. Can't get that info from the article.

https://books.google.com/ngrams/graph?content=duties&year_st...

joshdavham 3 days ago | prev | next |

Just a note to readers that this article is from Feb 19, 2023.

It'd be interesting to see if the data has changed much. Especially since last week!

0points 3 days ago | prev | next |

"duties" continues it's downward spiral into oblivion

leobg 2 days ago | root | parent |

Which is funny, since one man’s rights are, by definition another man’s duties.

(I mean, of course “person” and “their”.)

will-burner 3 days ago | prev | next |

It would be cool to see one of those word diagrams where the size of the word is how often it appears (a word cloud), and to have one of the word clouds for the word in 1970 and one for the words in 2018 with maybe some years in between. That would make it a lot easier to digest the information than a grid of frequency line plots. It's information overload when I open the page and it takes a lot of energy to read all the different words and compare the plots. The word clouds would get the point across easier and clearer imo.

rgbswan 2 days ago | root | parent |

https://pudding.cool/projects/vocabulary/index.html

causality0 3 days ago | prev | next |

Interesting that certain terms have dropped in usage as precipitously as they rose, such as micro-aggressions, cultural appropriation, and mansplaining.

akomtu 2 days ago | prev | next |

A natural upgrade of this chart:

1. Analyze titles only to get a sense of what narrative is being pushed. Titles are usually made of screaming hot-words to grab attention.

2. Draw each word as a colored circle on a 2D plane, using some word2vec model. This is going to capture changes in meaning. For example, we would see how the word "safety" migrates from one neighborhood to another.

3. Show an animation how the usage of hot-words changes with time, from 80s to today. This will show how the narrative changes.

Edit: another upgrade would be to do the same, but for newspapers of the opposite camp. This would show two evolving clouds of hot-words: a red cloud and a blue cloud. It's not obvious how the two evolve: whether they stay away from each other, or interact. My hypothesis is that they keep a good distance from each other to keep their readers in separate bubbles. This is necessary because the readers give energy of their attention to the clouds, and a rational self-serving tactic is to fence off their faithful readers.

PeterStuer 3 days ago | prev | next |

Word frequency is one thing, shifting word semantics another. Some of those words meant very different things in the 1970's than they do today.

rgbswan 2 days ago | root | parent |

Wonderful topic! Reframing or reinterpretation over time. And then data science origins together and track the path into mainstream culture and the effects the reinterpretation had on language in old/new context and how it normalised/polarised certain concepts, topics, perspectives, ideologies and so on ... with a sub-focus on teens and their social media use or the cool, alternative, avant-garde kids in art and music.

rgbswan 2 days ago | prev | next |

Has any compared national/international media outlets?

Is there an automated LLM comparing word usage, phrasing, maybe even cultural/class-driven subtext in context of time/events in print/TV or a project comparing national/international news outlets that target specific population segments/demographics?

One could analyse the same in advertisement/social media/even 'Hollywood' and data science all of that stuff together. Create a chain or rather 'graph of thought'. See what the ministry of truth has been up to in recent decades ... I'm joking, of course, it's a complex world and people cooperate so what the hell, but it would be interesting nevertheless.

Sadly, I can't even scrape websites that have a robots.txt.

I remember reading a book about above topic a decade ago where multiple authors did something like that to analyse cooptation in media (German: "Gleichschaltung").

rfarley04 3 days ago | prev | next |

More recent analysis of the same thing from The Economist: https://archive.ph/knWrN

cubefox 3 days ago | root | parent |

I think this is a good rebuttal which discusses the above: https://ncofnas.com/p/wokism-is-just-beginning

guax 3 days ago | root | parent |

The wikipedia article on the word is an interesting read: https://en.wikipedia.org/wiki/Woke

The reductionist choice of the author and the way things are phrased seem like they're fact and argument based but are just a sugar coated simple and dull opinions.

gruez 3 days ago | root | parent |

>The reductionist choice of the author and the way things are phrased seem like they're fact and argument based but are just a sugar coated simple and dull opinions.

The article was in reply to an economist article, which does define what "woke" means, at least in the context of the article.

>The term woke was originally used on the left to describe people who are alert to racism. Later it came to encompass those eager to fight any form of prejudice. By that definition, it is obviously a good thing. But Democrats seldom use the word any more, because it has become associated with the most strident activists, who tend to divide the world into victims and oppressors. This outlook elevates group identity over the individual sort and sees unequal outcomes for different groups as proof of systemic discrimination. That logic is then used to justify illiberal means to correct entrenched injustices, such as reverse discrimination and the policing of speech. It is this sort of “woke warrior” that Republicans love to lambast.

>Our analysis subsumes both the advocates and the denigrators of woke thinking, by looking at ideas and actions associated with this sort of activism, for good or for ill. It measures, for example, talk of “diversity, equity and inclusion” (DEI) in the corporate world, regardless of whether it is being invoked as a way to correct the under-representation of women and racial minorities or as an example of pious window-dressing. Some of the yardsticks we use apply only to the more doctrinaire form of woke activism, such as the number of drives to censure academics for views deemed offensive. Others capture only the more positive aspects of the movement, such as polling data on the proportion of Americans who worry about racial injustice. Either way, the results are consistent: America has passed “peak woke”.

cubefox 2 days ago | root | parent | next |

Yeah, but Cofnas provides counterarguments to this. Young people are much more woke than older people, who are still often in positions of power. As the generations shift, ideologies that are more prevalent among the young take hold. Moreover,

> The argument that we are “passed peak woke,” which recently got a boost from articles in the Economist and the New York Times, misinterprets the consolidation of the woke victory as a decline in the ideology’s power (like Crusaders buying fewer swords in the year 1100).

He goes more into detail in the article.

guax 2 days ago | root | parent | prev |

Democrats stopped using that word likely because seldom anyone actually used it in any significant way before it got co-opted by rhetoric that uses it to denigrate their critics. I seen it on twitter in the past but never on mainstream political discourse before it came out of Trumps mouth (in a way that crossed the Atlantic at least). Just being critical of issues that are not even controversial in Europe is enough to be tagged as infected by the woke mind virus on X/Twitter.

Coming from an "outside" view of American politics, the American left is incredibly conservative, both economically and morally so these labels are at best distractions and used by people to reinforce the duality of something that should be plural. This is a guilt that both democrats and republicans are carrying. Is just that Republicans are particularly nonchalant about what I view as anti democratic discourse.

I've seen something very similar in Brazil with the conservative parties in there adopting the "Petralha" (as in someone sympathetic to the Workers Party) or Communist as a way to denigrate any non conservative view independent of actual affiliation or opinion. Its not meant to be accurate, just mean to group all criticism together, because if you can do that, then its easier to use the stupid criticism (from radicals as you pointed out) as a reason to classify any criticism coming from "that kind of people" as stupid.

From the article shared (no the economist one) its clear that the definition adopted of woke by the author is cherry picked to be inflammatory. Likely real from some people but certainly not something that is shared among mainstream Democrats and I would bet not even a sizeable minority. Parts of it do resonate with the majority but thats part of the allure of using it, you create something obviously controversial that no one believes in full to tie everyone in the same bucket.

2 days ago | prev | next |

[deleted]

marklubi 2 days ago | prev | next |

I was doing this in 2007. Most of my time was spent trying to figure out how to automatically de-duplicate the tripe, and filter out the AP re-posted stuff.

neverrroot 3 days ago | prev | next |

Quite a strong change over the years. I wonder still, what this all will lead to, even more so given the recent developments.

up2isomorphism 3 days ago | prev | next |

One question is: Will this making an AI generated content harder to be recognized, as particular topic/keywords are so much frequently used so the information gain is relatively decreased compared to, let's say 10 years ago?

MrSkelter 2 days ago | prev | next |

What an utterly valueless “analysis”. The use of a word says nothing about context and without that you can say nothing about any imagined editorial stance.

This simply conflates any use of a word with advocacy for a position.

Even more tellingly we are supposed to think that saying things like “discrimination based on race and gender is unfair” constitutes an arguable position.

The takeaway is that people like the author have entirely bought into a radical right wing position and somehow believe they are impartial observers.

zxcvbnm69 2 days ago | root | parent |

[dead]

jrflowers 3 days ago | prev | next |

[flagged]

giraffe_lady 3 days ago | root | parent |

Some of the inclusion choices are extremely revealing. It is wokeness to talk about white supremacy, bullying, xenophobia, it seems.

ars 3 days ago | root | parent | next |

Talking about it is not wokeness, but acting like it's a widespread phenomenon is.

I do not believe that incidences of these things have dramatically skyrocketed in the last decades, which means the word frequency should go down not up.

giraffe_lady 3 days ago | root | parent |

Assuming we were talking about them the correct amount before? If they were, for some reason, given less attention than their impact deserved previously, it would be reasonable for use frequency go up as awareness of that failure spreads.

jrflowers 2 days ago | root | parent |

The implicit assumption that tends to come across from folks that complain about “wokeness” with a straight face (eg people mad about video game character design) is that white supremacy, xenophobia, bullying etc. happen at a rate in society such that these phenomenon should be considered so acceptable that there is no rightful place to even use the “woke” words other than very occasional whispers behind closed doors.

Sometimes people that share this view will start with their personal opinions on these topics and then with backwards to try and create a mathematical or scientific justification. The result is often a substack full of charts that say stuff like “The y axis (wokeness) exploded under Obama” and/or quotes from Arthur Jensen.

giraffe_lady 2 days ago | root | parent |

Yeah.

spondylosaurus 2 days ago | root | parent | prev |

Or this comment:

> One issue is that the woke fad constantly generates hot new buzzwords, such as "equity"...

I wasn't aware that "equity" is new or a buzzword, lol.

tropicalfruit 3 days ago | prev |

rubs hands together

wokeness is a neo liberal individualist ideology designed to expand consumer markets