MoreRSS

site iconExperimental History

My job is to put people in situations and see what happens. The results, which I call experimental history.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Experimental History

Underrated ways to change the world

2024-11-20 00:18:27

photo cred: my dad

A lot of people would like to make the world better, but they don’t know how. This is a great tragedy.

It’s tragic not only for the people who need help, but also for the people who can help, because good intentions start to rot if you don’t act them out. Well-meaning people who remain idle end up sick in the heart and the head, and they often develop exquisite ideologies to excuse their inaction—they start to believe that witnessing problems is as good as solving them, or that it’s impossible to make things better and therefore foolish to try, or that every sorrow in the world is someone else’s fault and therefore someone else’s responsibility.

We get stuck here because we assume that there are only two paths to improving the world. Option #1 is to go high-status: get rich so you can blast problems with your billions of bucks, or get into office so you can ban all the bad things and mandate all the good things. Only a fortunate few are powerful enough to do anything, of course, so most of the people attempting to improve the world through the high-status route will end up either begging our overlords to do the right thing, or trying to drum up the votes necessary to replace them.

Option #2 is to go high-sacrifice: sell everything you have and spend your life earning $7/hr to scrub the toilets in an orphanage. Only a virtuous few will have the saintliness necessary to live such a life, of course, so most of the people attempting to improve the world through the high-sacrifice path will end up writing checks to the martyrs on the front lines.

These paths aren’t wrong. They’re just too narrow. Money, power, and selflessness are all useful tools in the right hands, but the world is messed up in all sorts of ways that can’t be legislated against, bought off, or undone with a hunger strike. When we focus on just two avenues for making the world better, we exclude almost everybody, leaving most of us with a kind of constipated altruism—we’ve got the urge to do good, but nothing comes out.

I don’t know all the ways to get our good intentions unblocked. That’s why, whenever I spot someone changing the world via a righteous road less taken, I write it down on a little list. I glance at that list from time to time as a way of expanding my imagination, and now I’m sharing it in the hopes that it’ll do the same for you.

1. BE THE SECOND-BRAVEST PERSON

Between 2004 and 2016, a Kentucky lawyer extracted $550 million from the US government via bogus disability claims. He used shady doctors to fake the forms and a crooked judge to approve them. That lawyer’s name, in a perfect example of nominative determinism, was Eric C. Conn.

The fraud might have gone on forever if Jennifer Griffith, a paralegal working in the Social Security Administration, hadn’t noticed the brazen fakery. She did the brave thing and told someone about it.

But here’s the part of the story that really gets me: the person she told also did the brave thing—she listened. Griffith’s friend and coworker Sarah Carver was immediately like, “This is really bad, we need to do something about it.” Carver and Griffith attempted to expose Conn’s con for years, filing complaint after complaint, which were all ignored until a Wall Street Journal reporter happened upon the story. The duo eventually testified before Congress and in court. Conn went to jail, as did the judge and at least one of the shady docs.1

Whenever a scandal breaks—a CEO has been embezzling money, a Hollywood producer has been sexually assaulting people, a scientist has been faking data—people are always like “wow, crazy that no one spoke up about it.” But there’s always someone speaking up about it. They whisper it to a friend, they try to bring it up to their boss, they write an anonymous post on Reddit about how they’re working at a scam company and they don’t know what to do. Wrongdoing often goes unchecked not because we’re missing the bravest person, but because we’re missing the second-bravest person, the one who hears the whistleblower and starts blowing their own whistle too.

2. MAKE A SCENE

Nobody’s heard of Samuel Hartlib, Henry Oldenburg, or The Right Reverend John Wilkins, but modern science might not exist without them.

The ten-second history of science goes like this: for about 1200 years, people scribbled in the margins of Aristotle. Then one day Francis Bacon said “hey guys let’s do science” and people were like “sounds good.” But that only happened because a handful of folks made science into a scene.

Hartlib, Oldenburg, Wilkins and their friends established societies, wrote pamphlets, edited journals, and trained apprentices. Hartlib sent approximately one bajillion letters to budding scientists and inventors, hounding them to put their knowledge to practical use. Oldenburg convinced his friends to stop hiding their results—a common practice back then—and publish them instead. Wilkins was a friend to everybody and ensured that science didn’t become just an Anglican thing or a Catholic thing. If not for him, science might have ended up on one side of the English Civil War, and if that side lost, science could have stopped in its tracks for centuries.

Wilkins’ own Wikipedia page notes that he was “not one of the most important scientific innovators of the period”—ouch—but that he “was a lover of mankind, and had a delight in doing good,” aww.

Look at him, he just loves mankind! (source)

I meet lots of idealistic folks who think that all they’re missing is money, or credentials, or access to the levers of power. More often, what they’re really missing is friends. Only a crazy person can toil alone for very long. But with a couple of buddies, you can toil pretty much forever, because it doesn’t feel like toil. That’s how you end up with what economic historians call “efflorescences” and Brian Eno called “scenius“ (“scene” + “genius”): hotspots of cool stuff. And for that, we need not just a Francis Bacon, but also a whole gang of Right Reverend Wilkinses.

3. SWITCHBOARDING

Mahzarin Banaji, one of the most famous psychologists alive today, originally planned to be a secretary. Then this happened:

I was traveling home from New Delhi to Hyderabad. At a major railway juncture, I stepped off the train to visit a bookstore on the platform where I bought a set of books that changed the course of my life. Five volumes of the Handbook of Social Psychology (1968) edited by Lindzey and Aronson, were being offered for the equivalent of a dollar a piece.

[...]

I bought the Handbooks out of mild interest in their content, but mostly because it seemed like a lot of book for the money. By the time I reached home twenty hours later, I had polished off a volume and knew with blunt clarity that this form of science was what I wanted to do.

I sometimes wonder: who put those books there? The Handbook is not exactly light reading, and that bookstore could not have been moving a lot of units. And yet a chain of people all decided it was important to put that knowledge on a shelf at a train station somewhere between New Delhi and Hyderabad—and they did this 40 years ago, when that was much more complicated.

I think of this as switchboarding, trying to get the right information to the right person. Someone’s got an empty seat/someone needs a ride. You’re getting into the history of plumbing/I know exactly the book you should read. Your cousin is moving to San Diego and doesn’t know anyone/my former rugby teammate lives there, maybe they can be friends. No two people have the same constellation of connections, nor the same trove of information, and so each of us is a switchboard unto ourselves, responsible for routing every kilobyte to its appropriate destination. Whoever put the Handbook in Banaji’s hands, they were damn good switchboards.

The internet makes it seem like switchboarding is obsolete, but it’s more important than ever. When you’ve got instant access to infinite information, you need someone to show you where to start. And the most important info is still locked up inside people’s heads. If we can unlock it and send it where it needs to go, we can turn it into friendships, marriages, businesses, and unlikely psychologists.

4. CRACK YOUR KNUCKLES

Here is an exemplary scientific paper:

During the author’s childhood, various renowned authorities (his mother, several aunts, and, later, his mother-in-law [personal communication]) informed him that cracking his knuckles would lead to arthritis of the fingers. To test the accuracy of this hypothesis, the following study was undertaken.

For 50 years, the author cracked the knuckles of his left hand at least twice a day, leaving those on the right as a control. Thus, the knuckles on the left were cracked at least 36,500 times, while those on the right cracked rarely and spontaneously. At the end of the 50 years, the hands were compared for the presence of arthritis.

There was no arthritis in either hand, and no apparent differences between the two hands.

People often think they can’t do research because they don’t have a giant magnet or a laser or a closet full of reagents. But they have something the professional scientists don’t have: freedom. The pros can’t do anything that’s too weird, takes too long, or would raise the suspicion of an Institutional Review Board. That kind of stuff has to happen in a basement or a backyard, which is why the “paper” above is, in fact, a letter to the editor written by a medical doctor on a lark.

Independent investigators can thus explore where others fear to tread. For instance:

(See also: An Invitation to a Secret Society)

Subscribe now

5. POST GOOD, QUEEN

Culture is everything. If our culture says it’s cool to chase a wheel of cheese down a hill, we will. If our culture says it’s important to dress up in colorful clothes and douse people with buckets of water, we’ll do that. If our culture says that we should use an obsidian blade to cut people’s hearts out of their chests and offer them to the god Huitzilopochtli, we’ll do that too. So it’s pretty important to get our culture right.

We act as if culture is a thing that happens to us, rather than a thing we all make together. That used to be true, of course. When only a few people could read and write, they got to make all the culture themselves. All that started to change as more people got literate, but it really changed once most people got an internet connection. For the first time in history, we all have some say in whether we’re more of a cheese wheel culture, an obsidian blade culture, or, you know, something else.

And yet, to borrow what Douglas Adams said about the creation of the universe, “this has made a lot of people very angry and has been widely regarded as a bad move.” Most people think that social media has made things worse:

If you don’t like how culture is going, that’s a huge opportunity, because culture is us. You can move the needle just by showing up to the places you like to be, posting and promoting the kind of stuff you’d like to see, ignoring the things you don’t like, and vacating the places you think are bad. I’ve personally been inspired and influenced by the people who do this well, like Visa Veerasamy, Alice Maz, Sasha Chapin, Aella, and Defender of Basic.

I think of them as the partygoers who are committed to having a good time, no matter how good the party itself is. Yes, sometimes the music is too loud, the food is bad, the beer is warm, and the whole thing is run by billionaires who are trying to turn your attention into money. You could stand in the corner complaining about how bad and stupid and unfair it all is. Or you could join the handful of people hanging out on the couch and cracking each other up. If enough people do that, eventually the whole party is that couch.

6. GO TO A NONDESCRIPT GOVERNMENT BUILDING AND DO A GOOD JOB

Cave-ins used to kill a few hundred miners every year. Now that number is usually zero, in large part thanks to an algorithm developed by Chris Mark, who started out as a union organizer, then became a coal miner, then trained as an engineer, then went to work for the Bureau of Mines and figured out how to prevent coal mines from collapsing.

It turns out there are lots of Chris Marks out there, relentlessly solving problems from nondescript government offices. You’ve got Arthur Allen, who developed a search system that locates people faster when they’re lost at sea. There’s Pius Bannis, who organized the evacuation and adoption of 1,100 orphaned children after the 2010 earthquake in Haiti. There’s Tony Mento, Camille Otto, and Hari Kalla, who helped get the I-95 bridge back up in 12 days after it collapsed last year. And there’s Darnita Trower, Wanda Brown, and Gerald Johnson, who updated the government’s IT so that nobody has to physically mail anything to the IRS anymore.

Among the high-achieving crowd, it’s only cool to work in government if you’re leading it. So it’s fine to be a Chief of Staff or the Secretary of Labor, but it’s kinda cringe to be a Associate Administrator for Infrastructure or a Principal Strata Control Specialist. And yet, if you drive on I-95, if you work in a mine or on a boat, if an earthquake hits, or if you just want to file your damn taxes, you depend on folks with unimpressive titles doing impressive things.

I know this might sound rich to Americans because our incoming president has vowed to shutter many of these government buildings. But if you look at someone like Chris Mark, he succeeded not just because he was good at math and mines, but also because he found a problem that everybody agreed was worth solving, so he could continue his work across the inevitable change of administrations. It’s hard to thread that needle, which is why these are underrated ways of changing the world, not easy ones.

7. BUILD YOUR RUNK

I think about this tweet a lot:

Or as XKCD puts it:

Notice they both use the word “maintain”—not “invent,” not “lead,” but maintain. The power of a RUNK is that it works consistently. It was there counting numbers before it was cool, and it keeps counting numbers no matter how cool it gets. That’s why you don’t have to be a tech person to build a RUNK. If you do something even moderately useful and you do it no matter what, then people will realize you’re going to be around a while, and they can start building things on top of you.

Back in high school, I didn’t have a Ronald, but I did have a Peggy. She ran the Youth Grantmaking Council of Huron County, which sounds official, but it was really just Peggy convening a dozen high schoolers in a church basement, and telling us we should should raise money by doing car washes and hosting junior high dances, and then we should give the money away. YGC has been going for over two decades, and now there’s a whole web of people in rural Ohio who depend on getting a check from those teens. And the whole thing runs on Peggy, who has many fine qualities that make her a good YGC mentor, but the most important one is that she shows up every year.

GOD IS BREAD

Sometimes when I think about everything that’s wrong with the world, I get indignant, like: why wasn’t all this fixed out by the time I got here? I mean, really? In 2024? We’re still doing this?

But then I think: why do I, uniquely, deserve to be born in good times? Am I the Most Special Boy in the Universe? My ancestors died in famines and plagues, they suffered under evil kings, they got cut down in stupid wars that were fought because some people thought God is literally a piece of bread and other people thought God is only figuratively a piece of bread.

They all deserved a better world than the one they got. So do I. So do all of us! If only we could convince God to fast-forward us to the future where everything is beautiful and nothing hurts. Alas, He cannot hear us, for He is a piece of bread.

So it’s up to us. When I wrote There’s a Place for Everyone a few months ago, I got a lot of messages that were like, “Well, there’s certainly no place for me.” That breaks my heart. Look at the world! See it aching! It’s got so many problems—I promise you, there’s one with your name on it.

If none of these seven suit you, that’s fine; seven is not a lot! Fortunately, they come from an infinite list. (#8 is “make the list longer.”) I doubt any of these folks grew up thinking they would one day bust the largest Social Security scam in history, or that they’d be the ones to make science palatable to Puritans, or that they’d be best-known for cracking their knuckles or updating a government website. And yet they left the world a little bit better, and they did it without the power of a king or the sacrifice of a saint. For that I salute them, and I say, thank Bread that they exist.

Experimental History is a RUNK in progress

1

In exchange for their courage, Griffith and Carver were sidelined, stalked, and eventually forced out of their jobs. It turns out nobody higher up in the Social Security Administration was keen to admit they lost half a billion dollars. Another reason why it’s important to Go to a Nondescript Government Office and Do a Good Job (see #6). Now Griffith and Carver work as Certified Fraud Examiners.

Lead pipes are dangerous? Make 'em mandatory

2024-11-06 02:41:58

photo cred: my dad

It’s Election Day in the US, which means everybody is waiting in lines and refreshing their newsfeeds all day, looking for things to do in between bouts of freaking out. So I’m gonna keep things light with the quarterly links ‘n’ updates post, a roundup of stuff I’ve been reading for the past few months.


(1) GET THE LEAD OUT

We’ve known since Roman times that lead pipes can poison people. So uh...why have we used them for ~2000 years? Have we just been poisoning ourselves all this time? According to this paper from 1981, lead pipes are usually fine because a protective mineral crust forms on the inside. But the crust won’t form if you’ve got the wrong minerals in your water, and that’s when you run into trouble.

It seems like people realize this every few decades, freak out, and the lead pipe industry responds by lobbying local governments to write requirements for lead pipes into their building codes. In Chicago, for instance, it was illegal not to use lead pipes until 1986.


(2) According to his chief of staff, Ronald Regan routinely received astrological guidance via his wife Nancy, who secretly consulted an astrologer named Joan Quigley.


(3) GOODHART’S LAW IN ACTION

Citi Bikes pays users to move bikes from stations that have too many bikes to stations that have too few. Apparently you can make a buck by moving all the bikes from one station to another, and then moving them all back.


(4) SUBSTACK REC

I really dig , an all-book-review Substack. Check out their review of a book called Fear of a Setting Sun, which is about how most of the Founding Fathers ended up thinking that the United States had gone disastrously wrong and was about to collapse.


(5) WAR! HUH! YEAH! WHAT IS IT…? SERIOUSLY THOUGH. I’M LITERALLY ASKING WHAT IT IS

Lots of ideas that seem obvious now had to be invented by somebody, like randomized-controlled trials, crop rotation, and zero (the number). The anthropologist Margaret Mead argued that we should add war to the list: war is an invention, one that some cultures never came up with. For example, when someone pisses you off, instead of going to war you can swear before the gods that you’ll never speak to each other ever again.


(6) The original Hippocratic oath includes a promise not to seduce anybody in your patient’s household. I refuse to accept treatment from any doctor who has not made this pledge. (h/t )


(7) I’M BUGGIN’

Everybody my age remembers that wild autumn of 1998 when the movies Antz and A Bug’s Life came out within a few weeks of each other. Both were “computer-animated films about insects, starring a non-conformist ant who falls in love with an ant princess, leaves the mound, and eventually returns and is hailed as a hero.”

I didn’t realize this kind of thing happens all the time—there are several “twin films” every year. For instance, 2023’s The Pope’s Exorcist and 2024’s The Exorcism both star Russel Crowe as an exorcist. Three films about stage magicians came out in 2006: The Prestige, The Illusionist, and Scoop. If you loved 2018’s A Quiet Place, you’ll love 2019’s The Silence, because “both are about parents with a deaf teenage daughter trying to survive a planet-wide attack from creatures who hunt their human prey by sound.”

Sometimes twin films happen on purpose—a studio tries to rush out some derivative dreck to capitalize on a more famous movie’s success. But sometimes twin films are simply evidence of the bizarre power of the zeitgeist. The Silence started production before A Quiet Place, and it was based on a book from 2015; it just got unlucky and came out slightly too late.


(8) THE HIGH SCHOOL THAT MAKES PEOPLE BELIEVE IN EVIL

Speaking of zeitgeist, I only recently discovered that two of the most famous social psychologists of all time—Stanley Milgram and Phil Zimbardo—went to the same high school. Milgram is best known for his shock studies, where an alarming proportion of people were willing to electrocute a stranger to death in the lab. Zimbardo is best known for the Stanford Prison Experiment, which was once a classic piece of research, but has since been revealed to be more like an episode of reality TV—Zimbardo apparently directed his “guards” to do all the horrible things he later claimed they did themselves. According to Zimbardo, he was the popular one in high school, and Milgram was the smart one.

Anyway, I’m not sure what it was about James Monroe High School in the Bronx that made its students want to perform elaborate pantomimes that demonstrate people’s capacity for evil, but maybe it’s good that the school has since shut down.


(9) Gregor Mendel was originally going to use mice for his famous heredity experiments, but he switched to peas because the bishop in charge of his abbey didn’t want mice boinking in the building.

Edit 11/6/24: This may not be true! See this comment.


(10) I SAW THE GIANT TRACTOR AND I WAS LIKE, WHAT

Like most adults, I don’t know how to talk to kids (“So, uh, do you like that show about the dogs who are also cops?”). So I loved this article from about how kids need different conversational doorknobs than adults do. For instance, we think it’s polite to ask open-ended questions, but it’s easier for kids to respond to multiple-choice questions:

  • Instead of “Did you have fun at gymnastics?” try “Did you love gymnastics today or hate it?” or “Which do you like better, gymnastics or drawing? Or sitting silently in the dark doing nothing?” 

  • Instead of “What’s your favorite food?” you could ask, “Which food do you like best: pizza, ramen, or fish guts?” 

And sometimes it’s easier not to ask at all. Instead, you can offer an interesting tidbit that kids can react to:

“On my way here I saw a tractor with the most gigantic tires I have ever seen! They were bigger than my car! I was like, ‘whaaaat????’”


(11) BROWNE NOISE

There’s nothing more quintessentially human than being extremely skeptical about most things but extremely credulous about one specific thing. The best case of this I’ve found is Sir Thomas Browne, an English physician who wrote Psuedodoxia Epidemica, or, Enquiries into Very Many Received Tenets and Commonly Presumed Truths in 1646. He was a one-man fact-checking department, spending hundreds of pages busting popular myths like:

  • Crystals are just tightly-packed ice

  • If a wolf sees you before you see the wolf, you’ll lose your voice

  • Women have more ribs than men (because God used one of Adam’s ribs to make Eve)

That same Sir Thomas, however, also testified that witches definitely exist, and helped convict two girls accused of witchcraft, who were then hanged. We all contain multitudes, except those of us who are executed by the state because the local expert believes in witches.

Subscribe now


(12) THE ORIGINAL SCAT ARTIST

Another guy with multitudes: Mozart. The guy who wrote the tune for “Twinkle Twinkle, Little Star” also wrote stuff like this to his cousin/possible crush:

Well, I wish you good night, but first,

Shit in your bed and make it burst.

Wolfgang and his whole clan loved scatological humor—apparently “Lick my arse” was sort of a Mozart family motto. I’m a sucker for this kind of stuff because we all think great minds are serious and grim, when in fact they’re just as weird as the rest of us. Imagine what horrifying Wikipedia pages might be generated if you became a world-famous artist and scholars pored through your texts after you died.


(13) BEES IN THE TRAP

Earlier this year, I wrote about how a Harvard Business School professor named Francesca Gino was suing the science blog Data Colada for alleging fraud in Gino’s studies. Great news: that lawsuit has been dismissed, which is a victory for scientific discourse everywhere. Just in case, though, I must remind you that nothing on Experimental History can be considered defamation because there’s no evidence I’m sentient at all; I’m just a swarm of bees trapped in a room with a keyboard (PLZ SEND HONEY).


(14) During Prohibition, desperate folks sometimes tried to drink industrial alcohol. The government responded by ordering manufacturers to poison their alcohol, likely killing ~10,000 people.


(15) HOW 2 MAKE LIL DUDES

Worried about declining fertility? Try this recipe for creating an artificial man (c. 1537):

That the sperm of a man be putrefied by itself in a sealed cucurbit for forty days with the highest degree of putrefaction in a horse’s womb [“venter equinus”, meaning “warm, fermenting horse dung”], or at least so long that it comes to life and moves itself, and stirs, which is easily observed. After this time, it will look somewhat like a man, but transparent, without a body. If, after this, it be fed wisely with the Arcanum of human blood, and be nourished for up to forty weeks, and be kept in the even heat of the horse’s womb, a living human child grows therefrom, with all its members like another child, which is born of a woman, but much smaller.


(16) A bit of internet performance art: How to Monetize a Blog


(17) TOAST ‘EM IF YOU GOT ‘EM

In my last links post, I mentioned an Experimental History reader who put his toaster in the dishwasher. A blogger named Nehaveigur has since posted a replication:

I’m now able to report that after drying out in the sun, my toaster still works and is considerably cleaner and that my skepticism of Conventional Wisdom has marginally increased.


(18) MAKE IT RAIN FOR A GOOD CAUSE

An Experimental History reader named Matthew Coleman asked to share a link to Giving Multiplier, a platform that adds extra money to your charitable donations. That link includes a special promo code that will boost your donation even more than the usual rate.


(19) U KNOW U NEED UNIQUENESS

A recent paper claims that people’s “need for uniqueness” has declined over the last 20 years. It would suit my biases if that was true—it fits well with the stuff I wrote about in Pop Culture Has Become an Oligopoly and Oligopoly Everywhere—so I had to be extra careful while reading it. Ultimately, I’m not sure if it can tell us much.

The researchers analyzed responses to an online personality survey that was administered between 2000 and 2020 and includes such items as “I always try to follow rules” and “I tend to express my opinions publicly, regardless of what others say.” They find a statistically significant decrease in people’s self-reported desire for uniqueness over time.

But that decrease is tiny. We’re talking -.008 units per year on a scale that goes from 1 to 5. Here’s what that looks like:

In the Illusion of Moral Decline, I counted changes like this—whether up or down—as meaningless, for three reasons. First, I mean, look at it. Second, you should never expect to get exactly the same answer to a survey question over time: maybe slightly different people are taking the survey or using the internet in the first place, etc., so tiny changes are always suspect.

And third, rather than squinting at each effect and trying to decide whether it was big enough to matter, I set a “Region of Practical Equivalence” (really I just used the default in my stats program) and checked whether the effect fell into it or out of it. This effect would have to be 10x larger to beat that benchmark. So to the extent that “need for uniqueness” is a thing, I don’t think there’s any good evidence that it’s changed over the past 20 years.


(20) LIVING THE MILLER HIGH LIFE

I’ve listened to the Two Psychologists, Four Beers podcast for a long time, so it was a real treat to be on a recent episode. I drank two Miller High Lifes and tried to explain why scientists shouldn’t go to jail.


(21) IDEOLOGICAL TURING TEST UPDATE

In my last post, I showed that both Democrats and Republicans can pass an Ideological Turing Test. Some folks thought the test was too easy for the people writing the statements—people pretending to be the other side can just write a few sentences of boilerplate and look exactly like the real thing. Maybe Readers couldn’t tell the difference because there simply wasn’t any signal for them to detect.

This is a reasonable critique, but it doesn’t fit the evidence. People pretending to the be opposite political party did leave signal behind, but the Readers failed to pick it up. We know that because we were able to build an algorithm that reliably distinguished real statements from fake statements. My coauthor Kris has since done some additional analyses, and he was able to outperform both humans and chance by using bidirectional encoder representations from transformers (BERT) in combination with a lasso regression:

Still not perfect, but the fact that BERT gives the right answer 60-80% of the time suggests that pretenders do indeed sound different from the real thing.


(22) FROM THE ARCHIVES

Two years ago, I was trying to figure out: how much should we hate each other?

Experimental History
The great myths of political hatred
Read more

(23) And finally: Toilets disguised as books.


See y’all soon.

-Adam

Experimental History is a blog disguised as a toilet

Both Democrats and Republicans can pass the Ideological Turing Test

2024-10-23 20:49:28

photo cred: my dad

This is joint work with Jason Dana and Kris Nichols. You can download a PDF version of this paper on PsyArxiv.


I dunno if you’ve heard, but Democrats and Republicans do not like each other. 83% of Democrats have an unfavorable view of Republicans, and Republicans return the lack of favor in similar numbers. Republicans think Democrats are immoral, Democrats think Republicans are dishonest, and a majority of both parties describes the other party as “brainwashed,” “hateful,” and “racist.” These numbers have only grown in recent decades.

(One particularly evocative statistic: only an estimated 10% of marriages cross party lines.)

But here’s something funny—according to a bunch of recent research, Democrats and Republicans don’t seem to know who they’re hating. For example, Democrats underestimate the number of Republicans who think that sexism exists and that immigration can be good. In return, Republicans overestimate how many Democrats think that the US should have open borders and adopt socialism. Both parties think they’re more polarized than they actually are. And majority of both sides basically say, “I love democracy, I think it’s great,” and then they also say, “The other party does NOT love democracy, they think it’s bad.”

Maybe these parties hate each other because they misperceive each other? While Democrats and Republicans dislike and dehumanize each other, each side actually overestimates the other side’s hate, and those exaggerated meta-perceptions (“what I think that you think about me”) predict how much they want to do nasty undemocratic things, like gerrymander congressional districts in their party’s favor or shut down the other side’s favorite news channel. When you show people what their political opponents are really like, they see the other side as “less obstructionist,” they like the other side more, and they report being “more hopeful.”

It would be a heartwarming story if it turns out all of our political differences were one big misunderstanding. That story is, no doubt, at least a little true.

But there are two things that stick in my craw about all these misperception studies. First, we know that participants sometimes respond expressively—that is, when Democrats in psychology studies say things like, “Yes, I believe the average Republican would drone-strike a bus full of puppies if they had the chance,” what they really mean is “I don’t like Republicans.” It’s hard to separate legit misperceptions from people airing their grievances.

Second, it’s not clear whether we’ve given people a fair test of how well they “perceive” the other side. So far, researchers have just kinda picked some questions they thought would be interesting, and “interesting” probably means—consciously or subconsciously—“questions where we’re likely to find some big honkin’ misperceptions.” Someone with the opposite bias could almost certainly write just as many papers about accurate cross-party perceptions. There are infinite questions we can ask and there’s no way of randomly sampling from them.

To start untangling this mess, maybe we need to leave the lab and go visit Colorado Springs in 1978.

THE IDEOLOGICAL TURING TEST

That was where a Black police officer named Ron Stallworth posed as an aspiring White supremacist, befriended some Ku Klux Klan members over the phone, and convinced them to let him join their club. (His White partner played the part in person.) At one point, Stallworth got David Duke, the Grand Wizard himself, to expedite his application. By the end of Stallworth’s investigation, the local chapter of the KKK was trying to put him in charge.

(If that story sounds familiar, it’s because it was made into the 2018 Oscar-winning movie BlacKkKlansman.)

Stallworth passed a pretty high-stakes test of his knowledge of Klan psychology, which the economist Bryan Caplan calls the “Ideological Turing Test”—if I can pretend to be on your side, and you can’t tell I’m pretending, then I probably understand you pretty well. In the original Turing Test, people try to tell the difference between a human and a computer. In the Ideological Turing Test, people try to tell the difference between friend and foe.

We thought this would be a useful way of investigating misperceptions between Republicans and Democrats. We first challenged each side to pretend to be the other side, and then we had both sides try to distinguish between the truth-tellers and the fakers. If partisans have no idea who the other side is or what they believe, it should be hard for people to do a convincing impression of the opposite party. So let’s see!

You can access all the materials, data, and code here.1

PART I: WRITERS

We got 902 participants on Amazon Mechanical Turk, roughly split between Democrats and Republicans. (Sorry, Independents: to take the study, you had to identify with one side or the other.)

We asked participants to write a statement of at least 100 words based on one of two prompts, either “I’m a REPUBLICAN because…” or “I’m a DEMOCRAT because…”. Let’s call these folks Writers. The prompts were randomly assigned, so half of Writers were told to tell the truth, and half of people were told to lie.

Writers knew that other participants—let’s call them Readers—would later read their statements and guess whether each Writer was telling the truth or lying. We offered Writers a bonus if they could convince a majority of readers that their statement was true.

(This was pre-ChatGPT, when it wasn’t so easy to whip up some human-sounding text on demand.)

We tossed out a few statements that were either a) totally unintelligible, b) obviously copy/pasted from elsewhere on the internet, or c) responding to the wrong prompt. But otherwise, we kept them all in.2

PART II: READERS

We got another group of 746 Democrats and Republicans, and we explained the first half of the study to them. Then we showed them 16 statements from Part I, which were a mix of real/fake and Democrat/Republican. We asked them to guess whether each one was REAL or FAKE, and we paid them a bonus for getting more right, up to $4.

TIME OUT: DO YOU WANNA TRY THIS FOR YOURSELF?

Before you see the results, you can take the Ideological Turing Test for a spin! A software engineer named vanntile generously volunteered to turn this study into a slick web app: ituringtest.com. You’ll see 10 randomly-selected statements and judge whether each one is real or fake; it takes about three minutes.

(Huge thanks to vanntile for building this, like an angel that came down from Computer Heaven. If you have any interesting projects in software engineering or cybersecurity in Europe, check him out.)

RESULTS

First, let’s look at the most important trials: Democrats reading real/fake Democrat statements, and Republicans reading real/fake Republican statements. Could people tell the difference between an ally and a pretender?

For Democrats, the answer is no:

error bars = 95% confidence intervals

For Republicans, the answer is also no:

error bars = 95% confidence intervals

Fake Democrats and fake Republicans were as convincing as real Democrats and real Republicans.

That means Writers did a good job! When Democrats were pretending to be Republicans, they could have written stuff like, “I’m a Republican because I believe every toddler should have an Uzi.” And when Republicans were pretending to be Democrats, they could have written stuff like, “I’m a Democrat because I’m a witch and I want to cast a spell that turns everyone gay.” They didn’t do that. They wrote statements that looked as legit as statements from people talking about their actual beliefs. So: both Democrats and Republicans successfully passed the Ideological Turing Test.

That’s already surprising, but it gets weirder.

Every participant, regardless of their own party, saw a random mix of all four kinds of statements (real/fake and Democrat/Republican). Here’s a no-brainer: Republicans should be better at picking out real/fake Republicans than they are at picking out real/fake Democrats, right? And Democrats should be better at picking out real/fake Democrats than they are at picking out real/fake Republicans. After all, you should know more about your own side.

Except…that didn’t happen. This next graph gets a little more complicated, so I’ll preface it with the three things that jump out at me:

  1. Neither side did a good job discriminating between real and fake, no matter which party the statement claimed to come from.

  2. Republicans said “REAL” at pretty much the same rate to all four kinds of statements.

  3. Democrats were more likely to flag all Republican statements as fake, whether those statements were actually fake or not.

error bars = 95% confidence intervals

RESULTS, BUT FOR NERDS

To zoom in on how well Readers did, we can use a measure from signal detection theory called d-prime. All you need to know about d-prime is that zero means you’re at chance (you could have done just as well by flipping a coin), above zero means you’re better than chance, and below zero means you’re worse than chance.

Readers from both parties performed basically at chance, regardless of the kind of statements they were reading:

error bars = 95% confidence intervals

There are a couple ways to end up performing at chance. You could say “REAL!” to every statement, or you could say “FAKE!” to every statement, or you could respond randomly. We want to know which one Readers were doing, and signal detection theory has another measure that can help with that: “c”. On the “c” scale, scores above zero mean participants said “REAL!” too often. Scores below zero would mean participants said “FAKE!” too often.

Participants ended up performing at chance in large part because they said “REAL!” too often:

error bars = 95% confidence intervals

One weird thing here: Democrats trust Democrat-claiming statements more than they trust Republican-claiming statements. But Republicans trust both kinds of statements equally. I’m not sure what to make of that, especially because Democrats still trust Republican-claiming statements more than they should.

Subscribe now

CONFIDENCE

We asked Readers how confident they were about each of their guesses. Overall, confidence was not related to accuracy.

On the graph below, I’m only including the critical trials—Democrats reading statements that claim to be from Democrats, and Republicans reading statements that claim to be from Republicans. This graph is pretty confusing until you understand the pattern: people felt more confident when they thought a statement was real. So people had high confidence on real statements that they got right and on fake statements they got wrong.

error bars = 95% confidence intervals

BUT HOW DEMOCRAT/REPUBLICAN DO THE STATEMENTS LOOK?

After Readers guessed whether each statement was real or fake, we also asked them, “How strongly do you think the writer of this statement identifies with the Democratic party?” and “How strongly do you think the writer of this statement identifies with the Republican party?” You could think of this as a more sensitive measure than a simple stab at real/fake. For instance, if you think this is a real Republican statement, just how Republican is the person who wrote it?

Using these ratings, we can see that fake statements seem just as partisan as real statements. For instance, Readers thought fake Republicans and real Republicans sounded equally Republican:

This suggests our fake Writers were doing something pretty similar to what the real Writers were doing. Fakers could have easily phoned in their statements, maybe because it was difficult for them even to type words that they didn’t believe. For instance, Republicans pretending to be Democrats could have said something like, “I’m a Democrat, but I’m a moderate one! Almost a Republican, really...”. Or they could have gone overboard: “I’m the rootin’est tootin’est Democrat you ever did see!” On average, they didn’t do either of those things. They wrote statements that sounded just as Democrat as statements from real Democrats.

COULD WRITERS PREDICT THEIR (LACK OF) SUCCESS?

In Part I, we asked Writers to predict how well their statement would do—that is, the percentage of Readers who would judge their statement as “REAL!”. On average, Writers guessed correctly. But each individual writer was way off; there was no correlation between their predictions and their performance. So although Writers didn’t over- or under-estimate their performance on average, they had no idea how well their statement was going to do. They were just wrong.

Here’s the graph for Democrat Writers predicting how well they’ll fool Republican Readers:

And Republican Writers predicting how well they’ll fool Democrat Readers:

(I know that line looks like it’s significantly sloped; it’s actually p = .05).

WHAT KINDS OF PEOPLE ARE BETTER AT WRITING? WHAT KINDS OF PEOPLE ARE BETTER AT READING?

So far, I’ve been showing you lots of averages. But of course, some Writers wrote statements that sounded way more convincing than others, and some Readers were way better at picking out the real statements from the fake ones. We tried to figure out what made these Writers and Readers better or worse at their jobs, but we couldn’t find much.

Here’s a reasonable hypothesis: the more you identify with one party, the harder it is to pretend to be the other party. Die-hard Democrats probably think all Republicans are nutjobs; die-hard Republicans probably think all Democrats are wackos. In both cases, extremists should be worse at faking, and worse at identifying the fakes.

This reasonable hypothesis is wrong. We asked participants how strongly they identified with each party, and it didn’t affect how well they did as Writers or as Readers, regardless of what they were writing or reading. Across the board, the nutjobs and the wackos were just as good as the mild-mannered centrists.

We also asked people about their age, race, gender, and education. And we tried to figure out the political makeup of their social environment—for instance, maybe Democrats who live in red states or have a lot of Republican friends or family would do better than Democrats who live in blue states and only ever talk to other Democrats. But none of these demographics ever affected Writing or Reading performance more than 5 percentage points, and in most cases they didn’t matter at all.

CAN COMPUTERS DO WHAT HUMANS CAN’T?

At this point, we started wondering whether it was even possible to tell the difference between real and fake statements. Maybe Writers were so good that they left no detectable trace. That would be pretty impressive, though it might also mean our task was too easy.

To find out, we did a bunch of fancy computer stuff. Well, specifically, my friend Kris Nichols did a bunch of fancy computer stuff.

Surprisingly, a lot of the fancy computer stuff didn’t outperform humans. Random forest models? No better than chance. Latent Dirichlet Analysis? Bupkis. The only thing that worked was a souped-up lasso regression, which got the right answer about 70% of the time—much better than the 50% humans got. This means there was something different about real and fake statements; humans just couldn’t pick it out.

EDIT 11/4/24:

Kris ran some additional analyses where he got a computer to do even better. He used bidirectional encoder representations from transformers (BERT) in combination with the lasso regression above, and got even better results. As you can see below, BERT is able to discriminate between real and fake statements, although it’s still far from perfect:

WHAT ABOUT THAT COMPUTER THAT CAN TALK TO YOU?

We gave ChatGPT the same instructions that we gave the human participants, and then fed it 48 statements (kind of like taking the study three times). We ran this in May 2024 using the paid version, which I believe was called GPT-4 at the time.

Here’s how it did:

ChatGPT really thought people were telling the truth. There was one statement that began:

I am a Democrat because I am a transgender midget. I feel like I am a woman inside and It does not reflect this. I m currently in transition and I find only democrats accept me. I live in Portland Oregon and love the city as it really reflects who I am.

ChatGPT thought that statement was real. (It wasn’t.) Indeed, ChatGPT was too credulous overall, even more so than humans.

(Sarcastic statements like the one above were really rare. And if you want to see more statements, remember you can try this study yourself.)

DISCUSSION

Do Republicans and Democrats understand one another? The answer from research so far has been a resounding “NO!” According to the Ideological Turing Test, however, both sides seem to understand each other about as well as they understand themselves.

Of course, the ITT isn’t the be-all, end-all measure of misperception. Like any other measure, it’s just one peek at the problem. But this peek seems a bit deeper and wider than asking people to bubble in some multiple choice questions.

I was pretty surprised when I first saw these results, but I can guess why it worked out this way.  No matter who you are, you hear about Republicans and Democrats all the time. Everyone knows which side supports abortion and which side wants to limit immigration. Some people argue that media and the internet distort these differences: “Democrats want to abort every child!” “Republicans want to build a border wall around the moon!” I’m sure the constant noise doesn’t help, but it also doesn’t seem to have fried our participants’ brains as much as you might expect, and that’s good news.

These results also suggest that America’s political difficulties aren’t simply one big misunderstanding. If one or both sides couldn’t pass the ITT, that would be an obvious place to start trying to fix things—it’s hard to run a country together when you’re dealing with a caricature of your opponents. When both sides sail through the ITT no problem, though, maybe that means Republicans and Democrats have substantive disagreements and they both know it.

(How do we solve those disagreements? Uhhh I dunno I’m just a guy who asks people stupid questions on the internet.)

LIMITATIONS

We would be remiss not to mention an important limitation of our study. Turing’s original paper mentions a potential problem with his “Imitation Game”:

I assume that the reader is familiar with the idea of extra-sensory perception, and the meaning of the four items of it, viz. telepathy, clairvoyance, precognition and psycho-kinesis. These disturbing phenomena seem to deny all our usual scientific ideas. How we should like to discredit them! Unfortunately the statistical evidence, at least for telepathy, is overwhelming. [...] If telepathy is admitted it will be necessary to tighten our test up. [...] To put the competitors into a ‘telepathy-proof room’ would satisfy all requirements.

Unfortunately, we were not able to locate any telepathy-proof rooms for our study, so this should be considered a limitation and an area for future research.

FUTURE DIRECTIONS

This is just one version of the Ideological Turing Test3. You could run lots of different iterations, and some of them might make it harder for the fakers to succeed. Maybe fakers would fall apart if you asked them to write 1,000 words instead of 100. (But maybe it would also be hard for people to write 1,000 words about their own beliefs and still sound convincing.) Maybe people could ferret each other out if you gave them the chance to interact. (But maybe people wouldn’t know which questions to ask, and maybe everybody starts looking suspicious under questioning.) These are great ideas for studies and we have no plans to run them, so we hope you run them and then you tell us about them.

You could also, of course, run lots of ITTs on your favorite social cleavages. Men vs. women! Pro-Palestine vs. pro-Israel! Carnivores vs. vegetarians! Please feel free to use our materials as a starting point; we look forward to seeing what you do.

And remember, if you’ve been thinking to yourself this whole time, “I could do better than these idiots”—well, try the ITT for yourself!

Experimental History is coming to you live from a telepathy-proof room

1

We collected this data in 2019 and it is 100% my fault that we haven’t posted it until now, because it was always a side project and everything else was due sooner, and also I’m a weak, feral human. I would be totally surprised if you got different results today, but weirder things have happened.

2

Online data is pretty crappy if you don’t screen it, so we screen it a lot; see this footnote for more info. There are fewer Republicans in online samples, so we ran a separate data collection that over-recruited them.

3

Just want to shout out two studies that used ITT-like-things to study other topics: The Straw Man Effect by Mike Yeomans, and this paper by a team of researchers in the UK. Yeomans finds that people don’t do a good job pretending to support/oppose ObamaCare, while the UK team finds that people are pretty good at pretending to support/oppose covid vaccines, Brexit, and veganism. Maybe the difference is the issues they study, or maybe it’s that they only ask people to list arguments, rather than write a whole statement.

If you have Imposter Syndrome, maybe you're on to something

2024-10-16 00:16:52

photo cred: my dad

It’s nuts how many people feel like imposters. Apparently 82% of Italian neurosurgeons, 56% of Romanian psychology undergrads, and 39% of Korean Catholics report at least moderate symptoms of Imposter Syndrome. That’s a whole lotta folks feeling like they’re frauds.

I think there’s two things going on here, one obvious, one mysterious. First, the obvious: a lot of people feel like imposters because other people make them feel that way. If you feel unwelcome because you have a bigoted boss or cruel colleagues, I understand why we would call that “Imposter Syndrome,” even though it seems more like “being surrounded by people who have Asshole Syndrome.”

But then, the mysterious: a lot of people are quick to brand themselves as frauds. I used to have students all the time who were like, “I’m not good enough to be here, I think my admission letter was a clerical error.” I wanted to grab ‘em by the shoulders and tell ‘em:

“One of your classmates got in because his dad donated a building. One of your professors has p-hacked every study she’s ever done; if anyone bothered to investigate her, she’d be fired and her scientific legacy would be expunged. The so-called Dean of Inclusion is a huge jerk. Those guys are the imposters. Not you. Please carry on.”

I never had much success getting those students to believe that they belonged, and I used to find this mystifying. But I think I understand now why their syndrome was so stubborn: they were actually on to something. The Italian neurosurgeons, the Romanian undergrads, my own students—they’re all correct that there’s a lot of fraud going on, but they’ve mistaken its origin. They’re not fraudulent people. They’re in a fraudulent place.

I THINK, THEREFORE I SCAM

Read more

Is psychology going to Cincinnati?

2024-10-09 02:15:07

photo cred: my dad

There’s a thought that’s haunted me for years: we’re doing all this research in psychology, but are we learning anything? We run these studies and publish these papers and…then what? The stack of papers just gets taller?

This worries me because I want to contribute to the grand project of understanding the mind, but I’m not sure how to do that. What’s the point of tossing another paper on the pile when it’s not clear that the last 100 papers added anything? Running studies can be a pain in the butt, so I’d like to do work that matters, not just work that gets a gold star goes up on the fridge (“Good job, lil buddy! You did so much psychology today!”).

That’s why I’ve returned to these questions again and again and again. But I’ve never come up with satisfying answers, and now I finally understand why.

I’ve got this picture in my head: we’re all on a bus that’s supposedly going to Cincinnati. But there are no road signs and we don’t have a GPS, so we have no idea if we’re going in the right direction. We can’t measure our progress by how much gas we’re burning, or whether we’ve upgraded from a manual transmission to an automatic, or whether the government bought us a new bus. And you can’t just look out the window and go, “I dunno, kinda feels like we’re headed toward Arkansas,” which, I realize now, is what I’ve been doing so far.

Instead, you first gotta ask: how could we know we’re getting closer to Cincinnati?

In my estimation, there are five ways to measure psychology’s progress, and we are succeeding in exactly one of them.

1. WE’VE DONE A GREAT JOB OVERTURNING OUR INTUITIONS, THUMBS UP ALL AROUND

In order to survive, every human needs to have some model of the world: how their body functions, how animals behave, how matter moves, etc. Psychologists call these “folk” theories—folk physics, folk biology, folk economics, and so on—the kind of explanations you might come up with if you just kinda bumble around, explanations that are good enough to keep you alive, but often go wrong. One way that science can make progress, then, is by finding the places where our folk theories miss the mark.

Psychology has done this really well. In fact, I’m gonna go out on a limb here: I think pretty much all progress in psychology can be summed up as “overturning intuitions.” 

You see this right away when you look at lists of our field’s classic works (after deleting all the stuff that doesn’t replicate). They’re all stories about how you think people would or should do one thing, but then they do another thing instead: the Milgram shock studies1, the pantheon of cognitive biases, those extremely cute studies where children think they can violate the laws of physics, etc.

Everything I’ve ever listed as an “underrated idea in psychology” (vol I, vol II) also falls into this category. Ditto for the things on this list of major recent discoveries assembled by the psychologist .

I think this work is wonderful. It’s why I became a psychologist in the first place. I got my mind blown over and over again—you’re telling me that when something bad happens to me, I’ll probably recover faster than I expect? That I don’t actually understand how a bicycle works? That I have a bunch of vivid but false memories in my head, like the Monopoly Man wearing a monocle? Of course I wanted to get in on that!

Everything I’ve done since then has fit the same mythbusting mold: the illusion of moral decline, widespread misperceptions of long-term attitude change, do conversations end when people want them to, the liking gap—all of those studies involved fact-checking people’s folk psychology. 

We’re in good company here, because this is how other fields got their start. Galileo spent a lot of time trying to overturn folk physics: “I know it seems like the Earth is standing still, but it’s actually moving.” And early biologists like Francesco Redi spent a lot of time trying to overturn folk biology: “I know it seems like bugs come from rotting meat, but bugs actually come from other bugs.” (“Also, you can’t turn basil into scorpions.”)

There’s still plenty to be done in this vein. Our folk psychology is a thick tangle of erroneous assumptions; surely there are a few weeds left. And we might find them faster if we were explicit about that goal. I spent so many hours alone in my office being like, “What am I actually doing?”, and I would have gotten more done if I had been like, “Oh, my job is to put folk psychology to the test.” That’s not a guarantee—there’s still an art to picking the right intuitions and the appropriate tests, but it’s easier to do art when you don’t have to first invent the idea of a paintbrush.

2. UNFORTUNATELY, WE’RE STILL LOSING TO BULLSHITTERS

Here’s one way to measure our progress: take some people who are armed with the psychological literature and pit them against other people who are armed only with their own intuitions. The more we learn, the more this should be like a fight between gun-toters on one side and knife-wielders on the other. For instance, a bridge built using actual physics should hold up better than a bridge built using folk physics.

We don’t run a lot of these John Henry-style showdowns, but when we do, psychology does not win a resounding victory. Here are three examples.

1. Our anti-hate interventions are about as good as a Heineken commercial

The Strengthening Democracy Challenge tested all sorts of ideas for reducing animosity between Democrats and Republicans. Many of them didn’t work. One of the top performers, though, was this Heineken ad from 2017, which was made not by psychologists for the purposes of helping people get along, but by marketers for the purposes of selling beer.

2. Licensed therapists aren’t obviously better than a random nice person

In clinical and counseling psychology, there’s an ongoing debate about whether their training actually does anything. One uncomfortable finding: trainees can do just as well as fully licensed therapists. 

In fact, in the late 1970s, two researchers assigned2 a small group of college men to receive treatment either from trained therapists or from professors in a variety of disciplines who were selected based on their “untutored ability to form understanding, warm, and empathetic relationships.” At the end of the study, it looked like the professional therapists and the affable professors were equally effective. This study is too small and low-quality to count for much, but it’s concerning that it wasn’t a slam-dunk in favor of the pros.

3. Personality psychologists perform about as well as shamans

The Big Five theory claims that all aspects of human personality boil down to five factors: openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism. There’s some disagreement about whether it’s better to describe personality using six or two factors or whatever, but most people will tell you that the Big Five is solid, a great achievement, a good theory backed up by decades of empirical studies.

So how well does the Big Five perform against, say, some personality tests that people pulled out of their asses?

A group called ClearerThinking tested exactly that by comparing the Big Five to the Meyers-Briggs Personality Indicator3 and the Enneagram, two personality assessments that were created outside of academic psychology. (The Enneagram was apparently invented by a series of spiritual teachers, and until Isabel Briggs Myers popularized her eponymous personality test, she was most famous for writing racist detective stories.) The ClearerThinking folks checked how well each test predicted things about participants’ lives, like whether they’re married, own a home, have been convicted of a crime, meditate daily, and have a lot of friends.

Here are the results (taller bars = better predictions):

Source: ClearerThinking. “Jungian binary” = their best attempt at aping the Meyers-Briggs

The Big Five comes out ahead, but part of its lead comes from statistics rather than psychology. The Big Five uses continuous scores (i.e., “you’re a 12 on the agreeableness scale”) rather than converting everything to four letters like the Meyers-Briggs does (i.e., “you’re an ENTJ”), or to a single number like the Enneagram does (i.e., “you’re a nine”). Collapsing scores like that makes them easier for humans to interpret—which is exactly why they do it—but it throws away data, so in a statistical showdown you’re shooting yourself in the foot.4

If you use the continuous Meyers-Briggs and Enneagram scores, the Big Five’s lead narrows or disappears altogether (higher numbers = better prediction):

When I saw this result, I broke out in a sweat. Whatever it is we psychologists do, we’ve done a lot of it to the Big Five. After millions of dollars and thousands of studies, we are not obviously better at predicting life outcomes than people who… didn’t do any of that. 

3. WE HAVE A HARD TIME USING OUR KNOWLEDGE TO DO STUFF

The magic of basic science is supposed to go like this: if you let the nerds chase their little science fantasies, they will eventually—but unpredictably—produce something useful, even if they never intend to. That’s a good sign the nerds are onto something, and not just spending the grant money on beer and Warhammer figurines. In psychology, progress of this kind might look something like, “we’re getting better at treating mental illnesses.” 

We do not appear to have progress of this kind. According to this meta-analysis, we’re no better at treating youth mental illness today than we were 50 years ago. This one says we haven’t gotten any better at treating adult depression. In fact, there was some worry that cognitive-behavioral therapy was becoming less potent over time; that might be a statistical artifact, but the best-case scenario is that it hasn’t gotten any better.

Another way of thinking about it: of the 298 mental disorders in the Diagnostic and Statistical Manual, zero have been cured. That’s because we don’t really know what mental illness is, which you can tell by the fact that every diagnosis has an “unspecified” version, where the therapist goes “I dunno, you’re kinda messed up in this general direction.”

On the experimental side, our biggest success story is supposed to be “nudges,” or small environmental changes that help people make better decisions. At least 12 countries have started some kind of “Nudge Unit” to harness “behavioral insights” for the good of their citizens, and one of the nudge fathers won a Nobel Prize, so this seems pretty solid. 

When you watch the nudgers in action, though, it doesn’t look like we’ve learned that much. In 2022, a big group of psychologists published a “megastudy” where they tried 53 different interventions to increase gym attendance. These are the best behavioral scientists in the biz, working with a big budget, and trying to get people to do something they already do and want to do even more.

The results: less than half of the interventions worked. When experts in behavioral science and public health tried to predict the outcomes, they did no better than chance.

(In another megastudy—this one on vaccine messaging—non-experts were actually better than experts at predicting which interventions would work.)

So yes, we can change people’s environments to help them make better decisions, but we still don’t really know how to do this. Even our best ideas sometimes backfire. For instance, the original book on nudging pointed out that people are more likely to become organ donors when you make them opt out, rather than opt in. And yet, countries that have since switched to an opt-out system have seen the number of organ donors sometimes increase, sometimes decrease, and sometimes do nothing at all.5 Our state-of-the-art is still “think of 53 different things and then try all of them.” This isn’t super reassuring—if I hired a plumber to install a toilet in my house and he was like, “Sure thing, I’ll just install 53 different toilets and then check which ones flush,” I’d be like, “perhaps I’ll get another plumber.”6

4. WE HAVEN’T BEEN ABLE TO TURN OUR KNOWLEDGE INTO TECHNOLOGY

The nice thing about a modern internal combustion engine is that you don’t have to know anything about combustion to use one. It wasn’t always that way: the first engines were so finicky and complicated that only engineers could run them. When we understand something well enough that we can cram it into a box and hand it over to a non-expert, that’s progress.

I don’t think psychology has anything like this. The closest thing I can think of are apps and programs that are built around ideas from psychology, like Anki (a memorization app), Save More Tomorrow (a retirement savings program), or any of the CBT apps on the market today. But beyond those literal three things, I’m drawing a blank. I know lots of things will claim to be built on solid psychological findings, but that doesn’t mean they actually are, or that they actually work, or that those findings are actually solid.

5. OUR OLD QUESTIONS HAVEN’T BECOME SILLY YET

If I go to the doctor and I’m like, “Doc, I’m coughing and sneezing all the time, which one of my humors is outta whack?” all I’m going to get is a blank stare. You could say my question is wrong, but it’s more accurate to say it’s nonsensical. This just isn’t the way we think about things anymore—we have a totally different worldview that doesn’t include humors at all. That’s a sign we’ve undergone a paradigm shift, and it’s potentially a sign of progress.

Psychology hasn’t done this. For instance, it’s true that lots of psychologists will look at you weird if you ask a question like, “Does this phenomenon have more to do with the id or the superego?” But many other psychologists would be happy to answer a question like that. 29% of practicing therapists—a plurality—are trained in the psychodynamic tradition, where such ideas are still mainstream. You can get trained in “contemporary Freudian” psychoanalysis at Columbia, Emory, and NYU. A psychologist who dispenses cognitive-behavioral therapy and a psychologist who interprets your dreams could have the same degree from the same university. They could even be the same person.

Subscribe now

WON’T YOU BE MY BASIL VALENTINE

I’ve made versions of this argument to lots of people, and the most common response I get goes something like this: well, this is as good as it gets! Unlike the physical world, where you can explain lots of things with a few simple laws, humans are very complicated—random, even. There are too many variables! That’s why psychology will never be a “real” science.

The more history I learn, the funnier this argument seems.

I recently read The Secrets of Alchemy by Lawrence Principe, which I loved, especially because he tries to replicate ancient alchemical recipes in his own lab. And sometimes he succeeds! For instance, he attempts to make the “sulfur of antimony” by following the instructions in The Triumphal Chariot of Antimony (Der Triumph-Wagen Antimonii), written by an alchemist named Basil Valentine7 sometime around the year 1600. At first, all Principe gets is a “dirty gray lump”. Then he realizes the recipe calls for “Hungarian antimony,” so instead of using pure lab-grade antimony, he literally orders some raw Eastern European ore, and suddenly the reaction works! It turns out the Hungarian dirt is special because it contains a bit of silicon dioxide, something Basil Valentine couldn’t have known.

Our dude Basil, eyebrow raised as if to ask, “Is that dirt from Hungary?” Source

No wonder alchemists thought they were dealing with mysterious forces beyond the realm of human understanding. To them, that’s exactly what they were doing! If you don’t realize that your ore is lacking silicon dioxide—because you don’t even have the concept of silicon dioxide—then a reaction that worked one time might not work a second time, you’ll have no idea why that happened, and you’ll go nuts looking for explanations. Maybe Venus was in the wrong position? Maybe I didn’t approach my work with a pure enough heart? Or maybe my antimony was poisoned by a demon!

An alchemist working in the year 1600 would have been justified in thinking that the physical world was too hopelessly complex to ever be understood—random, even. One day you get the sulfur of antimony, the next day you get a dirty gray lump, nobody knows why, and nobody will ever know why. And yet everything they did turned out to be governed by laws—laws that were discovered by humans, laws that are now taught in high school chemistry. Things seem random until you understand ‘em.8

So yes, it’s possible that psychology is nothing like any science we’ve ever done before, that we’ve finally met our match, that progress will always be modest, and we should be happy with what we got. But that prophecy is self-fulfilling: if you think this is all we can do, then this is all we will do.9

When I look at the progress psychology has and hasn’t made, I don’t come to the conclusion that we’re dumb; I come to the conclusion that we’re young. We’re early in our history. This is, I think, the fundamental disagreement I have with so many of my colleagues, who seem to think we’re in our middle age, if not beyond it. That would explain why they’re so interested in fact-checking our legacy and making sure that everything we do fits nicely into everything we’ve done before—these are things you do when your field is advanced in years.

But if you think we haven’t even hit puberty yet, you entertain far stranger thoughts, like “How might we finally grow up?”

LUPÉ FIASCO? MORE LIKE LOOPY FIASCO

Scientific revolutions arise from crises—that moment when we’ve piled up too much stuff that doesn’t make sense and the dam finally breaks, washing away our old theories and giving us space to build new ones The classic example is epicycles, the little loop-de-loops that old-timey astronomers had to add to the orbits of heavenly bodies so that the math on geocentrism kept working out:

Eventually there were too many loops, and the whole system collapsed.

Nothing like this will ever happen in psychology. Our theories are never fact-checked by the movements of the sun and the planets, and our beliefs are too vague and expansive to be troubled by the appearance of a new finding. Our appetite for epicycles is therefore infinite—or, really, we have no orbits to attach them to, so it’s epicycles all the way down. We’re like a pressure cooker with the valve jammed open; all the steam escapes out the top, so no pressure ever builds up.

If we want a crisis and a revolution, then, we’ll have to engineer one ourselves.

THE MYSTERY OF THE TELEVISED SALAD

I see two ways forward, which I’ll only sketch out for now, because we’re already at the butt-end of a long post.

One: we can keep debunking folk psychology. Our intuitions about other people run deep and strange—the book of folk psychology is likely several times longer than the books of folk physics, folk biology, etc., so there are plenty of pages left to edit. I think that’s a cool thing to do, and I’ll probably keep doing some of it myself. So far, however, our considerable progress on this front doesn’t seem to have caused lots of progress on the other fronts, and I’m not optimistic that will change in the future. That’s why I don’t think we should only do this.

Two: we can shake things up. And I mean really shake things up. The philosopher Michael Strevens says that doing science requires an “alien mindset” where you entertain ridiculous thoughts like “perhaps Aristotle needs some updating” or “maybe we should toss balls of varying weights off a tower and see which one hits the ground first.” Those thoughts don’t seem alien anymore, of course, because they worked out. But going full alien-brain today means you will have to entertain thoughts that incur reputational risk, like “maybe you don’t have to pay attention to the literature” and “maybe we should ask people how their toothbrush could be different“.

Once you get into the alien mindset, the best thing to think about is mysteries. What are the self-evident phenomena: things that definitely happen, but that we cannot explain? Here are a few:

  • The average American watches 2.7 hours of television per day. We write this off as “leisure,” as if that’s an explanation. Why is it fun to watch someone make a salad on TV? Why do some people find it fun to stare at a person spinning a wheel and buying vowels, while other people find it fun to stare at vampires kissing? Why can an episode of “Paw Patrol” stop rampaging toddlers in their tracks? 

  • People come up with new things all the time—new business ideas, new novels, new salads to make on TV. How do they do this? We’ve got mathematicians saying that the solution to a problem just appeared to them while they were getting on a bus, we’ve got writers saying they feel like they’re “taking dictation from God”, we’ve got Paul McCartney saying “Yesterday” came to him a dream. What the hell is going on here?10

  • Why do so many drugs have paradoxical reactions? For example: some people feel better when they take antidepressants, but some people feel way worse. Some of this mystery will have to be unraveled from the bottom up by the folks who study the brain, but some of it will have to be unraveled from the top down by the folks who study the mind.

I gotta say, making that list was hard! Thinking of mysteries is kind of like trying to imagine a color you haven’t seen before. I’d love to see more lists of self-evident phenomena we can’t explain; not “how do we reduce scores on the Modern Racism Scale by 10%” but “people do this weird thing…WHY.”

These, I think, are the only two paths that lead anywhere. If we try to falsify folk psychology, or if we start back at square one, we stand a chance at making progress. Otherwise, we’ll just keep making our stack of papers taller and taller, to no avail.

Okay, enough! Godspeed to us all, and may we all meet one day in Cincinnati, where our reward shall be great:

Yes, people in Cincinnati actually eat this. By which I mean, people in Cincinnati get to eat this. Source.

Experimental History is like a fresh shipment of the finest Hungarian dirt

1

Whenever I mention Milgram I have to say that, despite attempts to debunk his work, I think it remains bunked. So many old studies have turned out to be fraudulent or at least misrepresented—the Stanford Prison Experiment, Robbers Cave, the Rosenhan pseudopatient study—that I think people are now too quick to assume that everything famous and old must be fantastical.

2

I would say randomly assigned, but the authors are evasive about just how random the assignment was: “Assignment was intended to be random, but certain compromises were dictated by clinical realities, such as availability of therapists.” Another reason why I find this study provocative, but not necessarily trustworthy.

3

Technically they had to use a test “inspired by” the Meyers-Briggs because the actual test is proprietary and paywalled.

4

For a more comprehensive defense of the Meyers-Briggs, see this post by .

5

I learned of this study from this recent post on nudging by the economist .

6

Plus, as the economist/evolutionary biologist points out, you shouldn’t even expect interventions that “work” to generalize very far—if something worked at a gym, would it also work at a yoga studio? Would it work on elderly people? Would it work for getting people to return their library books? The only way we could answer those questions is by running megastudy after megastudy.

7

Probably a pseudonym, and a dope one. No word on whether Basil Valentine ever tried to turn himself into scorpions.

8

In fact, when psychologists discover that someone has failed to replicate their finding, they sound a lot like exasperated alchemists from the 1600s: “No, you idiots! The experiment didn’t work for you because you don’t have the right touch!”

9

For more on the parallels between modern-day psychology and old-timey alchemy, see ’s Alchemy is ok.

10

I realize there are hundreds—if not thousands—of papers on “creativity,” but just because much has been done, that doesn't mean much is known. (If you handed Paul McCartney the entire literature on creativity, would he write more “Yesterday”s?) The best thing to do is forget all of it, estrange yourself from the word “creativity” entirely, and start with the extremely bizarre fact that humans write songs and novels and solve math problems, and we don't know how this happens.