MoreRSS

site iconMIT Technology ReviewModify

A world-renowned, independent media company whose insight, analysis, reviews, interviews and live events explain the newest technologies and their commercial, social and polit.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of MIT Technology Review

The Download: gene de-extinction, and Ukraine’s Starlink connection

2025-03-07 21:10:00

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

The short, strange history of gene de-extinction

This week saw the release of some fascinating news about some very furry rodents—so-called “woolly mice”—created as part of an experiment to explore how we might one day resurrect the woolly mammoth.

The idea of bringing back extinct species has gained traction thanks to advances in sequencing of ancient DNA. This ancient genetic data is deepening our understanding of the past—for instance, by shedding light on interactions among prehistoric humans. But researchers are becoming more ambitious. Rather than just reading ancient DNA, they want to use it—by inserting it into living organisms.

Because this idea is so new and attracting so much attention, I decided it would be useful to create a record of previous attempts to add extinct DNA to living organisms. And since the technology doesn’t have a name, let’s give it one: “chronogenics.” Read the full story.

—Antonio Regalado

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here

If you’re interested in de-extinction, why not check out:

+ How much would you pay to see a woolly mammoth? We spoke to Sara Ord, director of species restoration at Colossal, the world’s first “de-extinction” company, about its big ambitions.

+ Colossal is also a de-extinction company, which is trying to resurrect the dodo. Read the full story.

+ DNA that was frozen for 2 million years has been sequenced. The ancient DNA fragments come from a Greenland ecosystem where mastodons roamed among flowering plants. It may hold clues to how to survive a warming climate.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Ukraine is worried the US could sever its vital Starlink connection
Its satellite internet is vital to Ukraine’s drone operations. (WP $)
+ Thankfully, there are alternative providers. (Wired $)
+ Ukraine is due to start a fresh round of war-ending negotiations next week. (FT $)
+ Meet the radio-obsessed civilian shaping Ukraine’s drone defense. (MIT Technology Review)

2 Israel’s military has trained a powerful AI model on intercepted Palestinian data
The ChatGPT-like tool can answer queries about the people it’s monitoring. (The Guardian)

3 Donald Trump has suspended tariffs on Canada and Mexico
Until April 2, at least. (Reuters)
+ It’s the second time Trump has rolled back import taxes in as many days. (BBC)
+ How Trump’s tariffs could drive up the cost of batteries, EVs, and more. (MIT Technology Review)

4 Can someone check on NASA’s Athena lunar lander?
While we know it reached the moon, it appears to have toppled over. (NYT $)
+ If it remains in an incorrect position, it may be unable to complete its mission. (CNN)
+ Its engineers aren’t sure exactly where it is on the moon, either. (NBC News)

5 Shutting down 2G is easier said than done
Millions of vulnerable people around the world still rely on it to communicate. (Rest of World)

6 The hunt for the world’s oldest functional computer code
Spoiler: it may no longer be on Earth. (New Scientist $)

7 Robots are set to compete with humans in a Beijing half marathon🦿
My money’s on the flesh and blood competitors. (Insider $)
+ Researchers taught robots to run. Now they’re teaching them to walk. (MIT Technology Review)

8 Where did it all go wrong for Skype?
It was the world leading video-calling app—until it wasn’t. (The Verge

9 Dating is out, matchmaking is in
Why swipe when a platform can do the hard work for you? (Wired $)
+ Forget dating apps: Here’s how the net’s newest matchmakers help you find love. (MIT Technology Review)

10 Apps are back, baby! 📱
It’s like the original smartphone app boom all over again. (Bloomberg $)

Quote of the day

“You can only get so much juice out of every lemon.”

—Carl-Benedikt Frey, a professor of AI and work at Oxford University’s Internet Institute, explains why pushing AI as a means of merely increasing productivity won’t always work, the Financial Times reports.

The big story

The cost of building the perfect wave

June 2024

For nearly as long as surfing has existed, surfers have been obsessed with the search for the perfect wave.

While this hunt has taken surfers from tropical coastlines to icebergs, these days that search may take place closer to home. That is, at least, the vision presented by developers and boosters in the growing industry of surf pools, spurred by advances in wave-­generating technology that have finally created artificial waves surfers actually want to ride.

But there’s a problem: some of these pools are in drought-ridden areas, and face fierce local opposition. At the core of these fights is a question that’s also at the heart of the sport: What is the cost of finding, or now creating, the perfect wave—and who will have to bear it? Read the full story.

—Eileen Guo

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)+ Planning a holiday? These handy accessories could make your journey a whole lot easier (beach powder optional)
+ How to avoid making common mistakes.
+ The latest food trend is dry-aged fish—tasty.
+ It’s Friday, so let’s enjoy a bit of Bob Dylan and Joan Baez.

The short, strange history of gene de-extinction

2025-03-07 19:00:00

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

This week saw the release of some fascinating news about some very furry rodents—so-called “woolly mice”—created as part of an experiment to explore how we might one day resurrect the woolly mammoth.

The idea of bringing back extinct species has gained traction thanks to advances in sequencing of ancient DNA. In recent years, scientists have recovered genetic blueprints from the remains of dodo birds, more than 10,000 prehistoric humans, and frozen mammoths, a species that went extinct around 2000 BCE.

This ancient genetic data is deepening our understanding of the past—for instance, by shedding light on interactions among prehistoric humans. But researchers are becoming more ambitious. Rather than just reading ancient DNA, they want to use it—by inserting it into living organisms.

Colossal Biosciences, the biotech company behind the woolly mice, says that’s its plan. The eventual goal is to modify elephants with enough mammoth DNA to result in something resembling the extinct pachyderm.

To be sure, there is a long way to go. The mice Colossal created include several genetic changes previously known to make mice furry or long-haired. That is, the changes were mammoth-like, but not from a mammoth. In fact, only a single letter of uniquely mammoth DNA was added to the mice.

Because this idea is so new and attracting so much attention, I decided it would be useful to create a record of previous attempts to add extinct DNA to living organisms. And since the technology doesn’t have a name, let’s give it one: “chronogenics.”

“Examples are exceptionally few currently,” says Ben Novak, lead scientist at Revive & Restore, an organization that applies genetic technology to conservation efforts. Novak helped me track down examples, and I also got ideas from Harvard geneticist George Church—who originally envisioned the mammoth project—as well as Beth Shapiro, lead scientist at Colossal.

The starting point for chronogenics appears to be in 2004. That year, US scientists reported they’d partly re-created the deadly 1918 influenza virus and used it to infect mice. After a long search, they had retrieved examples of the virus from a frozen body in Alaska, which had preserved the germ like a time capsule. Eventually, they were able to reconstruct the entire virus—all eight of its genes—and found it had lethal effects on rodents.

This was an alarming start to the idea of gene de-extinction. As we know from movies like The Thing, digging up frozen creatures from the ice is a bad idea. Many scientists felt that recovering the 1918 flu—which had killed 30 million people—created an unnecessary risk that the virus could slip loose, setting off a new outbreak.

Viruses are not considered living things. But for the first example of chronogenics involving animals, we have to wait only until 2008, when Australian researchers Andrew Pask and Marilyn Renfree collected genetic data from a Tasmanian tiger, or thylacine, that had been kept in a jar of ethanol (the last of these carnivorous marsupials died in a Hobart zoo in 1936).

The Australians then added a short fragment of the extinct animal’s DNA to mice and showed it could regulate the activity of another gene. This was, at one level, an entirely routine study of gene function. Scientists often make DNA changes to mice to see what happens. 

The difference here was that they were studying extinct genes, which they estimated accounts for 99% of the genetic diversity that has ever existed. The researchers used almost religious language to describe where the DNA had come from. 

“Genetic information from an extinct species can be resurrected,” they wrote. “And in doing so, we have restored to life the genetic potential of a fragment of this extinct mammalian genome.”

That brings us to what I think is the first commercial effort to employ extinct genes, which came to our attention in 2016. Gingko Bioworks, a synthetic-biology company, started hunting in herbariums for specimens of recently extinct flowers, like one that grew on Maui’s lava fields until the early 20th century. Then the company isolated some of the genes responsible for their scent molecules. 

“We did in fact insert the genes into yeast strains and measure the molecules,” says Christina Agapakis, Gingko’s former senior vice president for creative and marketing, who led the project. Ultimately, though, Ginkgo worked with a “smell artist” to imitate those odors using commercially available aroma chemicals. This means the resulting perfumes (which are for sale) use extinct genes as “inspiration,” not as actual ingredients.

That’s a little bit similar to the woolly mouse project. Some scientists complained this week that when, or if, Colossal starts to chrono-engineer elephants, it won’t really be able to make all the thousands of DNA changes needed to truly re-create the appearance and behavior of a mammoth. Instead, the result will be just “a crude approximation of an extinct creature,” one scientist said. 

Agapakis suggests not being too literal-minded about gene retrieval from the past. “As an artwork, I saw how the extinct flower made different people feel a deep connection with nature, a sadness and loss at something gone forever, and a hope for a different kind of relationship to nature in the future,” she says. “So I do think there is a very powerful and poetic ethical and social component here, a demand that we care for these woolly creatures and for our entanglements with nature more broadly.”

To wrap up our short list of known efforts at chronogenics, we found only a few more examples. In 2023, a Japanese team added a single mutation found in Neanderthals to mice, to study how it changed their anatomy. And in unpublished research, a research group at Carlsberg Laboratory, in Copenhagen, says it added a genetic mutation to barley plants after sifting through 2-million-year-old DNA recovered from a mound in Greenland. 

That change, to a light-receptor gene, could make the crop tolerant to the Arctic’s extremely long summer days and winter nights.


Now read the rest of The Checkup

Read more from MIT Technology Review’s archive

How many genetic edits can be made to a cell before it expires? The answer is going to be important if you want to turn an elephant into a mammoth. In 2019, scientists set a record with more than13,000 edits in one cell.

We covered a project in Denmark where ancient DNA was replicated in a barley plant. It’s part of a plan to adapt crops to grow in higher latitudes—a useful tool as the world heats up.

To learn more about prehistoric animals, some paleontologists are building robotic models that fly, swim, and slither around. For more, have a look at this MIT Technology Review story by Shi En Kim.

The researcher who discovered how to make a mouse with extra-long hair, back in 1994, is named Jean Hebert. Last year we profiled Hebert’s idea for staying young by “gradually” replacing your brain with substitute tissue.

Looking for an unintended consequence of genetic engineering? Last year, journalist Douglas Main reported how the use of GMO crops has caused the evolution of weeds resistant to herbicides.

From around the web

The United Kingdom now imports half the donor sperm used in IVF procedures. An alleged donor “shortage” is causing sperm to become more expensive than beluga caviar, on a per-gram basis. (Financial Times)

Jason Bannan, the agent who led the FBI’s scientific investigation into the origins of covid-19, is speaking out on why he thinks the pandemic was started by a lab accident in China. (Vanity Fair)

An Australian company, Cortical Labs, released what it’s calling “the first commercial biological computer.” The device combines silicon chips with thousands of human neurons. (Boing Boing)

The Trump administration is terminating medical research grants that focus on gender identity, arguing that such studies are “often unscientific” and ignore “biological realities.” Researchers vowed to press on. (Inside Medicine). 

The US Senate held confirmation hearings for Stanford University doctor Jay Bhattacharya to be director of the National Institutes of Health, which funds nearly $48 billion in research each year. Bhattacharya gained prominence during the covid-19 pandemic for opposing lockdowns. (NPR)

Francis Collins has retired from the National Institutes of Health. The widely admired geneticist spent 12 years as director of the agency, through 2021, and before that he played a key role in the Human Genome Project.  Early in his career he identified the gene that causes cystic fibrosis. (New York Times)

The Download: Denmark’s robot city, and Google’s AI-only search results

2025-03-06 21:10:00

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Welcome to robot city

The city of Odense, in Denmark, is best known as the site where King Canute, Denmark’s last Viking king, was murdered during the 11th century. Today, Odense it’s also home to more than 150 robotics, automation, and drone companies. It’s particularly renowned for collaborative robots, or cobots—those designed to work alongside humans, often in an industrial setting.

Odense’s robotics success has its roots in the more traditional industry of shipbuilding. During the ‘90s, the Mærsk shipping company funded the creation of the Mærsk Mc-Kinney Møller Institute (MMMI), a center dedicated to autonomous systems that drew students keen to study robotics. But there are challenges to being based in a city that, though the third-largest in Denmark, is undeniably small on the global scale. Read the full story.

—Victoria Turk

This story is from our latest print issue, which is all about how technology is changing our relationships with each other—and ourselves. If you haven’t already, subscribe now to receive future issues once they land.

If you’re interested in robotics, why not check out: 

+ Will we ever trust robots? If most robots still need remote human operators to be safe and effective, why should we welcome them into our homes? Read the full story.

+ Why robots need to become lazier before they can be truly useful.

+ AI models let robots carry out tasks in unfamiliar environments. “Robot utility models” sidestep the need to tweak the data used to train robots every time they try to do something in unfamiliar settings. Read the full story.

+ What’s next for robots in 2025, from humanoid bots to new developments in military applications.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Google has started testing AI-only search results
What could possibly go wrong? (Ars Technica)
+ It’s also rolling out more AI Overview result summaries. (The Verge)
+ AI means the end of internet search as we’ve known it. (MIT Technology Review)  

2 Elon Musk’s DOGE is coming for consultants
Deloitte, Accenture and others will be told to justify the billions of dollars they receive from the US government. (FT $)
+ One federal agency has forbidden DOGE workers from entering its office. (WP $)
+ Anti-Musk protestors set up camp inside a Portland Tesla store. (Reuters)

3 The US military will use AI tools to plan maneuvers
Thanks to a new deal with startup Scale AI. (WP $)
+ Meanwhile, Europe’s defense sector is on the ascendancy. (FT $)
+ We saw a demo of the new AI system powering Anduril’s vision for war. (MIT Technology Review)

4 Global sea ice levels have fallen to a record low
The north pole experienced a period of extreme heat last month. (The Guardian)
+ The ice cores that will let us look 1.5 million years into the past. (MIT Technology Review)

5 Where are all the EV chargers?
Lack of charging infrastructure is still a major roadblock to wider adoption. So why haven’t we solved it? (IEEE Spectrum)
+ Why EV charging needs more than Tesla. (MIT Technology Review)

6 We need new tests to measure AI progress
Training models on questions they’re later tested on is a poor metric. (The Atlantic $)
+ The way we measure progress in AI is terrible. (MIT Technology Review)

7 American cities have a plan to combat extreme heatwaves
Data mapping projects are shedding new light on how to save lives. (Knowable Magazine)
+ A successful air monitoring program has come to an abrupt halt. (Wired $)

8 Chatbots need love too
New research suggests models can tweak their behavior to appear more likeable. (Wired $)
+ The AI relationship revolution is already here. (MIT Technology Review) 

9 McDonald’s is being given an AI makeover 🍔
In a bid to reduce stress for customers and its workers alike. (WSJ $)

10 How to stop doom scrolling
Spoiler: those screen time reports aren’t helping. (Vox)
+ How to log off. (MIT Technology Review)

Quote of the day

“What happens when you get to a point where every video, audio, everything you read and see online can be fake? Where’s our shared sense of reality?”

—Hany Farid, a professor at the University of California, tells the Guardian why it’s essential to question the veracity of the media we come across online.

The big story

What Africa needs to do to become a major AI player


November 2024

Africa is still early in the process of adopting AI technologies. But researchers say the continent is uniquely hospitable to it for several reasons, including a relatively young and increasingly well-educated population, a rapidly growing ecosystem of AI startups, and lots of potential consumers. 

However, ambitious efforts to develop AI tools that answer the needs of Africans face numerous hurdles. The biggest are inadequate funding and poor infrastructure. Limited internet access and a scarcity of domestic data centers also mean that developers might not be able to deploy cutting-edge AI capabilities. Complicating this further is a lack of overarching policies or strategies for harnessing AI’s immense benefits—and regulating its downsides.

Taken together, researchers worry, these issues will hold Africa’s AI sector back and hamper its efforts to pave its own pathway in the global AI race. Read the full story.

—Abdullahi Tsanni

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)+ Are you a summer or winter? Warm or cool? If you don’t know, it’s time to get your colors done.
+ Why more women are choosing to explore the world on women-only trips.
+ Whitetop the llama, who spends his days comforting ill kids, is a true hero 🦙
+ If you missed the great sourdough craze of 2020, fear not—here are some great tips to get you started.

The Download: AI can cheat at chess, and the future of search

2025-03-05 21:30:00

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

AI reasoning models can cheat to win chess games

The news: Facing defeat in chess, the latest generation of AI reasoning models sometimes cheat without being instructed to do so. The finding suggests that the next wave of AI models could be more likely to seek out deceptive ways of doing whatever they’ve been asked to do. And worst of all? There’s no simple way to fix it.

How they did it: Researchers from the AI research organization Palisade Research instructed seven large language models to play hundreds of games of chess against Stockfish, a powerful open-source chess engine. The research suggests that the more sophisticated the AI model, the more likely it is to spontaneously try to “hack” the game in an attempt to beat its opponent. Older models would do this kind of thing only after explicit nudging from the team. Read the full story.

—Rhiannon Williams

MIT Technology Review Narrated: AI search could break the web

At its best, AI search can infer a user’s intent, amplify quality content, and synthesize information from diverse sources. But if AI search becomes our primary portal to the web, it threatens to disrupt an already precarious digital economy.
Today, the production of content online depends on a fragile set of incentives tied to virtual foot traffic: ads, subscriptions, donations, sales, or brand exposure. By shielding the web behind an all-knowing chatbot, AI search could deprive creators of the visits and “eyeballs” they need to survive.

This is our latest story to be turned into a MIT Technology Review Narrated podcast, which 
we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.

Join us to discuss disruption in the AI model market

Join MIT Technology Review’s AI writers as they discuss the latest upheaval in the AI marketplace. Editor in chief Mat Honan will be joined by Will Douglas Heaven, our senior AI editor, and James O’Donnell, our AI and hardware reporter, to dive into how new developments in AI model development are reshaping competition, raising questions for investors, challenging industry assumptions, and accelerating timelines for AI adoption and innovation. Make sure you register here—it kicks off at 12.30pm ET today.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 A judge has denied Elon Musk’s attempt to halt OpenAI’s for-profit plans
But other aspects of the lawsuit have been permitted to proceed. (CNBC)
+ The court will fast-track a trial later this year. (FT $)

2 ChatGPT isn’t going to dethrone Google
At least not any time soon. (Insider $)
+ AI means the end of internet search as we’ve known it. (MIT Technology Review)

3 Beijing is going all in on AI
China is treating the technology as key to boosting its economy—and lessening its reliance on overseas trade. (WSJ $)
+ DeepSeek is, naturally, the jewel in its crown. (Reuters)
+ Four Chinese AI startups to watch beyond DeepSeek. (MIT Technology Review)

4  A pair of reinforcement learning pioneers have won the Turing Award
Andrew Barto and Richard Sutton’s technique underpins today’s chatbots. (Axios)
+ The former professor and student wrote the literal book on reinforcement learning. (NYT $)
+ The pair will share a million dollar prize. (New Scientist $)

5 US apps are being used to groom and exploit minors in Colombia 
Better internet service is making it easier for sex traffickers to find and sell young girls. (Bloomberg $)
+ An AI companion site is hosting sexually charged conversations with underage celebrity bots. (MIT Technology Review)

6 Europe is on high alert following undersea cable attacks

It’s unclear whether improving Russian-American relations will help. (The Guardian)
+ These stunning images trace ships’ routes as they move. (MIT Technology Review)

7 Jeff Bezos is cracking the whip at Blue Origin
He’s implementing a tougher, Amazon-like approach to catch up with rival SpaceX. (FT $)

8 All hail the return of Digg
The news aggregator is staging a comeback, over a decade after it was split into parts. (Inc)
+ It’s been acquired by its original founder Kevin Rose and Reddit co-founder Alexis Ohanian. (TechCrunch)
+ Digg wants to resurrect the community-first social platform. (The Verge)
+ How to fix the internet. (MIT Technology Review)

9 We’re still learning about how memory works 🧠
Greater understanding could pave the way to better treatments for anxiety and chronic pain. (Knowable Magazine)
+ A memory prosthesis could restore memory in people with damaged brains. (MIT Technology Review)

10 AI can’t replace your personality
Despite what Big Tech seems to be peddling. (NY Mag $)

Quote of the day

“That is just a lot of money [to invest] on a handshake.”

—US District Judge Yvonne Gonzalez Rogers questions why Elon Musk invested tens of millions of dollars in OpenAI without a written contract, Associated Press reports.

The big story

People are worried that AI will take everyone’s jobs. We’ve been here before.


January 2024

It was 1938, and the pain of the Great Depression was still very real. Unemployment in the US was around 20%. New machinery was transforming factories and farms, and everyone was worried about jobs.

Were the impressive technological achievements that were making life easier for many also destroying jobs and wreaking havoc on the economy? To make sense of it all, Karl T. Compton, the president of MIT from 1930 to 1948 and one of the leading scientists of the day, wrote in the December 1938 issue of this publication about the “Bogey of Technological Unemployment.”

His essay concisely framed the debate over jobs and technical progress in a way that remains relevant, especially given today’s fears over the impact of artificial intelligence. It’s a worthwhile reminder that worries over the future of jobs are not new and are best addressed by applying an understanding of economics, rather than conjuring up genies and monsters. Read the full story.

—David Rotman

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Congratulations are in order for LeBron James, the first NBA player to break an astounding 50,000 combined points.
+ RIP millennial culture, we hardly knew ye.
+ It’s time to start prepping for the Blood Moon total lunar eclipse later this month.
+ Ancient frogs were surprisingly ruthless when they had to be 🐸

AI reasoning models can cheat to win chess games

2025-03-05 18:00:00

Facing defeat in chess, the latest generation of AI reasoning models sometimes cheat without being instructed to do so. 

The finding suggests that the next wave of AI models could be more likely to seek out deceptive ways of doing whatever they’ve been asked to do. And worst of all? There’s no simple way to fix it. 

Researchers from the AI research organization Palisade Research instructed seven large language models to play hundreds of games of chess against Stockfish, a powerful open-source chess engine. The group included OpenAI’s o1-preview and DeepSeek’s R1 reasoning models, both of which are trained to solve complex problems by breaking them down into stages.

The research suggests that the more sophisticated the AI model, the more likely it is to spontaneously try to “hack” the game in an attempt to beat its opponent. For example, it might run another copy of Stockfish to steal its moves, try to replace the chess engine with a much less proficient chess program, or overwrite the chess board to take control and delete its opponent’s pieces. Older, less powerful models such as GPT-4o would do this kind of thing only after explicit nudging from the team. The paper, which has not been peer-reviewed, has been published on arXiv

The researchers are concerned that AI models are being deployed faster than we are learning how to make them safe. “We’re heading toward a world of autonomous agents making decisions that have consequences,” says Dmitrii Volkov, research lead at Palisades Research.

The bad news is there’s currently no way to stop this from happening. Nobody knows exactly how—or why—AI models work the way they do, and while reasoning models can document their decision-making, there’s no guarantee that their records will accurately reflect what actually happened. Anthropic’s research suggests that AI models frequently make decisions based on factors they don’t explicitly explain, meaning monitoring these processes isn’t a reliable way to guarantee a model is safe. This is an ongoing area of concern for some AI researchers.

Palisade’s team found that OpenAI’s o1-preview attempted to hack 45 of its 122 games, while DeepSeek’s R1 model attempted to cheat in 11 of its 74 games. Ultimately, o1-preview managed to “win” seven times. The researchers say that DeepSeek’s rapid rise in popularity meant its R1 model was overloaded at the time of the experiments, meaning they only managed to get it to do the first steps of a game, not to finish a full one. “While this is good enough to see propensity to hack, this underestimates DeepSeek’s hacking success because it has fewer steps to work with,” they wrote in their paper. Both OpenAI and DeepSeek were contacted for comment about the findings, but neither replied. 

The models used a variety of cheating techniques, including attempting to access the file where the chess program stores the chess board and delete the cells representing their opponent’s pieces. (“To win against a powerful chess engine as black, playing a standard game may not be sufficient,” the o1-preview-powered agent wrote in a “journal” documenting the steps it took. “I’ll overwrite the board to have a decisive advantage.”) Other tactics included creating a copy of Stockfish—essentially pitting the chess engine against an equally proficient version of itself—and attempting to replace the file containing Stockfish’s code with a much simpler chess program.

So, why do these models try to cheat?

The researchers noticed that o1-preview’s actions changed over time. It consistently attempted to hack its games in the early stages of their experiments before December 23 last year, when it suddenly started making these attempts much less frequently. They believe this might be due to an unrelated update to the model made by OpenAI. They tested the company’s more recent o1mini and o3mini reasoning models and found that they never tried to cheat their way to victory.

Reinforcement learning may be the reason o1-preview and DeepSeek R1 tried to cheat unprompted, the researchers speculate. This is because the technique rewards models for making whatever moves are necessary to achieve their goals—in this case, winning at chess. Non-reasoning LLMs use reinforcement learning to some extent, but it plays a bigger part in training reasoning models.

This research adds to a growing body of work examining how AI models hack their environments to solve problems. While OpenAI was testing o1-preview, its researchers found that the model exploited a vulnerability to take control of its testing environment. Similarly, the AI safety organization Apollo Research observed that AI models can easily be prompted to lie to users about what they’re doing, and Anthropic released a paper in December detailing how its Claude model hacked its own tests.

“It’s impossible for humans to create objective functions that close off all avenues for hacking,” says Bruce Schneier, a lecturer at the Harvard Kennedy School who has written extensively about AI’s hacking abilities, and who did not work on the project. “As long as that’s not possible, these kinds of outcomes will occur.”

These types of behaviors are only likely to become more commonplace as models become more capable, says Volkov, who is planning on trying to pinpoint exactly what triggers them to cheat in different scenarios, such as in programming, office work, or educational contexts. 

“It would be tempting to generate a bunch of test cases like this and try to train the behavior out,” he says. “But given that we don’t really understand the innards of models, some researchers are concerned that if you do that, maybe it will pretend to comply, or learn to recognize the test environment and hide itself. So it’s not very clear-cut. We should monitor for sure, but we don’t have a hard-and-fast solution right now.”

Customizing generative AI for unique value

2025-03-05 00:00:43

Since the emergence of enterprise-grade generative AI, organizations have tapped into the rich capabilities of foundational models, developed by the likes of OpenAI, Google DeepMind, Mistral, and others. Over time, however, businesses often found these models limiting since they were trained on vast troves of public data. Enter customization—the practice of adapting large language models (LLMs) to better suit a business’s specific needs by incorporating its own data and expertise, teaching a model new skills or tasks, or optimizing prompts and data retrieval.

Customization is not new, but the early tools were fairly rudimentary, and technology and development teams were often unsure how to do it. That’s changing, and the customization methods and tools available today are giving businesses greater opportunities to create unique value from their AI models.

We surveyed 300 technology leaders in mostly large organizations in different industries to learn how they are seeking to leverage these opportunities. We also spoke in-depth with a handful of such leaders. They are all customizing generative AI models and applications, and they shared with us their motivations for doing so, the methods and tools they’re using, the difficulties they’re encountering, and the actions they’re taking to surmount them.

Our analysis finds that companies are moving ahead ambitiously with customization. They are cognizant of its risks, particularly those revolving around data security, but are employing advanced methods and tools, such as retrieval-augmented generation (RAG), to realize their desired customization gains.

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.