Toggle light / dark theme

Get the latest international news and world events from around the world.

Log in for authorized contributors

Learning while Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies

Even the best-trained robots struggle when they leave the lab. They face “distribution shifts”—situations they didn’t see in training, like a brand of cereal with a new box design or a human suddenly walking into their personal space. Static datasets (fixed instructions) simply can’t prepare a robot for every “what if” scenario.

To make sense of all this messy real-world data, the researchers introduced two key technical innovations to the robot’s “Vision-Language-Action” (VLA) brain.


Imagine bringing home a single robot to be your all-in-one kitchen assistant—you want it to brew your morning Gongfu tea, make fresh juice in the afternoon, and mix the perfect cocktail at night. While it might have been trained extensively in a lab, in your house, the counter is slightly higher, the fruit is shaped differently, and your cocktail shaker is transparent. Pre-trained Vision-Language-Action (VLA) models provide an incredible starting point, yet real-world deployment is never a fixed test distribution. This leaves a critical, unsolved challenge: how do we take the heterogeneous experience generated across a fleet of robots and use it to post-train a single, generalist model across a wide range of tasks simultaneously?

We present Learning While Deploying (LWD), a fleet-scale offline-to-online RL framework for continual post-training of generalist VLA policies. Instead of treating deployment as the finish line where a policy is merely evaluated, LWD turns it into a training loop through which the policy improves. A pre-trained policy is deployed across a robot fleet, and both autonomous rollouts and human interventions are aggregated into a shared replay buffer for offline and online updates. The updated policy is then redeployed, enabling continuous improvement by leveraging interaction data from the entire fleet.

A Generalist Learns Beyond Demonstrations

Some robot learning systems have explored data flywheels: deploying a policy, collecting new robot data, extracting high-quality behaviors, and training the next policy to imitate them. While this supports scalable improvement, it still treats deployment mainly as a source of expert demonstrations. Prior post-training systems mainly focus on specialist policies, leaving fleet-scale post-training of a single generalist policy across diverse tasks unresolved.

Negative effects of artificial sweeteners may pass on to next-generation, study suggests

Health organizations are starting to raise concerns about the potential long-term impacts of artificial sweeteners, which taste sweet but—unlike sugar—contain no calories, suggesting they could interfere with energy metabolism and increase the eventual risk of diabetes or cardiovascular disease.

Now a new study in mice indicates that the popular sweeteners sucralose and stevia have negative effects on the gut microbiome and gene expression, potentially compromising metabolic health, which can be transmitted between generations.

“We found it intriguing that despite the growing consumption of these additives, the prevalence of obesity and metabolic disorders such as insulin resistance has not declined,” said Dr. Francisca Concha Celume of the Universidad de Chile, lead author of the article in Frontiers in Nutrition.

The Universe might not be flat (and cosmologists are quietly freaking out)

Everything we know about the shape of the Universe could be completely wrong.

This is one of the most fascinating unsolved problems in cosmology, and it almost never gets talked about outside of research papers. It’s called the curvature tension, and it links in to the \.

You have no free will at all | Stanford professor Robert Sapolsky

Become a Big Think member to unlock expert classes, premium print issues, exclusive events and more: https://bigthink.com/membership/?utm_… How your biology and environment make your decisions for you, according to Dr. Robert Sapolsky.

Up next, Your reptilian brain, explained ► • Your reptilian brain, explained | Robert S…

Robert Sapolsky, PhD is an author, researcher, and professor of biology, neurology, and neurosurgery at Stanford University. In this interview with Big Think’s Editor-in-Chief, Robert Chapman Smith, Sapolsky discusses the content of his most recent book, “Determined: The Science of Life Without Free Will.”

Being held as a child, growing up in a collectivist culture, or experiencing any sort of brain trauma – among hundreds of other things – can shape your internal biases and ultimately influence the decisions you make. This, explains Sapolsky, means that free will is not – and never has been – real. Even physiological factors like hunger can discreetly influence decision making, as discovered in a study that found judges were more likely to grant parole after they had eaten.

This insight is key for interpreting human behavior, helping not only scientists but those who aim to evolve education systems, mental health research, and even policy making.

Go Deeper with Big Think:

Do We Have Free Will? with Robert Sapolsky & Neil deGrasse Tyson

Is there a quantum reason we could have free will? Neil deGrasse Tyson and comedian Chuck Nice explore the concept of free will and predetermination with neuroscientist, biologist, and author of Determined: The Science of Life Without Free Will, Robert Sapolsky.

A special thanks from our editors to Robert Sapolsky’s dog.

Could we put an end to the question of whether or not we have free will? Discover “The Hungry Judge Effect” and how little bits of biology affect our actions. We break down a physicist’s perspective of free will, The Big Bang, and chaos theory. Is it enough to just feel like we have free will? Why is it an issue to think you have free will if you don’t?

We discuss the difference between free will in big decisions versus everyday decisions. How do you turn out to be the type of person who chooses vanilla ice cream over strawberry? We explore how quantum physics and virtual particles factor into predetermination. Could quantum randomness change the actions of an atom? How can society best account for a lack of free will? Are people still responsible for their actions?

What would Chuck do if he could do anything he wanted? We also discuss the benefits of a society that acknowledges powers outside of our control and scientific advancements made. How is meritocracy impacted by free will? Plus, can you change if people believe in free will if they have no free will in believing so?

Thanks to our Patrons Pro Handyman, Brad K. Daniels, Starman, Stephen Somers, Nina Kane, Paul Applegate, and David Goldberg for supporting us this week.

Groundbreaking Study on Chimp Warfare Shows Us the Nature of War

Support this channel on Patreon to help me make this a full time job: / whatdamath (Unreleased videos, extra footage, DMs, no ads)
Alternatively, PayPal donations can be sent here: http://paypal.me/whatdamath
Get a Wonderful Person Tee: https://teespring.com/stores/whatdamath
More cool designs are on Amazon: https://amzn.to/3QFIrFX

Hello and welcome! My name is Anton and in this video, we will talk about war and what it means in the animal kingdom
Links:
https://www.science.org/doi/epdf/10.1… #science #chimps.

0:00 War never changes — but what is war?
2:28 Gombe Chimp War in the 1970s
4:30 New study — largest war ever
5:20 Ngogo chimp project
5:55 Something happened in 2014 resulting in violence
7:20 Why did the violence start?
8:40 Implications for humans
9:45 Ant warfare
11:40 What does this tell us about our nature?
12:55 Conclusions.

Enjoy and please subscribe.

Bitcoin/Ethereum to spare? Donate them here to help this channel grow!
bc1qnkl3nk0zt7w0xzrgur9pnkcduj7a3xxllcn7d4
or ETH: 0x60f088B10b03115405d313f964BeA93eF0Bd3DbF

The hardware used to record these videos:

DAMPE satellite reveals cosmic rays share spectral break near 15 teravolts

A century after their discovery, cosmic rays—particles of extreme energy originating from the far reaches of the universe—remain a mystery to scientists. The DAMPE (Dark Matter Particle Explorer) space telescope is tackling this phenomenon, particularly investigating the role that dark matter may play in their formation. This international mission, which includes the University of Geneva (UNIGE), has made a major breakthrough by highlighting a universal feature of these particles. The results are published in the journal Nature.

Cosmic rays are the most energetic particles observed in the universe, far surpassing the energies of particles produced by man-made accelerators on Earth. Their exact origin is still under study, and it is believed that they originate from extreme astrophysical phenomena, such as supernovae, black hole jets, or pulsars.

The DAMPE space telescope, launched in December 2015, aims to provide answers regarding the origin and nature of these cosmic rays. This space mission, with the astrophysics group from the Department of Nuclear and Particle Physics (DPNC) at the University of Geneva (UNIGE) being one of its main contributors, has made a crucial breakthrough. Through the analysis of high-precision measurements collected by the telescope, scientists have identified a universal feature in the energy spectra of primary cosmic ray nuclei, ranging from protons to iron.

/* */