This Week in Robotics 17/04

Welcome to the Robot Remix, where we provide weekly insight on robotics, autonomy and AI

This week -

Too few robotics might explain AI pessimism
Trash robots
Amazon's data drop
Teaching robots to see in 3D

News

Stanford University released its 2023 AI Index Report. Lots of charts and some interesting takeaways, three that stood out -

Industry races ahead of academia -

Until 2014, most significant machine learning models were released by academia. Since then, industry has taken over. In 2022, there were 32 significant industry-produced machine learning models compared to just three produced by academia. Building state-of-the-art AI systems increasingly requires large amounts of data, computer power, and money—resources that industry actors inherently possess in greater amounts compared to nonprofits and academia.

2. Demand for roboticists isn't keeping up

The most in-demand AI skills in the US job market are "Machine Learning" and "AI", which are required in 1% & 0.61% of all US job postings and have seen a 5x growth in 10 years. In comparison, the requirement for robotics skills has stayed relatively flat at 0.06%. The next insight makes this particularly frustrating.

3. GDP and AI optimism are inversely correlated

GDP explains almost 70% of pessimism toward AI. One reason for this trend might be that higher-paying jobs are currently the most exposed to AI replacement. Research by the FT found that generative AI's rapid growth has shifted the risk away from manual labour to the knowledge economy. Tasks like truck driving and factory work are actually pretty tough to automate, whereas writing and coding are being conquered by generative AI...

Goldman estimates that AI could replace over 300 million jobs. As of 2023, there were 333 million full-time jobs in Europe + the US.

Maybe we need more roboticists?

The X factor - Elon Musk has reportedly purchased 10,000 graphics processing units (GPUs) for a Twitter AI project. This purchase likely places them as the 3rd largest holder of GPUs and could have required a 9 figure investment.

Sources said that Musk is working on a large language model to train on Twitter's vast pool of text data. This is funny timing considering last month Musk signed an open letter warning about the dangers of AI and calling for a pause on the development of AI systems. After critics flagged the potential hypocrisy, he responded that - 'Everyone and Their Dog is Buying GPUs'.

Meanwhile, Musk has recruited engineers from top AI companies, including DeepMind and recently changed Twitter's name in the company's records to X Corp... Making this a likely part of his goal to build an 'everything app' under the 'X' brand.

Research‌

Picking your data source - Amazon has publicly released the largest image dataset for automated product sorting, covering robotic picking, transfer and placing. It's a decent set with - 450K object segmentations, 200K unique objects, and 4k failures.

This is pretty important. One of the biggest bottlenecks in smart robotics today is the quantity and quality of data.

Pump up the volume - An autonomous robot that climbs stairs and steps over stones, trained using self-supervised learning in simulation. What's new? This project uses a new approach called Neural Volumetric Memory (NVM) to improve its performance.

Visual autonomy solutions traditionally concatenate image frames in 2D - missing lots of details and struggling with occlusion.

NVM aggregates feature volumes from multiple camera views by applying 3D translation and rotations to bring them back to the robot's frame of reference. Essentially, it takes a sequence of 2D images and outputs a single 3D feature volume representing its surrounding. This improves feature recognition and increases the system's knowledge of its environment.

They benchmarked their solution in the real world and showed clear performance improvements over state-of-the-art alternatives.

We added Meta AI's Segment Anything Model (SAM) for object recognition to a robot "service dog" this weekend for @trychroma's hackathon.

It can interact with GPT-4 to receive commands (via a chatGPT plugin).

For this hackathon, we wanted to do something more than just the LLMs… pic.twitter.com/a7YKLYjlgD
— Marco Mascorro (@Mascobot) April 10, 2023

Vision and execution - Meta released Segment Anything Model (SAM), a foundation model for computer vision that can "cut out" any object, in any image, with a single click.

Excluding the hackathon above, we haven't seen it used in robotics yet. Based on the rate at which Meta is publishing robotics papers, expect this to change.

Taking out the trash - The team at GoogleAI have developed a robot that autonomously sorts recycling in a real office.

The system can -
- Correctly identify & differentiate between categories of waste
- Manipulate the items & move them to the correct bin

Being Google, they used Deep Reinforcement Learning (DRL) to enable the robot to self-teach by trial and error. This approach doesn't require as much training data as other DL approaches but, in complex scenarios, can be prohibitively time-consuming as the system cycles through scenarios using brute force.

Why is this project interesting? The team solved the challenges of DRL by incrementally increasing the complexity of the training environment -

- First, they bootstrapped the RL approach by capturing low-quality data
- They then trained the model in a simulation until it was good enough for the real world
- Next, they used a controlled classroom environment to train 20 robots
- Once the classroom success reached 80% they let it loose in the wild for more training

The results of this approach were impressive but highlight the complexity of using deep learning in robotics.

Meme

‌