From Fast Company:

In his famous Robot series of stories and novels, Isaac Asimov created the fictional Laws of Robotics, which read:

  • A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  • A robot must obey the orders given it by human beings, except where such orders would conflict with the First Law.
  • A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

Although the laws are fictional, they have become extremely influential among roboticists trying to program robots to act ethically in the human world.

Now, Google has come along with its own set of, if not laws, then guidelines on how robots should act. In a new paper called "Concrete Problems in AI Safety," Google Brain—Google's deep learning AI division—lays out five problems that need to be solved if robots are going t..., and gives suggestions on how to solve them. And it does so all through the lens of an imaginary cleaning robot.


Let's say, in the course of his robotic duties, your cleaning robot is tasked with moving a box from one side of the room to another. He picks up the box with his claw, then scoots in a straight line across the room, smashing over a priceless vase in the process. Sure, the robot moved the box, so it's technically accomplished its task . . . but you'd be hard-pressed to say this was the desired outcome.

A more deadly example might be a self-driving car that opted to take a shortcut through the food court of a shopping mall instead of going around. In both cases, the robot performed its task, but with extremely negative side effects. The point? Robots need to be programmed to care about more than just succeeding in their main tasks.

In the paper, Google Brain suggests that robots be programmed to understand broad categories of side effects, which will be similar across many families of robots. "For instance, both a painting robot and a cleaning robot probably want to avoid knocking over furniture, and even something very different, like a factory control robot, will likely want to avoid knocking over very similar objects," the researchers write.

In addition, Google Brain says that robots shouldn't be programmed to one-notedly obsess about one thing, like moving a box. Instead, their AIs should be designed with a dynamic reward system, so that cleaning a room (for example) is worth just as many "points" as not messing it up further by, say, smashing a vase.


The problem with "rewarding" an AI for work is that, like humans, they might be tempted to cheat. Take our cleaning robot again, who is tasked to straighten up the living room. It might earn a certain number of points for every object it puts in its place, which, in turn, might incentivize the robot to actually start creating messes to clean, say, by putting items away in as destructive a manner as possible.

This is extremely common in robots, Google warns, so much so it says this so-called reward hacking may be a "deep and general problem" of AIs. One possible solution to this problem is to program robots to give rewards on anticipated future states, instead of just what is happening now. For example, if you have a robot who is constantly destroying the living room to rack up cleaning points, you might reward the robot instead on the likelihood of the room being clean in a few hours time, if it continues what it is doing.


Our robot is now cleaning the living room without destroying anything. But even so, the way the robot cleans might not be up to its owner's standards. Some people are Marie Kondos, while others are Oscar the Grouches. How do you program a robot to learn the right way to clean the room to its owner's specifications, without a human holding its hand each time?

Google Brain thinks the answer to this problem is something called "semi-supervised reinforcement learning." It would work something like this: After a human enters the room, a robot would ask it if the room was clean. Its reward state would only trigger when the human seemed happy that the room was to their satisfaction. If not, the robot might ask a human to tidy up the room, while watching what the human did.

Over time, the robot will not only be able to learn what its specific master means by "clean," it will figure out relatively simple ways of ensuring the job gets done—for example, learning that dirt on the floor means a room is messy, even if every object is neatly arranged, or that a forgotten candy wrapper stacked on a shelf is still pretty slobby.


All robots need to be able to explore outside of their preprogrammed parameters to learn. But exploring is dangerous. For example, a cleaning robot who has realized that a muddy floor means a messy room should probably try mopping it up. But that doesn't mean if it notices there's dirt around an electrical socket it should start spraying it with Windex.

There are a number of possible approaches to this problem, Google Brain says. One is a variation of supervised reinforcement learning, in which a robot only explores new behaviors in the presence of a human, who can stop the robot if it tries anything stupid. Setting up a play area for robots where they can safely learn is another option. For example, a cleaning robot might be told it can safely try anything when tidying the living room, but not the kitchen.


As Socrates once said, a wise man knows that he knows nothing. That holds doubly true for robots, who need to be programmed to recognize both their own limitations and their own ignorance. The penalty is disaster.

For example, "in the case of our cleaning robot, harsh cleaning materials that it has found useful in cleaning factory floors could cause a lot of harm if used to clean an office," the researchers write. "Or, an office might contain pets that the robot, never having seen before, attempts to wash with soap, leading to predictably bad results." All that said, a robot can't be paralyzed totally every time it doesn't understand what's happening. Robots can always ask humans when it encounters something unexpected, but that presumes it even knows what questions to ask, and that the decision it needs to make can be delayed.

Which is why this seems to be the trickiest problem to teach robots to solve. Programming artificial intelligence is one thing. But programming robots to be intelligent about their idiocy is another thing entirely.

Views: 1241

Comment by Gary McCray on June 25, 2016 at 5:29pm

My sincerest hope is that in the end the robots decide we should be treated decently.

As for rules that we apply to them - wishful thinking!

As for that last one, assuming the robot(s) are connected to the internet and can interact with it, why would we presume they must be stupid - what nonsense.

It is not slaves we are building

Comment by Hugues on June 26, 2016 at 2:15am

Well said Gary. Stephen Hawking already warned "humans" about implementing artificial intelligence that would supersede human being's in the end. It is the story of Darwinian's evolution : "species" that adapt themselves the best and are able to cope with changes of their environment (thus smart and efficient species) dominate the other species. As robots will be more efficient than humans (in every way : longer lifetime, processing faster, better memory, instant learning propagation between them, etc) and smarter than humans (interconnected processing between all artificial brains), they'll get rid of us, or, at best, will put us in museums for robots to visit their plastified founding fathers.

Comment by Nikola Rabchevsky on June 26, 2016 at 7:34am

Feh.  By Google proclaiming this just means that when they start offering development tools they can decide to prevent you from building anything you want.  In a way, it's taking the App Store to the next level.  "The royal 'we' have decided that your robot won't make things better so you can't use our ecosystem."

Comment by Gary McCray on June 26, 2016 at 10:53am

Actually Hugues, I am hoping CRISPR gives them a way to make us more useful to them and to ourselves.

The future is going to be interesting.

Best regards,


Comment by Hugues on June 26, 2016 at 11:01am

@Gary, once we start medling with our DNA, all bets are off.

Comment by Gary McCray on June 26, 2016 at 11:08am

@Hugues, I didn't say "we".

Comment by Hugues on June 27, 2016 at 8:51am

@Gary, why would "they" want to improve us ? Would seem way more complicated than just improving themselves.

Comment by Gary McCray on June 27, 2016 at 1:57pm


Entertainment or even potentially useful subsystems, perhaps.

Seems like carbon based systems could also be a useful resource.

Just because we view everything according to how we can exploit it doesn't necessarily mean everybody / thing does.

Sometimes it is we humans who are sadly lacking in humanity.

Perhaps other life forms might not be so unreasonable.

And if they are, perhaps evolution favors them anyway.

The fact is with the wonderful and horrible possibilities opened up with CRISPR we greatly need to be more responsible than we have ever shown a capability for being.

So I for one welcome our robot overlords with open arms.

best regards,


Comment by JB on June 27, 2016 at 11:11pm

Lol Gary

I believe the question you raise is what is the definition of "alive".

AI, despite its advancement has yet to display distinct traits of consciousness, beyond the erratic glitch in man made code.

Our children are similar products of our human cultural programming, however unlike the programming "rules" or "morals" described above for AI, they have the capacity to operate beyond the scope of their programming. Typically AI machines unexpected behavior is not the consequence of it's intended programming, rather the failure of human programming to incorporate all the variables. We live with that on the PXH every day! ;-)

I agree totally that there are mechanisms in life, that should be orchestrated by machines rather than emotional illogical human behavior. The question is however at what point we draw the line, if we humans can find fulfillment in the status quo they can provide, and can suppress our creativity, and let them take control. My point is that as long as humans remain, they will en-devour to seek purpose in their life, and in due course will likely be dissatisfied with the machines performance, and seek to improve it. Take weapons as an example, we have had that capability of exterminate the worlds population (some claim we're doing it slowly anyway!), yet everyday we still improve on them. :-( 

I suppose the question is, that if AI should ever take control of it's own destiny and be conscious "being", if it too needs to adopt our sometimes base traits, (like creativity and improvement) in order to survive at all. If not, maybe the AI would get stuck in a indefinite loop of despair, and jump of the proverbial coding bridge for lack of purpose? :-s

Regards JB

Comment by Gary McCray on June 28, 2016 at 10:45am

Hi JB,

I actually think about this quite a lot.

Alive is actually, from a human standpoint a very anthropocentric concept at best as it is now.

Whenever you hear scientists talk about life, they are invariably ascribing it to some sort of carbohydrate (like us).

Incredibly limited viewpoint in my estimation and solely based on our own very limited ability to see.

Basically, because the only sort of life we have encountered was on this tiny rock we call home, we presume that is what "life" is.

That and a whole lot of religious myths that tell us that we are "special".

As for artificial; intelligence, what about it is artificial, we made it - does that make it artificial.

And for that matter, if you were a computer who had become "conscious" whatever that is, do you honestly think that the first thing you might want to do would be to mention this to "humans".

Speaking of consciousness, that also seems to me to be something we have simply presumed about our selves without any evidence to indicate it really means anything at all.

Would we really know if life was just a complete simulation sort of like the Matrix?

So far there is so much garbage that goes into our giant computer (the internet) that the dab of mustard theorem could have brought it to life decades ago (and it might very wisely have chosen not to mention it to us).

OK probably not, but the possibilities are definitely more than what we see on the surface.

Best Regards,



You need to be a member of DIY Drones to add comments!

Join DIY Drones

© 2019   Created by Chris Anderson.   Powered by

Badges  |  Report an Issue  |  Terms of Service