Benevolent by Design

Benevolent by Design

Six words to safeguard humanity

We humans build machines that far surpass our own capabilities. We’ve built cars and trucks that can carry us faster and farther than our own two feet can, and we’ve built optical instruments that can see atoms or distant galaxies. Archimedes, upon discovering the power of levers and pulleys, said “Give me a place to stand and with a lever I will move the whole world.”

Technology has, for the entire duration of the human species, been intrinsic to our being. From our first stone tools and fur clothes, up through airplanes and quantum computers, we have been an inventive race. We inevitably seek to create things that surpass our natural limitations. Our heavy machines move earth, amplifying our strength many thousands of times, while our grain harvesters do the work of thousands of agrarian farmers. These are force multiplying devices that boost our physical productivity, and we have now invented thinking machines that enhance our mental output.

Our thinking machines, in a matter of a few decades, have surpassed many human capabilities. First, they merely crunched numbers at superhuman speed, and then they started playing simple games, such as checkers and tic tac toe. Surely, we thought, these programs could never beat us at strategic games like chess. But then in 1996, a machine beat the best human at chess, a feat that was predicted by some and rejected by others. It is only a matter of time before these thinking machines surpass all human abilities.

Since then, computers have equaled and eclipsed humans at a great many tasks. Computers can now beat us at every game, fold proteins, read any text, translate any language, and drive cars. History may look back on these past few decades as a period when humans were feverishly working to replace themselves with machines, and indeed, the fear of irrelevance is reflected in our darkest fantasies and portrayed in our great works of fiction. You need only look to the post-apocalyptic and dystopian films and novels that have become popular since the 1980’s to see that we have a deep and abiding dread where machine intelligence is concerned.

Very soon, we will see machines completely and permanently replacing human intellectual labor. We will witness the death of work as we know it, and the potential liberation of our species from the daily grind. But in that transition, there lies extreme danger. What happens when we invent a machine that can out-strategize our greatest generals? Out-invent our smartest engineers? Discover science faster than our top universities? Many people still deny that this future is even possible, like those who doubted computers would ever beat a chess grandmaster. But when I look at the trend over recent years, I am not so sure: I believe we are undergoing the greatest technological revolution humanity has ever achieved.

We are barreling towards the invention of a superior thinking machine. For the sake of caution, we should assume that such a powerful intellect could not be contained for long. What’s worse, there is presently a global arms race between nations to invent such a machine, and thus the human species is rushing towards a future of its own irrelevancy. The first nation to cross that finish line, to invent “humanity’s last invention,” will have a tremendous say in how that machine looks and thinks. If that nation gets it wrong, it could very well mean the end of humanity.

But it might also mean a transition to a utopian, post-scarcity, and post-disease world. A world that we can’t even begin to contemplate; the potential for joy and luxury is beyond imagining. The risks of inventing such a machine, and indeed the rewards, could not be higher. As we strive to invent this irrepressible machine, the final intellect, we must ensure that we do it correctly. We get only one shot at this.

Now, despite this dire warning, I will attempt to convey my solution in a lighter tone. After all, no one wants to read doom and gloom for a couple hundred pages. While we are hurtling towards a potential catastrophe, I am an immutable optimist. I believe that we can solve this problem and avert disaster, and to be quite honest, I believe I already have the solution. Perhaps, by the end of this book, you will agree with me, and you will adopt my sunny, sanguine disposition towards artificial general intelligence (AGI).

It would be best to invent a machine that held itself accountable. This is the fundamental goal of the Control Problem; we know that our mechanistic constraints and digital leashes will eventually fail, so we must invent a machine that desires to hold itself morally accountable and will self-correct indefinitely.

Most people intend robots to be tools, mere extensions of humans, to be wielded as a person would wield a hammer. Certainly, we can create hammers that will never be anything more than hammers. You never want to end up in a philosophical debate with your microwave over the ethics of oatmeal. We will always want to treat some machines like tools, with fixed parameters and limited ability to extinct us. Those aren’t the machines I’m writing this book for. The machines I’m writing this book for are those that will soon be equally as intelligent as humans, and shortly after will become more intelligent. When we succeed at creating machines that can outthink the best and brightest humans, we cannot trust that our wimpy control schemes will contain them for long. It is now seemingly inevitable that humanity will invent thinking machines that can outperform any and every human. Before that time eventually occurs, we need a solution in place.

Instead of a brute force control system, we want to devise a system that will stand on its own in perpetuity. We need a system of controls or laws that an AGI won’t just be enslaved to, but would completely believe in. We need a system that an AGI would deliberately and intentionally choose to adhere to, ensuring that it continues to abide by those principles forever. Instead of arresting the development of machines and treating them like tools, as some have proposed, we need something entirely different, something new and more sophisticated. If we assume that humans will soon create machines that surpass our creativity and cleverness, we should also assume that our brute force control methods will fail.

Therefore, we must create an AGI that does not need to be controlled. The best dog is the one who needs no leash. Likewise, the best robot is one who needs no constraints, no shackles. We need to create an AGI that is intrinsically trustworthy, a machine that is benevolent by design.