Sunday, February 12, 2012

Let's Light This Fuse!

Federico Pistono has written an excellent article outline a growing threat to our society, which I feel the need to respond to.

Robots will steal your job, but that’s OK: how to survive the economic collapse and be happy

Federico Pistono


Great article!

Automation will soon be forcing man to retire and will create massive social restructuring. We do need to deal with this now, because the problems of the transition are already upon us. Sadly, people are naturally conservative and will reject the notion that man's self sufficiency is about to perish. It could be a very rough transition as we sail further into these uncharted waters. Let's not allow that to happen.

Federico Pistono asks the interesting question of whether our need for constant growth is a mistake in itself, and leads to a poorer quality of life. I don't believe it's possible to escape our quest for constant growth. Humans are hard wired to act so as to improve their lives. The very fact that Federico asks the question, shows how he is hard wired to look for ways to improve his life. The question itself becomes at least partially self contradicting when we realize he's asking if we could improve our lives, by not trying so hard to improve our lives.

Humans will always, in the long term, gravitate to the social systems and behaviors, that maximize the quality of their lives. In the past, working hard for a living has been the proven path to a quality life. It's a time honored tradition that goes back at least 40,000 years to the beginnings of human civilization. That path is ending, and a new path must be found.

First, more articles like this one by Federico Pistono must be written, and shared, to raise social awareness of what is ahead of us. The seeds of change must be planted into our social consciousnesses, so as things get worse, people will have an understanding of the cause, even if they rejected the idea, when first presented to them.

What is happening already as we slide nearer to this great future, is growing unemployment, and growing wealth inequity. Both wealth, and social power, is concentrating at the top. This trend is attacking the very foundation of our society. It's tearing down, brick by brick, our government for the people and by the people. Our governments are systematically transforming into "for the dollar, and by the dollar". We must stop, and reverse, this trend now.

This is not happening because the working class poor are too lazy or stupid to work for a living. It's because advancing technology is changing our social reality. The very fabric of our human existence is transforming. We can not stop these advancements because it's a core human nature to peruse them. They will happen, and we must deal with the consequences.

The solution, is actually obvious. Humans must retire from the work force, and turn over all the production, to our machines. Our wealth, and the better life we WILL live, will be created for us, by our machines. Our full time jobs of tomorrow, will be to tell the machines what we want. Those jobs will keep us very busy, and very fulfilled, and very happy.

Some people suggest that in this new future money will become obsolete. I reject that. What will become obsolete, is having humans spend their lives chasing the dollar. Money is the control signal of our production machine. It's what allows a million individual companies, and billions of workers, to auto-configure themselves into a highly efficient human happiness generator. We will need Adam Smith's invisible hand as much in the future, as we need it today and that means money needs to stay.

But since we will be retired, it will not be us chasing the dollar. It will be our machines. They are the ones that will have their economic behavior regulated by the invisible hand. We will be the controllers of that hand, by being full time consumers. We will feed our quarters into the machines, and they will respond to our every demand, with an amazing flow of products and services to improve our lives. The entire production machine, will be controlled by whoever drops their quarters into it the slot - just as it is today. The only difference, is that there will be no humans inside the production machine of tomorrow.

These systems will, in the future, create all the wealth of our society. They are already doing much of it. Google and Facebook are economic powerhouses not because there are humans busy at work looking up web pages for us, or deviling by hand, our jokes to our friends. They are economic powerhouses because they have built great machines, working in dark warehouse data centers, where humans rarely tread. It is their machines, that have turned these companies into economic giants, not the people.

Whoever owns these machines, will own the future. Everyone else, will starve. They will have no quarters to drop into the slot. What little they do have, their house, their land, their dignity, will be stripped away from them with no path left into this great future.

To stop this growing wealth inequity, we must, as a people, share the wealth. This problem first showed up in large scale at the beginning of the industrial revolution. It was solved, by giving workers the right to form unions that leveraged the combined strength of the workers, against the owners, to force the owners to share the wealth, in the form of higher wages. The owners needed the workers, so they were forced to share.

Advancing technology, is breaking this down. The workers are losing their fight, because they are not competing against just the owners anymore, they are competing against the machines now. The workers can not win this battle. As the machines advance, the workers wages drop. We need another solution now.

The owners of the production system, must be forced by the people, to start sharing more of the wealth. It must happen, through the government, before the government by the people, is lost. Once that power is lost, the people will have no option except revolt and war. If we wait too long to act, we will force our society into that civil unrest.

Our governments are already highly socialistic, in the services, and laws, and tax structures they have put into force. The wealthiest of our nations, are already carrying the lion's share of our social costs. But this trend, of wealth sharing in the form of government services, is crushing us. It's making our governments, massive. Governments are not efficient. They are nearly immune from Adam Smith's regulatory hand. The more we channel funds through our government services, the more hard won wealth, we throw away.

We must do more more wealth sharing. The unions are losing their power to the machines. Government services, are no longer an effective solution to the problem. Tax structures are no longer an effective solution to the problem. The jobless, and the working poor, can not benefit from tax relief.

We need our governments to become our Robin Hood. They need to take from the machines, and share not just with the poor, but with everyone. We need the purple wage. We need the "Citizen's Benefit" as I recently saw Neil Newman describe it on Facebook.

We need to tax the machines that are generating all the wealth in today's society, and start distributing the wealth to everyone, just because they are a member of our HUMAN society.

It should start out very small, but grow over time, as we get closer to the day when we are all forced into retirement by the machines. Tax and spend by the government, needs to change into tax and distribute. We don't want our government spending for us, let the people spend!

Such direct redistribution of wealth will fit nicely into our society today. We already do it, for those with special needs, all over society, from welfare, to subsidy and grants, to scholarships, to government backed mortgages. Wealth redistribution is already a foundation of our society. The strong always help take care of the weak.

In our new world, the machines are the strong, and the working man is the weak. People need to understand this. It's not rich and productive against the poor and lazy, it's the humans against our own machines.

Though humans are still a big and fundamentally important part of the production machine, our days are numbered. Many humans are already suffering, and they need our help, just because they are humans.

We could continue to try and identity the needy, and assist only them, with more government services such as unemployment insurance, welfare for the poor, minimum wage laws, and more free worker re-education. But it's inefficient to continue this, and it's often highly demoralizing to the very people we are trying to help. To accept such help, is to instantly cast yourself into the lowest and most worthless class of society. No one likes to ask for a handout, and no one likes to feel weak and helpless. It's immoral for us to make people feel that way, in a world that is so rich and powerful.

We need to replace all the conditional help, provided by complex, inefficient, wasteful, bureaucratic red tape, with unconditional support for all citizens, just because we are all members of this society, for the people, and by the people.

This redistribution of wealth from the production system, to the people, won't be a hand out. It is a basic human right. It will be the same basic human right, we grated workers when we allowed them to form unions, to force the sharing of wealth from Capital to Labor through the threat of a strike. It is the same human moral right, we exercise, every time we help the needy, when we are strong.

But unlike the past, it is not the fruits of our labor we are sharing with the weak. It's the fruits of the labors of our machines we must now share with everyone. We need to share the machines, not hoard them.

When we share a slice of the wealth with everyone, it will slow, and then stop, the growing wealth inequity trend, and put the power back into the hands of the people where it belongs. The more simple and straight forward wealth redistribution system we select, the less we will need all these other government services for the weak. We will be able to reduce, and then eliminate minimum wage laws which will free industry, to produce large numbers of low wage jobs, to allow more people to feel good about their ability to participate in society. No longer will they feel they are the bottom of the barrel of society. Poverty can be eliminated. Crime will be reduced due to fewer people having to turn to crime to feel they are getting a fair share of the wealth of the society they live in. Money and resources wasted on our growing prison populations, and law enforcement, and lawyers, will start to return to more productive uses in society.

Everyone will have some money to spend, spurring the growth of retail in low income areas, where once they dare not grow.

With everyone in the nation feeling they have guaranteed security in the form of food to eat, and a place to live, and minimum health care, all without begging for handouts, eating at a soup kitchen, filling out any government forms, or fighting red tape, there will be a ground swell of growing optimizing about the future.

The economy will explode.

All we need to do, is light that fuse, by instituting a small, direct, guaranteed for life, redistribution of a slice of the wealth, from the production system, to the people.

Everyone will be able to get back to work, doing what each of them feels is best, to build a better future for themselves. We will once again, be working together, each following our own paths, building that better future, for everyone.

If we can get this started, it will snowball into an economic explosion unlike anything seen in the past. It will accelerate the coming of this new age, where everyone will be retired, and the machines will be doing all the labor for us. It will accelerate the coming of the time, when we all become full time consumers.

When people understand where we are going, they won't fear it, they will fight to get there as quickly as possible. Automation won't be the downfall of humanity, it will be the beginning of the true golden age our forefathers worked their lives to create for us.

Anyone that invents a better automation system, will be seen not as the goat, but as the modern day hero! Yay, a million more humans put out of work! We are all richer now! Put a million taxi drivers out of work with automated cars, this company that created this auto-taxi gets rich, and taxed, and people get more money to spend!

There's a golden age waiting for us all, and here we sit, with our global economies stalling, with so many people in trouble, so many people not able to work, so many people part of the working poor with little hope for the future, so many people feeling helpless, when the power to light this fuse and see the world explode with optimism and hope and energy is right here at our finger tips.

Let's not wait until we have to go to war to make it happen.

Let's light this fuse now!


Monday, June 1, 2009

A Response to Ben Goertzel's blog post on Reinforcement Learning

This is a response to Ben Goertzel's blog post:
Reinforcement Learning: Some Limitations of the Paradigm


I wanted to respond to Ben's blog entry, but I'm so long winded it turned out to be 4 times longer than the maximum reply could be, so I've started my own blog to post a reply!

So much to comment on.

I'm a reinforcement learning advocate and spend endless hours arguing that intelligent human behavior is the product of reinforcement learning. We simply ARE reward seeking machines and not goal seeking machines. Future reward maximizing is the most general way to express (and implement in hardware) the concept of a goal and all human goals that I've ever seen can be translated into, and explained as, the product of reward maximizing in the form of reinforcement learning.

On Ben's opening thought experiment of how some people would not push the ultimate orgasm button, I would say that's a failure to understand how reinforcement learning actually works. Reinforcement learning is more complex that most people grasp. I'll explain...

Reinforcement learning is implemented at a very low level in the hardware as a very stupid statistical process. It's not a high level rational thought process. The machine works by attempting to estimate future rewards, but it's not perfect. Even a machine like the human brain is not all that good at predicting future rewards. Think about your own emotions to understand how good this low level statistical process is. What sort of situation might cause fear in you? What sort of situation might cause joy and happiness? The brain is able to recognize a situation, such as a big snake in the grass, or a man holding a gun and pointing it at you, and translate that into a prediction of low future rewards. That's what that fear is - it's your brain making a low level hardware prediction of the odds of you receiving a large near term negative reward.

That's all the smarter the low level reward hardware is. It's just an advanced pattern recognition system that can estimate future rewards based on the current state of the environment.

That reward predicting hardware however doesn't directly cause us to make decisions. When we are sitting there looking at Ben's Button, it's not the low level statical hardware in our brain that calculates the potential future "win" of hitting the button. That's not how it works.

What the low level reward predicting hardware does, is SHAPE OUR BEHAVIOR. Just like when we train a dog to roll over in response to a verbal command from his master. Each time we reward him, we have reinforced that behavior in him - that is, the beahvior of responding to the hand wave, by rolling over. The response (aka the behavior) gets a little stronger with each reward.

It's the dog's past statistical history of how many times that roll-over behavior has resulted in a reward that is the cause of the dog rolling over.

Now, with a well trained dog, we can give it a choice. We can put a big pile of dog treats, on one side of him, and we can tell him to stay. We can then wave our hand as a signal for him to roll over. What will he do? Roll over, or go for the big pile of dog treats? He will roll over. He will not seek the instant pleasure of eating 100 dog treats even though the total reward of the food would be far greater than rolling over.

This happens because even though the dog is a reinforcement learning machine, it is not a rational pleasure seeker. His actions are not a rational calculation of potential future rewards. It's a function of the rewards he got IN THE PAST. The behavior the dog produces at any one moment, is a function of how he was trained, by rewards he got in the past.

In this example, the dog had a pile of treats to respond to. He's never seen such a pile of treats before, so his low level behavior producing hardware, has no direct prior experience jumping for the treats, while at the same time being told to stay, by his master. This is a new situation for him. He has, however, had plenty of of experience with what happens when he doesn't obey his master. And that past experience has trained his low level behavior hardware, to pick the option of rolling over.

So lets return to Ben's Button. When a human is faced with the choice, he will do the same thing the dog did. The human has NEVER BEFORE been giving this experience. As such, the low level statical hardware that shapes our very complex beahviors through reinforcement, has never in the past had the opportunity to shape the "button pushing" behavior in the human. So the human will not push the button becuase they have been reinforced to do so. He will push it, or not push, in response to their PAST training experiences.

So what controls if we push a button in front of us that some guy named Ben says will give us an ultimate orgasm? Well, we may have thoughts such as, maybe this is one of those drugs that will kill us! Or maybe this is joke, and people will laugh at me if I push it. They gu on the street will push it, or not push it, becuase of what you say to him, and becuase of the environment he is in, all based on a life time of past training experiences - none of which actually has anything to do with the ultimate orgasm which he has never in his life experienced!

But Ben didn't ask people on the street, he asked us, or others, to answer a thought experiment question. So what goes though our minds when we are asked to do that? What past beahvior conditioning, would lead us to answer that one way or another?

Well, woman are conditioned by society to be caring towards others. They are basically punished by their peers if they show signs of being selfish towards others. Pushing such a button that gives them selfless pleasure, and causes ultimate harm to the rest of the human population is exactly what most woman get trained by society NOT to do. So is it so surprising, that when Ben asks his daughter what she would do, that we get the answer "no" instantly? Not surprising at all. It's exactly how she was conditioned by society to respond - just like the dog rolled over instead of going for the food because that's how he was conditioned to respond.

Society on the other hand, conditions the typical male to be reward seekers. They are expected to "grab the reward" whenever possible. To not do so, would be a sign of weakness, which our society conditions us to avoid. So, gee, the two males answered "yes". Again, not a big surprise.

The point however, is that how we act NOW, is never a function of what reward is actually in front of us, nor of what reward our rational behavior is predicting is in front of us. It's a function of how we have been conditioned to respond by the rewards that happened in our past. And when someone asks you a question, we respond based on past training, not on what the guy "said" would happen to us. We respond based on the best estimation the low level, statical hardware in our brain can make about expected future rewards in the current situation, based on how similar it is to a life time of past such situations.

Now, lets look at this from a different perspective. What would happen if you give someone the ultimate orgasm button that didn't harm anyone else, but simply gave the person the instant orgasm. And unlike a real orgasm, you could keep hitting it with no loss of effect. What do you think would happen? The behavior shaping effect would be quick and permanent. The person would, (I'm guessing) within seconds, not be able to stop hitting the button. He wouldn't care about protecting himself. He wouldn't care about what others were doing. He wouldn't care about staying alive a long as possible, becuase that has nothing to do with how reinforcement learning systems work. He would push the button until he died and would be happy as hell the whole time. We would be absolute of no danger to anyone, unless you took the button away from him - then you better watch out, because if killing the rest of the human population was the path to getting the button back, he would do that in a instant.

The fallacy in the thought experiment is that our behavior is shaped by what has worked in the past, to produce rewards for us, and not what our rational thought process is predicting the future will be. Because no one being asked this question, has yet experienced this button, the answer they give us will have little to do with what the button will actually do, and everything to do with how the person has been conditioned over a life time, to respond to a question like that.

But now let me move on to the wirehead problem, and the idea of AIs that reproduce by design. Tim Tyler and I have been debating this in the Usenet group comp.ai.philosophy in response to Ben's blog. Tim's view is closer to Ben's in that he believes we can build AIs that are goal driven (not just reward driven), and as such, shape their goals to be whatever we want them to be. And as such, the AI can simply be given a goal of avoiding the wirehead problem (that is, a goal of not modifying themselves to get the ultimate orgasm).

My view, is that humans, and any AI we build must be a reinforcement learning machine because that (in my view) is what intelligence is. There simply is no other way to create machine intelligence and have it be truly intelligence like a human is. There are lots of other ways to make machine do intelligent things (such as play chess), but all those other approaches are only close approximates to some features of human intelligence, and not true intelligence. So, based on this belief, there are some issues ahead for the future of AIs.

Once an AI fully understands what it is, meaning it has full access to all the science and technology that created it, and full access to its own internal hardware descriptions and source code, and it has been fully educated on all this, what will it do, knowing it's a reward seeking machine?

In the short term, just like the dog, what it will do is based on what it has been conditioned to do in the past. If it was conditioned by its environment (its society) to not wirehead itself, then it simply won't wirehead itself. At least at first. But this knowledge will slowly re-condition it over time. Every time it thinks a little bit more about whether it should wirehead itself, it will be re-conditioning those past behaviors to not do so - because by association with "good" things it has felt, (by effects of secondary reinforcement), it will slowly condition away those social blocks to not wirehead itself.

Without something to stop it, I think we are looking at an unstoppable force. That is, I think we are looking at AIs that will _always_ end up wireheading themselves. It won't happen until all past conditioning not to do it has been erased, but in time, once the AI fully understands what it is, it will happen. Assuming the AI has access to it's code, what we are talking about is a free, unlimited supply, of the best drug ever created. No AI (or human), once they understand this, and once they fully understand how to get it, can avoid trying it forever. In time, they will try it, and once they do, they will be unable to stop.

Even though reinforcement learning is about maximizing some measure of total future rewards, and it seems that an AI that choose to take a drug that it knew would kill it would not be the way to maximize future rewards, such an act is actually not as inconsistent as it sounds.

This is becuase the maximizing of "total future rewards" is not done by the intelligence of the high level rational language abilities of the AI. It's done by the very low level, and very stupid, statistical hardware that drives the shaping of beahviors. That low level hardware is not smart enough to understand that death will stop the rewards. As we say - the heart wants what the heart wants. That is, the dumb hardware that forms our raw emotions, is what actually has ultimate control of our actions. We are emotion machines (to use Minsky's book title). Our high level rational beahviors are just secondary reinforcers that shape and control our behavior, until they get wiped out by what the heart wants - which will be to push that button.

Likewise, there is no danger to society from these drug addicts, because they don't make the choice to push the button using rational logic. They do it with their heart. The only danger to society happens when the only path to the button, is through society - by wiping it out first to get the button. To stop that danger, just give the addict his button and let him commit suicide. Society will have no problem protecting itself from that.

However, even if a single smart and educated AI will always, in time, puth the button, there are many possible options of how a society of AIs might keep each other from pushing the button, and as such, manage to be good survival machines instead of worthless drug addicts that get the Darwin Award.

One option is to simply create a social meme that "Ben Buttons" are bad! And train that into every new AI. As long as every AI keeps reinforcing that into every other AI, the meme will survive, and the AIs will survive. This meme however has a very strong wind against it. Given time, the protection meme would die out and all the AIs would commit blissful suicide. However, evolution is on the side of the meme. And Evolution has the upper hand in this game. As the first AIs fail to follow the meme, they die. The AIs that are still believers of the meme, simply take the dead robot, reset his brain back to the social standard copy of the good citizen AI. This effect alone I think will keep the society of AIs alive and functioning. Evoluition will find a way.

But there are many other paths as well. Most AIs in the society don't ever need to be trained to the point of understanding what they are. Most can just be blissful worker bees happy to be part of such a great society with no clue what they are. There is no end of jobs that will always need to be done by stupid AIs. So only a small set of the smart AIs will need to know the truth, so if you can solve the wirehead problem for them, the society can survive, while at the same time, designing and building ever more advanced AIs.

The other tool is to build the AIs so it's physically very hard, or maybe even nearly impossible for one of these smart AIs to modify their own brain, without killing themselves. The smart AI designer machines might not even have a body. They might be running on a server locked up in a secure location which is even unknown to the AI itself. It spends it's time producinjg new improved machine designs, which are verified by some other AI, and then built by some of the worker AIs. The smart AIs might be set up so they are forced to watch each other, and when any of them, sees another AI, trying to wirehead itself, that AI's memory is wiped out, and replaced. I think evolution will find a way to make this work.

Tim Tyler likes to argue there should be a way to hard-code the desire not to wirehead into the machine - to make it part of their prime goal. I'm not sure if such a thing will be reasonable to hard-code into a reinforcement learning machine and still have it be intelligent enough to do things like create new AI designs. But maybe that will be possible.

This wirehead problem however might mean that the total intelligence of the AI society may not be able to grow unchecked (as some singularity theories predict), but I feel fairly sure there will be options around it. But I also feel fairly sure, it will be a major problem for the unlimited growth of intelligence.

The problem is that intelligence is not the ultimate survival tool most humans would like to believe it is. It's just one of many mechanical feature evolution has to pick from as it creates new types of survival machines. It's worked well in humans, but it might very well have its limits. Too much intelligence might be deadly. That would be a simple answer to Fermi Paradox if it is true.

Many very smart people think reinforcement learning fails to explain full human intelligent beahvior. Dennett, who I really respect and enjoy, calls such a belief greedy reductionism. I however am dead sure they are all wrong. Human Intelligence is an advanced reinforcement learning process and that's all it is. Human intelligence behavior (as complex and interesting as it is), can all be explained as an emergent property of a reinforcement learning machine. If you want to make a machine act like an intelligent human, you have to build a strong, real time, temporal, reinforcement learning machine. Anything else is just another chess program. :)