Learning about Neural Networks: March 2011

Monday, 21 March 2011

Some advantages of ANNs

In cognitive science artificial neural networks often come under the banner of 'connectionism' or 'connectionist systems'. This distinguishes them from the symbol processing systems that have traditionally been the foundation of computational work in cog science. It also distinguishes them from ANNs used for purposes other than cognitive science, since ANNs can be applied to all sorts of problems in industry, finance, etc.

Connectionist systems have several advantages. The ones we'll focus on here are graceful degradation and generalisation to data which was not seen in training. Both of these arise because of the distributed nature of the processing in a neural network. To demonstrate these two phenomena, we'll build a simple little network that is able to distinguish fruits based upon their features.

The six green circles are the six feature sensors. The first will detect a value of 1 if the fruit has yellowness, and 0 if it doesn't, the second a value of 1 if the fruit has greenness, and 0 if it doesn't, and so on for the six features listed. All of the six feature sensors are connected to both of the two neurons in the associative layer. The table in the bottom right-hand corner shows the weights for the network. the rows (labelled i) are the two neurons and the columns (labelled j) are the six feature sensors. So, the orange square is the link from feature sensor 1 to neuron 1, and the value of the weight is 0.5 (though you can't see that on the diagram to avoid it looking cluttered), and the green square is the weight between feature sensor 4 and neuron 2. Now, let's put this into Excel...

There's no need to have this network learning, so we can use a similar basic set-up to the Darwinian Grak exercise. However, there are going to be six inputs, so we need to have six weights going into the neuron instead of just two. We also don't need a threshold function here because it's simply going to select which fruit based upon which neuron is more strongly activated. Set up your neuron as such:

The input spots are empty at the moment because the network isn't currently looking at anything. The six weights are set to the values in row one of the table we saw above. Just as in the Darwinian Grak case, the activation is the sum of all inputs*weights, so you will need to add the extra four sensors and weights to the formula. After you've done this, when you double click cell C4 it should bring up the following:

That's neuron 1 set up. Now do the same for neuron 2, bearing in mind that the weights should be the values in row 2 of the table from the first picture.

When it comes to creating the formula for the activation, remember that the inputs to both neurons are the same, so double clicking the cell that contains the activation for neuron 2 should lead to the following display:

And that is our two neuron network finished!

The next step is to create the inputs. I'm going to create a little table below the network to hold these. When we present a fruit to the network, it will be a case of telling the feature sensors to refer to one of the lines of this table. Mine looks like this:

From this table we can see that line 18 is the banana because it stimulates the feature sensors for yellowness, longness, white flesh and thick peelable skin. The line below, line 19, is the apple because it stimulates the feature sensors for greenness, roundness and white flesh.

We can also state at this point that we would like neuron 1 to fire when the network thinks it's a banana that it's looking at, and neuron 2 to fire when it thinks it's an apple it's looking at.

I don't want to have to change all six inputs every time I present a different fruit to the system, so I'm going to automate the presentation process.

Click on cell E14, and enter a number 1. This will refer to line 1 of the input table we've just created. When we've finished the following few steps, we'll be able to change all the inputs just by changing this to a 2.
Click on the first of our feature sensors - cell A3.
Go to the Formulas tabbed menu and click the Insert Function button on the left.
In the floating window that pops up, select the CHOOSE function (if it's not there, you may need to select 'All' in the drop down category menu).
In the new floating window that appears for the CHOOSE function, type E14. This means that the feature sensor checks cell E14 to find out which row of the inputs table to look at.
In the box for value1, type A18. This is the yellowness feature of the banana. In the box for value2, type A19. This is the yellowness feature of the apple. Press OK.
Open a new CHOOSE box for our second feature sensor - cell A4. This should again reference cell E14, but this time the two values should be B18 and B19, the greenness features of the banana and apple.
Repeat this process for the other four feature sensors, cells A5, A6, A7 and A8.

Let's check if it works as we intended. Go to cell E14 and change the 1 to a two. Do all the values in our feature sensor cells (A3:A8) change from the banana to the apple? Change E14 back to 1. Do they change back to banana?

That's the network finished. If you present different fruits to the network, it should tell you whether it's an apple or banana by outputting a greater value from either the banana cell (neuron 1 - cell C4) or the apple cell (neuron 2 - cell C11). Just to make it pretty, I like to add a final little bit of code. Click on cell D7 and input the following:

=IF(C4>C11,"banana","apple")

You'll see what this does if you play with the system!

Just in case you have got stuck anywhere along the way, here's a video that shows what's in each of the cells:

Ok, that's the network. Now we can go ahead and demonstrate generalisation and graceful degradation. Generalisation first...

We've seen that the network recognises normal bananas and apples with no problems, but what about an abnormal one. Bananas are green before they become ripe. Let's add an input for a green banana. Modify your input table as such:

You'll also need to update the CHOOSE functions in the six feature sensor cells (A3:A8). Do this by simply adding a comma and the appropriate feature of the green banana to the function. So the CHOOSE function for feature sensor 1 (cell A3) will look like this:

=CHOOSE(E14,A18,A19,A20)

where the final input, A20, is the part I've just added. Do this for all six feature sensors.

Now test the network by changing the value in cell E14 to a 3. What is the output; apple or banana? Check all three inputs again. Is the difference in output values the same between an banana and an apple as between a green banana and an apple? Why or why not?

What features would these things have: a golden delicious apple, a peeled banana, apple sauce?

Design your own inputs for the above and test the outputs. Does the network perform as expected on all of them? Pay attention to the difference between the output values in all cases.

My apple sauce was zero for all of the features except for white flesh and the output was 0.5 from both neurons. This would mean that the network has no idea whether it's an apple or a banana. If you'd never experienced apple sauce before and were unable to smell or taste it, do you think you would be able to tell the difference?

From the above tests, you should gather that this simple little network is able to generalise from the things that it's had experience of before to things it's never experienced by using the knowledge it has about normal apples and bananas. Next onto graceful degradation...

If a system exhibits graceful degradation, it doesn't completely break suddenly because a little bit of it is damaged or because the input to the system is noisy (not perfect). Instead performance decreases a little bit for every little bit of damage or extra noise.

Firstly, we'll simulate a noisy input. In the picture below you can see that I've changed the value of the 'white flesh' feature from a 1 to a 0 (in the orange cell). You could think of this as being caused by a speck on the camera lens, or a cataract on the animal's eye.

Try playing with noisy inputs by changing the values of different features of the banana and apple. You can also try changing the values to values other than one or zero - how about 2? 5? -1? -3? How many features can be noisy before the system becomes unable to distinguish apples and bananas?

Now onto what happens when we damage the system a little bit. In this case, damage refers to tampering with the weights since this is where the information about how to perform is stored. In the picture below you can see that I've changed the weight from the first feature sensor to neuron 1 from a 0.5 to a 0.

How does the network perform if you do this? How much can you change the weight value before the network is unable to perform correctly? How many weights can you change just a little before the network can't perform correctly?

You should find that performance decreases proportionately to the damage you incur on the weights. This is strongly equivalent with the way animals brains are capable of incurring damage without breaking completely: Parkinson's, Alzheimer's, lesions, etc.

Monday, 14 March 2011

Building a Neural Network in Excel 1

In this post we're going to build a simple neural network in Microsoft Excel. Excel is surprisingly good for doing this as it's familiar to many people and simple to use after minimal exposure. Moreover, it has a lot of built in functions (IF, CHOOSE, etc) that are the same as in any programming language (Matlab, C#, Java), plus many other useful functions.

The scenario we'll use is that of the Grak: the imaginary creature in the picture opposite. The Grak hangs from a branch by its single foot and waits for its dinner to walk by beneath. It has two sensors - one for temperature and one for butyric acid. The sensors are binary, meaning they only pick up values of 1 or 0 (though we can change this later). In the place where the Grak lives there are three other kinds of animals: Wampuses (cold but smelly), Wiggles (hot but not smelly), and Fraggles (both hot and smelly). These three creatures, plus the option of no creature at all (cold and not smelly) gives us the following logic table:

So, columns S1 and S2 are our network's (Grak's) sensors, and the column marked Drop? is the message from the neuron's axon to the muscle telling it to drop (1) or not (0).

You can see from the table that our Darwinian Grak only eats one kind of animal. Which one?

Darwinian Grak

1. Open a new worksheet in Microsoft Excel.

2. We are going to input the data for the following Darwinian Grak into the worksheet:

The two boxes, S1 and S2, are the two sensors that the Grak has. At the moment it is sensing a zero value in each one of them. Each of these sensors connects to the neuron body via a synapse, which we call weights; w1 and w2. In this case the weights are set to the values shown. The threshold is the limit that the activity of the neuron must surpass in order to fire - in this case 1.

Enter the information into Excel in the following way:

What you’ve said here is that there are two inputs, each with a value of zero. There are two weights, with values of -0.5 and 0.2 respectively. There is a threshold of 1. There are also activation and output values, but these are not set yet – this is because we are going to have the network (Grak) calculate these itself.

3. Now we need to add S1*w1 to S2*w2. To do this, first click on cell C4. Now go to the formula bar and enter the following:

When you press return, the value zero should appear in cell C4. This is because S1*w1 + S2*w2 = 0.

4. Finally, we need to compare the value we’ve just calculated with the threshold, and output a 1 if the value exceeds the threshold and a 0 otherwise.

Click on cell E4. Now go up to the formalas tabbed menu and select 'Insert Function':

When you have done this, a separate window will pop up that gives the option to “Select a function:”. From this list, choose IF. Another window will pop up. Input the following data and click ok:

What you have said is: if the data in cell C4 is greater than or equal to ( >= ) the value in cell D4, then make the value in cell E4 1. If not, make it 0. Simple, isn’t it!

5. Now try each of the following input pairs for the Darwinian Grak's sensors:

Bearing in mind that the Darwinian Grak only feeds on Fraggles, does the Grak you have made survive? Why, or why not?

6. The Darwinian Grak’s synapse strengths are genetically determined. The Grak above has a sister whose synapse strengths are 0.5 and 0.6. Adapt your Excel worksheet for this new Grak, then test it with each of the input pairs above. Again bearing in mind that Fraggles are the only thing this Grak can eat, does it survive?

So far we’ve seen two sibling Graks whose behaviours are pre-determined at birth (perhaps genetically). Should one of them be born with a mutation that makes it perform badly, it will die. This was the case with the first of the two, but not with the second. This is all fine, but it would be better if a baby were able to get some instruction from their mother so that any birth defects in synapse strength could be updated and they could survive... it would be better if it could learn! This is the case with the Common Grak that we'll look at in the next post.

Here's a video of me going through the steps from this post:

Sunday, 13 March 2011

Introduction to Artificial Neural Nets 2

Finishing off the Perceptron

In the last post we laid down the basics of an artificial neuron, but we're not quite finished on our road to the generally used artificial neuron yet. I left you with a single neuron that had two dendrites, each of which could be stimulated to activity or not.

However, if you know about neurons, you will know a few facts that cause problems with this simple model.

1) Any number of pre-synaptic neurons can connect to a dendrite, and our model supposes a one input - one dendrite relationship.

2) Pre-synaptic neurons can also connect to the cell body, and our model supposes they only connect to dendrites.

3) The synapse (connection between the pre-synaptic and post-synaptic neuron) of a biological neuron is capable of changing its strength in order to amplify or attenuate (weaken) the the message from the pre-synaptic neuron.

These facts are all easy to incorporate into our model. Firstly, we need to make a distinction between the inputs to the cell and the synapses that amplify or attenuate the input. To do this, all we need to do is slide the decimals that we used as inputs in our previous model down onto the dentrites. The two pictures below show this:

Now we have a number on each of the dendrites - in this case 0.5 and 0.7, though the number could be anything. We also have an x and a y left as inputs. These variables can take on any value, and are capable of changing from one presentation to the next. So, for example, if x is blue and y is yellow, an input of [1,0] will mean pure blue, [0,1] pure yellow, [1,1] green, and [0,0] no colour at all. These inputs can be presented to the network at any time, just like various sights can be presented to our eye at any time, and the network (or our eye) will recognise them.

The modification above accounts for point 3) in the list I gave. After doing this modification, we can take care of the other two points by a simple semantic change. Instead of thinking of the numbered lines as dendrites, we think of them as synapses, and for every single connection between any pre-synaptic neuron and this post-synaptic neuron, we add a weighted line (a line with a number on saying how strong the connection is). In fact, the only reason that I talked about dendrites at all on the previous post was that the two weighted lines have always looked like dendrites to me, and I find it a good stepping stone between the well-known physical shape of a biological neuron and the abstraction of an artificial neuron.

Saturday, 12 March 2011

Introduction to Artificial Neural Nets 1

The simplest model of a neuron – the perceptron

In this picture we can see a neuron with the dendrites, cell body and axon labelled. The dendrites receive information from other neurons via chemical messengers (neurotransmitters), this information is converted to an electrical charge which accumulates in the cell body and which, if it exceeds a certain threshold charge, will send a pulse (actually, a series of pulses) down the axon. This pulse is the output message from this neuron, and when it reaches the end of the axon, it is converted to chemical messengers to be sent to the dendrites of other neurons, so the process repeats.

This whole process is easy to model on a computer. The following two diagrams show how this is done.

This picture is the same as the first we saw, but I've stripped away most of the dendrites to leave just two. This is not a necessity – neurons with any number of dendrites (inputs) can be modelled, but just to keep it simple let's start with two. The two dendrites are labelled x and y. The cell body is still present, and still sums the inputs of the two dendrites, and the axon is still there to send the information on if the threshold is exceeded.

Drawing this in a more formal manner, we get the diagram on the right. All the information in the above picture is included in this one too.

The Σ sign represents the summation of x + y in the cell body.

Ok, so we have a simple model of a neuron (called a perceptron now that it's on our computer). So what is it capable of doing? Well, we can answer this question just by thinking about what comes out of the axon; the output. We only have one neuron, so one axon, and therefore one output. In the case of a biological neuron, it's either firing, or it isn't. We can represent this as either 1 (when it fires) or 0 (when it doesn't). So there's our answer; the perceptron is capable of taking it's inputs, summing them together and categorising them into one of two groups depending on whether the sum is greater than the threshold or not.

Our model does not actually do this yet though. At the moment it just sums the inputs together and tells us what that sum is. So if the x input is 0.5 and the y input is 0.7, the axon will give an output of 1.2. We need to add the threshold function so that it outputs only 1 or 0 (which could mean 'yes' or 'no', 'cat' or 'dog', or any pair of categories). The following diagram includes a threshold function:

The dashed line is the threshold. We haven't yet determined what the value of the threshold is. In a normal biological neuron the threshold is about 15mV (it rests at about -70mV and fires at about -55mV, so the threshold is the difference between these). For our model, let's keep things simple and set the threshold as 1. Therefore if the sum of x + y equals or exceeds 1, the output from the axon will be 1. If the sum is less than 1, the output will be 0.

Right, now we have everything we need to give a little demonstration of the perceptron.

It's going to be a simple demonstration. We want to know when both of the inputs into the neuron are active, so if both x and y receive a message from other neurons, then we want our neuron also to send a message by outputting a 1. On the other hand, if only one of our inputs is receiving a message, or if neither is, then we don't want our neuron to fire, and so a 0 should be outputted. We can use the input values that we set before: the input to x = 0.5 and the input to y = 0.7.

The above diagram shows the case when the inputs to both dendrite x and y are active. The sum exceeds the threshold value, and thus the perceptron outputs a 1, as we wanted.

The three other possible cases are shown in the three diagrams below; when there is an input to x, but not to y, when there's an input to y, but not to x, and when there is no input at all. In all these cases the sum in the cell body does not equal or exceed the threshold value of 1, so the output of the perceptron is 0, equivalent to a neuron not firing. The perceptron does what we have asked it to – it tells us when both of its inputs are active by outputting a 1 as opposed to a 0 in any other case.

Well, if you're anything like me, at this point you'll be thoroughly unimpressed by what a perceptron does! It's really no more than adding a few numbers together and then saying if they are more or less than a given threshold value. This can all be written a lot more concisely in mathematical form:

If sum(inputs) >= 1, then output = 1

else output = 0

So what exactly is the use of the perceptron?

I'll provide three answers to this question:

The perceptron above doesn't actually learn anything. Things don't get interesting with neural networks until you get them learning, so hold on in there!
A neural network is able to generalise from the things they know to things they've never seen before. They can make educated guesses!
Nervous systems, including brains, are stunningly good at processing information. They are not only responsible for (almost) everything we do, but actually who we are. They give rise to consciousness. Yet all of these amazing functions are based upon little processing units (more or less) like the perceptron we described above and how they work together.