The "travelling salesman problem" is the name commonly given to the task of finding the shortest path connecting a given set of points. The name comes from the idea of a salesman who needs to visit each town in a region and does not want to drive further than necessary, but the same problem might arise if one wanted, for example, to link several buildings to a mains electricity supply or several computers to a network. The following tasks explore one way of solving the problem.
The Excel workbook Salesman.xls contains a list of twenty points to be visited. Each point has a label, which is just an integer from 1 to 20, and the x and y coordinates of the point.
Make a chart illustrating the route connecting these points in the order they are given in the worksheet. Use an XY graph with the data points connected by straight lines.
Write a VBA function called Distance to calculate the distance
between any two points, given their x and y coordinates. The arguments
passed to your function should be two Range objects, each containing
the two coordinates of one of the points, and it should return a
number of type Double. In other words, the first line of your function
should look something like
Public Function Distance(v1 as Range, v2 as Range) As Double
To access the individual coordinates, you can use the Cells property of the Range object. For example, the first cell in the Range v1 is v1.Cells(1), and the number contained in that cell (the cell's value) is v1.Cells(1).Value. You will also need to use the built-in Sqr function, which calculates the square root of a number.
In cell D3, enter a formula using your function to calculate the distance between points 1 and 2. Check that the result is what you would expect. Fill the cells down to D21 with the same formula, so that each contains the distance between two consecutive points along the route. In cell D22 place the sum of all these distances, i.e. the total length of the route. Format the cells so that all distances are displayed to an appropriate number of decimal places.
The route given in the worksheet you downloaded is clearly not the shortest one linking the points! One way to find the shortest route would be to try all possible routes, calculate the length of each, and simply pick the smallest number. The problem with this approach is that the number of routes grows rapidly as the number of points to be visited increases. How many ways are there of connecting 20 points? How about 25 or 50 points?
An alternative approach is to try making small changes to the route and see which ones reduce the total length. For example, one such small change would be simply to swap two points in the list.
Write a subroutine SwapRows to swap two points in the list. You can start by writing a subroutine that always swaps the same two points, e.g. points 1 and 2. Remember that you only want to move the numbers in the first three columns. The distances in column D will be recalculated automatically.
You can start by creating a Range object to contain the whole region describing the route:
Dim route as Range Set route = Range("A2:C21")Then you can access individual rows using the Rows property of the Range object row1 = route.Rows(1) and individual cells as above using e.g. cell2 = row1.Cells(2). You can check that you are accessing the right cells and ranges by adding a line like row1.Select to your subroutine. Then you can easily see whether the region selected in the worksheet is the region you expected.
You can swap two rows by swapping the three fields one at a time, e.g. Point, then x, then y. Note that when you swap two numbers, you need an extra variable to hold one of the values temporarily, e.g.
Dim temp as Double temp = cell1.Value cell1.Value = cell2.Value cell2.Value = tempWhat happens if you don't use the variable temp?
What happens to the total length of the route when you run your subroutine? Run it again to swap the two points back to their original positions.
Now generalize your subroutine so that it takes three arguments - a Range object and two integers. The Range object will tell the function where the table of points on the route is, and the two integers will tell it which rows to swap.
Because your SwapRows subroutine now requires arguments, you cannot run it directly any more. Instead, you have to call it from another subroutine. Write a simple subroutine called Salesman to call SwapRows and swap the same two rows as you did above. Then try swapping different pairs of rows.
Swapping rows around "by hand" like this is a slow way to find the shortest route, but we can program the computer to try swapping rows at random and keep the changes that make the route shorter.
To decide which rows to swap, we can generate two random integers in the range from 1 to 20. The Rnd() function generates a (pseudo-)random floating point number in the range 0<=z<1. So we can get a number in the range we want using iRow = int(20*Rnd())+1. We can access the length of the route from the worksheet as follows:
Dim rLength as Range, routeLength as Double set rLength = Range("D22") routeLength = rLength.ValueModify your Salesman subroutine so that it swaps two rows chosen at random, and swaps them back again if the total length of the route increases.
Run the subroutine a few times and see what happens to the route. Then add a For loop to the routine to repeat the procedure a number of times. Start with a fairly small number of iterations until you find out how long your routine takes to run, then increase the number so that it runs for a minute or two. Does your program find the same route every time if you start with the points in the same order? (You can use "Sort" under the "Data" menu to sort the points into their original order.) If not, clearly it is not always finding the shortest possible route. Is there a reason why it might not be able to? How might you improve the algorithm to get round the problem?
One variation on the method we have been using is "simulated annealing". The name comes from the process of annealing a solid, such as steel, to change its properties by removing defects. Don't worry to much about the details, which are described briefly here to provide some background. The presence of defects in a metal raises its internal energy, so in its most stable state at low temperature, a metal should contain no defects. However, if a metal is cooled rapidly from its melting point, defects will be present because there is not enough time for them to be removed before the metal is too solid. Annealing involves cooling the metal slowly to avoid "freezing in" the defects. Keeping the temperature high for longer allows the metal to reach higher-energy states in the short term on its way between low-energy states.
In solving the travelling salesman problem, we can modify the procedure we have used so far by accepting some changes to the route even if they increase its length. This may seem a strange idea, but sometimes we can only get from one fairly short route to another, shorter one by going via a longer route. We can allow this to happen by accepting a longer route with a probability that depends on how much longer the route is. Then we gradually decrease this probability with each iteration until by the end we are only accepting shorter routes. The probability we will use is given by exp(-D/T), where D is the increase in length of the route and T is a quantity analogous to the temperature of the metal in real annealing. We gradually decrease the temperature T, for example by multiplying it by a factor a little below one after each iteration.
Change your Salesman routine so that instead of always rejecting a longer route, it accepts it with the probability given in the last paragraph. You may find the following example useful.
Dim prob as Double, z as Double z = Rnd() If (z<prob) Then ' Do something! ' (This part of the code will only be run with a probability ' given by prob.) End IfHow does the behaviour of your program differ from before? Is the route (or routes) it finds shorter or longer? How does its behaviour vary if you change the starting temperature, the rate of cooling and the number of iterations?
Modify your program so that instead of assuming that the route and the length are always in the same cells, it allows the user to select the relevant Ranges. You should not assume that the number of points will always be the same. You could also allow the user to select the starting temperature, rate of cooling and number of iterations.
There are many ways you could extend this program. For example: