Friday 30 October 2015

Lecture 9

The transportation problem and assignment problems are important problems, both practically and theoretically. They are simple without being trivial, and therefore are good problems on which to test new ideas about algorithmic methods.

I mentioned the fact that the Hirsch conjecture for the $n\times m$ Hitchcock transportation problem polytope states that no more than $m+n-1$ pivots should be needed to move between any two basic feasible solutions.

The diameter of a specific transportation polytope depends on the values of $n$, $m$, $s_i$s and $d_j$s, all integers. The Hirsch conjecture can be checked combinationally for small value of these. That has been done and the Hirsch conjecture has indeed been found to hold. (We allow pivot steps for which $\sum_{ij}x_{ij}c_{ij}$ does not decrease.)

A recent seminar on the subject is Kim's Transportation problem and the diameters of transportation polytopes (2015). Kim says that Hurkens has proved the upper bound $4(m+n-1)$. There is a folklore that Hurkens proved $3(m+n-1)$ in 2007, but this is in an unpublished paper. It is not hard to prove that the Hirsch conjecture holds for the special case $n=2$ and general $m$. I cannot recall whether or not it has been shown for $n=3$.

The assignment problem polytope has diameter 2 because every permutation is the product of two cycles, and applying a cyclic permutation to an assignment corresponds to a pivot step in the assignment problem. See On the assignment polytope, by Balinski and Russakoff.

The network simplex algorithm which we used in today's lecture to solve the transportation problem is not a polynomial time algorithm, but in practice it works very well. There do exist simplex-like pivoting algorithms for the transportation problem that run in polynomial time. I will explain next time a polynomial time algorithm for the special case of the assignment problem. This is the Hungarian algorithm, which solves the $n\times n$ assignment problem in time $O(n^3)$.

Thursday 29 October 2015

Lecture 8

The minimum cost flow problem (MCFP) is important because a huge number of problems are special cases of it. We saw in 8.5 that the longest-path problem can be viewed as a MCFP and leads to the critical path method, one of the oldest of all operational research techniques.

We have studied the solution of the MCFP by the network simplex algorithm. As with a general linear programming problem, the possibility of making a bad choice of pivots can give the algorithm a worse that polynomial running time. However, there do exist simplex algorithms which do run in polynomial time. See

James B. Orlin (1997). "A polynomial time primal network simplex algorithm for minimum cost flows". Mathematical Programming 78: 109–129.

Suppose $n$ and $m$ are the numbers of vertices and edges, respectively, and each cost is either $\infty$ or an integer bounded by $C$. Orlin describes an simplex-like pivoting algorithm with running $O\bigl(\min(n^2m \log nC, n^2m^2 \log n)\bigr)$.

Monday 26 October 2015

Lecture 7

You will have seen that the details of the Ellipsoid algorithm are really quite intricate. I have tried to show you the key ideas in Lemmas 7.1-7.3, but have not dealt with the issue of approximating square roots in polynomial running time. If you have a question about the explanation or think you see an error in the notes, please let me know so I can fix it.

I think the proof of Lemma 7.3 is particularly cute for the way it uses the primal-dual theory of linear programming to show that $\text{P}=\emptyset\implies \text{P}_\epsilon=\emptyset$. Question 10 on Examples Sheet 1 is similar in spirit.

Remember that the importance of the Ellipsoid algorithm lies in its theoretical rather than practical. In practice, linear programs are solved by high quality implementations of the simplex algorithm or by an interior point method like Karamakar's method.

In Lemma 7.2 we used the fact that $Q\subset \mathbb{R}^n$, the convex hull of $n+1$ vectors, $v^0,v^1,\dotsc,v^n$, that do not lie in the same hyperplane has
$$ \text{vol}(Q)=\frac{1}{n!}\left| \det \left( \begin{array}{ccc} 1 & \cdots & 1 \\ v^0 & \cdots & v^n \end{array} \right)\right|. $$ The proof of this result is not for our course, but if you are interested see A Note on the Volume of a Simplex, P. Stein The American Mathematical Monthly Vol. 73, No. 3 (Mar., 1966), pp. 299-301

Sunday 25 October 2015

Lecture 7 postponed

Unfortunately I had to cancel our lecture on Friday 23 October because of illness. I am hoping to be well enough to lecture on Monday. I have taken the opportunity to make some small corrections and refinements to the lecture notes for Lecture 7.

Wednesday 21 October 2015

Lecture 6

The proof that the Ellipsoid algorithm can solve linear programming in polynomial time is due to Leonid Khachiyan (1979). This discovery took place a full 20 years after the invention of the simplex algorithm. Today we sketched its main idea. In the next lecture we will be looking at some of the details.

I mentioned today some examples of problems in different complexity classes. Let me recap here:

P includes "Is there a path from $i$ to $j$ in this graph with cost $\leq k$?", 2SAT and "Is the set $P=\{x : Ax\leq b\}$ nonempty?"

NP-complete includes SAT, 3SAT, Subset Sum, Hamiltonian Path, Travelling Salesman Decision Problem ("Does there exist a tour of length $\leq k$?"), and Subgraph Isomorphism.

NP-hard, but not in NP includes TSP optimization problem ("What is the shortest TSP tour?") and the Halting Problem. Note that is it wrong to say, as some do, that the travelling salesman problem of finding the best tour is NP-complete. It is not NP-complete. It is NP-hard. In general, optimization forms of NP-complete decision problems should be spoken of as being NP-hard.

NP, but not in P or NP-complete, is conjectured to include Graph Isomorphism. If this could be proved then it would follow that P$\neq $NP.

EXAMPLE SHEET 1. Let me make some remarks on the examples sheet which you should now be attempting.

Q10. Think about the dual to "maximize $0^T x$ s.t. $Ax\leq b$.

Q11. Here you are being asked to show SAT $\leq_p$ 3SAT. So you need to find a way to make a polynomial time reduction from a SAT instance to a 3SAT instance.

Q12(b). Here are you being given some hints as to how to make a polynomial time reduction from SAT by which to show SAT $\leq_p$ Hamiltonian Path Problem.

Monday 19 October 2015

A Brief History of NP-Completeness, 1954–2012

David S. Johnson has written a paper called A Brief History of NP-Completeness, 1954–2012. The year 2012 was the 100th anniversary of the birth of Alan Turing and the 40th anniversary of Richard Karp's important paper detailing 23 NP-complete problems. If the subject of NP-completeness and P$=$NP interests you, then you might enjoy reading this paper about the history of computational complexity theory. 

Lecture 5

I had not anticipated that it would take so long to describe the basic elements of complexity theory and therefore was not able to finish all that I had put in the notes for Lecture 5. However, this is of no matter. Next time I will talk about the important question of whether or not P$=$NP.

Computational complexity theory is a very large subject, of which this lecture can only scratch the surface. You should aim to understand the definition of problem classes P, NP, NP-hard and NP-complete. There are other complexity classes, such as PSPACE (problems that can be solved using memory space that is polynomial in the size of the problem) and EXPTIME. It is known that P $\subseteq$ NP $\subseteq $PSPACE $\subseteq $EXPTIME, with at least one of these inclusions being strict, but it is not known which.

I mentioned the famous book, Computers and Intractability: A Guide to the Theory of NP-Completeness by Michael Garey and David S. Johnson. This is perhaps the most cited reference text in all of computer science. For this book they received received the 1979 Lanchester Prize from the Operations Research Society of America.

The following few remarks are certainly non-examinable, but you might find them interesting.

1. In the figure on page 24 of our notes one can see that there are potentially problems which are NP-hard but which are not NP-complete. Optimization problems are in this class. That is, "Does there exist as solution to a given travelling salesman problem with path length $\leq k$" is NP-complete. However, the harder problem: "Find the minimum length travelling salesman path" is  NP-hard and not in NP. One reason it is not in NP is that it is not a decision problem. An example of a decision problem that is NP-hard but not in NP is the halting problem: "given a program and an input, does the program terminate when run on this input?"

2. Problems that are in NP but neither in P or NP-complete are designated NPI (the letter I standing for "intermediate"). It has been proved that if P$\neq$NP then NPI problems must exist. But no natural ones are known. It is suspected that graph isomorphism is NPI, since it has not been proved to be either in P, or NP-complete. This is the problem of deciding whether or not two graphs are isomorphic. On the one hand graph isomorphism is in P for many special types of graph (such as planar ones). On the other hand, the subgraph isomorphism problem is known to be NP-complete. This is the problem of deciding whether or not one graph contains a subgraph which is isomorphic to a second graph.

3. Sometimes changing a problem very slightly can change its complexity. For example, 2SAT is the Boolean satisfiability problem in which each clause is restricted to have just two literals. This problem is in P, whereas 3SAT is NP-complete.

4. The question of whether or not it is true that P$=$NP is unlikely to be solved soon. Research success, such as it is, has succeeded only in explaining why certain proof ideas must be incapable of resolving this question. There are thousands of theorems in the research literature which are stated in the form "... is true, unless P$=$NP".

Friday 16 October 2015

Lecture 4

On page 20 of the notes the first tableau should have $a_{21}=-2$. In fact it does in the notes, but not in the overhead slide I was using.

It is not obvious that Gomory's cutting plane algorithm should terminate. If interested, you can read this proof that it terminates.

The paper I mentioned about using linear programming to find a potential function needed for proofs in an online bin packing problem is

 E.G. Coffman, Jr, D.S. Johnson, P.W. Shor and R. R. Weber. Markov chains, computer proofs, and average-case analysis of best fit bin packing. In Proc. 25 Annual ACM Symposium on Theory of Computing, San Diego, May, pages 412-421, 1993.

I had forgotten the date. Obviously computers are now much more powerful. It might be possible to extend the table in this paper, and perhaps formulate some conjectures concerning which $\{j,k\}$ lead to bounded or linear waste.

I mentioned that CPLEX is one of the leading commercial packages for linear programming. One can solve problems of medium size quite quickly with Mathematica. I have used Mathematica for problems of a few hundred variables and constraints.

Wednesday 14 October 2015

Non-degenerancy assumptions

I said we would make two assumptions.

(i) The rows of $A$ are linearly independent and thus it has rank m. (If some rows are linearly dependent then a constraint can be removed without changing the feasible set).

(ii) Every set of $m$ columns of $A$ are linearly independent. (If some columns are linearly dependent then one of the corresponding variables can be removed.)

Thanks to a student query, I have thought more about this. What are we trying to achieve with these assumptions? Suppose $x$ is a BFS with support of size $m$, but the columns of this support are linearly dependent, so $Ay=0$ where $S(y)\subseteq S(x)$. Then by considering $x'=x+\epsilon y $ and increasing $\epsilon$ away from $0$ (in either the positive or negative direction) until some component of $x'$ becomes $0$, we see that it is possible to find a new feasible solution $x'$ such that $S(x') < m $. This is known as a "degenerate solution", i.e. there is a solution to the $m$ constraints $Ax=b$ in which strictly less than $m$ components of $x$ are nonzero. In the simplex tableau this would evidence itself by there being a $0$ in the rightmost column of the tableau. This is more than we can expect in general. Perturbing $A$ slightly, this should not happen. For the purposes of giving an algebraic explanation of the simplex algorithm we want $A_B$ to be non-singular. Degeneracy can also occur in dual solutions. This would be evidenced by a $0$ occurring in the bottom row of the tableau, in the column of a non-basic variable. It would be better to restate our assumptions as follows (and I will amend the notes.)

(i) The rows of $A$ are linearly independent and thus it has rank m.
(ii) Every set of $m$ columns of $A$ are linearly independent.

These assumptions rule out degeneracies of the following types: that we might have $Ax+z=b$ in the primal for some $(x,z)$ having less than $m$ non-zero components, or in the dual we might have $\lambda^TA-w^T=c^T$ for some $(\lambda,w)$ having less than $n$ non-zero components.

Degeneracy is not really a problem in practice, but it is with degeneracy that cycling can occur. In the example in lectures the pivot column is being chosen as the one with the most negative entry. One way to avoid cycling is to choose amongst pivot columns in a different way: by preferring the candidate whose variable has the smallest index (Bland's rule). You can check with that this pivot column selection rule the tableau considered in lectures does not cycle.

Lecture 3

The simplex algorithm is due to George Dantzig. In his paper on the origins of the simplex algorithm he writes:

"In the summer of 1947, when I first began to work on the simplex method for solving linear programs, the first idea that occurred to me is one that would occur to any trained mathematician, namely the idea of step by step descent (with respect to the objective function) along edges of the convex polyhedral set from one vertex to an adjacent one. I rejected this algorithm outright on intuitive grounds - it had to be inefficient because it proposed to solve the problem by wandering along some path of outside edges until the optimal vertex was reached. I therefore began to look for other methods which gave more promise of being efficient, such as those that went directly through the interior."

This is interesting because Dantzig says that the simplex method is something that any trained mathematician would think of - but that that on first sight it appears to be inefficient.

In practice these days computers solve problems with 100,000s constraints and variables. There is a nice on line simplex method tool which you might like to use to check answers that you obtain by hand when doing questions on Examples sheet 1. There is also a very nice program here, which runs under Windows.

http://trin-hosts.trin.cam.ac.uk/fellows/dpk10/IB/simple2x.html

It was written by Doug Kennedy at Trinity.  It takes the hard labour out of performing pivot operations and it may be configured to prompt the choice of pivot elements, or to solve problems automatically. 

Tuesday 13 October 2015

Examples sheet 1

I have uploaded Examples sheet 1. This contains questions on Lagrangian methods, linear programming and computational complexity. We will have a session at which I will discuss these questions on Tuesday November 3: 16:00-17:30 in MR14.

Monday 12 October 2015

Lecture 2

We did not prove Theorem 2.1 (Supporting Hyperplane Theorem). It is an obvious theorem, but actually fairly tricky to prove. The proof can be found in Boyd and Vandenberghe 2.5. One first proves the "separating hyperplane theorem" which says that two non-intersecting convex sets can always be separated by a hyperplane. The supporting hyperplane theorem follows by applying the separating hyperplane theorem to two specific sets. One is the set of points  $\{(c,y): y>\phi(c), c\in \mathbb{R}^m\}$, and the the other is the set consisting of the single point $(b,\phi(b))$. The first of these sets is convex if $\phi$ is convex.

I have revised the proof of Theorem 2.4 to make it shorter, but I believe it is still clear.

I did talk to Section 2.5 today. I will return to the subject of shadow prices another time.

Water-filling solution. The problem I referred to as having a "water-filling solution" is motivated by a problem in  information theory. Suppose one is trying to apportion a constrained amount of total transmission power across a number of noisy Gaussian communication channels. In channel $i$ the received signal is $Y_i=x_i+Z_i$, where $x_i$ is input signal, for which on-average $x_i^2=p_i$, and $Z_i\sim N(0,n_i)$. When the power to noise ratio in the $i$th channel is $p_i/n_i$ then the capacity of this channel is proportional to $\log_2(1+p_i/n_i)$. The problem of maximizing the $\sum_i\log_2(1+p_i/N_i)$ is that of distributing a fixed amount of power $\sum_ip_i=P$, say, across multiple channels so as to maximize the total communication capacity of these channels, i.e.  the rate at which information can be reliably transmitted. 

Friday 9 October 2015

Lecture 1 homework

1. Consider $P_b$: minimize $f(x)=-x$, in two cases:

(a) $h(x)=\sqrt{x}$, $b=2$;

(b) $h(x)=x^2$, $b=16$.

What happens when you try to solve these problems using Lagrangian methods? In each case, find $\phi(b)$ and explain whether strong duality holds.

 2. Given $a_1,\dotsc,a_n>0$,
 \[
 \begin{align*}
\text{minimize}\ &- \sum_{i=1}^n \log(a_i+x_i)\\[4pt]
 \text{subject to } & x_1,\dotsc,x_n\geq 0\text{ and } \sum_{i=1}^nx_i =b.
 \end{align*}
 \]
The optimal $x$ corresponds to one that can be found by a so-called `water filling algorithm'. Imagine placing bars of heights $a_i$ side by side in the fashion of a histogram and then flooding above these bars so as to cover area of $b$. Draw a picture to illustrate this idea.

Biography of Alan Turing launched today

Alan Turing will be mentioned on our course for the Turing Machine (Lecture 5). At 6:60am today on the Radio 4 today programme I heard: "Alan Turing was the British mathematician who broke German codes at Bletchley Park during World War II. He committed suicide after he was prosecuted for homosexuality. A new biography is being published of Turing's life ­ by his own nephew Dermot Turing. It was launched yesterday in ­GCHQ."
This from the Glouceshire Echo:
"A book commemorating the life of renowned codebreaker Alan Turing will be officially launched at GCHQ.
Prof: Alan Turing Decoded, written by Turing's nephew Sir Dermot Turing, will be unveiled at the government's Cheltenham-based listening hub today.
It celebrates Turing's work as a mathematician, codebreaker, computer scientist and biologist.
Alan Turing made a crucial contribution to the cryptanalytic success of GCHQ's forerunner, the Government Code and Cypher School, at Bletchley Park during World War II."
Read more: http://www.gloucestershireecho.co.uk/8203-Biography-legendary-codebreaker-Alan-Turing/story-27946199-detail/story.html#ixzz3o3MsHjrG

Thursday 8 October 2015

The Secret Rules of Modern Living: Algorithms

ALGORITHMS feature centrally in operations research. Last night I watched this fun program by Marcus du Satuoy, with the title "The Secret Rules of Modern Living: Algorithms".

Professor du Sautoy discusses many things that we will meet in our course, such as the Travelling Salesman Problem and the Gale-Shapley algorithm for stable matching.

It is on BBC iPlayer for just 20 more days. I recommend that you catch it.

http://www.bbc.co.uk/programmes/p030s6b3




The program begins with the example of a face recognition algorithm.

"From internet shopping to the airport runway, algorithms are everywhere in modern life. They are behind the success of tech giants like Google, and have saved lives by matching kidney patients with donor organs. As if by magic these invisible, mathematical procedures give us fast, accurate answers to a range of complex problems."

Lecture 1

Our first lecture was about the important and powerful technique of Lagrange multipliers, which are used to address problems of optimization under constraints. Additional reading can be found in S. Boyd and L. Vandenberghe: Convex Optimization. Cambridge University Press (2004), chapters 1 and 5.

 One person asked why, in a line of the proof of Theorem 1.3, we can use $\inf_{c\in \mathbb{R}^m} \inf_{x\in X(c)} =\inf_{x\in X}$. Another class member provided the correct answer: that every $x\in X$ satisfies $h(x)=c$ for some $c$. 

Subsequent to the lecture I corrected the numbering of Theorem 1.3 on page 3. I will sometimes make small changes to the notes after a lecture.

As we begin, some of you may be wondering: what is Operational Research? Here are some links that may help to explain.

PROFESSIONAL SOCIETIES
INFORMS is US-based,society but has members throughout the world and organizes many conferences. They have sections in theoretical fields such as Applied Probability, and more applied ones such as Transportation Science & Logistics. On their web site they try answer: What is Operations Research?

They say, "Operations Research (O.R.), or operational research in the U.K, is a discipline that deals with the application of advanced analytical methods to help make better decisions. The terms management science and analytics are sometimes used as synonyms for operations research. Employing techniques from other mathematical sciences, such as mathematical modeling, statistical analysis, and mathematical optimization, operations research arrives at optimal or near-optimal solutions to complex decision-making problems."

The UK society is the the Operational Research Society: https://www.theorsociety.com/

RESEARCH FUNDING
Here is a graphic of research areas in the Mathematical Sciences supported by EPSRC

Total theme funding, £210.2 million (4.66% of whole portfolio). There are 350 grants in the Mathematical Sciences theme. Mathematical Aspects of Operational Research presently has 21 grants, totalling £8m and is represented by the circle at the bottom of the picture.

RESEARCH PUBLICATION
Mathematics of Operations Research  is a leading journal. If you look at the titles of some articles this will show that there is some quite deep mathematics being pursued.

Mathematics of Operations Research (MOR) publishes excellent foundational articles having significant mathematical contents and relevance to operations research and management science.