Kuhn's algorithm for finding the greatest

matching in a bipartite graph

Given a bipartite graph comprising vertices and edges. Required to find the
maximum matching, ie, select the edges as much as possible, so that no one had
selected edge vertex in common with any other selected edge.

Description of the algorithm

Necessary definition
A matching
is a set of pairwise non-adjacent edges of the graph (in other words,
any vertex must be incident to no more than one of the plurality of ribs
). Power
matching will call the number of edges in it. The greatest (or maximum) will be called
a matching matching, the power of which is maximum among all possible matchings
in the graph. All the vertices that have adjacent edges of matching (ie, that have a
degree in exactly one subgraph formed ), we call this a matching rich.
Chain length will call some easy way (ie not contain duplicate vertices or edges)
containing the edges.
Alternating circuit (in a bipartite graph, relative to a matching) will be called a chain
in which the ribs in turn owns / not owns matching.
Increases chain (in the bipartite graph, with respect to a matching) will be called an
alternating circuit, which start and end vertices belong to matching.

Berge's Theorem
Wording . Matching the maximum if and only if there is increasing relative thereto
The proof of necessity . We show that if a matching as possible, there is no
increased relative to a chain. Proof of this is constructive: we show you how to
increase through this magnifying circuit power matching unit.
To do this, perform the so-called matching alternating along the chain . We recall
that, by definition, the first edge circuit does not belong to matching, the second belongs to the third - again belongs to the fourth - owned, etc. Let's change the
status of all the edges along the chain : those edges that are not included in the
matching (the first, third and so on until the last) is included in the matching, and the
edges that were previously included in the matching (the second, fourth, and so on
up penultimate) - remove from it.
It is understood that the power matching with increased by one (because it was
added at one edge longer than the deleted). It remains to verify that we have built a
correct matching, ie, that no vertex of the graph does not have a right of two adjacent
edges of this matching. For all peaks alternating chain , except the first and last, it
follows from the alternation of the algorithm: first we have removed the top of each of
these adjacent edges, and then added. For the first and last vertex chain and
nothing could be broken as to interlace they were to be unsaturated. Finally, for all
the other peaks - not in the chain - obviously, nothing has changed.Thus, we really
built a matching, and one greater power than the old one, which completes the proof
of the necessity.

The proof of sufficiency . We prove that if a relative matching

the ways it is - as much as possible.

is not increasing

The proof is by contradiction. Let there is a matching

having more power than
. Consider the symmetric difference of these two matchings, ie leave all edges
included in or
, but not both simultaneously.
It is understood that a plurality of ribs - is certainly not matching. Consider what
kind of a set of ribs is; For convenience, we consider it as a graph. In this graph,
each vertex, obviously, has degree 2 (because each node can have a maximum of
two adjacent edges - one matching and from another). It is easy to understand that
while this graph consists only of cycles or paths, with neither one nor the other do
not intersect with each other.
Now, note that the path in this graph may not be any, but only even length. In fact,
in any way in the graph edges alternate: After the rib of a rib comes from
and vice versa. Now, if we look at the way some odd length in the graph , it turns
out that in the original graph it will increase the matching circuit or to , or for
. But this could not be, because in the case of matching
it contradicts with the
condition, and in the case
- with its maximum (as we have already proved the
necessity of the theorem, which implies that the existence of increasing the matching
circuit can be maximized).
We now prove a similar assertion for cycles all cycles in the graph can have only
chtnuyu length. It proved quite simple: it is clear that in the cycle of edges also have
alternate (owned by turns , then
), but this condition can not be fulfilled in a
cycle of odd length - in it there are certainly two adjacent edges of a matching, which
contradicts the definition matching.
Thus, all the way, and the cycles of the graph
are chtnuyu
length. Hence, the graph contains an equal number of edges of and out
. But
considering that contains all the edges and
, except for their common edges,
it follows that the power and
the same. We have a contradiction: on the
assumption matching was not the maximum, then the theorem is proved.

Algorithm Kuna
Kuhn's algorithm - a direct application of Theorem Berge. It can be summarized as
follows: first take an empty matching, and then - in the graph is possible to find
magnifying chain - will perform striping matching along the chain, and repeat the
process of finding increasing the chain. Once a circuit is not found - the process
stops - the current matching is maximum.
It remains to detail the method of finding increasing circuits. The algorithm Kuhn just looking for any of these circuits via bypass in depth or width . Kuhn's algorithm
looks at all the vertices of the graph at a time, starting from each round, trying to find
a magnifying circuit, starting at this summit.
It is more convenient to describe the algorithm, assuming that the graph is already
divided into two parts (in fact the algorithm can be implemented and so that he was
not allowed to enter the graph is clearly divided into two parts).

The algorithm looks at all the vertices of the first part of the graph:
. If
the current node is already filled with a matching current (ie already selected some
edge adjacent to it), then skip this vertex. Otherwise - the algorithm is trying to
saturate this summit, which starts search for increasing the chain, starting from this
Search magnifying circuit by means of a special bypass in depth or width (usually for
ease of implementation is used to bypass the depth). Initially bypass in depth is the
current top of the unsaturated first part. To view all of the edges of the top, let the
current edge - that edge
. If the node is not yet saturated with matching, it
means that we were able to find a magnifying circuit: it consists of a single rib
; in this case, just include it in the edge matching and stop increasing the search
from the top of the chain . Otherwise - if the already saturated with some
, then try to pass along this edge: thus we will try to find a magnifying
circuit passing through the ribs
. Simply move on in our crawl to the
top - now we are trying to find a chain of magnifying this summit.
One can understand that as a result of this tour, started from the top , or find
magnifying circuit, thereby saturate the top , such as a magnifying circuit does not
find (and, therefore, the vertex will not be able to become rich).
After all nodes

are scanned, the current will be maximum matching.

Working hours
Thus, the algorithm can be represented as Kuhn series of launches bypass in
depth / width over the entire graph. Consequently, all the algorithm is executed

that in the worst case there

However, this estimate may be a bit better . It turns out that the algorithm Kuhn
importantly, what proportion is chosen for the first, and which - for the second. In
fact, in the above described realization starts crawling depth / width of the peaks
occur only the first part, so the whole algorithm is executed in a time
where - the number of vertices of the first part. In the worst case, it is
(where - the number of vertices of the second part). This shows that it is cheaper,
when the first share has less number of vertices than the second. On a very
unbalanced graphs (when and very different), this translates into a significant
difference times of operation.

We give here the implementation of the above algorithm, based on the bypass in
depth and take a bipartite graph clearly broken into two parts of the graph. This
implementation is very concise and, perhaps, it is worth remembering in this form.
Here - the number of vertices in the first part, - in the second part,
- a list of
the top edges of the first part (ie, a list of numbers of vertices, in which the edges of
the lead ). Tops in both lobes are numbered independently, ie first share - with the
, the second - with the numbers

Then there are two sub-array:

. First - it contains information about
the current matching. For programming convenience, this information is found only
for the vertices of the second part:
- is the number of vertices of the first part,
connected by an edge to a vertex of the second part (or
, if no matching edges of
the leaves are not). The second array - the usual array of "visited" vertices
crawled in depth (it is needed, just to bypass the depth did not come in one vertex
- is bypassing the deep. It returns
if she could find a
magnifying chain of peaks , while it is believed that this feature has made the
alternation matchings found along the chain.
Inside the function to view all the edges emanating from the vertex of the first part,
and then checked if it leads to the edge of the top of unsaturated , or if this
node is full, but fails to find a magnifying circuit recursive run out
, then we
say that we have found a magnifying circuit, and before returning from the function
with the result of
producing alternating current in the edge: redirect the edge
adjacent to , at the top .
In the main program first indicates that the current matching - empty (the list
filled with numbers
). Then he moved the top of the first part, and of her tour
starts in depth
previously zeroing array
It is worth noting that the size of the matching is easy to get the number of
in the main program, who returned result
. Needless required
maximal matching in the array
int n, k;
vector < vector<int> > g;
vector<int> mt;
vector<char> used;
bool try_kuhn (int v) {
if (used[v]) return false;
used[v] = true;
for (size_t i=0; i<g[v].size(); ++i) {
int to = g[v][i];
if (mt[to] == -1 || try_kuhn (mt[to])) {
mt[to] = v;
return true;
return false;
int main() {
... ...
mt.assign (k, -1);
for (int v=0; v<n; ++v) {
used.assign (n, false);

try_kuhn (v);
for (int i=0; i<k; ++i)
if (mt[i] != -1)
printf ("%d %d\n", mt[i]+1, i+1);
Once again, that Kuhn's algorithm is easy to implement, and so he worked on graphs
that are known, they are bipartite, but clear their division into two parts found. In this
case, it will have to abandon the easy division into two parts, and all the information
stored for all vertices. To do this, an array of lists is now specifies not only for the
vertices of the first part, and for all vertices (of course, now the tops of both shares
are numbered in total numbering - from before ). Arrays
now also
determined for the vertices of both lobes, and, accordingly, they must be maintained
in this state.

Better implementation
We modify the algorithm is as follows. Prior to the main loop algorithm will find some
simple algorithm arbitrary matching (a simple heuristic algorithm ), and only then
will perform a series of function calls kuhn (), which will improve this matching. As a
result, the algorithm will run noticeably faster on random graphs - because in most
graphs you can easily dial a matching large enough weight with the help of
heuristics, and later found to improve the matching up to the maximum already Kuhn
conventional algorithm. Thus, we save on starting a crawl depth of the peaks, which
we have already included using heuristics to the current matching.
For example , you can just go through all the vertices of the first part, and for each
of them to find any edge that can be added to matching, and add it. Even such a
simple heuristic algorithm can accelerate Kuna several times.
It should be noted that the main loop will have to be modified slightly. Because the
function is called
in the main loop is assumed that the current node is not
included in the matching, it is necessary to add the appropriate checks.
The implementation of the code will change only in the function main ():
int main() {
... ...
mt.assign (k, -1);
vector<char> used1 (n);
for (int i=0; i<n; ++i)
for (size_t j=0; j<g[i].size(); ++j)
if (mt[g[i][j]] == -1) {
mt[g[i][j]] = i;
used1[i] = true;
for (int i=0; i<n; ++i) {
if (used1[i]) continue;

used.assign (n, false);

try_kuhn (i);
for (int i=0; i<k; ++i)
if (mt[i] != -1)
printf ("%d %d\n", mt[i]+1, i+1);
Another good heuristic is the following. At each step, the summit will seek lesser
extent (but not isolated), select it from any edge, and add it to the matching, then
remove both the top of it with all incident edges of the graph. Such greed works very
well on random graphs, even in most cases builds maximal matching (albeit against
her have a test in which it finds a matching substantially less than the maximum).

