Pure Theory: Cycle Detection with Tortoise and hare algorithm


Monday, October 31, 2011

Cycle Detection with Tortoise and hare algorithm

Cite: Part of this article is from Wiki Page.


Cycle Detection, in computer science, is the algorithmic problem of finding a repeated sequence (or a cycle) of iterated function values.
The function would map a set of finite values to the subset of itself(from empty set to full set), which is
x0, x1 = f(x0), x2 = f(x1), ... , xi = f(xi-1), ...
 The function must eventually have the same values twice, E.g i and j, such that xi == xj, which is the first repeatition, is the start of the cycle sequence. We say Let μ be the smallest index such that the value xμ reappears infinitely often within the sequence of values xi, and let λ (the loop length) be the smallest positive integer such that xμ = xλ+μ. The cycle detection problem is the task of finding λ and μ.


5, 6, 1, 4, 3, 6, 1, 4, 3, ...
Here  6, 1, 4, 3 will be the cycle, which has four elements and starts from index 1.
So μ is 1 and λ is 4.

Native solution:

Using hash table would be a good idea, which restore every boolean value for each distinct key, once the key comes again with the corresponding value is already true, then they cycle starts here. For this solution, both time complexity and space complexity are O(n), which is linear.

Tortoise and hare algorithm:

It's also called Floyd's cycle-finding algorithm, which only uses only two pointer or other values to keep the index of the array, but these two pointers moves with different speeds.

The key insight in the algorithm is that, for any integers i ≥ μ and k ≥ 0xi = xi + kλ, where λ is the length of the loop to be found. In particular, whenever i = kλ ≥ μ, it follows that xi = x2i. Thus, the algorithm only needs to check for repeated values of this special form, one twice as far from the start of the sequence than the other, to find a period ν of a repetition that is a multiple of λ. Once ν is found, the algorithm retraces the sequence from its start to find the first repeated value xμ in the sequence, using the fact that λ divides ν and therefore that xμ = xν + μ. Finally, once the value of μ is known it is trivial to find the length λ of the shortest repeating cycle, by searching for the first positionμ + λ for which xμ + λ = xμ.
The algorithm thus maintains two pointers into the given sequence, one (the tortoise) at xi, and the other (the hare) at x2i. At each step of the algorithm, it increases i by one, moving the tortoise one step forward and the hare two steps forward in the sequence, and then compares the sequence values at these two pointers. The smallest value of i > 0 for which the tortoise and hare point to equal values is the desired value ν.
Here is a snippet code (in python) from wiki link I mentioned above, 
def floyd(f, x0):
    # The main phase of the algorithm, finding a repetition x_mu = x_2mu
    # The hare moves twice as quickly as the tortoise
    tortoise = f(x0) # f(x0) is the element/node next to x0.
    hare = f(f(x0))
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))
    # at this point the start of the loop is equi-distant from current tortoise
    # position and x0, so hare moving in circle and tortoise (set to x0 )
    # moving towards circle, will intersect at the beginning of the circle.
    # Find the position of the first repetition of length mu
    # The hare and tortoise move at the same speeds
    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)
        mu += 1
    # Find the length of the shortest cycle starting from x_mu
    # The hare moves while the tortoise stays still
    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1
    return lam, mu
Post a Comment