All about flooble | fun stuff | Get a free chatterbox | Free JavaScript | Avatars
 perplexus dot info

 Lost my pointer 2 (Posted on 2003-12-18)
You are given a pointer to the head of a SINGLY linked list. Write a function that takes as an argument one pointer, and determines whether or not the list is "looped".

A "looped" list means that the linked list has a node that points to a PREVIOUS member (thereby creating a cycle).

Clearly, if a list is looped, and you attempt to simply go from node to each next node until you reach the end, you will never get to an end (just going around and around the loop forever).

Conditions: The list (and the loop, if it exists) may be of any given size (up to available memory), and your function has only 64 bytes (16 4-byte pointers) of available heap/stack space).

***Extra Credit: What is the speed (big-O notation) of your solution?
__________________________________

For those of you non-computer types: a singly linked list is simply a list of records where each record (node) points to the NEXT record. (The only way to get to a node is to first be at the previous node.) The first node is special in that we keep track of where it is.

So, we can always get to the first node, but to get to the second we must first get to the first node. To get to the third, we must first get to the second, and so on.

Therefore, it is an easy matter of traversing all the nodes in order. But it is impossible to go backwards (unless one keeps track of all previously visited nodes) or to access the nodes in any other order (like random access or alphabetically).

Normally, the last node points to NULL, and we generally know to stop at that point.

This problem arises when one of the nodes MIGHT point to a node that occurred in the list previously. (For instance, the 651st node points to the 83rd node.) Then we will end up going from the 651st -> 83rd -> 84th... and continue looping.

The question is... how do we know that we've looped? (And how quickly can we determine whether or not we looped.)

 See The Solution Submitted by SilverKnight Rating: 4.0000 (2 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)
 a solution | Comment 4 of 22 |
We must compare each forward pointer we look at to a previous pointer to see if they are the same. Every once in a while we have to replace the one we're comparing against in case the loop comes back to a point after the one we had saved. But if we keep replacing the comparison pointer at fixed intervals we risk that the loop is larger than our interval of replacement, and we'll never recognize a loop. So start out replacing the comparison pointer every 2 (say) pointers that we look at, but then every time we replace the comparison pointer, double the interval at which the replacement is made the next time.

If there is no loop, then the time is obviously O(N). To see why the time is O(N) even when there is a loop, consider an approximate calculation based on a list of N elements with a terminal loop of x elements out of this. You must examine N - x elements before entering the loop. To that we have to add the expected number of elements that will be examined within the loop. In the following, I use log to represent the base-2 logarithm. We will have reset the replacement interval for the comparison pointer log(N-x) times. But each time we do that, we double the interval. So we expect the size of the interval to be 2^log(N-x), or just N-x. Compared to the size of the loop, x, this is (N-x)/x times as big, so we'd expect to have to go through the loop (N-x)/x times ("give or take" (actually multipy or divide by) a factor not too far from 1), but since it contains x elements, that is gone through by examining ((N-x)/x)*x = N-x times. So we get N-x+N-x, which is O(N).

The below is a sample, written in PL/I. Some object-oriented languages may be able to do this without regard to the way the structure is laid out, but here I've just put in a fictitious structure that contains a forward pointer. It's not tested, as I didn't wish to put together a driver program, and all that linkage work.

DCL LOOKPTR POINTER;
DCL CURRPTR POINTER;
DCL SKIPAMT FIXED BIN(31) INIT(2);
DCL NUM_SKIPPED FIXED BIN(31) INIT(0);
DCL 1 MYSTRUCT BASED (CURRPTR),
2 STRUCT_STUFF CHAR(20),
2 FWD_PTR POINTER;

CURRPTR=MYSTRUCT.FWD_PTR;
DO WHILE (CURRPTR ¬= LOOKPTR & CURRPTR¬=NULL);
NUM_SKIPPED = NUM_SKIPPED + 1;
IF NUM_SKIPPED = SKIPAMT THEN
DO;
SKIPAMT = 2 * SKIPAMT;
NUM_SKIPPED = 0;
LOOKPTR = CURRPTR;
END;
CURRPTR=MYSTRUCT.FWD_PTR;
END; /* DO WHILE */
IF CURRPTR = NULL THEN
RETURN ('0'B); /* NOT LOOPED */
ELSE
RETURN ('1'B); /* IS LOOPED */
END ISLOOPED;

To test out the idea, I wrote a program in Quick Basic to simulate pointers with an array of numbers 1-N and shuffled them. Each is considered a pointer (subscript actually) to the next in the list. If it points to N (in this case 4000) it is considered terminated. Thus the N as a length of the list is actually less than the length of the array. The program marks, in the REDIMed array useCt which are actually used (the algorithm looks at every element that's really in the "list" at least once), and the populated entries in that array are counted to determine the actual N.

The maximum N is the size of the array, 4000. The ratio of number of elements looked at to number of elements, N, is tabulated for bins of 1000 (i.e., under 1000, 1000-2000, etc.) I haven't done a chi-squared analysis but it looks as if the ratio is about constant, so as to confirm O(N). The tabulation is done only for those which found a loop. There's no point in counting in those which it's obvious it is O(N). (What may be confusing in the program is that variable N is the size of the array, acting as a null pointer, while the variable called used contains the size of the "list", which we are calling N in the write-up.)

The statistics produced for the 1000-wide bins' average ratio of examined nodes to nodes is:
0 2.343126213126932
1 2.360104875078315
2 2.613321757777493
3 2.238462445587248
where the number at the left is the number of thousands starting the range.

Actually, the above ratios show up approximately, consistently. But a graphics version of the stat program, plotting individual results of examined nodes as the y axis and nodes as the x axis, indicates a stepped function, with steps breaking at powers of 2, and each step sloped at a value of 1. The steps are what increase the average slope to the approximate 2.4? value.

The program is:

RANDOMIZE TIMER
DEFDBL A-Z
n = 4000
'SCREEN 12
DIM array(n)

DIM ctCtr(10)
DIM usedCtr(10)

FOR trial = 1 TO 2000
'  OPEN "looptrce.txt" FOR OUTPUT AS #2
REDIM useCt(n)
FOR i = 1 TO n
array(i) = i
NEXT

FOR i = 1 TO n
r = INT(RND(1) * n + 1)
IF i <> r THEN SWAP array(i), array(r)
NEXT

skipamt = 2
numSkipped = 0

ct = 1

useCt(currptr) = 1
currptr = array(currptr)
useCt(currptr) = 1
DO WHILE (currptr <> lookptr AND currptr <> n)
numSkipped = numSkipped + 1
IF numSkipped = skipamt THEN
skipamt = 2 * skipamt
numSkipped = 0
lookptr = currptr
END IF
currptr = array(currptr)
useCt(currptr) = 1
'    PRINT #2, skipamt; numSkipped; currptr
ct = ct + 1
LOOP
'  CLOSE
used = 0
FOR i = 1 TO n
IF useCt(i) THEN used = used + 1
NEXT

PRINT used, ct, currptr, ct / used
subsc = INT(used / 1000)
IF currptr < n THEN
ctCtr(subsc) = ctCtr(subsc) + ct
usedCtr(subsc) = usedCtr(subsc) + used
END IF
'  IF ct / used > 2 THEN END
'  IF currptr = n THEN
'   PSET (used * 630 / n, 400 - ct * 100 / n), 7
'  ELSE
'   PSET (used * 630 / n, 400 - ct * 100 / n), 14
'  END IF
NEXT
FOR i = 0 TO 10
IF usedCtr(i) > 0 THEN PRINT i, ctCtr(i) / usedCtr(i)
NEXT

 Posted by Charlie on 2003-12-18 15:45:49

 Search: Search body:
Forums (0)