Wow. These past 10 months or so have gone by in a weird way.
I was surprised to learn that it's been almost a year since I wrote this post on a way of sorting two sets of search results in way I'd been thinking about for a while. I called it a "duck sort".
Anyway, when I wrote the function to do the sort I was comparing two lists, specifically the first item of each. When I decided which list's item "won" the comparison, I would remove the item from the winning list and continue to make the comparisons until one of the lists became empty.
I'd always wondered if it would be better to traverse the lists instead of continually removing the first item of each list. In terms of clarity in my head, I went with the "remove" approach instead of the "traverse" approach because it was easier for me while just trying to figure out the basic logic of the sort method. But now I have this Python timer thing I just posted about here …
So, I made a "traverse" version of the sort method and made a little test script to take an increasingly larger data set and time the two approaches to the algorithm and see what happened.
Not surprisingly, traversing the lists was increasingly more efficient and faster than my original "remove" approach, i.e. altering/re-drawing the lists.
So, here is the new "traverse" approach to the "duck sort" and the original "remove" approach.
""" This code is in Python. """ ### duck sort via traversing lists. def duck_sort_TRAVERSE(x,y): output =  yi, xi = 0, 0 while yi < len(y) and xi < len(x): if y[yi] >= x[xi]: output.append(y[yi]) yi += 1 else: output.append(x[xi]) xi += 1 output = output + y[yi:] output = output + x[xi:] return output ### duck sort via removing item from lists. def duck_sort_REMOVE(x,y): output =  while len(y) > 0 and len(x) > 0: first_x = x first_y = y if first_y >= first_x: output.append(first_y) y.remove(first_y) else: output.append(first_x) x.remove(first_x) output = output + y output = output + x return output
I ran the test on the same "Hansel and Gretel" data from my original post on the "duck sort". But this time, I simply multiplied the data set by a number starting from "0" and ending in "10,000" in increments of 5,000. This way I could compare the speed of the two approaches on a larger and larger data set.
The results are interesting. Here's a table with the time (seconds) of the "traverse" and "remove" approaches. The "data_length" column is simply the size of the multiplier used to augment the data set.
Perhaps more interesting is a graph of the two approaches, below. With the graph, it's hard to even see that the "traverse" approach is trending upward.