highest is a program that finds the highest numbers in a sequence of lines containing numbers.


  • Usage:
  • A comparison of some different methods of finding the largest files; n is the number of files checked, and m is the number of files desired in the report. Typically, n will be quite a bit larger than m.
    In the spirit of full disclosure, I should mention that part of the reason the treap version is faster than the heap version, is that the treap version is able to use a tuple without loss of reverse ordering, while the heap version uses a class.

    Here is a pair of graphs (in postscript format) describing how highest with a treap compares to highest with a heap, to highest with bisect.insort, and to GNU sort | tail. As a quick reminder, we're looking at n numbers, and keeping the m best values (lines):

    Here they are again as png images (they'll probably look better as postscript, but not everyone has a postscript previewer handy):

  • Portions of this software are owned by The University of California, Irvine - and are not under any version of the GPL. GPL is a fine series of licenses, but the partial owners of the software need it to be distributed under these terms.


    Back to Dan's tech tidbits

    You can e-mail the author with questions or comments: