简体   繁体   中英

sorting array of structs in c

i have a structure :

  typedef struct book{
  double rating;
  double price;
  double relevance;
  int ID;
}B;

an array

list* B;

and a file of these so read in the files with this

int read_file(char* infile, int N)
{
  int c;
  if((fp=fopen(infile, "rb")))
    {
      fscanf(fp, "%*s\t%*s\t%*s\t%*s\n");
      c=0;
      while((!feof(fp))&&(c<N))
    {
      fscanf(fp, "%lf\t%lf\t%lf\t%d\n", &list[c].rating,  &list[c].price, &list[c].relevance, &list[c].ID);   
      c++;
    }

 fclose(fp);      
    }
  else
    {
      fprintf(stderr,"%s did not open. Exiting.\n",infile);
      exit(-1);
    }
  return(c);
}

and a compare method

int comp_on_price(const void *a, const void *b)
{

  if ((*(B *)a).price < (*(B *)b).price)
    return 1;
  else if ((*(B *)a).price > (*(B *)b).price)
    return -1;
  else
    return 0;  

}

I would like a stable sort with nlog(n) time perhaps merge sort in order of lowest prie to highest

i only need the 20 lowest prices.

how would i implement this using my compare to method?

thanks

I would like a stable sort with nlog(n) time perhaps merge sort in order of lowest prie to highest

i only need the 20 lowest prices.

Then you can do this in O(n) time. You can find the first 20 values in O(N) time then sort those O(1).

See here for the STL C++ library version

Annotated Python implementation here

qsort is your friend :). (while it's not Nlog(N) in worst case, it's difficult do do anything faster)

The function you want to use is qsort . C comes with a perfectly acceptable sort which does exactly what you seem to need.

qsort itself isn't a stable sort (well, it may be for a given implementation, but the standard doesn't guarantee it) but it can be made into one with some trickery. I've done that before by adding a pointer to the array elements which is initially populated with the address of the element itself (or an increasing integer value as you read the file will probably do here).

Then you can use that as a minor key, which ensures elements with the same major key are kept in order.

If you don't want to go to the trouble of changing the structures, Algorithmist is a good place to get code from. Myself, I tend to prefer minor modifications to re-implementations.

To actually make it stable, change your structure to:

typedef struct book {
  double rating;
  double price;
  double relevance;
  int ID;
  int seq;                                 // Added to store sequence number.
} B;

and change your file reading code to:

fscanf(fp, "%lf\t%lf\t%lf\t%d\n", ... 
list[c].seq = c;                           // Yes, just add this line.
c++;

then your comparison function becomes something like:

int comp_on_price(const void *a, const void *b) {
    B *aa = (B*)a;
    B *bb = (B*)b;

    if (aa->price < bb->price)
        return 1;
    if (aa->price > bb->price)
        return -1;
    return (aa->seq < bb->seq) ? 1 : -1;   // Cannot compare equal.
}

Since you mentioned C and not C++, I would say you consider implementing your own version of something similar to qsort() .

Look at how the comparator for qsort is defined. You would need to define something similar for yourself? For the actual sorting, you would need to implement your own version of StableSort() from scratch.

It's just a slight changes to your comparizon function to make library qsort stable. See link here

Something like below should do the trick (untested, be cautious):

int comp_on_price(const void *a, const void *b)
{
    if ((*(B *)a).price < (*(B *)b).price)
        return 1;
    else if ((*(B *)a).price > (*(B *)b).price)
        return -1;
    else
        // if zero order by addresses
        return a-b;
}

This would work if you can guarantee a and b are in the same address space (two pointers in the same array) and that every comparisons give a greater overall ordering of the array, addresses of lower structures will tend to become even slower. This is true for bubble sorts or similar. That would also work for a trivial implementation of QucikSort (which qsort is not). However for other algorithms, or any algorithm using additional address space for temporary storage (maybe for optimization purpose), this property will not be true.

If what you sort contains any unique identifier in compared items (in the current example that is probably true for field ID), another method to make the sort stable would be to compare these items. You could also add such a unique key in a new field for that purpose, but as it uses more memory you should consider the third option described below before doing that.

My preferred method would still be a third one, do not directly sort an array of structures, but sort an array of pointers to actual structure items. This has several good properties. First you can compare arrays of the structure pointed to, as it won't change and it will make the sort stable.

The comparison function will become something like:

int comp_on_price(const void *a, const void *b)
{
    if ((*(B **)a)->price < (*(B **)b)->price)
        return 1;
    else if ((*(B **)a)->price > (*(B **)b)->price)
        return -1;
    else
        // if zero, order by addresses
        return *(B **)a-*(B **)b;
}

Other good properties is that it avoid moving structures around while sorting, it only need moving pointers, and that can be time saving. You can also keep several such pointer arrays, and that allow several ordered accesses to array items at the same time.

Drawbacks are that it takes some memory and that access to items is slightly slower (one level of indirection more).

You don't need to qsort everything. Just create an empty B* array for the 20 lowest records, copy the first <=20 records in there and qsort them, if there are more than 20 then as you iterate over your elements compare them to the highest in the first 20: if more then continue else compare to next highest etc. back to the lowest then shift the other pointers to make space for your next entry in the low-20. You do need a deterministic comparison - listen to paxdiablo on that front: add an input record number or something to differentiate records.

i finally did this using a counting sort it took over 100 lines of code in c.

i then did it in one line in a shell script

sort -nk 2,2 -s Wodehouse.txt | sort -rnk 3,3 -s| sort -rnk 1,1 -s|head -20

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM