[英]How do I use mergesort to sort the names that have the same surname?
我們的導師給了我們一個 csv 文件,我們應該根據他們的姓氏按字母順序對名字進行排序。 但是有些名字具有相同的姓氏,我的代碼僅適用於他們的姓氏。 當他們有相同的姓氏時,我不知道要添加什么以按他們的名字對他們進行排序。
這是我的代碼:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define people 11
struct list_people {
char FirstName[20];
char LastName[20];
char Name[20];
char Age[5];
};
typedef struct list_people Details;
int merge_sort();
int merge(Details *[11], int, int, int);
int main() {
FILE *info;
int i, j;
Details A[people];
Details *cell[people];
for (i = 0; i < (people + 1); i++) {
cell[i] = &A[i];
}
info = fopen("people.csv", "r");
for (i = 0; i < people; i++) {
fscanf(info," %[^,], %[^,], %8s", A[i].LastName, A[i].FirstName, A[i].Age);
}
fclose(info);
merge_sort(cell, 1, people - 1);
for (i = 1; i < people; i ++) {
printf( "\t %-20s %-20s Age:%-20s \n", A[i].FirstName, A[i].LastName, A[i].Age);
}
}
int merge_sort(Details *A[], int low, int high) {
int mid;
if (low < high) {
mid = (low + high) / 2;
merge_sort(A, low, mid);
merge_sort(A, mid + 1, high);
merge(A, low, mid, high);
}
return 0;
}
int merge(Details *A[], int low, int mid, int high) {
int leftIndex = low;
int rightIndex = mid + 1;
int combinedIndex = low;
int i, j;
Details tempA[people];
while (leftIndex <= mid && rightIndex <= high) {
if (strcasecmp((A[leftIndex]->LastName), (A[rightIndex]->LastName)) <= 0) {
tempA[combinedIndex] = *(*(A + leftIndex));
combinedIndex++;
leftIndex++;
} else {
tempA[combinedIndex] = *(*(A + rightIndex));
combinedIndex++;
rightIndex++;
}
}
if (leftIndex == mid + 1) {
while (rightIndex <= high) {
tempA[combinedIndex] = *(*(A + rightIndex));
combinedIndex++;
rightIndex++;
}
} else {
while (leftIndex <= mid) {
tempA[combinedIndex] = *(*(A + leftIndex));
combinedIndex++;
leftIndex++;
}
}
for (i = low; i <= high; i++) {
*(*(A + i)) = tempA[i];
}
return 0;
}
您可以使用以下方法比較元素:
int cmp;
// compare the last names
cmp = strcasecmp( ( A[leftIndex]->LastName), ( A[rightIndex]->LastName) );
if (cmp == 0)
{
// last names are identical so compare the first names
cmp = strcasecmp( ( A[leftIndex]->FirstName), ( A[rightIndex]->FirstName) );
}
if (cmp <= 0)
{
// ...
}
else
{
// ...
}
代碼中存在多個問題:
數組大小為people
,因此初始化循環應排除索引值people
。 而不是for (i = 0; i < (people + 1); i++)
你應該寫:
for (i = 0; i < people; i++)
為避免緩沖區溢出,您應該在fscanf()
為每個目標數組指定要讀取的最大字節數:
fscanf(info," %19[^,], %19[^,], %4s", A[i].LastName, A[i].FirstName, A[i].Age);
您還應該檢查fscanf()
的返回值以檢測無效輸入。
要訂購具有相同姓氏的項目,您應該使用比較函數並比較LastName
、 FirstName
、 Age
,返回第一個不相等的結果。
下面是一個例子:
int compareDetails(const Details *a, const Details *b) {
int res, age_a, age_b;
if ((res = strcasecmp(a->LastName, b->LastName)) != 0)
return res;
if ((res = strcasecmp(a->FirstName, b->FirstName)) != 0)
return res;
age_a = atoi(a->Age);
age_b = atoi(a->Age);
return (age_a > age_b) - (age_a < age_b);
}
數組在 C 中是基於零的,因此索引值應該從0
開始。 在 C 中指定包含第一個索引並排除最后一個索引的數組切片也是慣用的。 這允許空切片並使merge_sort
代碼更簡單,避免容易出錯的+1
/ -1
調整。
cell
指針數組是多余的,因為您將Details
數組就地排序。 您可以通過以下方式簡化代碼:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define PEOPLE 11
struct list_people {
char FirstName[20];
char LastName[20];
char Name[20];
char Age[5];
};
typedef struct list_people Details;
int merge_sort(Details A[], int low, int high);
int main() {
FILE *info;
int i, j, n;
Details A[PEOPLE];
info = fopen("people.csv", "r");
if (info == NULL)
return 1;
for (n = 0; n < PEOPLE; n++) {
if (fscanf(info," %19[^,], %19[^,], %4s", A[n].LastName, A[n].FirstName, A[n].Age) != 3)
break;
}
fclose(info);
merge_sort(A, 0, n);
for (i = 0; i < n; i++) {
printf( "\t %-20s %-20s Age:%-20s \n", A[i].FirstName, A[i].LastName, A[i].Age);
}
return 0;
}
int compareDetails(const Details *a, const Details *b) {
int res, age_a, age_b;
if ((res = strcasecmp(a->LastName, b->LastName)) != 0)
return res;
if ((res = strcasecmp(a->FirstName, b->FirstName)) != 0)
return res;
age_a = atoi(a->Age);
age_b = atoi(a->Age);
return (age_a > age_b) - (age_a < age_b);
}
void merge(Details A[], int low, int mid, int high) {
int leftIndex = low;
int rightIndex = mid;
int combinedIndex = 0;
int i, j;
Details tempA[high - low];
while (leftIndex < mid && rightIndex < high) {
if (compareDetails(&A[leftIndex], &A[rightIndex]) <= 0) {
tempA[combinedIndex] = A[leftIndex];
combinedIndex++;
leftIndex++;
} else {
tempA[combinedIndex] = A[rightIndex];
combinedIndex++;
rightIndex++;
}
}
while (leftIndex < mid) {
tempA[combinedIndex] = A[leftIndex];
combinedIndex++;
leftIndex++;
}
while (rightIndex < high) {
tempA[combinedIndex] = A[rightIndex];
combinedIndex++;
rightIndex++;
}
for (i = low; i < high; i++) {
A[i] = tempA[i - low];
}
}
int merge_sort(Details A[], int low, int high) {
if (high - low >= 2) {
int mid = low + (high - low) / 2;
merge_sort(A, low, mid);
merge_sort(A, mid, high);
merge(A, low, mid, high);
}
return 0;
}
如果您有兩個相同的姓氏,那么您需要與另一個屬性進行比較並使用它來確定排序的方式。
首先,為了讓事情更清晰一些,我們strcasecmp
條件表達式中取出對strcasecmp
的調用,並將它們的結果保存到某個變量中:
while ( leftIndex <= mid && rightIndex <= high )
{
int lastNameResult = strcasecmp( A[leftIndex]->LastName, A[rightIndex]->LastName );
然后作為第一遍,我們將把<= 0
情況分成兩個單獨的情況:
if ( lastNameResult < 0 )
{
// left < right, process as before
}
else if ( lastNameResult == 0 )
{
// new case to handle same last names
}
else
{
// left > right, process as before
}
現在在這個新案例中,我們需要與第二個屬性進行比較; 通常的選擇是FirstName
:
...
else if ( lastNameResult == 0 )
{
int firstNameResult = strcasecmp( A[leftIndex]->FirstName, A[rightIndex]->FirstName );
if ( firstNameResult < 0 )
{
// left < right, handle the same way you did for LastName
}
else if ( firstNameResult == 0 )
{
// left == right, need to compare against another attribute
}
else
{
// left > right, handle the same way you did for LastName
}
}
...
如果您遇到兩個具有相同名字的條目,那么您需要選擇另一個屬性進行比較(例如Age
),並將其放入else if ( firstNameResult == 0 )
分支中。 或者,您可以合並<
和==
情況,讓重復的條目落在它們所在的位置。
顯然,這種方法不能很好地擴展多個屬性,並且您最終會復制代碼。 更好的方法是將比較與合並分離。 添加一個新變量,我們將其mergeLeft
,如果您需要從左側列表合並,則將其設置為 true(非零),如果您需要從右側合並,則設置為 false ( 0
)。 我們可以在不需要使用整個 if-else 鏈的情況下進行計算:
while ( leftIndex <= mid && rightIndex <= high )
{
int lastNameResult = strcasecmp( A[leftIndex]->LastName, A[rightIndex]->LastName );
int firstNameResult = strcasecmp( A[leftIndex]->FirstName, A[rightIndex]->FirstName );
/**
* Yes, you can use logical expressions outside of an if condition. The
* parentheses aren't necessary in this case, but it should make the
* expression easier to understand. The result of this expression
* will either be 0 or 1.
*/
int mergeLeft = lastNameResult < 0 || (lastNameResult == 0 && firstNameResult < 0);
如果左側LastName
小於右側LastName
,或者如果姓氏相等且左側FirstName
小於右側FirstName
,則mergeLeft
變量將設置為 true ( 1
)。 噗,那個丑陋的嵌套if
語句神奇地消失了,我們留下它作為while
循環的主體:
if ( mergeLeft )
{
tempA[combinedIndex++] = *A[leftIndex++]; // I've combined the updates
} // of combinedIndex and leftIndex
else // within these statements,
{ // just to save a little space
tempA[combinedIndex++] = *A[rightIndex++]; // I also replaced pointer
} // notation with array notation
} // because it's less eye-stabby
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.