简体   繁体   English

正确读取 C 中制表符分隔的 csv 文件

[英]Read a tab delimited csv file in C properly

I have a CSV file tab-delimited that I am trying to read through C.我有一个以制表符分隔的 CSV 文件,我正在尝试通读 C。 fields are separated by tab but there are few null fields as well.. I am using fscanf(fileptr,"[^\t]s",field);字段由制表符分隔,但 null 字段也很少。我正在使用 fscanf(fileptr,"[^\t]s",field); to read a single field, the issue I am getting it did not help me to identify those fields that are null.. i also tried this fscanf(fileptr,"%s\t%s\t",field1,field2);要读取单个字段,我遇到的问题并没有帮助我识别那些是 null 的字段。我也试过这个 fscanf(fileptr,"%s\t%s\t",field1,field2); but I am facing the same issues, even in the second method if I have a field separated with space,I didn't get it proper because %s will only read-string without white spaces.但我面临同样的问题,即使在第二种方法中,如果我有一个用空格分隔的字段,我没有得到它正确,因为 %s 只会读取没有空格的字符串。 How to achieve that?如何做到这一点?

int main() {
    FILE *fp = fopen( "your_file_here", "r");
    if( fp == NULL ){
        return 1; //file not exists or locked by another process
    }
    char mystr[1024];   //buffer for read data from file
    while( !feof( fp ) ) { //loop trough file
        if( fgets( mystr, 1024, fp ) != NULL ) { //read line
            int len = strlen( mystr ); //string length
              
            //trim \n or \r from ending of string
            for ( int i = len -1; i >= 0; i-- ) {
                if( mystr[ i ] == '\r' || mystr[ i ] == '\n' ){ //if ending has an \r or \n
                    mystr[ i ] = 0; //set this to \0
                }
                else{ //else
                    break; //leave loop
                }
            }//end trim

            //tokenize the mystr by search for \t and store the
            //chars in tokenBuff
            //if \t found, the tokenBuff is ready for output
            //the last token by found NULL is also ready for output
            char tokenBuff[1024];
            menset( tokenBuff, 0, 1024 );
            int offset = 0;
            int tokenBuffOff = 0;
            while( true ){
                if( mystr[ offset ] == '\t' ){
                    printf( "token: %s\n", tokenBuff );
                    memset( tokenBuff, 0, 1024 );
                    tokenBuffOff = 0;
                }
                else{
                    if( mystr[ offset ] == '\0' ){
                        if( tokenBuff != NULL ){
                            printf( "token: %s\n", tokenBuff );
                        }
                        break;
                    }
                    else{
                        tokenBuff[ tokenBuffOff ] = mystr[ offset ];
                        tokenBuffOff++;
                    }
                }
                offset++;
            }//end tokenize

        }//end if fgets
    }/end while
    fclose(fp) ;
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM