简体   繁体   English

如何使用引号之间的逗号读取具有某些值的逗号分隔数据

[英]How to read comma-delimited data with some values using commas between quotes

I have a data file that includes comma-delimited data that I am trying to read into Octave.我有一个数据文件,其中包含我试图读入 Octave 的以逗号分隔的数据。 Most of the data is fine, but some includes numbers between double quotes that use commas between the quotes.大多数数据都很好,但有些数据包括双引号之间的数字,这些数字在引号之间使用逗号。 Here's a sample section of data:这是数据的示例部分:

.123,4.2,"4,123",700,12pie
.34,4.23,602,701,23dj
.4345,4.6,"3,623,234",700,134nfg
.951,68.5,45,699,4lkj

I've been using textscan to read the data (since there's a mix of number and strings), specifying comma delimiters, and that works most of the time, but occasionally the file contains these bigger integers in quotes scattered through that column.我一直在使用textscan来读取数据(因为混合了数字和字符串),指定逗号分隔符,这在大多数情况下都有效,但文件偶尔会在引号中包含这些较大的整数,引号分散在该列中。 I was able to get around one of these quoted numbers earlier in the data file because I knew where it would be, but it wasn't pretty:我能够在数据文件的前面找到其中一个引用的数字,因为我知道它会在哪里,但它并不漂亮:

sclose = textscan(fid, '%n %n', 1, 'delimiter', ',');
junk = fgetl(fid, 1);
junk = textscan(fid, '%s', 1, 'delimiter', '"');
junk = fgetl(fid, 1);
sopen = textscan(fid, '%n %s', 1, 'delimiter', ',');

I don't care about the data in that column, but because it changes size and sometimes contains the quoted with extra commas that I want to ignore, I'm struggling with how to read/skip it.我不关心该列中的数据,但是因为它改变了大小并且有时包含引号和我想忽略的额外逗号,所以我正在努力阅读/跳过它。 Any suggestions on how to handle it?关于如何处理它有什么建议吗?

Here's my current (ugly) approach that reads the column as a string, then uses strfind to check for a " within the string. If it's present then it reads another comma-delimited string and repeats the check until the closing " is found and then resumes reading the data.这是我当前的(丑陋的)方法,该方法将列读取为字符串,然后使用strfind检查字符串中的“。如果它存在,则它读取另一个以逗号分隔的字符串并重复检查,直到找到结束的”,然后恢复读取数据。

fid = fopen('sample.txt', 'r');
for k=1:4
  expdata1(k, :) = textscan(fid, '%n %n %s', 1, 'delimiter', ',');  #read first 3 data pts
  qcheck = char(expdata1(k,3));
  idx = strfind(qcheck, '"');  #look for "
  dloc = ftell(fid);
  for l=1:4
    if isempty(idx) #if no " present, continue reading data
      break
    endif
    dloc = ftell(fid);  #save location so can return to next data point
    expdata1(k, 3) = textscan(fid, '%s', 1, 'delimiter', ',');  #if " present, read next comma segment and check for "
    qcheck = char(expdata1(k,3));
    idx = strfind(qcheck, '"');
  endfor
  fseek(fid, dloc);
  expdata2(k, :) = textscan(fid, '%n %s', 1, 'delimiter', ',');
endfor
fclose(fid);

There's gotta be a better way...一定有更好的方法...

I see this has a matlab tag on it, are you using matlab textscan or octave?我看到上面有一个 matlab 标签,您使用的是 matlab textscan 还是 octave?

If in matlab, I would suggest using either readmatrix or readtable .如果在 matlab 中,我建议使用readmatrixreadtable

Also note, the format specifier for quoted string is %q .另请注意,带引号的字符串的格式说明符是%q This should be applicable to both languages even for textscan .这应该适用于两种语言,即使对于textscan

Putting your sample data in data.csv , the following is possible:将您的样本数据放入data.csv ,以下是可能的:

>> readtable("data.csv", 'Format','%f%f%q%d%s');
ans =

  4×5 table

     Var1     Var2        Var3         Var4       Var5   
    ______    ____    _____________    ____    __________

     0.123     4.2    {'4,123'    }    700     {'12pie' }
      0.34    4.23    {'602'      }    701     {'23dj'  }
    0.4345     4.6    {'3,623,234'}    700     {'134nfg'}
     0.951    68.5    {'45'       }    699     {'4lkj'  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从带有逗号分隔数据的单元格中提取变量及其值 - Extract variables and their values from a cell with comma-delimited data 读取以逗号和制表符分隔的行 - Read line delimited by comma and tab 如何在倍频程/ matlab中读取乘法定界数据 - How to read multiply delimited data in octave/matlab 如何使用textscan()从MATLAB中的.txt文件中读取逗号分隔值? - How do I read comma separated values from a .txt file in MATLAB using textscan()? 如何使用 MATLAB 将多个逗号分隔的.dat 文件矩阵合并为一个矩阵? - How to merge multiple comma delimited .dat file matrices into one matrix using MATLAB? 如何从文本文件读取数据,该文本文件是回车换行符,分隔为matlab? - How do to I read in data from a text file which is carriage return line feed delimited to matlab? 如何在MATLAB中读取带'/'和空格的定界文件 - how to read delimited file with '/' and space in MATLAB 如何在MATLAB中两个数字之间打印逗号? - How to print commas between two numbers in MATLAB? 将Matlab矩阵导出为.txt,值之间不带逗号 - Export Matlab Matrix as .txt without commas in between values 从Matlab中的文本文件中读取数据,但数据的定义方式不一致 - Read in data from a text file in Matlab that is not delimited in a consistent manner
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM