基于列值的 DataFrame 中的 select 行，限制为 16384 行

Question

I have a huge *.csv file contains data as the example below, and had load the *.csvfile to a dataframe named as "data"我有一个巨大的 *.csv 文件，其中包含如下示例的数据，并将 *.csv 文件加载到名为“data”的 dataframe

I want to select the rows that with "CHR" column equals to "1", and my code is as below我想 select 具有“CHR”列等于“1”的行，我的代码如下

selected_row = data.loc[data['CHR'] == '1']

the result of selected_row is correct(row 0/3/6/7/10/13 are selected in the example), however, not containing all the rows with column equals to "1", I finally found selected_row contains rows with CHR=='1' till the 16384 row of data, the 16385 row (and many following rows) of data with CHR=='1' is not selected in selected_row, please advise, thanks. selected_row 的结果是正确的（示例中选择了第 0/3/6/7/10/13 行），但是，不包含列等于“1”的所有行，我终于发现 selected_row 包含 CHR= 的行='1' 直到第16384行数据，在selected_row中没有选择CHR=='1'的第16385行（以及后面的很多行）数据，请指教，谢谢。

Answer 1

Try尝试

selected_row = data.loc[data['CHR'].isin([1, '1'])]

Answer 2

i think you have got your filters mixed up to make it more easier for you Now apply the filter to your dataframe #try this我想你已经把你的过滤器弄混了，让你更容易现在将过滤器应用到你的 dataframe #试试这个

filter_row= data['CHR'] == '1']. #this would return a dataframe with boolean values which you can then use afterwards


```
data.loc[filter_row]

Answer 3

Thanks for everyone.谢谢大家。 By the way, it is strange that if I specify data type when reading the *.csv file, the problem also disappeared, not really know the reason behind and just for anyone's reference顺便说一句，奇怪的是如果我在读取*.csv文件时指定数据类型，问题也消失了，不知道背后的原因，仅供大家参考

data = pandas.read_csv("mydata.csv",dtype={"CHR":"string"})

基于列值的 DataFrame 中的 select 行，限制为 16384 行

问题描述

3 个解决方案

解决方案1
1 已采纳 2021-02-14 10:39:55

解决方案2
0 2021-02-14 10:45:35

解决方案3
0 2021-02-16 15:25:47

基于列值的 DataFrame 中的 select 行，限制为 16384 行

问题描述

3 个解决方案

解决方案1 1 已采纳 2021-02-14 10:39:55

解决方案2 0 2021-02-14 10:45:35

解决方案3 0 2021-02-16 15:25:47

解决方案1
1 已采纳 2021-02-14 10:39:55

解决方案2
0 2021-02-14 10:45:35

解决方案3
0 2021-02-16 15:25:47