如何根据python中的条件从数据框中删除行？

Question

So I have a CSV file that has data in the following manner:所以我有一个 CSV 文件，其中包含以下方式的数据：

|Variable |Time |Value|

|A1       |Jan  | 33  |

|         |Feb  | 21  |   

|         |Mar  | 08  |   

|         |Apr  | 17  |   

|         |May  | 04  |   

|         |Jun  | 43  |   

|         |Jul  | 40  |   

|         |Aug  | 37  |   

|         |Sep  | 30  |   

|         |Oct  | 46  |   

|         |Nov  | 10  | 

|         |Dec  | 13  | 

| B1      |Jan  | 20  |       

|         |Feb  | 11  |   

|         |Mar  | 02  |   

|         |Apr  | 18  |   

|         |May  | 10  |   

|         |Jun  | 35  |   

|         |Jul  | 45  |   

|         |Aug  | 32  |   

|         |Sep  | 39  |   

|         |Oct  | 42  |   

|         |Nov  | 15  | 

|         |Dec  | 18  |

Like this it goes on until A10 and B10.像这样一直持续到 A10 和 B10。

I need only A with time from Jan to Dec along with the values and drop values corresponding to B. How to do it?我只需要从一月到十二月的时间以及与 B 对应的值和删除值。怎么做？ What will be the condition?会是什么条件？

Answer 1

Two different methods:两种不同的方法：

If the column widths are fixed:如果列宽是固定的：

df = pd.read_fwf('file.csv', colspecs=[(1,9), (11,16), (17, 22)])
df = df[df.replace('', np.nan).ffill()['Variable'].str.startswith('A')]
print(df)

Output:输出：

   Variable Time  Value
0        A1  Jan     33
1            Feb     21
2            Mar      8
3            Apr     17
4            May      4
5            Jun     43
6            Jul     40
7            Aug     37
8            Sep     30
9            Oct     46
10           Nov     10
11           Dec     13

If things are more dirty:如果事情更脏：

with open('file.csv', 'r') as f:
    df = pd.DataFrame([[y.strip() for y in x.split('|')[1:4]] for x in f.readlines() if x.strip()])
df.columns = df.iloc[0].values
df = df.drop(0).reset_index(drop=True)
df['Value'] = pd.to_numeric(df['Value'])
print(df)

Output:输出：

   Variable Time  Value
0        A1  Jan     33
1            Feb     21
2            Mar      8
3            Apr     17
4            May      4
5            Jun     43
6            Jul     40
7            Aug     37
8            Sep     30
9            Oct     46
10           Nov     10
11           Dec     13
12       B1  Jan     20
13           Feb     11
14           Mar      2
15           Apr     18
16           May     10
17           Jun     35
18           Jul     45
19           Aug     32
20           Sep     39
21           Oct     42
22           Nov     15
23           Dec     18

Answer 2

Assuming your data is arranged as you described, with some extrapolation as below假设您的数据按照您的描述排列，并进行如下推断

Use pandas' ffill() to impute the variable column to facilitate the desired selection as below.使用 pandas 的ffill()来估算变量列，以方便进行所需的选择，如下所示。

sample = pd.read_csv('sample.csv')
sample['Variable'].ffill(axis=0,inplace=True)
sample = sample.loc[sample['Variable'].str.startswith('A')]
n_months = 12
indexes_to_impute_as_empty = list(range(0,len(sample),n_months))
sample.loc[indexes_to_impute_as_empty,'temp_Variable'] = sample.loc[indexes_to_impute_as_empty,'Variable']
sample['Variable'] = sample['temp_Variable']
sample.drop(columns=['temp_Variable'],inplace=True)
sample.replace(np.nan,"",inplace=True)
sample

如何根据python中的条件从数据框中删除行？

问题描述

2 个解决方案

解决方案1
0 2022-05-26 06:15:30

解决方案2
0 已采纳 2022-05-26 06:46:19

如何根据python中的条件从数据框中删除行？

问题描述

2 个解决方案

解决方案1 0 2022-05-26 06:15:30

解决方案2 0 已采纳 2022-05-26 06:46:19

解决方案1
0 2022-05-26 06:15:30

解决方案2
0 已采纳 2022-05-26 06:46:19