如何根據python中的條件從數據框中刪除行？

Question

所以我有一個 CSV 文件，其中包含以下方式的數據：

|Variable |Time |Value|

|A1       |Jan  | 33  |

|         |Feb  | 21  |   

|         |Mar  | 08  |   

|         |Apr  | 17  |   

|         |May  | 04  |   

|         |Jun  | 43  |   

|         |Jul  | 40  |   

|         |Aug  | 37  |   

|         |Sep  | 30  |   

|         |Oct  | 46  |   

|         |Nov  | 10  | 

|         |Dec  | 13  | 

| B1      |Jan  | 20  |       

|         |Feb  | 11  |   

|         |Mar  | 02  |   

|         |Apr  | 18  |   

|         |May  | 10  |   

|         |Jun  | 35  |   

|         |Jul  | 45  |   

|         |Aug  | 32  |   

|         |Sep  | 39  |   

|         |Oct  | 42  |   

|         |Nov  | 15  | 

|         |Dec  | 18  |

像這樣一直持續到 A10 和 B10。

我只需要從一月到十二月的時間以及與 B 對應的值和刪除值。怎么做？ 會是什么條件？

Answer 1

兩種不同的方法：

如果列寬是固定的：

df = pd.read_fwf('file.csv', colspecs=[(1,9), (11,16), (17, 22)])
df = df[df.replace('', np.nan).ffill()['Variable'].str.startswith('A')]
print(df)

輸出：

   Variable Time  Value
0        A1  Jan     33
1            Feb     21
2            Mar      8
3            Apr     17
4            May      4
5            Jun     43
6            Jul     40
7            Aug     37
8            Sep     30
9            Oct     46
10           Nov     10
11           Dec     13

如果事情更臟：

with open('file.csv', 'r') as f:
    df = pd.DataFrame([[y.strip() for y in x.split('|')[1:4]] for x in f.readlines() if x.strip()])
df.columns = df.iloc[0].values
df = df.drop(0).reset_index(drop=True)
df['Value'] = pd.to_numeric(df['Value'])
print(df)

輸出：

   Variable Time  Value
0        A1  Jan     33
1            Feb     21
2            Mar      8
3            Apr     17
4            May      4
5            Jun     43
6            Jul     40
7            Aug     37
8            Sep     30
9            Oct     46
10           Nov     10
11           Dec     13
12       B1  Jan     20
13           Feb     11
14           Mar      2
15           Apr     18
16           May     10
17           Jun     35
18           Jul     45
19           Aug     32
20           Sep     39
21           Oct     42
22           Nov     15
23           Dec     18

Answer 2

假設您的數據按照您的描述排列，並進行如下推斷

使用 pandas 的ffill()來估算變量列，以方便進行所需的選擇，如下所示。

sample = pd.read_csv('sample.csv')
sample['Variable'].ffill(axis=0,inplace=True)
sample = sample.loc[sample['Variable'].str.startswith('A')]
n_months = 12
indexes_to_impute_as_empty = list(range(0,len(sample),n_months))
sample.loc[indexes_to_impute_as_empty,'temp_Variable'] = sample.loc[indexes_to_impute_as_empty,'Variable']
sample['Variable'] = sample['temp_Variable']
sample.drop(columns=['temp_Variable'],inplace=True)
sample.replace(np.nan,"",inplace=True)
sample

如何根據python中的條件從數據框中刪除行？

問題描述

2 個解決方案

解決方案1
0 2022-05-26 06:15:30

解決方案2
0 已采納 2022-05-26 06:46:19

如何根據python中的條件從數據框中刪除行？

問題描述

2 個解決方案

解決方案1 0 2022-05-26 06:15:30

解決方案2 0 已采納 2022-05-26 06:46:19

解決方案1
0 2022-05-26 06:15:30

解決方案2
0 已采納 2022-05-26 06:46:19