[英]Filling sections (specific rows/columns) of dataframe based on criteria (corresponding to df1 row/column names) defined another df
[英]Python: From Rows to Columns for Specific Sections in DF
我在excel中有下表,如下所示:
import pandas as pd
data = """ Col1 | Col2 | Col3 | Col4
Value11 | Value21 | Value31 |
stuff | stuff | stuff | 2.0
stuff | stuff | stuff | 3.0
| | Total | 5.0
Value12 | Value22 | Value32 |
stuff | stuff | stuff | 6.0
stuff | stuff | stuff | 4.0
| | Total | 10.0 """
df = pd.read_csv(pd.compat.StringIO(data), header=0, delimiter = "|")
print(df)
+---------+---------+---------+------+
| Col1 | Col2 | Col3 | Col4 |
+---------+---------+---------+------+
| Value11 | Value21 | Value31 | |
| stuff | stuff | stuff | 2.0 |
| stuff | stuff | stuff | 3.0 |
| | | Total | 5.0 |
| Value12 | Value22 | Value32 | |
| stuff | stuff | stuff | 6.0 |
| stuff | stuff | stuff | 4.0 |
| | | Total | 10.0 |
+---------+---------+---------+------+
並且想要這樣,以便我可以進行數據分析:
+-------+-------+--------+------+----------+---------+---------+
| Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | Col7 |
+-------+-------+--------+------+----------+---------+---------+
| stuff | stuff | stuff | 2.0 | Value11 | Value21 | Value31 |
| stuff | stuff | stuff | 3.0 | Value11 | Value21 | Value31 |
| stuff | stuff | stuff | 6.0 | Value12 | Value22 | Value32 |
| stuff | stuff | stuff | 4.0 | Value12 | Value22 | Value32 |
+-------+-------+--------+------+----------+---------+---------+
也就是說,我想將每個 Col1、Col2、Col3 中的值轉換為相應部分旁邊的重復行。
我看到的唯一模式是 Col3 中有一個“Total”變量,就在我想轉換為行的值的正上方。
關於如何在 Python 中實現這一點的任何想法?
你在尋找這樣的東西嗎?
import pandas as pd
df = pd.DataFrame(
{"Col1": ["Value11", "stuff1", "stuff1","Value12", "stuff2", "stuff2"],
"Col2": ["Value21", "stuff1", "stuff1","Value22", "stuff2", "stuff2"],
"Col3": ["Value31", "stuff1", "stuff1","Value32", "stuff2", "stuff2"],
"Col4": ["", 2, 3,"",6,4], },
index=[1, 2, 3,4,5,6])
df1 = df.loc[df['Col1'] == 'stuff1']
df2 = df.loc[df['Col1'] == 'stuff2']
dfc = pd.concat([df1,df2])
df11 = df.loc[df['Col1'] == 'Value11']
df22 = df.loc[df['Col1'] == 'Value12']
dfc1 = pd.concat([df11, df11])
dfc2 = pd.concat([df22, df22])
df_fin1 = pd.concat([dfc1, dfc2])
print(df_fin1)
dfc.reset_index(drop=True, inplace=True)
df_fin1.reset_index(drop=True, inplace=True)
df_fin = pd.concat([dfc, df_fin1], axis=1)
print(df_fin)
因此,如果您只選擇行並對其進行操作,則代碼如下所示:
df21 = pd.concat([df.iloc[0:1], df.iloc[0:1]])
df22 = pd.concat([df.iloc[4:5], df.iloc[4:5]])
df2 = pd.concat([df21,df22])
df1 = pd.concat([df.iloc[1:3], df.iloc[5:7]])
df1.reset_index(drop=True, inplace=True)
df2.reset_index(drop=True, inplace=True)
df_f = pd.concat([df1, df2], axis=1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.