[英]How to iteratively fill pandas Dataframe with columns
I am trying to create a pandas dataframe with iteratively counting statisitcs from another dataframe, it goes through columns (that filtered with regex). 我正在尝试创建一个pandas数据框,并从另一个数据框中迭代统计statisitcs,它通过列(使用正则表达式过滤)。 How could i create a result Dataframe?
我如何创建结果数据框? Input dataframe:
输入数据框:
In [4]: control.head()
Out[4]:
Patient Gender Age Left-Lateral-Ventricle_NVoxels Left-Inf-Lat-
Vent_NVoxels ... supramarginal_CurvInd_lh
0 P008 M 30 9414
311 ... 7.5
1 P013 F 35 7668
85 ... 10.4
2 P018 F 27 7350
202 ... 8.0
3 P033 F 55 7548
372 ... 9.2
4 P036 F 31 8598
48 ... 8.0
[5 rows x 930 columns]
I wrote a code to count statistic, but stuck on creating result pandas dataframe 我写了一个代码来统计统计信息,但坚持创建结果熊猫数据框
def select_volumes(group_c,group_k):
Select_list = ["Amygdala", "Hippocampus", "Lateral-Ventricle",
"Pallidum", "Putamen", "Thalamus"]
Side = ["Left", "Right"]
for s in Side:
for struct in Select_list:
volumes_c = group_c.filter(regex="^(?=.*"+s+")(?=.*"+struct+")
(?=.*Volume)")
volumes_k = group_k.filter(regex="^(?=.*"+s+")(?=.*"+struct+")
(?=.*Volume)")
k = cohens_d(volumes_c, volumes_k)
meand = volumes_c.mean()
result_df = pd.Dataframe(
{
"Cohen's norm": some result
"Mean Value": meand
}
)
return k
function select_volumes gives me the result: 函数select_volumes给我结果:
Left-Amygdala_Volume_mm3 -0.29729
dtype: float64
Left-Hippocampus_Volume_mm3 0.33139
dtype: float64
Left-Lateral-Ventricle_Volume_mm3 -0.111853
dtype: float64
Left-Pallidum_Volume_mm3 0.28857
dtype: float64
Left-Putamen_Volume_mm3 0.696645
dtype: float64
Left-Thalamus-Proper_Volume_mm3 0.772492
dtype: float64
Right-Amygdala_Volume_mm3 -0.358333
dtype: float64
Right-Hippocampus_Volume_mm3 0.275668
dtype: float64
Right-Lateral-Ventricle_Volume_mm3 -0.092283
dtype: float64
Right-Pallidum_Volume_mm3 0.279258
dtype: float64
Right-Putamen_Volume_mm3 0.484879
dtype: float64
Right-Thalamus-Proper_Volume_mm3 0.809775
dtype: float64
I want Left-Amygdala_Volume_mm3 ... be the row with value -0.29729 with column name Cohen's d be the column for every Select_list: example, how dataframe should looks 我希望Left-Amygdala_Volume_mm3 ...是值为-0.29729且行名为Cohen's d的行作为每个Select_list的列: 例如,数据帧的外观
I still cannot really understand how and where, but you showed that somewhere in the function you were able to build a float64 Series containing for example Left-Amygdala_Volume_mm3
as index and -0.29729
as value. 我仍然无法真正理解操作的方式和位置,但是您表明该函数中的某个位置能够构建一个float64系列,其中包含例如
Left-Amygdala_Volume_mm3
作为索引,而-0.29729
作为值。 And I assume that at the same time, you have the value of meand
for the same index value. 而且我假设您同时具有相同索引值的
meand
值。
More exactly I will assume: 更确切地说,我将假设:
k = pd.Series([-0.29729], dtype=np.float64,index=['Left-Amygdala_Volume_mm3'])
because it prints as: 因为它打印为:
print(k)
Left-Amygdala_Volume_mm3 -0.29729
dtype: float64
At the same time, I assume that meand
is also a similar Series. 同时,我认为它的
meand
也是相似的系列。 So we will access its value as meand.iloc[0]
(lets say value is 9174.1) 因此,我们将其访问值为
meand.iloc[0]
(假设值为9174.1)。
You should combine them to build the content of a row: 您应该将它们结合起来以构建一行的内容:
row = k.reset_index().iloc[0].tolist() + [meand.iloc[0]]
In the example we have row
: ['Left-Amygdala_Volume_mm3', -0.29729, 9174.1]
在示例中,我们具有以下
row
: ['Left-Amygdala_Volume_mm3', -0.29729, 9174.1]
So you now need to build a large list of that rows: 因此,您现在需要构建该行的大型列表:
def select_volumes(group_c,group_k):
Select_list = ["Amygdala", "Hippocampus", "Lateral-Ventricle",
"Pallidum", "Putamen", "Thalamus"]
Side = ["Left", "Right"]
data = []
for s in Side:
for struct in Select_list:
volumes_c = group_c.filter(regex="^(?=.*"+s+")(?=.*"+struct+")
(?=.*Volume)")
volumes_k = group_k.filter(regex="^(?=.*"+s+")(?=.*"+struct+")
(?=.*Volume)")
k = cohens_d(volumes_c, volumes_k)
meand = volumes_c.mean()
# build a row of result df
data.append(k.reset_index().iloc[0].tolist() + [meand.iloc[0]])
# after the loop combine the rows into a dataframe and return it:
result = pd.DataFrame(data, columns=['index', "Cohen's d", 'Mean']).set_index('index')
return result
I write to pd.Dataframe inside a function: 我在函数内写入pd.Dataframe:
k = cohens_d(volumes_c, volumes_k)
meand = volumes_c.mean()
volumes_df.append([cohen.index[0],cohen.values[0], meand)
return volumes_df
and out of a function I call pd.Dataframe with: 并从函数中调用pd.Dataframe与:
finaldf=pd.DataFrame(select_volumes(control,patolog))
finaldf.columns=['Structure','Cohensd','Meand')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.