![](/img/trans.png)
[英]How to create a data frame from the results of an if statement on three data frames
[英]dynamically create string from three data frames
我有三個數據框,如下所示,一個是 df,另一個是異常:-
d = {'10028': [0], '1058': [25], '20120': [29], '20121': [22],'20122': [0], '20123': [0], '5043': [0], '5046': [0]}
df1 = pd.DataFrame(data=d)
基本上 df 的鏡像副本中的異常只是在異常中,值將為 0 或 1,這表示值為 1 的異常和值為 0 的非異常
d = {'10028': [0], '1058': [1], '20120': [1], '20121': [0],'20122': [0], '20123': [0], '5043': [0], '5046': [0]}
df2 = pd.DataFrame(data=d)
第三個數據框如下:-
d = {'10028': ['US,IN'], '1058': ['NA, JO, US'], '20120': [''], '20121': ['US,PK'],'20122': ['IN'], '20123': ['Us,LN'], '5043': ['AI,AL'], '5046': ['AA,AB']}
df3 = pd.DataFrame(data=d)
details = (
'\n' + 'Metric Name' + '\t' + 'Count' + '\t' + 'Anomaly' + '\t' + 'Country'
'\n' + '10028:' + '\t'+ '\t' + str(df1.tail(1)['10028'][0]) + '\t' + str(df2['10028'][0]) + '\t'+ str(df3['10028'][0]) +
'\n' + '1058:' + '\t' + '\t' + str(df1.tail(1)['1058'][0]) + '\t' + str(df2['1058'][0]) + '\t'+ str(df3['1058'][0]) +
'\n' + '20120:' + '\t' +'\t' + str(df1.tail(1)['20120'][0]) + '\t' + str(df2['20120'][0]) + '\t'+ str(df3['20120'][0]) +
'\n' + '20121:' + '\t' + '\t' +str(round(df1.tail(1)['20121'][0], 2)) + '\t' + str(df2['20121'][0]) + '\t'+ str(df3['20121'][0]) +
'\n' + '20122:' + '\t' + '\t' +str(round(df1.tail(1)['20122'][0], 2)) + '\t' + str(df2['20122'][0]) + '\t'+str(df3['20122'][0]) +
'\n' + '20123:' + '\t' + '\t' +str(round(df1.tail(1)['20123'][0], 3)) + '\t' + str(df2['20123'][0]) + '\t'+str(df3['20123'][0]) +
'\n' + '5043:' + '\t' + '\t' +str(round(df1.tail(1)['5043'][0], 3)) + '\t' + str(df2['5043'][0]) + '\t'+str(df3['5043'][0]) +
'\n' + '5046:' + '\t' + '\t' +str(round(df1.tail(1)['5046'][0], 3)) + '\t' + str(df2['5046'][0]) + '\t'+str(df3['5046'][0]) +
'\n\n' + 'message:' + '\t' +
'Something wrong with the platform as there is a spike in [values where anomalies == 1].'
)
問題是列值在每次運行中總是在變化我的意思是在這次運行中它'10028', '1058', '20120', '20121', '20122', '20123', '5043', '5046'
但也許在下一次運行中它將是'10029', '1038', '20121', '20122', '20123', '5083', '5946'
如何根據數據框中存在的列動態創建詳細信息,因為我不想硬編碼,並且在消息中我想傳遞值為 1 的列的名稱。
對於 df1 和 df2,列的值將始終為 1 或 0,而對於 df3,則為列表或空白。
對於兩個數據框,我得到了一個工作解決方案,如下所示:-
# first part of the string
s = '\n' + 'Metric Name' + '\t' + 'Count' + '\t' + 'Anomaly'
# dynamically add the data
for idx, val in df1.iloc[-1].iteritems():
s += f'\n{idx}\t{val}\t{df2[idx][0]}'
# last part
s += ('\n\n' + 'message:' + '\t' +
'Something wrong with the platform as there is a spike in [values where anomalies == 1].'
)
如果不存在匹配值,則打印 null
要獲得預期的結果,您可以執行以下操作(輸入數據必須是問題所示的字典,如果不是,請提供真實的輸入數據):
import pandas as pd
final_d = []
d = {'10028': 0, '1058': 25, '20120': 29, '20121': 22,'20122': 0, '20123': 0, '5043': 0, '5046': 0}
final_d.append(d)
d = {'10028': 0, '1058': 1, '20120': 1, '20121': 0,'20122': 0, '20123': 0, '5043': 0, '5046': 0, '91111':0}
final_d.append(d)
d = {'10028': ['US','IN'], '1058': ['NA', 'JO', 'US'], '20120': [''], '20121': ['US','PK'],'20122': ['IN'], '20123': ['Us','LN'], '5043': ['AI','AL'], '5046': ['AA','AB'], '00000':['kk','dd','ee']}
final_d.append(d)
# Now, we will merge the dictionaries on key
data = {}
for i, dt in enumerate(final_d):
for k,v in dt.items():
if k in data:
if type(v)==list:
data[k][i] = ','.join(v)
else:
data[k][i] = v
else:
data[k] = ['']*len(final_d)
if type(v)==list:
data[k][i] = ','.join(v)
else:
data[k][i] = v
maxlen = max([len(v) for v in data.values()])
data = {k:v if len(v)==maxlen else v+['']*(maxlen-len(v)) for k,v in data.items()}
# Creating the base dataframe
df = pd.DataFrame.from_dict(data)
# Converting the column headers (metric names) into a row in the dataframe
df = pd.concat([pd.DataFrame.from_dict({k:[v] for k,v in zip(df.columns.tolist(), df.columns.tolist())}), df], ignore_index=True)
# removing column names
df.columns = [''] * len(df.columns)
# organising the dataframe according to your required output
result = df.T.reset_index(drop=True)
# Adding the column names as required
result.columns = ['Metric Name', 'Count', 'Anomaly', 'Country']
# Voila!
print(result.to_string(index=False))
生成的dataframe:
Metric Name Count Anomaly Country
10028 0 0 US,IN
1058 25 1 NA,JO,US
20120 29 1
20121 22 0 US,PK
20122 0 0 IN
20123 0 0 Us,LN
5043 0 0 AI,AL
5046 0 0 AA,AB
91111 0
00000 kk,dd,ee
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.