[英]how to generate n number of columns based on rows in previous column data in python, I am very new to python, pandas data frames
I have csv file data as below我有如下的 csv 文件数据
ModelNumber Variables
---------- ----------
208 TotalTerms
208 Children
208 Property
208 isMarried
207 HasLoan
207 Children
how to generate below output如何生成以下输出
ModelNumber Variable1 Variable2 Variable3 Variable4
---------- ---------- ---------- ---------- ----------
208 TotalTerms Children Property isMarried
207 HasLoan Children
I think a better case for your problem is to use pivot_table and define each variable as column instead of variable1, variable2, etc... And simply use 1/0 (True/False)for each variable in each model number:我认为对您的问题更好的情况是使用 pivot_table 并将每个变量定义为列而不是变量 1、变量 2 等......并且只需对每个型号中的每个变量使用 1/0(真/假):
df_1 = pd.DataFrame({'ModelNumber':[208,208,208,208,207,207],
'Variables':['TotalTerms','Children','Property','isMarried','HasLoan','Children']})
df_output = pd.pivot_table(df_1,index='ModelNumber',columns='Variables',aggfunc=len)
print(df_output)
Output:输出:
Variables Children HasLoan Property TotalTerms isMarried
ModelNumber
207 1 1 0 0 0
208 1 0 1 1 1
I'll write steps so it will be easier for you.我会写步骤,这样你会更容易。
Step 1: Read csv file第 1 步:读取 csv 文件
Step 2: While reading put data in the dict (we want to have data like ModelNumber as a key and Variables as an array elements), if the variable value is in the dict then append it's value to the array, if not, add its key to the dict with empty array as a value and then add this variable to the array.第 2 步:在读取 dict 中的数据时(我们希望将 ModelNumber 等数据作为键,将 Variables 作为数组元素),如果变量值在 dict 中,则将其值附加到数组中,如果没有,则添加其以空数组作为值的字典的键,然后将此变量添加到数组中。
Example data representation based on your data:基于您的数据的示例数据表示:
{
"208": ["TotalTerms", "Children", "Property", "isMarried"],
"207": ["HasLoan", "Children"]
}
Step 3: export this data back to csv第 3 步:将此数据导出回 csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.