[英]How to insert a pandas dataframe having a single csv column into MySQL Database
I have a pandas dataframe that I read from google sheet.我有一个从谷歌表中读取的熊猫数据框。 I then added the tag
column using:然后我使用以下方法添加了tag
列:
df['tag'] = df.filter(like = 'Subject', axis = 1).apply(lambda x: np.where(x == 'Y', x.name,'')).values.tolist()
df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= ''])
Resultant sample DataFrame:结果示例数据帧:
Id Name Subject-A Subject-B Total tag
0 1 A Y 100 [Subject-A]
1 2 B Y 98 [Subject-B]
2 3 C Y Y 191 [Subject-A, Subject-B]
3 4 D Y 100 [Subject-B]
4 5 E Y 95 [Subject-B]
Then I export the dataframe to a MySQL Database after converting the tag
column into a comma separated string by:然后,在将tag
列转换为逗号分隔的字符串后,我将数据框导出到 MySQL 数据库:
df['tag'] = df['tag'].map(lambda x : ', '.join(str(i) for i in x)).str.replace('Subject-','')
df
Id Name Subject-A Subject-B Total tag
0 1 A Y 100 A
1 2 B Y 98 B
2 3 C Y Y 91 A, B
3 4 D Y 100 B
4 5 E Y 95 B
df.to_sql(name = 'table_name', con = conn, if_exists = 'replace', index = False)
But in the MySQL database the tag
columns is:但是在 MySQL 数据库中, tag
列是:
A,
,B
A,B
,B
,B
My actual data has many such "Subject" columns so the result looks like:我的实际数据有许多这样的“主题”列,因此结果如下所示:
, , , D
A, ,C,
...
...
Could someone please let me know why it's giving expected out in Pandas but when I save the dataframe in cloud SQL, the column looks different.有人可以让我知道为什么它在 Pandas 中给出了预期的结果,但是当我将数据框保存在云 SQL 中时,该列看起来不同。 The expected output in MySQL database is same as how the tag
column is appearing in Pandas. MySQL 数据库中的预期输出与tag
列在 Pandas 中的显示方式相同。
Here is alternative solution, seems some data related problem.这是替代解决方案,似乎是一些与数据相关的问题。
First filter Subject
columns with remove Subject-
and then use DataFrame.dot
with columns names with separator, last strip separator from right side:首先使用 remove Subject-
过滤Subject
列,然后使用DataFrame.dot
与带有分隔符的列名称,从右侧的最后一个带分隔符:
df1 = df.filter(like = 'Subject').rename(columns=lambda x: x.replace('Subject-',''))
print (df1)
A B
0 Y NaN
1 NaN Y
2 Y Y
3 NaN Y
4 NaN Y
df['tag'] = df1.eq('Y').dot(df1.columns + ', ').str.rstrip(', ')
print (df)
Id Name Subject-A Subject-B Total tag
0 1 A Y NaN 100 A
1 2 B NaN Y 98 B
2 3 C Y Y 191 A, B
3 4 D NaN Y 100 B
4 5 E NaN Y 95 B
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.