简体   繁体   English

如何将具有单个 csv 列的 Pandas 数据框插入 MySQL 数据库

[英]How to insert a pandas dataframe having a single csv column into MySQL Database

I have a pandas dataframe that I read from google sheet.我有一个从谷歌表中读取的熊猫数据框。 I then added the tag column using:然后我使用以下方法添加了tag列:

df['tag'] = df.filter(like = 'Subject', axis = 1).apply(lambda x: np.where(x == 'Y', x.name,'')).values.tolist()
df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= ''])

Resultant sample DataFrame:结果示例数据帧:

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     [Subject-A]
1   2   B                   Y           98      [Subject-B]
2   3   C       Y           Y           191     [Subject-A, Subject-B]
3   4   D                   Y           100     [Subject-B]
4   5   E                   Y           95      [Subject-B]

Then I export the dataframe to a MySQL Database after converting the tag column into a comma separated string by:然后,在将tag列转换为逗号分隔的字符串后,我将数据框导出到 MySQL 数据库:

df['tag'] = df['tag'].map(lambda x : ', '.join(str(i) for i in x)).str.replace('Subject-','')
df

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     A
1   2   B                   Y           98      B
2   3   C       Y           Y           91      A, B
3   4   D                   Y           100     B
4   5   E                   Y           95      B

df.to_sql(name = 'table_name', con = conn, if_exists = 'replace', index = False)

But in the MySQL database the tag columns is:但是在 MySQL 数据库中, tag列是:

A,
,B
A,B
,B
,B

My actual data has many such "Subject" columns so the result looks like:我的实际数据有许多这样的“主题”列,因此结果如下所示:

, , , D
A, ,C,
...
...

Could someone please let me know why it's giving expected out in Pandas but when I save the dataframe in cloud SQL, the column looks different.有人可以让我知道为什么它在 Pandas 中给出了预期的结果,但是当我将数据框保存在云 SQL 中时,该列看起来不同。 The expected output in MySQL database is same as how the tag column is appearing in Pandas. MySQL 数据库中的预期输出与tag列在 Pandas 中的显示方式相同。

Here is alternative solution, seems some data related problem.这是替代解决方案,似乎是一些与数据相关的问题。

First filter Subject columns with remove Subject- and then use DataFrame.dot with columns names with separator, last strip separator from right side:首先使用 remove Subject-过滤Subject列,然后使用DataFrame.dot与带有分隔符的列名称,从右侧的最后一个带分隔符:

df1 = df.filter(like = 'Subject').rename(columns=lambda x: x.replace('Subject-',''))
print (df1)
     A    B
0    Y  NaN
1  NaN    Y
2    Y    Y
3  NaN    Y
4  NaN    Y

df['tag'] = df1.eq('Y').dot(df1.columns  + ', ').str.rstrip(', ')
print (df)
   Id Name Subject-A Subject-B  Total   tag
0   1    A         Y       NaN    100     A
1   2    B       NaN         Y     98     B
2   3    C         Y         Y    191  A, B
3   4    D       NaN         Y    100     B
4   5    E       NaN         Y     95     B

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM