简体   繁体   中英

How to insert a pandas dataframe having a single csv column into MySQL Database

I have a pandas dataframe that I read from google sheet. I then added the tag column using:

df['tag'] = df.filter(like = 'Subject', axis = 1).apply(lambda x: np.where(x == 'Y', x.name,'')).values.tolist()
df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= ''])

Resultant sample DataFrame:

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     [Subject-A]
1   2   B                   Y           98      [Subject-B]
2   3   C       Y           Y           191     [Subject-A, Subject-B]
3   4   D                   Y           100     [Subject-B]
4   5   E                   Y           95      [Subject-B]

Then I export the dataframe to a MySQL Database after converting the tag column into a comma separated string by:

df['tag'] = df['tag'].map(lambda x : ', '.join(str(i) for i in x)).str.replace('Subject-','')
df

    Id  Name    Subject-A   Subject-B   Total   tag
0   1   A       Y                       100     A
1   2   B                   Y           98      B
2   3   C       Y           Y           91      A, B
3   4   D                   Y           100     B
4   5   E                   Y           95      B

df.to_sql(name = 'table_name', con = conn, if_exists = 'replace', index = False)

But in the MySQL database the tag columns is:

A,
,B
A,B
,B
,B

My actual data has many such "Subject" columns so the result looks like:

, , , D
A, ,C,
...
...

Could someone please let me know why it's giving expected out in Pandas but when I save the dataframe in cloud SQL, the column looks different. The expected output in MySQL database is same as how the tag column is appearing in Pandas.

Here is alternative solution, seems some data related problem.

First filter Subject columns with remove Subject- and then use DataFrame.dot with columns names with separator, last strip separator from right side:

df1 = df.filter(like = 'Subject').rename(columns=lambda x: x.replace('Subject-',''))
print (df1)
     A    B
0    Y  NaN
1  NaN    Y
2    Y    Y
3  NaN    Y
4  NaN    Y

df['tag'] = df1.eq('Y').dot(df1.columns  + ', ').str.rstrip(', ')
print (df)
   Id Name Subject-A Subject-B  Total   tag
0   1    A         Y       NaN    100     A
1   2    B       NaN         Y     98     B
2   3    C         Y         Y    191  A, B
3   4    D       NaN         Y    100     B
4   5    E       NaN         Y     95     B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM