I have a pandas dataframe that I read from google sheet. I then added the tag
column using:
df['tag'] = df.filter(like = 'Subject', axis = 1).apply(lambda x: np.where(x == 'Y', x.name,'')).values.tolist()
df['tag'] = df['tag'].apply(lambda x: [i for i in x if i!= ''])
Resultant sample DataFrame:
Id Name Subject-A Subject-B Total tag
0 1 A Y 100 [Subject-A]
1 2 B Y 98 [Subject-B]
2 3 C Y Y 191 [Subject-A, Subject-B]
3 4 D Y 100 [Subject-B]
4 5 E Y 95 [Subject-B]
Then I export the dataframe to a MySQL Database after converting the tag
column into a comma separated string by:
df['tag'] = df['tag'].map(lambda x : ', '.join(str(i) for i in x)).str.replace('Subject-','')
df
Id Name Subject-A Subject-B Total tag
0 1 A Y 100 A
1 2 B Y 98 B
2 3 C Y Y 91 A, B
3 4 D Y 100 B
4 5 E Y 95 B
df.to_sql(name = 'table_name', con = conn, if_exists = 'replace', index = False)
But in the MySQL database the tag
columns is:
A,
,B
A,B
,B
,B
My actual data has many such "Subject" columns so the result looks like:
, , , D
A, ,C,
...
...
Could someone please let me know why it's giving expected out in Pandas but when I save the dataframe in cloud SQL, the column looks different. The expected output in MySQL database is same as how the tag
column is appearing in Pandas.
Here is alternative solution, seems some data related problem.
First filter Subject
columns with remove Subject-
and then use DataFrame.dot
with columns names with separator, last strip separator from right side:
df1 = df.filter(like = 'Subject').rename(columns=lambda x: x.replace('Subject-',''))
print (df1)
A B
0 Y NaN
1 NaN Y
2 Y Y
3 NaN Y
4 NaN Y
df['tag'] = df1.eq('Y').dot(df1.columns + ', ').str.rstrip(', ')
print (df)
Id Name Subject-A Subject-B Total tag
0 1 A Y NaN 100 A
1 2 B NaN Y 98 B
2 3 C Y Y 191 A, B
3 4 D NaN Y 100 B
4 5 E NaN Y 95 B
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.