I want to apply FastText for my stack over flow tag predictor.
I have my tags as a dataframe:
df['Tags']
0 [php]
1 [firefox]
2 [r]
3 [c#]
4 [php, api]
...
179994 [php, flash]
179995 [delphi]
179996 [c]
179997 [android]
179998 [java, email]
Name: Tags, Length: 134222, dtype: object
I want to transform each element into a single string __label__XX__label__YY__
, so I tried:
tags=['__label__'.join(s) for s in df['Tags']]
This results in:
['php', 'firefox', 'r', 'c#', 'php__label__api', 'c#__label__asp.net', '.net__label__javascript', 'sql', '.net', 'algorithm', 'windows-7']
But I want my result as
['__label__php', '__label__firefox', '__label__r', '__label__c#', '__label__php__label__api', '__label__c#__label__asp.net', '__label__.net__label__javascript', '__label__sql', '__label__.net', '__label__algorithm', '__label__windows-7']
Try:
tags = ['__label__' + '__label__'.join(s) for s in df['Tags']]
Test:
labels = [['foo'], ['bar', 'baz']]
j = '__label__'
[j + j.join(l) for l in labels]
# out: ['__label__foo', '__label__bar__label__baz']
It's also worth looking at result = df['Tags'].applymap(lambda s: j+j.join(s))
but i did not test that.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.