I have a dataset like this. The 1st column is the word and 2nd column is the tag .
Pretty O bad O storm O here O last O evening O. O From O Green O Newsfeed O: O AHFA B-group extends O deadline O for O Sage O Award O to O Nov O. O
I want to reconstruct the sentences,
so the output will be like
[[('Pretty', 'O'), ('bad', 'O'), ('storm','O'), ('here', 'O'), ('last', 'O'), ('evening', 'O'), ('.', 'B-geo')][(From, 'O'), ('Green', 'O'), ('Newsfeed', 'O'), ('storm:,'O'), ('AHFA', 'B-group'), ('extends', 'O'), ('deadline', 'O'), ('for', 'O'),('Sage', 'O'), ('Award', 'B-geo')][(to, 'O'), ('Nov', 'O'), ('.','O']]
Can someone help me making the sentences from this.
If you have:
a = pd.DataFrame([('Pretty', 'O'), ('bad', 'O'), ('storm','O'), ('here', 'O'), ('last', 'O'), ('evening', 'O'), ('.', 'B-geo')])
then to get: [('Pretty', 'O'), ('bad', 'O'), ('storm','O'), ('here', 'O'), ('last', 'O'), ('evening', 'O'), ('.', 'B-geo')]
You can do:
[tuple(u) for u in a.values.tolist()]
Then you can do this for each one of your dataframe and concat all the list of tuple
If you have all your sentences in one dataframe like this:
a = pd.DataFrame([
('Pretty', 'O'),
('bad', 'O'),
('storm','O'),
('here', 'O'),
('last', 'O'),
('evening', 'O'),
('.', 'B-geo'),
(' ',''),
('The', 'O'),
('World', 'O'),
('is', 'O'),
('...','N-geo')
])
you can find the index of " " ie space value and split your dataset like this:
index_list = a.index[a[0] == " "].tolist()
df1 = a.iloc[:index_list[0], :]
df2 = a.iloc[index_list[0]:, :]
So finally you'll have somethings like this:
def dataset_to_list_of_tuple(df):
final_list = []
index_list = df.index[df[0] == " "].tolist()
for i in range(len(index_list)):
if i == 0:
df_part = df.iloc[:index_list[0], :]
else:
df_part = df.iloc[index_list[i-1]:index_list[i], :]
sentence = [tuple(u) for u in df_part.values.tolist()]
final_list.append(sentence)
return final_list
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.