Imagining I have the data of the word count in the sentence, where each sentence is an instance.
For example, this is the data for the sentence “I love apple love” and “Oh my god apple apple apple”: data = [[(“I”, 1), (“love”, 2), (“apple”, 1)],[(“Oh”, 1), (“my”, 1), (“god”, 1), (“apple”, 3)]]
I want to convert this to the 2-d np array, where the features are word, and the value of the feature is the word frequency, in this case:
sentence id | I | love | apple | Oh | my | god |
---|---|---|---|---|---|---|
0 | 1 | 2 | 1 | 0 | 0 | 0 |
1 | 0 | 0 | 3 | 1 | 1 | 1 |
>>> import pandas as pd
>>> data = [[("I", 1), ("love", 2), ("apple", 1)],[("Oh", 1), ("my", 1), ("god", 1), ("apple", 3)]]
>>> data
[[('I', 1), ('love', 2), ('apple', 1)], [('Oh', 1), ('my', 1), ('god', 1), ('apple', 3)]]
>>> dfs = []
>>> for item in data:
val = dict(item)
index = [' '.join(dict(item).keys())]
df = pd.DataFrame(val, index=index)
dfs.append(df)
>>> sent_df = pd.concat(dfs)
>>> sent_df
I love apple Oh my god
I love apple 1.0 2.0 1 NaN NaN NaN
Oh my god apple NaN NaN 3 1.0 1.0 1.0
>>> sent_df.index.name = 'sentence'
>>> sent_df = sent_df.reset_index().fillna(0)
>>> sent_df
sentence I love apple Oh my god
0 I love apple 1.0 2.0 1 0.0 0.0 0.0
1 Oh my god apple 0.0 0.0 3 1.0 1.0 1.0
# if you don't want sentence inside the dataframe
# ===============================================
>>> sent_df = sent_df.drop('sentence', axis=1)
>>> sent_df
I love apple Oh my god
0 1.0 2.0 1 0.0 0.0 0.0
1 0.0 0.0 3 1.0 1.0 1.0
>>> sent_df.index.name = 'sentence_id'
>>> sent_df.reset_index()
sentence_id I love apple Oh my god
0 0 1.0 2.0 1 0.0 0.0 0.0
1 1 0.0 0.0 3 1.0 1.0 1.0
# if you want 2-D numpy array (numpy array doesn't preserve column names)
# =======================================================================
>>> sent_df.reset_index().to_numpy()
array([[0., 1., 2., 1., 0., 0., 0.],
[1., 0., 0., 3., 1., 1., 1.]])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.