简体   繁体   中英

How to save custom embedding matrix to .txt file format?

I have made a dictionary which contains word and its corresponding word vector in the following format:

{'word1': array([ 4.530e-02, -1.170e-02, -1.201e-01,  2.439e-01,  4.670e-02d], type=float32),
'word2': array([ 4.530e-02, -1.170e-02, -1.201e-01,  2.439e-01,  4.670e-02d], type=float32)}

I would like to save this dictionary to custom_embeddings.txt file in the following format:

The format of your custom_embeddings.txt file needs to be the token followed by the values of each of the dimensions for the embedding, all separated by a single space, eg here's two tokens with 5 dimensional embeddings:

word1 4.530e-02 -1.170e-02 -1.201e-01  2.439e-01  4.670e-02d
word2 4.530e-02 -1.170e-02 -1.201e-01  2.439e-01  4.670e-02d

It will be really helpful if you could tell me how to achieve this result?

Thanks in advance

Python's .items() call is an elegant way to loop over all the words in your dictionary. This will save the output as lines of a text file:

txt_filename = 'output.txt'

with open(txt_filename, 'w') as f:
    for word, vec in my_wordvec_dict.items():
        f.write('{} {}\n'.format(word, ' '.join(['{:e}'.format(item) for item in vec])))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM