I`ve faced with trouble to load model using gensim.model.FastText.load().
Here is some code and error which I get:
from gensim.models import FastText
class FastTextModel:
def __init__(self, model_path, dim=300):
self.dim = dim
self.model = FastText.load(model_path).wv
...
class GeneralModel:
def __init__(self, config):
if config["type"] == "fasttext":
# path - path to model
# dim - dimension, here 300
self.model = FastTextModel(config["path"], config["dim"])
File "/project/preprocessing/pipeline.py", line 15, in __init__
self.model_ru = GeneralModel(config["models"]["ru"])
File "/project/models/nlp_models.py", line 101, in __init__
self.model = FastTextModel(config["path"], config["dim"])
File "/project/models/nlp_models.py", line 16, in __init__
self.model = FastText.load(model_path).wv
File "/usr/local/lib64/python3.6/site-packages/gensim/models/fasttext.py", line 936, in load
model = super(FastText, cls).load(*args, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/gensim/models/base_any2vec.py", line 1244, in load
model = super(BaseWordEmbeddingsModel, cls).load(*args, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/gensim/models/base_any2vec.py", line 603, in load
return super(BaseAny2VecModel, cls).load(fname_or_handle, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 423, in load
obj._load_specials(fname, mmap, compress, subname)
File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 453, in _load_specials
getattr(self, attrib)._load_specials(cfname, mmap, compress, subname)
File "/usr/local/lib64/python3.6/site-packages/gensim/utils.py", line 464, in _load_specials
val = np.load(subname(fname, attrib), mmap_mode=mmap)
File "/usr/local/lib64/python3.6/site-packages/numpy/lib/npyio.py", line 447, in load
pickle_kwargs=pickle_kwargs)
File "/usr/local/lib64/python3.6/site-packages/numpy/lib/format.py", line 738, in read_array
array.shape = shape
ValueError: cannot reshape array of size 67239904 into shape (445446,300)
I've downloaded models from Google Drive folder, and though that it can somehow damage .npy files (as they are quite big), so I've downloaded each file (there 7 files for that model) separately, but this didn`t help me.
Also, I read that sometimes it can be caused because of bad unzipping in the 'load' method, but I'm passing already unzipped files into it, so this also don`t work for me.
Will be grateful for the help!
Where did the model(s) originate? The gensim FastText.load()
method is only for FastText models created & saved from gensim (via its .save()
method). Such models use a combination of Python-pickling & sibling .npy
raw-array files (to store large arrays) which must be kept together.
Models saved from Facebook's original FastText implementation are a different format, for which you'd use the load_facebook_model()
utility function:
https://radimrehurek.com/gensim/models/fasttext.html#gensim.models.fasttext.load_facebook_model
If you only need the vectors – as seems to be the case from your immediate use of only the .wv
property – you can also use the load_facebook_vectors()
function:
https://radimrehurek.com/gensim/models/fasttext.html#gensim.models.fasttext.load_facebook_vectors
(Also, not sure why you've wrapped the loaded model in your own FastTextModel
class which allows the caller to specify a dimensionality. You can't change the dimensionality of a loaded model, so it'd make more sense to just read the existing vector_size
from the model, rather than specify it outside.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.