简体   繁体   中英

Huggingface transformers unusual memory use

I have the following code attempting to use XL transformers to vectorize text:

  text = "Some string about 5000 characters long"

  tokenizer = TransfoXLTokenizerFast.from_pretrained('transfo-xl-wt103', cache_dir=my_local_dir, local_files_only=True)
  model = TransfoXLModel.from_pretrained("transfo-xl-wt103", cache_dir=my_local_dir, local_files_only=True)

  encoded_input = tokenizer(text, return_tensors='pt') 
  output = model(**encoded_input)

This produces:

    output = model(**encoded_input)
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 863, in forward
    output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 385, in forward
    dec_inp, r, attn_mask=dec_attn_mask, mems=mems, head_mask=head_mask, output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w//default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 338, in forward
    attn_score = attn_score.float().masked_fill(attn_mask[:, :, :, None], -1e30).type_as(attn_score)
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 2007869696 bytes. Error code 12 (Cannot allocate memory)

I'm a little perplexed by this because this is asking for 2007869696, which is only 2GB and this machine has 64G of RAM. So I both don't understand why it is asking for this, and even more, why it is failing to get it.

Where can I change the setting that controls this and allow this process more RAM? This is such a small invocation of the example code, and I just see very few places that would even accept this argument.

Are you sure you are using the gpu instead of cpu?

Try to run the python script with CUDA_LAUNCH_BLOCKING=1 python script.py . This will produce the correct python stack trace (as CUDA calls are asynchronous)

Also you can set the CUDA_VISIBLE_DEVICES using export CUDA_VISIBLE_DEVICES=device_number .

There is also an issue still open on the pytorch github, try to check it out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM