Huggingface Transformer 异常的内存使用

Question

I have the following code attempting to use XL transformers to vectorize text:我有以下代码试图使用 XL 转换器来矢量化文本：

  text = "Some string about 5000 characters long"

  tokenizer = TransfoXLTokenizerFast.from_pretrained('transfo-xl-wt103', cache_dir=my_local_dir, local_files_only=True)
  model = TransfoXLModel.from_pretrained("transfo-xl-wt103", cache_dir=my_local_dir, local_files_only=True)

  encoded_input = tokenizer(text, return_tensors='pt') 
  output = model(**encoded_input)

This produces:这产生：

    output = model(**encoded_input)
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 863, in forward
    output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 385, in forward
    dec_inp, r, attn_mask=dec_attn_mask, mems=mems, head_mask=head_mask, output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w//default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 338, in forward
    attn_score = attn_score.float().masked_fill(attn_mask[:, :, :, None], -1e30).type_as(attn_score)
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 2007869696 bytes. Error code 12 (Cannot allocate memory)

I'm a little perplexed by this because this is asking for 2007869696, which is only 2GB and this machine has 64G of RAM.我对此有点困惑，因为这是要求 2007869696，它只有 2GB，而这台机器有 64G 的 RAM。 So I both don't understand why it is asking for this, and even more, why it is failing to get it.所以我都不明白为什么它会要求这个，甚至更多，为什么它没有得到它。

Where can I change the setting that controls this and allow this process more RAM?我在哪里可以更改控制此设置并允许此进程使用更多 RAM？ This is such a small invocation of the example code, and I just see very few places that would even accept this argument.这是对示例代码的一个如此小的调用，我只看到很少有地方甚至会接受这个参数。

Answer 1

Are you sure you are using the gpu instead of cpu?你确定你使用的是 gpu 而不是 cpu？

Try to run the python script with CUDA_LAUNCH_BLOCKING=1 python script.py .尝试使用CUDA_LAUNCH_BLOCKING=1 python script.py运行 python 脚本。 This will produce the correct python stack trace (as CUDA calls are asynchronous)这将产生正确的 python 堆栈跟踪（因为 CUDA 调用是异步的）

Also you can set the CUDA_VISIBLE_DEVICES using export CUDA_VISIBLE_DEVICES=device_number .您也可以使用export CUDA_VISIBLE_DEVICES=device_number设置CUDA_VISIBLE_DEVICES 。

There is also an issue still open on the pytorch github, try to check it out. pytorch github上还有一个issue还是open，试试看。

Huggingface Transformer 异常的内存使用

问题描述

1 个解决方案

解决方案1
3 2020-10-06 15:05:59

Huggingface Transformer 异常的内存使用

问题描述

1 个解决方案

解决方案1 3 2020-10-06 15:05:59

解决方案1
3 2020-10-06 15:05:59