简体   繁体   English

Huggingface Transformer 异常的内存使用

[英]Huggingface transformers unusual memory use

I have the following code attempting to use XL transformers to vectorize text:我有以下代码试图使用 XL 转换器来矢量化文本:

  text = "Some string about 5000 characters long"

  tokenizer = TransfoXLTokenizerFast.from_pretrained('transfo-xl-wt103', cache_dir=my_local_dir, local_files_only=True)
  model = TransfoXLModel.from_pretrained("transfo-xl-wt103", cache_dir=my_local_dir, local_files_only=True)

  encoded_input = tokenizer(text, return_tensors='pt') 
  output = model(**encoded_input)

This produces:这产生:

    output = model(**encoded_input)
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 863, in forward
    output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w/default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 385, in forward
    dec_inp, r, attn_mask=dec_attn_mask, mems=mems, head_mask=head_mask, output_attentions=output_attentions,
  File "/home/user/w/default/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/w//default/lib/python3.7/site-packages/transformers/modeling_transfo_xl.py", line 338, in forward
    attn_score = attn_score.float().masked_fill(attn_mask[:, :, :, None], -1e30).type_as(attn_score)
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 2007869696 bytes. Error code 12 (Cannot allocate memory)

I'm a little perplexed by this because this is asking for 2007869696, which is only 2GB and this machine has 64G of RAM.我对此有点困惑,因为这是要求 2007869696,它只有 2GB,而这台机器有 64G 的 RAM。 So I both don't understand why it is asking for this, and even more, why it is failing to get it.所以我都不明白为什么它会要求这个,甚至更多,为什么它没有得到它。

Where can I change the setting that controls this and allow this process more RAM?我在哪里可以更改控制此设置并允许此进程使用更多 RAM? This is such a small invocation of the example code, and I just see very few places that would even accept this argument.这是对示例代码的一个如此小的调用,我只看到很少有地方甚至会接受这个参数。

Are you sure you are using the gpu instead of cpu?你确定你使用的是 gpu 而不是 cpu?

Try to run the python script with CUDA_LAUNCH_BLOCKING=1 python script.py .尝试使用CUDA_LAUNCH_BLOCKING=1 python script.py运行 python 脚本。 This will produce the correct python stack trace (as CUDA calls are asynchronous)这将产生正确的 python 堆栈跟踪(因为 CUDA 调用是异步的)

Also you can set the CUDA_VISIBLE_DEVICES using export CUDA_VISIBLE_DEVICES=device_number .您也可以使用export CUDA_VISIBLE_DEVICES=device_number设置CUDA_VISIBLE_DEVICES

There is also an issue still open on the pytorch github, try to check it out. pytorch github上还有一个issue还是open,试试看。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM