[英]Remove downloaded tensorflow and pytorch(Hugging face) models
I would like to remove tensorflow and hugging face models from my laptop.我想从我的笔记本电脑中删除 tensorflow 和拥抱面部模型。 I did find one link https://github.com/huggingface/transformers/issues/861 but is there not command that can remove them because as mentioned in the link manually deleting can cause problems because we don't know which other files are linked to those models or are expecting some model to be present in that location or simply it may cause some error.
我确实找到了一个链接https://github.com/huggingface/transformers/issues/861但没有可以删除它们的命令,因为如链接中所述,手动删除可能会导致问题,因为我们不知道还有哪些其他文件链接到这些模型或期望某些 model 出现在该位置,或者只是它可能会导致一些错误。
The transformers library will store the downloaded files in your cache.转换器库会将下载的文件存储在您的缓存中。 As far as I know, there is no built-in method to remove certain models from the cache.
据我所知,没有内置方法可以从缓存中删除某些模型。 But you can code something by yourself.
但是您可以自己编写代码。 The files are stored with a cryptical name alongside two additional files that have
.json
( .h5.json
in case of Tensorflow models) and .lock
appended to the cryptical name.这些文件与两个附加文件一起存储,其中包含
.json
(对于.h5.json
型号,则为 .h5.json )和.lock
名称附加的两个附加文件。 The json file contains some metadata that can be used to identify the file. json 文件包含一些可用于识别文件的元数据。 The following is an example of such a file:
以下是此类文件的示例:
{"url": "https://cdn.huggingface.co/roberta-base-pytorch_model.bin", "etag": "\"8a60a65d5096de71f572516af7f5a0c4-30\""}
We can now use this information to create a list of your cached files as shown below:我们现在可以使用此信息来创建缓存文件列表,如下所示:
import glob
import json
import re
from collections import OrderedDict
from transformers import TRANSFORMERS_CACHE
metaFiles = glob.glob(TRANSFORMERS_CACHE + '/*.json')
modelRegex = "huggingface\.co\/(.*)(pytorch_model\.bin$|resolve\/main\/tf_model\.h5$)"
cachedModels = {}
cachedTokenizers = {}
for file in metaFiles:
with open(file) as j:
data = json.load(j)
isM = re.search(modelRegex, data['url'])
if isM:
cachedModels[isM.group(1)[:-1]] = file
else:
cachedTokenizers[data['url'].partition('huggingface.co/')[2]] = file
cachedTokenizers = OrderedDict(sorted(cachedTokenizers.items(), key=lambda k: k[0]))
Now all you have to do is to check the keys of cachedModels
and cachedTokenizers
and decide if you want to keep them or not.现在您所要做的就是检查
cachedModels
和cachedTokenizers
的键并决定是否要保留它们。 In case you want to delete them, just check for the value of the dictionary and delete the file from the cache.如果您想删除它们,只需检查字典的值并从缓存中删除文件。 Don't forget to also delete the corresponding
*.json
and *.lock
files.不要忘记同时删除相应的
*.json
和*.lock
文件。
pip uninstall tensorflow
pip uninstall tensorflow-gpu
pip uninstall transformers
and find where you have saved gpt-2并找到您保存 gpt-2 的位置
model.save_pretrained("./english-gpt2") .??? model.save_pretrained("./english-gpt2") .???
english-gpt2 = your downloaded model name.
English-gpt2 = 您下载的 model 名称。
from that path you can manually delete.您可以从该路径手动删除。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.