简体   繁体   English

如何获取字符串列表并查找名称与列表中的字符串匹配的文件?

[英]How can I take a list of strings and find files who name matches a string in the list?

I have a list of 600+ numbers, and a directory of 50,000+ files. 我有600多个数字的清单,以及50,000多个文件的目录。 All of the files are named like this: 所有文件的命名如下:

99574404682_0.jpg
99574404682_1.jpg
99574437307_0.gif
99574437307_1.gif
99574437307_2.gif
99574449752.jpg
99574457597.jpg
99581722007.gif

I want to copy any file that has a name that matches a number in the list, up to the underscore, and copy to a new directory. 我想复制名称与列表中的一个数字匹配的任何文件,直到下划线,然后复制到新目录。

For example if my list contains: 例如,如果我的列表包含:

99574404682
99574449752
99581722007

Then the files: 然后文件:

99574404682_0.jpg
99574404682_1.jpg
99574449752.jpg
99581722007.gif

would be copied to a new directory. 将被复制到新目录。 I am on a Mac using bash 3.2. 我在使用bash 3.2的Mac上。 I am thinking something like python is what I need to use because the list is too large for grep or find but I am not sure. 我在想使用像python这样的东西,因为该列表对于grep或find来说太大了,但我不确定。 Thanks! 谢谢!

You could iterate through two lists taking item from one based on startswith condition: 您可以基于startswith条件遍历两个列表,从一个列表中取出项目:

files_lst = ['99574404682_0.jpg', '99574404682_1.jpg', '99574437307_0.gif', '99574437307_1.gif', '99574437307_2.gif', '99574449752.jpg', '99574457597.jpg', '99581722007.gif']

lst = [99574404682, 99574449752, 99581722007]

for x in files_lst:
    for y in lst:
        if x.startswith(str(y)):
            print(x)

# 99574404682_0.jpg
# 99574404682_1.jpg
# 99574449752.jpg
# 99581722007.gif

This gets all files that starts with numbers provided in lst . 这将获取所有以lst提供的数字开头的文件。

You can use shutil.copy() to copy your files over from a source to a destination. 您可以使用shutil.copy()将文件从源复制到目标。

from shutil import copy

from os import listdir
from os import makedirs

from os.path import abspath
from os.path import exists
from os.path import splitext

filenames = {'99574404682', '99574449752', '99581722007'}

src_path = # your files
dest_path = # where you want to put them

# make the destination if it doesn't exist
if not exists(dest_path):
    makedirs(dest_path)

# go over each file in src_path
for file in listdir(src_path):

    # If underscore in file
    if "_" in file:
        prefix, *_ = file.split("_")

    # otherwise treat as normal file
    else:
        prefix, _ = splitext(file)

    # only copy if prefix exist in above set
    if prefix in filenames:
        copy(abspath(file), dest_path)

Which results in the following files in dest_path : 这将在dest_path中产生以下文件:

99574404682_0.jpg  
99574404682_1.jpg  
99574449752.jpg  
99581722007.gif

I'm not really an expert in bash, but you can try something like this: 我并不是bash的专家,但是您可以尝试执行以下操作:

#!/bin/bash

declare -a arr=("99574404682" "99574449752" "99581722007")

## Example directories, you can change these
src_path="$PWD/*"
dest_path="$PWD/src"

if [ ! -d "$dest_path" ]; then
    mkdir $dest_path
fi

for f1 in $src_path; do 
    filename=$(basename $f1)
    prefix="${filename%.*}"
    IFS='_' read -r -a array <<< $prefix

    for f2 in "${arr[@]}"; do
        if [ "${array[0]}" == "$f2" ]; then
            cp $f1 $dest_path
        fi
    done
done

using os module and shutil module in python 在Python中使用os模块和shutil模块

import os
import shutil

and you can prepare a list contains the match pattern likes 您可以准备一个包含匹配模式喜欢的列表

match_pattern=['99574404682','99574449752','99581722007']

then use os.listdir() to get a list which contains the file name in source directory 然后使用os.listdir()获取包含源目录中文件名的列表

files_in_source_dir=os.listdir(source_directory_path)

at last copy the matching files 最后复制匹配的文件

for file in files_in_source_dir:
  if file.split('.')[0] in match_pattern: #using split('.')[0] to get filename without extend name
    shutil.copyfile(source_directory_path+file,target_directory_path+file)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果字符串与列表中的字符串匹配,我如何从句子中删除字符串 - How can i remove strings from sentences if string matches with strings in list 如何找到一个字符串与多个字符串列表的匹配项 - How can I find a match for a string against list of multiple strings 在字典和列表之间查找匹配的字符串并用字符串替换匹配项 - Find matching strings between dictionary and list and replace matches with string 如何在列表中找到与另一个列表中的字符串或子字符串匹配的字符串 - How do I find a string in a list that matches a string or substring in another list 修改数据帧时如何在字符串列表中查找部分匹配项 - How to find partial matches in a list of strings when modifying a DataFrame Python:如何获取列表列表,将每个元素转换为字符串,然后返回列表列表? - Python: How can I take a list of lists, convert every element into strings, and return the list of lists? 如果匹配项之间没有3个字符串,如何将一个空字符串插入列表 - How to insert an empty string into list if there are not 3 strings between matches 如果字符串与列表名称匹配,则将数字附加到列表 - Append number to list if string matches list name 如何在字典和列表之间找到匹配项以从匹配项创建新列表? - How can I find matching items between a dictionary and a list to create a new list from the matches? 如何在字符串列表中找到最长的字符串? - How to find the longest string in a list of lists of strings?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM