简体   繁体   English

用Python的一部分字符串列出文件的名称

[英]Listing name of files with part of a string from Python

I am trying to list all files within a directory that contain the string I specify as part of their names. 我试图列出目录中包含我作为其名称的一部分指定的字符串的所有文件。 I want to vary this string with each iteration of the loop. 我想在循环的每次迭代中更改此字符串。 The code I am using is: 我使用的代码是:

from subprocess import Popen
from subprocess import call

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
run_length = (len(species_array) - 5)
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for i in range (run_length):
    s = Popen("find", path, "-name", *species_array[i+1]*)
    print s.communicate()[0]

The file should contain species_array[i+1] as part of its name. 该文件应包含物种名称数组[i + 1]作为其名称的一部分。 Thanks in advance. 提前致谢。

If you want to use find you need to pass a list of args when shell=False . 如果要使用find ,则需要在shell=False时传递一个args list check_output will work for your case, you can slice the list instead of using range and you need str.format to wrap each specie/ele in * : check_output将适合您的情况,您可以对列表进行切片而不是使用range,并且需要str.format将每个specie / ele包装在*

from subprocess import check_output

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
    s = check_output(["find", path, "-name", "*{0}*".format(ele)])
    print s

For python 2.6 use Popen: 对于python 2.6,请使用Popen:

from subprocess Popen,PIPE

species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
    s = Popen(["find", path, "-name", "*{0}*".format(ele)],stdout=PIPE,stderr=PIPE)
    out,err = s.communicate()
    print(out,err)

Your loop is all wrong. 您的循环全错了。 python is much more expressive than that: python比这更具表现力:

1) You can skip the first element by starting the range at 1: 1)您可以通过从1开始范围来跳过第一个元素:

for i in range(1, len(species_arr) - 4):

...then use i instead of i+1 inside your loop. ...然后在循环中使用i而不是i+1

2) Even easier (and more idiomatic) is to use list slicing: 2)使用列表切片甚至更容易(而且更惯用):

for species in species_arr[1:-4]:

3) You can format strings in python using the format() method. 3)您可以使用format()方法在python中格式化字符串。

Here is an example employing those concepts: 这是使用这些概念的示例:

species_arr = [
    "homo_sapiens", 
    "pan_troglodytes", 
    "pongo_abelii", 
    "gorilla_gorilla", 
    "macaca_mulatta", 
    "callithrix_jacchus", 
    "bos_taurus", 
    "canis_familiaris", 
    "equus_caballus", 
    "felis_catus", 
    "ovis_aries", 
    "sus_scrofa", 
    "oryctolagus_cuniculus", 
    "rattus_norvegicus", 
    "mus_caroli", 
    "mus_pahari", 
    "mus_musculus"
]

chop_from_end = 4 

for species in species_arr[1:-chop_from_end]:
    fname = "*{0}*".format(species)
    print fname

--output:--
*pan_troglodytes*
*pongo_abelii*
*gorilla_gorilla*
*macaca_mulatta*
*callithrix_jacchus*
*bos_taurus*
*canis_familiaris*
*equus_caballus*
*felis_catus*
*ovis_aries*
*sus_scrofa*
*oryctolagus_cuniculus*

The format() method was introduced in python 3.0--but it was backported to python 2.6 (in a more limited form). format()方法是在python 3.0中引入的-但已反向移植到python 2.6(以更有限的形式)。 If for some reason your install does not have the format() method, you can use the old way: 如果由于某种原因您的安装不具有format()方法,则可以使用旧方法:

 fname = "*%s*" % species

See additional format() examples here: 在此处查看其他format()示例:

https://docs.python.org/3/library/string.html#format-examples https://docs.python.org/3/library/string.html#format-examples

4) Here's what you can do with the glob module : 4)这是您可以使用glob module

import glob
import os.path
import pprint

base_dir = '/Users/7stud/python_programs/dir1'

names = ['a', 'b', 'c']

for name in names: 
    fname = "*{0}*".format(name)
    path = os.path.join(base_dir, fname)
    pprint.pprint(glob.glob(path))
    print '-' * 20

--output:--
['/Users/7stud/python_programs/dir1/__pycache__',
 '/Users/7stud/python_programs/dir1/a.txt',
 '/Users/7stud/python_programs/dir1/aa.txt',
 '/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/b.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/__pycache__']
--------------------

Or, as a dict of name, matches pairs: 或者,作为name, matches的字典name, matches对:

results = dict(
    (
      name,
      glob.iglob(os.path.join(base_dir, "*{0}*".format(name)))
    )
    for name in names
)

for name, _iter in results.items():
    print "{0}:".format(name)
    pprint.pprint(list(_iter))

--output:--
a:
['/Users/7stud/python_programs/dir1/__pycache__',
 '/Users/7stud/python_programs/dir1/a.txt',
 '/Users/7stud/python_programs/dir1/aa.txt',
 '/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']
c:
['/Users/7stud/python_programs/dir1/__pycache__']
b:
['/Users/7stud/python_programs/dir1/ab.txt',
 '/Users/7stud/python_programs/dir1/b.txt',
 '/Users/7stud/python_programs/dir1/ba.txt']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM