简体   繁体   中英

python: can i move a file based on part of the name to a folder with that name

I have a directory with a large number of files that I want to move into folders based on part of the file name. My list of files looks like this:

  • ID1_geneabc_species1.fa

  • ID1_genexy_species1.fa

  • ID2_geneabc_species1.fa

  • ID3_geneabc_species2.fa

  • ID3_genexy_species2.fa

  • ID4_genexy_species3.fa

I want to move the files I have into separate folders based on the last part of the file name (species1, species2, species3). The first parts of the file name do not always have the same number of numbers and/or letters but are always in 3 parts separated by an underscore '_'.

This is what I have tried from looking online but it does not work:

import os
import glob

dirs = glob.glob('*_*')

files = glob.glob('*.fa')

for file in files:
   name = os.path.splitext(file)[0]
   matchdir = next(x for x in dirs if name == x.rsplit('_')[0])
   os.rename(file, os.path.join(matchdir, file))

I have the list of names (species1, species2, species3) in a list in the script below, which correspond to the third part of my file name. I am able to create a set of directories in my current working directory from each of these names. Is there be a better way to do this after the following script, like looping through the list of species, matching the file, then moving it into the correct directory? THANKS.

from Bio import SeqIO
import os
import itertools

#to get a list of all the species in genbank file
all_species = []
for seq_record in SeqIO.parse("sequence.gb", "genbank"):
    all_species.append(seq_record.annotations["organism"])

#get unique names and change from set to list
Unique_species = set(all_species)
Species = list(Unique_species)

#send to file
f = open('speciesnames.txt', 'w')
for names in Species:
    f.write(names+'\n')
f.close()

print ('There are ' + str(int(len(Species))) + ' species.')

#make directory for each species
path = os.path.dirname(os.path.abspath(__file__))
for item in itertools.product(Species):
    os.makedirs(os.path.join(path, *item))

So, you want a function, which gets folder name from file. Then you iterate over files, create dirs which don't exist and move files there. Stuff like that should work out.

def get_dir_name(filename):
    pos1 = filename.rfind('_')
    pos2 = filename.find('.')
    return filename[pos1+1:pos2]

for f in glob.glob('*.fa'):
    cwd = os.getcwd()
    dir_name = cwd+'/'+get_dir_name(f)
    print dir_name
    if not os.path.exists(dir_name):
        os.mkdir(dir_name)
    os.rename(f, dir_name+'/'+f)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM