How to loop in multiple directories with a for loop using os.walk()

Question

I created a script which is supposed to loop in different directory and then in the sub-directories of the initial directory and find the files to process. The graph is like this:

1. Embeddings_test  # general directory
   > sm  # first level of directory
     > or # second level of directory
         > file 1
         > file 2

     > aug
         > file 1
         > file 2

     > or_et_aug
         > file 1
         > file 2

   > bc  # first level of directory
     > or # second level of directory
         > file 1
         > file 2

     > aug
         > file 1
         > file 2

     > or_et_aug
         > file 1
         > file 2

....

What I want to do is to loop inside each sub-directories and retrieve the files inside but somehow, the loop stop at the first sub-directory: "sm" and "or" and do not loop in the others that is "aug" and "or_et_aug".

Results:

----------------
------------------Directory : sm ---- or
------------------
['flaubert-small-cased_emb_corpus_or_test.pkl', 'flaubert-small-cased_ylabels_corpus_or_test.pkl']
---file for emb--- :  flaubert-small-cased_emb_corpus_or_test.pkl
---file for labels--- :  flaubert-small-cased_ylabels_corpus_or_test.pkl
10 fold cross validation in processed ----------

Model name: Model_SVC_ovr
------------cross val predict used----------------
---------------cross val score used -----------------------
[1.   1.   1.   1.   1.   1.   1.   1.   0.75 1.  ]
0.97 accuracy with a standard deviation of 0.07
-------------------------------
-------------------------------
------------------
------------------Directory : bc ---- aug
------------------
------------------
------------------Directory : unc ---- or_et_aug
------------------

Code lines

# Loading features and classes

mod = ['sm', 'bc', 'unc', 'lg']

corpus = ['or','aug','or_et_aug']

for m, c in zip(mod, corpus):
    print("------------------")
    print("------------------Directory :", m, "----", c)
    print("------------------")
    for root, subdirs, files in os.walk("/ho/get/kelo/eXP/Test/embeddings_test" + "/" + m + "/" + c):
        #print(root)
        print(files)
        print("---file for emb--- : ", files[0])
        with open(os.path.join(root, files[0]), 'rb') as f1:
            features = pickle.load(f1)
        print("---file for labels--- : ", files[1])
        with open(os.path.join(root, files[1]), 'rb') as f2:
            ylabels = pickle.load(f2)
        # cross validation training and testing

        print("10 fold cross validation in processed ----------", "\n")
        models_list = classifiers()
        for model_name, model in models_list.items():
            print("Model name: {}".format(model_name))

            print("------------cross val predict used----------------")
            # cross val score : Run cross-validation for single metric evaluation

            print("---------------cross val score used -----------------------")
            scores = cross_val_score(model, features, ylabels, scoring='accuracy', cv=cv_splitter)
            print(scores)
            print("%0.2f accuracy with a standard deviation of %0.2f" % (scores.mean(), scores.std()))
            print("-------------------------------")
            print("-------------------------------")
            print("-------------------------------")
            print("-------------------------------")
            print("-------------------------------")

I expected to print result for each directories and then sub-directories.

Answer 1

Are your sure zip is the function you need?

import pathlib as pth
import itertools

ROOTDIR = pth.Path("/ho/get/kelo/eXP/Test/embeddings_test")

mod = ['sm', 'bc', 'unc', 'lg']
corpus = ['or','aug','or_et_aug']

for m, c in itertools.product(mod, corpus):
    root = ROOTDIR / m / c
    files = [file for file in root.iterdir() if file.is_file()]
    print(root)
    print(files)

I will describe my code later.

How to loop in multiple directories with a for loop using os.walk()

Question

1 answers

solution1
0 2021-04-22 18:01:02

How to loop in multiple directories with a for loop using os.walk()

Question

1 answers

solution1 0 2021-04-22 18:01:02

solution1
0 2021-04-22 18:01:02