I created a script which is supposed to loop in different directory and then in the sub-directories of the initial directory and find the files to process. The graph is like this:
1. Embeddings_test # general directory
> sm # first level of directory
> or # second level of directory
> file 1
> file 2
> aug
> file 1
> file 2
> or_et_aug
> file 1
> file 2
> bc # first level of directory
> or # second level of directory
> file 1
> file 2
> aug
> file 1
> file 2
> or_et_aug
> file 1
> file 2
....
What I want to do is to loop inside each sub-directories and retrieve the files inside but somehow, the loop stop at the first sub-directory: "sm" and "or" and do not loop in the others that is "aug" and "or_et_aug".
Results:
----------------
------------------Directory : sm ---- or
------------------
['flaubert-small-cased_emb_corpus_or_test.pkl', 'flaubert-small-cased_ylabels_corpus_or_test.pkl']
---file for emb--- : flaubert-small-cased_emb_corpus_or_test.pkl
---file for labels--- : flaubert-small-cased_ylabels_corpus_or_test.pkl
10 fold cross validation in processed ----------
Model name: Model_SVC_ovr
------------cross val predict used----------------
---------------cross val score used -----------------------
[1. 1. 1. 1. 1. 1. 1. 1. 0.75 1. ]
0.97 accuracy with a standard deviation of 0.07
-------------------------------
-------------------------------
------------------
------------------Directory : bc ---- aug
------------------
------------------
------------------Directory : unc ---- or_et_aug
------------------
Code lines
# Loading features and classes
mod = ['sm', 'bc', 'unc', 'lg']
corpus = ['or','aug','or_et_aug']
for m, c in zip(mod, corpus):
print("------------------")
print("------------------Directory :", m, "----", c)
print("------------------")
for root, subdirs, files in os.walk("/ho/get/kelo/eXP/Test/embeddings_test" + "/" + m + "/" + c):
#print(root)
print(files)
print("---file for emb--- : ", files[0])
with open(os.path.join(root, files[0]), 'rb') as f1:
features = pickle.load(f1)
print("---file for labels--- : ", files[1])
with open(os.path.join(root, files[1]), 'rb') as f2:
ylabels = pickle.load(f2)
# cross validation training and testing
print("10 fold cross validation in processed ----------", "\n")
models_list = classifiers()
for model_name, model in models_list.items():
print("Model name: {}".format(model_name))
print("------------cross val predict used----------------")
# cross val score : Run cross-validation for single metric evaluation
print("---------------cross val score used -----------------------")
scores = cross_val_score(model, features, ylabels, scoring='accuracy', cv=cv_splitter)
print(scores)
print("%0.2f accuracy with a standard deviation of %0.2f" % (scores.mean(), scores.std()))
print("-------------------------------")
print("-------------------------------")
print("-------------------------------")
print("-------------------------------")
print("-------------------------------")
I expected to print result for each directories and then sub-directories.
Are your sure zip
is the function you need?
import pathlib as pth
import itertools
ROOTDIR = pth.Path("/ho/get/kelo/eXP/Test/embeddings_test")
mod = ['sm', 'bc', 'unc', 'lg']
corpus = ['or','aug','or_et_aug']
for m, c in itertools.product(mod, corpus):
root = ROOTDIR / m / c
files = [file for file in root.iterdir() if file.is_file()]
print(root)
print(files)
I will describe my code later.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.