如何创建一个以txt作为Linux降序子目录的txt文件？

Question

My data follow the structure: 我的数据遵循以下结构：

../data/study_ID/FF_Number/Exam_Number/date, ../data/study_ID/FF_Number/Exam_Number/date，

Where the data dir contains 176 participants` sub-directories. 数据目录中包含176个参与者的子目录。 The ID number represents the participants ID, and each of the following sub-directories represents some experimental number. ID号代表参与者ID，以下每个子目录代表一些实验编号。 I want to create a txt file with one line per participants and the following columns: study ID, FF_number, Exam_Number and date. 我想创建一个txt文件，其中每个参与者一行，以下几列：研究ID，FF_number，Exam_Number和日期。

However it gets a bit more complicated as I want to divide the participants into chunks of ~ 15-20 ppt per chunk for the following analysis. 但是，这变得更加复杂了，因为我想将参与者分成每块〜15-20 ppt的块进行以下分析。

Any suggestions? 有什么建议么？ Cheers. 干杯。

Answer 1

Hmm, nobody? 嗯，没人吗？

You should redirect output of "find" command, consider switches -type d, and -maxdepth, and probably parse it with sed, replacing "/" with "spaces". 您应该重定向“ find”命令的输出，考虑开关-type d和-maxdepth，并可能用sed解析它，用“ spaces”替换“ /”。 Maybe piping through "cut" and "column -t" commands, and "sort" and "uniq" will be useful. 也许通过“ cut”和“ column -t”命令进行管道传递，“ sort”和“ uniq”将很有用。 Do names, except FF and ID, contain spaces, or special characters eg related to names of participants? 除FF和ID外，其他名称中是否包含空格或特殊字符（例如，与参与者的名称有关）？

It should be possible to get a TXT with "one liner" and a few pipes. 带有“一个衬管”和几个管道的TXT应该是可能的。

You should try, and post first results of your work on this :) 您应该尝试一下，并发布有关此工作的初步结果:)

EDIT: Alright, I created for me a structure with several thousands of directories and subdirectories numbered by participant, by exam number etc., which look like this ( maybe it's not identical with what you have, but don't worry ). 编辑：好的，我为我创建了一个结构，其中包含数千个目录和子目录，这些目录和子目录由参与者，考试编号等编号，看起来像这样（也许与您所拥有的不完全相同，但是不用担心）。 Studies are numbered from 5 to 150, FF from 45 to 75, and dates from 2012_01_00 to 2012_01_30 - which makes really huge quantity of directories in total. 研究编号从5到150，FF从45到75，日期从2012_01_00到2012_01_30-这实际上使目录总数非常庞大。

/Users/pwadas/bzz/data
/Users/pwadas/bzz/data/study_005
/Users/pwadas/bzz/data/study_005/05_Num
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_00
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_01
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_02
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_03
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_04
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_05
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_06
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_07
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_08
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_09
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_10
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_11
/Users/pwadas/bzz/data/study_005/05_Num/45_Exam/2012_01_12

Now, I want ( quote ) "txt file with one line per participants and the following columns: study ID, FF_number, Exam_Number and date." 现在，我想（引用）“每个参与者一行一行的txt文件，其以下各列：研究ID，FF_number，Exam_Number和日期。

So I use the following one-liner: 因此，我使用以下一线：

find /Users/pwadas/bzz/data -type d | head -n 5000 |cut -d'/' -f5-7  | uniq |while read line; do echo -n "$line: " && ls -d /Users/pwadas/bzz/$line/*Exam/* | perl -0pe 's/.*2012/2012/g;s/\n/ /g' && echo ; done  > out.txt

and here is the output ( a few first lines from out.txt ). 这是输出（out.txt的前几行）。 Lines are very long, I cutted it on output for first 80-90 characters: 行很长，我在输出的前80-90个字符处剪切了它：

dtpwmbp:data pwadas$ cat out.txt |cut -c1-90
data: 
data/study_005: 
data/study_005/05_Num: 2012_01_00 2012_01_01 2012_01_02 2012_01_03 2012_01_04 2012_01_05 2
data/study_005/06_Num: 2012_01_00 2012_01_01 2012_01_02 2012_01_03 2012_01_04 2012_01_05 2
data/study_005/07_Num: 2012_01_00 2012_01_01 2012_01_02 2012_01_03 2012_01_04 2012_01_05 2
data/study_005/08_Num: 2012_01_00 2012_01_01 2012_01_02 2012_01_03 2012_01_04 2012_01_05 2
dtpwmbp:data pwadas$

I hope this will help you a little, and you'll be able to modify it according to your needs and patterns, and that seems to be all I can do :) You should analyze the one liner, especially "cut" command, and perl-regex part, which removes newlines and full directory name from "ls" output. 我希望这会对您有所帮助，并且您可以根据自己的需要和模式进行修改，这似乎就是我所能做的一切：)您应该分析一个衬板，尤其是“ cut”命令，并且perl-regex部分，该部分从“ ls”输出中删除换行符和完整目录名称。 This is probably fair from optimal, but beautifying is not the point here, I guess :) So, good luck :) PS. 从最佳角度来看，这可能是公平的，但是我想，美化不是重点，:)所以，祝你好运:) PS。 "head" command limits output for N first lines, you'll probably want to skip out | “ head”命令限制了N行的输出，您可能想跳过| head .. | 头.. | part. 部分。

如何创建一个以txt作为Linux降序子目录的txt文件？

问题描述

1 个解决方案

解决方案1
1 已采纳 2012-09-19 17:43:11

如何创建一个以txt作为Linux降序子目录的txt文件？

问题描述

1 个解决方案

解决方案1 1 已采纳 2012-09-19 17:43:11

解决方案1
1 已采纳 2012-09-19 17:43:11