[英]Use the folder name as a column in a text file
The lazy me is thinking about adding a column to some textfiles. 懒惰的我正考虑在一些文本文件中添加一列。
The textfiles are in directories and I would like to add the directory name to the text file. 文本文件位于目录中,我想将目录名添加到文本文件中。
Like the text file text.txt
in the folder the_peasant
: 像文件夹
the_peasant
的文本文件text.txt
the_peasant
:
has a wart
was dressed up like a witch
has a false nose
would become: 会成为:
the_peasant has a wart
the_peasant was dressed up like a witch
the_peasant has a false nose
Then I have similar text files in other folders called "the_king" etc. 然后我在其他文件夹中有类似的文本文件,名为“the_king”等。
I would think this is a combination of the find command, bash scripting and sed but I cant see it through. 我认为这是find命令,bash脚本和sed的组合,但我无法看透。 Any ideas?
有任何想法吗?
This might work for you: 这可能对你有用:
find . -name text.txt | sed 's|.*/\(.*\)/.*|sed -i "s@^@\1 @" & |' | sh
or if you have GNU sed: 或者如果你有GNU sed:
find . -name text.txt | sed 's|.*/\(.*\)/.*|sed -i "s@^@\1 @" & |e'
Simple python script for this (should work from any folder, as long as you pass the fullpath to the target file, obviously): 这个简单的python脚本(应该可以在任何文件夹中工作,只要你将完整路径传递给目标文件,显然):
#!/usr/bin/python
if __name__ == '__main__':
import sys
import os
# Get full filepath and directory name
filename = os.path.abspath(sys.argv[1])
dirname = os.path.split(os.path.dirname(filename))[1]
# Read current file contents
my_file = open(filename, 'r')
lines = my_file.readlines()
my_file.close()
# Rewrite lines, adding folder name to the start
output_lines = [dirname + ' ' + line for line in lines]
my_file = open(filename, 'w')
my_file.write('\n'.join(output_lines))
my_file.close()
Here is what I came up with: 这是我想出的:
find /path/to/dir -type f | sed -r 'p;s:.*/(.*)/.*:\1:' | xargs -n 2 sh -c 'sed -i "s/^/$1 /" $0'
Here is an example of how the commands would be constructed, assuming the following files exist: 假设存在以下文件,下面是如何构造命令的示例:
/home/the_peasant/a.txt
/home/the_peasant/b.txt
/home/the_peasant/farmer/c.txt
First find /home/the_peasant -type f
would output those files exactly as above. 首先
find /home/the_peasant -type f
将完全按上述方式输出这些文件。
Next, the sed command would output a file name, followed by the directory name, like this: 接下来,sed命令将输出文件名,后跟目录名,如下所示:
/home/the_peasant/a.txt
the_peasant
/home/the_peasant/b.txt
the_peasant
/home/the_peasant/farmer/c.txt
farmer
The xargs would group every two lines and pass them to the sh command, so you would end up with the following three commands: xargs会将每两行分组并将它们传递给sh命令,因此最终会得到以下三个命令:
$ sh -c 'sed -i "s/^/$1 /" $0' /home/the_peasant/a.txt the_peasant
$ sh -c 'sed -i "s/^/$1 /" $0' /home/the_peasant/b.txt the_peasant
$ sh -c 'sed -i "s/^/$1 /" $0' /home/the_peasant/farmer/c.txt farmer
And finally this will result in the following sed commands which will add the folder name to the beginning of each line: 最后这将导致以下sed命令将文件夹名称添加到每行的开头:
$ sed -i "s/^/the_peasant /" /home/the_peasant/a.txt
$ sed -i "s/^/the_peasant /" /home/the_peasant/b.txt
$ sed -i "s/^/farmer /" /home/the_peasant/farmer/c.txt
Obligatory single liner using find and perl 使用find和perl的强制单线程
find . -maxdepth 1 -mindepth 1 -type d | perl -MFile::Basename -ne 'chomp; my $dir = basename($_); for my $file (glob "$dir/*") { print qq{sed -i "s/^/$dir /" $file\n} }' | tee rename_commands.sh
sh rename_commands.sh
Assumes perl and sed are in your $PATH. 假设perl和sed都在你的$ PATH中。 Generates a file of sed commands to do the actual change so you can review what is to be done.
生成sed命令文件以进行实际更改,以便您可以查看要执行的操作。
In my test, that command file looks like so: 在我的测试中,该命令文件如下所示:
sed -i "s/^/foo /" foo/text1
sed -i "s/^/foo /" foo/text2
sed -i "s/^/bar /" bar/belvedere
sed -i "s/^/bar /" bar/robin
The directory tree: 目录树:
% tree .
.
├── the_king
│ └── text.txt
├── the_knight
│ └── text.txt
├── the_peasant
│ └── text.txt
└── wart.py
3 directories, 4 files
Directories and contents before: 目录和内容:
% find . -name 'text.txt' -print -exec cat {} \;
./the_king/text.txt
has a wart
was dressed up like a witch
has a false nose
./the_knight/text.txt
has a wart
was dressed up like a witch
has a false nose
./the_peasant/text.txt
has a wart
was dressed up like a witch
has a false nose
Code (wart.py): 代码(wart.py):
#!/usr/bin/env python
import os
text_file = 'text.txt'
cwd = os.path.curdir # '.'
# Walk thru each directory starting at '.' and if the directory contains
# 'text.txt', print each line of the file prefixed by the name containing
# directory.
for root, dirs, files in os.walk(cwd):
if text_file in files: # We only care IF the file is in this directory.
print 'Found %s!' % root
filepath = os.path.join(root, text_file) # './the_peasant/text.txt'
root_base = os.path.basename(root) # './the_peasant' => 'the_peasant'
output = ''
with open(filepath, 'r') as reader: # Open file for read/write
for line in reader: # Iterate the lines of the file
new_line = "%s %s" % (root_base, line)
print new_line,
output += new_line # Append to the output
with open(filepath, 'w') as writer:
writer.write(output) # Write to the file
print
Which outputs: 哪个输出:
Found ./the_king!
the_king has a wart
the_king was dressed up like a witch
the_king has a false nose
Found ./the_knight!
the_knight has a wart
the_knight was dressed up like a witch
the_knight has a false nose
Found ./the_peasant!
the_peasant has a wart
the_peasant was dressed up like a witch
the_peasant has a false nose
Directories and contents after: 目录和内容后:
% find . -name 'text.txt' -print -exec cat {} \;
./the_king/text.txt
the_king has a wart
the_king was dressed up like a witch
the_king has a false nose
./the_knight/text.txt
the_knight has a wart
the_knight was dressed up like a witch
the_knight has a false nose
./the_peasant/text.txt
the_peasant has a wart
the_peasant was dressed up like a witch
the_peasant has a false nose
This was fun! 这很有趣! Thanks for the challenge!
感谢您的挑战!
I would. 我会。
Accessing the directory can be done by using 访问目录可以使用
import os
fpath = "example.txt"
dir_name = os.path.dirname(fpath)
Are you running the script in the appropriate folder? 您是否在相应的文件夹中运行脚本? Then you can use the os module to find the current folder.
然后,您可以使用os模块查找当前文件夹。 Say you wanted to take just the end of the directory tree, you could use os.path, like:
假设你只想在目录树的末尾,你可以使用os.path,如:
import os, os.path
curDirectory = os.getcwd()
baseDir = os.path.basename()
inFile = open("filename.txt").xreadlines()
outFile = open("filename.out", "w")
for line in inFile:
outFile.write("%s %s" % (baseDir, line))
outFile.close()
Edit: noticed something wasn't correct. 编辑:发现有些事情不正确。 I removed the dir loop - its recursively walking now.
我删除了dir循环 - 它现在递归行走。 Sorry for the mix up.
抱歉混淆了。
Using os.walk 使用os.walk
import os.path
directory = os.path.curdir
pattern = ".py";
for (path,dirs,files) in os.walk(directory):
for file in files:
if not file.endswith(pattern):
continue
filename = os.path.join(path,file)
#print "file: ",filename
#continue
with open(filename,"r") as f:
for line in f.readlines():
print "{0} {1}".format(filename,line)
f.close()
Output: 输出:
list1.py # LAB(replace solution)
list1.py # return
list1.py # LAB(end solution)
Here's a one-ish-liner in bash and awk: 这是bash和awk中的一个单线程:
find . -type f -print0 |
while read -r -d "" path; do
mv "$path" "$path.bak"
awk -v dir="$(basename "$(dirname "$path")")" '{print dir, $0}' "$path.bak" > "$path"
done
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.