I have the following test.txt
file
# Example:
# Comments
# Comments
MC
Attribute 1
Attribute 2
Attribute 3
---
MC
Attribute 1
Attribute 2
Attribute 3
---
MC
Attribute 1
Attribute 2
Attribute 3
I want to perform
\n
by \t
\t---\t
into a \n
So that I achieve the following
MC <TAB> Attribute 1 <TAB> Attribute 2 <TAB> Attribute 3
MC <TAB> Attribute 1 <TAB> Attribute 2 <TAB> Attribute 3
MC <TAB> Attribute 1 <TAB> Attribute 2 <TAB> Attribute 3
For some reason the following doesn't work
perl -pe "s/#.*//g; s/^\n//g; s/\n/\t/g; s/\t---\t/\n/g" test.txt
Producing the output
MC Description --- MC Description --- MC Description
If I just run the following instead
perl -pe "s/#.*//g; s/^\n//g; s/\n/\t/g;" test.txt
I also have
MC Description --- MC Description --- MC Description
It appear that the last command in s/#.*//g; s/^\n//g; s/\n/\t/g; s/\t---\t/\n/g
s/#.*//g; s/^\n//g; s/\n/\t/g; s/\t---\t/\n/g
s/#.*//g; s/^\n//g; s/\n/\t/g; s/\t---\t/\n/g
is not working.
You say you're removing \t---\t
, but that doesn't appear to be in the input.
If you want to match a line that has only whitespace and ---
on it, use ^\s*---\s*$
.
perl -pe "s/#.*//g; s/^\n//g; s/\n/\t/g; s/^\s*---\s*$/\n/g" test.txt
Note that this will leave you with no newline at the end of the file if there is no final ---
.
If you want to process the whole line, use -0
. -0
controls the "input record separator" which Perl uses to decide what is a line. -0
alone sets it to null (assuming there are no null bytes) will read the whole file.
Then your original almost works. You need to add a /m
so that ^
matches the beginning of a line as well as the beginning of a string.
perl -0pe "s/#.*//g; s/^\n//mg; s/\n/\t/g; s/\t---\t/\n/g" test.txt
But we can make this simpler! The input record separator separates records . Your record separator is ---\n
, so we can set it to that and process each record individually.
To set the input record separator to a string, we use $/
. And to do this in a one-liner, we put it in a BEGIN
block so it is run only once when the program starts, not for every line.
Finally, we use -l
to both automatically strip the record separator, which is ---\n
, and to add a newline to the end of each line. That is, it adds a chomp
at the start and a $_.= "\n"
at the end.
# Set the input record separator to ---\n.
# -l turns on autochomp to strip the separator.
# -l also adds a newline to each line.
# Strip comments.
# Strip blank lines (again, using /m so ^ works)
# Turn tabs into newlines.
perl -lpe 'BEGIN { $/ = "---\n" } s/#.*//mg; s/^\s*\n//mg; s/\n/\t/g;' test.txt
As a bonus, we get newlines on every line, including the last.
Finally, we can instead handle this using arrays. Same basic idea as before, but we split them back into lines and use grep
to filter out unwanted lines. Then we're left with a simple join.
I'll write this one out long-hand so it's easier to read.
#!/usr/bin/env perl -lp
BEGIN { $/ = "---\n" }
# Split into lines.
# Strip comment lines.
# Strip blank lines.
# Join back together with tabs.
$_ = join "\t",
grep /\S/,
grep !/^#.*/,
split /\n/, $_;
I find this approach more maintainable; it's easier to deal with an array of lines than everything mashed together in a multi-line string.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.