I have over a hundred text files formatted like this
<TITLE> This is the title
<SUBJECT> This is the subject
<XTITLE>
I want to extract the title values using a Windows batch file, eg "This is the title" from each of these text files to a single output file, and include also the filename of the text file where these were found. Each text file can have multiple title tags. Example output below:
This is the title textfile1.txt This is the second title textfile1.txt
This is the third title textfile2.txt
This is the fourth title textfile3.txt
Anyone?
@ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "outfile=%destdir%\outfile.txt"
(
FOR /f "delims=" %%i IN ('dir /b/a-d "%sourcedir%\*.txt"') DO (
FOR /f "usebackqtokens=1-3delims=<=>" %%a IN ("%sourcedir%\%%i") DO (
IF "%%b"=="TITLE" ECHO(%%i %%c
IF "%%a"=="TITLE" ECHO(%%i %%b
)
)
)>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir
and destdir
to suit your circumstances.
Produces the file defined as %outfile%
The if...%%a
line will be invoked if there are no leading spaces, the if...%%b
if there are leading spaces.
I changed the order of the report fields as that seemed to make more sense to me. If you truly want the report in the opposite order, simply revers the %%i
and %%a/%%b
in the echo
statements.
This routine produces one line per input file.
@ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "outfile=%destdir%\outfile.txt"
(
FOR /f "delims=" %%i IN ('dir /s/b/a-d "%sourcedir%\*.txt"') DO (
FOR /f "usebackqtokens=1-3delims=<=>" %%a IN ("%%i") DO (
IF "%%b"=="TITLE" ECHO(%%i %%c
IF "%%a"=="TITLE" ECHO(%%i %%b
)
)
)>"%outfile%"
GOTO :EOF
Same routine adjusted to include scan of subdirectories. Note that in this case, dir /s /b
includes the path in the listing.
You may wish to put the echo
ed %%i
in quotes in case of separators in path/filenames.
@echo off
pushd "c:\folder_with_files"
for %%# in (textfile*.txt) do (
for /f "tokens=1* delims=>" %%a in ('find "<SUBJECT>" "%%#"') do (
if "%%b" neq "" (
echo %%b : file %%#
)
)
)>>"c:\output.txt"
You might need to change the mask of the files in the first for loop and you need to change the PUSHD location
This method should run faster, specially if the files are large:
@echo off
setlocal EnableDelayedExpansion
rem Group titles of same files in same array elements
for /F "tokens=1,3 delims=:>" %%a in ('findstr /L "<TITLE>" *.txt') do (
set "t[%%a]=!t[%%a]! %%b"
)
rem Show the titles
(for /F "tokens=2,3 delims=[]=" %%a in ('set t[') do echo %%~Fa: %%b) > output.txt
For example, with these input files:
textfile1.txt
<TITLE> This is the title
<SUBJECT> This is the subject
<XTITLE>
<TITLE> This is the second title
<SUBJECT> This is the subject
<XTITLE>
textfile2.txt
<TITLE> This is the third title
<SUBJECT> This is the subject
<XTITLE>
textfile3.txt
<TITLE> Fourth title
<SUBJECT> This is the subject
<XTITLE>
<TITLE> Fifth title
<SUBJECT> This is the subject
<XTITLE>
<TITLE> Sixth title
<SUBJECT> This is the subject
<XTITLE>
This is the output:
C:\Folder\textfile1.txt: This is the title This is the second title
C:\Folder\textfile2.txt: This is the third title
C:\Folder\textfile3.txt: Fourth title Fifth title Sixth title
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.