简体   繁体   English

循环浏览子目录中的文件夹并合并文本文件

[英]Loop through folders in subdirectories and combine text files

I am wanting to loop through folders within a subdirectory and combine all text files into one file. 我想遍历子目录中的文件夹,并将所有文本文件合并为一个文件。 I found some answers online but none seems to work. 我在网上找到了一些答案,但似乎都没有用。 Any help is much appreciated. 任何帮助深表感谢。 I have provided what I've found below. 我提供了以下内容。 In the example below the DummyFolder has multiple subdirectories that contain .txt files that need to be merged into 1 file. 在下面的示例中, DummyFolder具有多个子目录,这些子目录包含需要合并为1个文件的.txt files文件。 I got code 3 to work yesterday but somehow I changed something and it is no longer working for some reason. 我昨天获得了代码3的支持,但是以某种方式我做了一些更改,并且由于某种原因它不再起作用。

Code 1: 代码1:

@echo off
set "header=C:\Users\user\Desktop\DummyFolder\Headings.txt"
set "folder=C:\Users\user\Desktop\DummyFolder\"
set "tempFile=%folder%\temp.txt"
for %%F in ("%folder%\*.txt") do (
   type "%header%" >"%tempFile%"
   type "%%F" >>"%tempFile%"
   move /y "%tempFile%" "%%F" >nul
)

Also found this code (Code 2): 还发现此代码(代码2):

$startingDir = 'C:\Users\user\Desktop\DummyFolder\'
$combinedDir = 'C:\Users\user\Desktop\DummyFolder\CombinedTextFiles'

Get-ChildItem $startingDir -Recurse | Where-Object {
   $txtfiles = Join-Path $_.FullName '*.txt'
   $_.PSIsContainer -and (Test-Path $txtfiles)
} | ForEach-Object {
   $merged = Join-Path $combinedDir ($_.Name + '_Merged.txt')
   Get-Content $txtfiles | Set-Content $merged
}

Also found this code (Code 3): 还找到了以下代码(代码3):

@echo on
set folder="C:\Users\user\Desktop\DummyFolder\"
for /F %%a in ('dir /b /s %folder%') do (
   if "%%~xa" == ".txt" (
      (echo/------------------------------
      type %%~a
      echo/)>>"%~dp0list.txt"
   )
)

In CMD you'd do something like this: 在CMD中,您将执行以下操作:

@echo off

set "basedir=C:\some\folder"
set "outfile=C:\path\to\output.txt"

(for /r "%basedir%" %f in (*.txt) do type "%~ff") > "%outfile%"

For use in batch files you need to change %f to %%f and %~ff to %%~ff . 要在批处理文件中使用,您需要将%f更改为%%f ,将%~ff更改为%%~ff


In PowerShell you'd do something like this: 在PowerShell中,您可以执行以下操作:

$basedir = 'C:\some\folder'
$outfile = 'C:\path\to\output.txt'

Get-ChildItem $basedir -Include *.txt -Recurse | Get-Content |
    Set-Content $outfile

Code 3 is not bad but it won't work with spaces in a path because you use the standard delims as you're not providing one. 代码3不错,但是它不能与路径中的空格一起使用,因为您使用的是标准delims因为您没有提供。 Also there a several other errors about working with spaces in a path. 此外,还有其他关于在路径中使用空格的错误。

The following code works and combine all txt files in all subdirectories. 以下代码可以正常工作并将所有子目录中的所有txt files合并在一起。 It will create a new file list.txt in the folder where this batch file is located. 它将在该批处理文件所在的文件夹中创建一个新文件list.txt If there is already an existing list.txt it will be overwritten. 如果已经存在一个list.txt ,它将被覆盖。 Note that it's a batch file: 请注意,这是一个批处理文件:

@echo off
set "folder=C:\Users\user\Desktop\DummyFolder\"
rem create new empty file: list.txt in directory of batch file: %~dp0
break>"%~dp0list.txt"
rem loop through all output lines of the dir command, unset delimns
rem so that space will not separate
for /F "delims=" %%a in ('dir /b /s "%folder%"') do (
   rem just look for txt files
   if "%%~xa" == ".txt" (
      rem don't use the list.txt
      if not "%%a" == "%~dp0list.txt" (
         rem append the output of the whole block into the file
         (echo/------------------------------
         type "%%a"
         echo/)>>"%~dp0list.txt"
      )
   )
)

If you don't understand something it's quite easy to find something good on the internet because there are several great batch scripting sites. 如果您不了解某些内容,则可以在Internet上找到不错的内容,这很容易,因为这里有许多出色的批处理脚本站点。 Further you can always use echo This is a message visible on the command prompt to display something that might be useful eg variables etc. With that you can "debug" and look what happens. 此外,您始终可以使用echo This is a message visible on the command prompt用于显示可能有用的内容(例如变量等)。通过该操作,您可以“调试”并查看发生的情况。 Some explanations beyond the comments ( rem This is a comment ) in the code: 代码中除注释( rem This is a comment )之外的一些解释:

1. break command: 1. break命令:

To clear a file I use the break command which will produce no output at all. 要清除文件,我使用break命令,该命令根本不会产生任何输出。 That empty output I redirect to a file, read it here: https://stackoverflow.com/a/19633987/8051589 . 空的输出我重定向到文件,请在此处阅读: https : //stackoverflow.com/a/19633987/8051589

2. General variables: 2.常规变量:

You set variables via set varname=Content I prefer the way as I do it with quotes: set "varname=Content" as it works with redirection characters also. 您可以通过set varname=Content设置变量。我更喜欢用引号来设置变量: set "varname=Content"因为它也适用于重定向字符。 Use the variable with one starting % and one trailing % eg echo %varname% . 将变量以一个开头%和一个结尾%例如echo %varname% You can read a lot of it on https://ss64.com/nt/set.html . 您可以在https://ss64.com/nt/set.html上阅读很多内容。 I think ss64 is probably the best site for batch scripting out there. 我认为ss64可能是在那里进行批处理脚本的最佳站点。

3. Redirection > and >> : 3.重定向>>>

You can redirect the output of a command with > or >> where > creates a new file and overwrites existing files and >> appends to a file or create one if not existing. 您可以使用>>>重定向命令的输出,其中>创建一个新文件并覆盖现有文件,并将>>追加到文件中,或者创建一个(如果不存在)。 There are a lot more thing possible: https://ss64.com/nt/syntax-redirection.html . 还有更多可能: https : //ss64.com/nt/syntax-redirection.html

4. for /f loop: 4. for /f循环:

In a batch file you loop through the lines of a command output by using a for /f loop. 在批处理文件中,您可以使用for /f循环遍历命令输出的各行。 The variable that is used will be written with 2 % in front of it, here %%a . 使用的变量将在其前面写入2 % ,这里是%%a I also set the delimiter delimns to nothing so that the command output will not be separated into several tokens. 我还将定界符delimns设置为delimns ,以便命令输出不会分成多个标记。
You can read a lot of details about a for /f loop at: https://ss64.com/nt/for_cmd.html . 您可以在以下位置阅读有关for /f循环的许多详细信息: https : //ss64.com/nt/for_cmd.html

5. Special variable syntax %%~xa and %~dp0 : 5.特殊变量语法%%~xa %~dp0%~dp0

The variable %%a which hold one line of the dir command can be expand to the file extension only via: %%~xa as explained here: https://stackoverflow.com/a/5034119/8051589 . 包含dir命令一行的变量%%a只能通过以下方式扩展为文件扩展名: %%~xa ,如此处所述: https : //stackoverflow.com/a/5034119/8051589 The %~dp0 variable contains the path where the batch file is located see here: https://stackoverflow.com/a/10290765/8051589 . %~dp0变量包含批处理文件所在的路径,请参见此处: https : %~dp0

6. Block redirection ( ... )>> : 6.块重定向( ... )>>

To redirect multiple commands at once you can open a block ( , execute commands, close the block ) and use a redirection. 要一次重定向多个命令,您可以打开一个块( ,执行命令,关闭该块)并使用重定向。 You could also execute every command and redirect that only that would have the same effect. 您还可以执行每个命令,并重定向仅具有相同效果的命令。

There are so many ways to do this. 有很多方法可以做到这一点。 For example, using the Wolfram Language you can: 例如,使用Wolfram语言,您可以:

StringJoin @@ 
    FileSystemMap[
        If[FileExtension[#] == "txt", Import[#, "Text"]] &, 
        "C:\\Users\\user\\Desktop\\DummyFolder\\", Infinity, 1]

An then write the result using 然后使用写结果

Export[C:\\Users\\user\\Desktop\\, %, "Text"]

You can also do this with Python, Perl, etc.. use PowerShell only if you need to share your solution and want to avoid installers. 您也可以使用Python,Perl等执行此操作。仅在需要共享解决方案并且希望避免安装程序时才使用PowerShell。 I would not spend too much time learning 1981 technology (CMD). 我不会花太多时间学习1981年技术(CMD)。

This may be a simple answer for what you are looking for, the usebackq is important to allow "" around paths. 这可能是您要查找的内容的简单答案,usebackq对于允许在路径中使用“”非常重要。 tokens=* to include all information. tokens = *以包含所有信息。 To use in a console instead of a batch file change %% to %. 要在控制台中使用而不是批处理文件,请将%%更改为%。

    for /f "tokens=*" %%a in ('dir /s /b C:\testpath\*.txt') do (for /f "usebackq tokens=*"  %%b in ("%%a") do (echo %%b >> C:\test.txt))

Assuming that your source files are located in immediate sub-directories of the root directory DummyFolder and that you want the content of Headings.txt to occur once only on top of the resulting file, you could accomplish your task using the following script: 假设你的源文件都位于根目录的直接子目录DummyFolder和想要的内容Headings.txt发生一次仅在生成的文件的顶部,你可以使用下面的脚本完成你的任务:

@echo off

rem // Define constants here:
set "folder=C:\Users\user\Desktop\DummyFolder"
set "header=%folder%\Headings.txt"
set "result=%folder%\merged.txt"

rem // Prepare result file, copy content of header file:
copy "%header%" "%result%" > nul
rem // Enumerate immediate sub-directories of the given root directory:
for /D %%D in ("%folder%\*") do (
    rem // Enumerate matching files per sub-directory:
    for %%F in ("%%~D\*.txt") do (
        rem // Append content of current file to result file:
        copy /Y "%result%" + "%%~F" "%result%" /B > nul
    )
)

In case your source files are located anywhere in the directory tree DummyFolder , you need to make sure that the header file Headings.txt and the result file merged.txt are not iterated: 如果源文件位于目录树DummyFolder中的任何位置,则需要确保头文件Headings.txt和结果文件merged.txt没有被迭代:

@echo off

rem // Define constants here:
set "folder=C:\Users\user\Desktop\DummyFolder"
set "header=Headings.txt"
set "result=merged.txt"

rem // Prepare result file, copy content of header file:
copy "%folder%\%header%" "%folder%\%result%" > nul
rem // Enumerate matching files in the whole given directory tree:
for /R "%folder%" %%F in ("*.txt") do (
    rem // Exclude the header file to be re-processed:
    if /I not "%%~nxF"=="%header%" (
        rem // Exclude the result file to be processed:
        if /I not "%%~nxF"=="%result%" (
            rem // Append content of current file to result file:
            copy /Y "%folder%\%result%" + "%%~F" "%folder%\%result%" /B > nul
        )
    )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM