简体   繁体   中英

Replace string with unicode in text file via Windows batch file

I have a file with this simple contents:

test.txt (ASCII encoded)

Baby, you can drive my :car:

Via a Windows batch file, I need to change :car: to 🚗 ( https://unicode-table.com/en/1F697/ )

I'd like to avoid installing new software on the client's server, so I'm trying to do it using PowerShell or something native.

So far I've tried a ton of suggestions ( https://www.generacodice.com/en/articolo/30745/How-can-you-find-and-replace-text-in-a-file-using-the-Windows-command-line-environment? ), but nothing works for me. Either it doesn't get replaced, or \Ὡ7 shows up literally. I've tried changing the inbound file's encoding to Unicode and that isn't working either.

Non-working example:

powershell -Command "(gc test.txt) -replace ':car:', '🚗' | Out-File -encoding Unicode test.txt"

Does anyone have any tips?

Edit: I've determined how to reproduce it.

If I run this line via command line, it works:

powershell -Command "(gc test.txt) -replace ':car:', '🚗' | Out-File -encoding utf8 test-out.txt"

If I put the same line of code inside replace.bat and then execute it, test-out.txt is corrupt.

The batch file is set to UTF-8 encoding. Should something be different?

I don't think a .bat file can have non-ascii encoding. If you're willing to have a file.ps1 file:

(gc test.txt) -replace ':car:', '🚗' | Out-File -encoding utf8 test-out.txt

The file has to be saved as utf8 with bom in notepad, not just utf8.

Then your .bat file would be:

powershell -file file.ps1

The powershell ise is a nice way to test this.

cmd /c file.bat
type test-out.txt

🚗

Windows .bat script interpreter does not understand any Unicode encoding (eg utf-8 or utf-16 or utf-16 ); the simplest principle is:

You have to save the batch file with OEM encoding. How to do this varies depending on your text editor. The encoding used in that case varies as well. For Western cultures it's usually CP850 .

To use any Unicode character (above ASCII range) as a part of string passed to PowerShell command then (instead of '🚗' ) apply the .NET method Char.ConvertFromUtf32(Int32) ; in terms of PowerShell syntax [char]::ConvertFromUtf32(0x1F697)

Being in ASCII it does not contradicts with above .bat encoding rule, and PowerShell would evaluate it to the 🚗 character…

Then, your line could be as follows:

powershell -Command "(gc test.txt) -replace ':car:', [char]::ConvertFromUtf32(0x1F697) | Out-File -encoding Unicode test.txt"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM