简体   繁体   中英

Having trouble with a batch script

I have a batch script which makes a call with sqlcmd to pull the results of a SELECT statement into a file called temp.txt. There are some foreign characters in the data that require us to use Unicode, so temp.txt is Unicode (codepage 65001).

Once the data is in temp.txt, the script counts the number of rows and appends some headers. In order to do this, it must create a new file (let's call it newfile.txt), add the headers and row count, and then copy in each line from temp.txt to newfile.txt.

All of this works fine, except that the first line copied in from temp.txt has a Unicode byte order mark in it; this means that the first line, instead of looking like this:

1, Custom Page

looks like this instead:

1, Custom Page

I cannot figure this out the best way to handle this.

If I could tell sqlcmd to give me Unicode without a BOM, that would be perfect--tried googling around, couldn't figure it out.

If I could figure out how to write a batch file FOR loop that removes the first three characters of only the first line when copying in temp.txt, I'd try that, but after some googling and experimentation I'm frustrated there.

For the record, the relevant code looks like this:

::%1 = sql file to call; %2 = filename to be created; %3 = header for file; %4 = data type row for file
sqlcmd -I -f 65001 -W -k 1 -h -1 -s "," -S servername -d dbname -i %1 -o temp.txt
set counter=0
for /f %%a in (temp.txt) do set /a counter+=1
echo ^^!total rows=%counter% >> %2
echo !str1! >> %2
echo !str2! >> %2
for /F "delims=¶" %%i in (temp.txt) do ( echo %%i >> %2 )

Please help me, I'm going insane over this ridiculous little problem.

You might try

chcp 65001

in your batch script prior to invoking sqlcmd. It wouldn't be entirely intuitive, but perhaps it plays a role.

If all else fails, get your self a version of bomstrip , and you should be in the clear.

HTH

Update

I have a 'fixed' version for windows, that will reopen stdin/stdout in binary mode, so that you'll avoid getting line-ends converted for you automatically (sic:):

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>

void
usage(char *prog)
{
    fprintf(stderr, "usage: %s\n", prog);
    exit(1);
}

int
main(int argc, char *argv[])
{
    size_t nread;
    char buf[65536];
    char *utf8bom = "\xef\xbb\xbf";

    if (argc > 1)
        usage(argv[0]);

    /*
     * On Windows, we need to use binary mode to read/write non-text archive
     * formats.  Force stdin/stdout into binary mode in case that is what
     * we are using.
     */
#ifdef WIN32
    if (fmt != archNull)
    {
        setmode(fileno(stdout), O_BINARY);
        setmode(fileno(stdin), O_BINARY);
    }
#endif
    nread = fread(buf, 1, strlen(utf8bom), stdin);
    if (nread == 0)
        return 0;
    if (strcmp(buf, utf8bom) != 0)
        fwrite(buf, 1, nread, stdout);
    for (;;) {
        nread = fread(buf, 1, sizeof buf, stdin);
        if (nread < 0)
            exit(1);
        if (nread == 0)
            return 0;
        fwrite(buf, 1, nread, stdout);
    }
    return 0;
}

Now you can do:

> .\bomstrip.exe < withoutbom > test
> md5sum.exe withoutbom test
f9f2e33bb16636f990180fa3fcbc93cb *withoutbom
f9f2e33bb16636f990180fa3fcbc93cb *test

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM