简体   繁体   中英

how to store UTF-8 encoding data to sqlite3 using Visual C++

I've created a sqlite database with encoding UTF-8(default).

Then I use the following statement to insert data:

strcpy(sql,"insert into blog(title) values('呵呵')");
sqlite3_exec(db,sql,0,0,0);

then I open the sqlite database with tool called SQLite Developer the value of title field shows ºǺ garbage code under Data encoding: UNICODE . then I changed Data encoding to ANSI , value of title shows right.

As I know the sqlite3_exec prototype is :

int sqlite3_exec(
  sqlite3*,                                  /* An open database */
  const char *sql,                           /* SQL to be evaluated */
  int (*callback)(void*,int,char**,char**),  /* Callback function */
  void *,                                    /* 1st argument to callback */
  char **errmsg                              /* Error msg written here */
);

I still try to pass wchar_t type to sql ,but still won't work it out.

My Visual C++ project already defined UNOCODE & _UNICODE , So my question is: how to store UTF-8 encoding data to sqlite3 using Visual C++?


Update(question solved)

I use iconv to convert GBK encoding to UTF-8 inspired by msandiford. Thanks msandiford so much.

char* pOut;
char* pIn;
size_t inLen,outLen=2000;
strcpy(sql,"insert into blog(title) values('呵呵')");
string strSQL = sql;
char* sql2 = (char*)malloc(2000);
memset(sql2,0,2000);
pOut = &sql2[0];
inLen = strlen(strSQL.c_str());
pIn = const_cast<char*>(strSQL.c_str());
iconv_t g2u8 = iconv_open("UTF-8","GBK");
iconv(g2u8,(const char**)&pIn,&inLen,&pOut,&outLen);
sqlite3_exec(db,sql2,0,0,0);

Collecting comments into answer form:

From the question comments, apparently the source files are not encoded in UTF-8. Converting to UTF-8 or using the UTF-8 encoding directly seems to work.

Using UTF-8 encoding directly:

    strcpy(sql,"insert into blog (title) values ('\xE5\x91\xB5\xE5\x91\xB5')");

You could avoid having to convert all your source files to UTF-8 by doing something like this:

    sprintf(sql, "insert into blog (title) values('%s')", AnsiToUtf8("呵呵"));

Unfortunately the AnsiToUtf8() function is going to be pretty platform specific.


Looking further into this, it appears that Visual Studio saves source files in the default encoding for your Windows locale settings. Based on this, there could potentially be an assortment of encodings if your dev team's computers are set up for different locales.

I think it would be quite difficult, if not impossible, to implement an AnsiToUtf8() function that would cope in all the possible cases, especially given that the locale settings for the computer that the code is developed on may not be the same as the computer that ultimately runs the code.

I think the cleanest way to resolve this would be to use UTF-8 encoding uniformly in source files, assuming you want to use code points in string literals outside the areas where the default encoding and Unicode overlap.

Another way would be to internationalise the code so that the source files did not contain extended characters, and use something like GNU gettext or similar to handle translations.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM