Tag[multibyte-functions] Recent Newest Questions

php regexp to search replace string functions to mb string functions

Solution was to look into look-aheads and look-behinds - the concept of LookArounds in RegEx helped me solve my issue since replacements was eaten fro ...

Why does `std::mbrlen` on mingw-w64 always return one (`1`)

When I compile the following source code in mingw-w64, I am always getting 1 (one) byte from the std::mbrlen: the sample code is based on code from ...

How to sorting strings in unicode using a predefined alphabet?

I have a mysql table with words in unicode using signs like ḥ, ḫ š, etc. The columns in the table are defined as utf8mb4_general_ci and recognize the ...

Dealing with binary data and mb_function overloading?

I have a piece of code here which I need either assurance, or "no no no!" about in regards to if I'm thinking about this in the right or entirely wron ...

Display-width of multibyte character in C standard library – how accurate is the database?

The wcwidth call of Standard C Library returns 2 for Asian characters. Then there are Unicode symbols, like arrows. For those it returns 1. It is ofte ...

Reliably rotating any string

I was experimenting with multibyte strings and how to handle them. Using the code that you can see here https://gist.github.com/charlydagos/89f67808e ...

Php - find first two characters of input from mysql database using mb_ function

Currently, I'm using mb_strrichr function to search characters from database table row but I'm getting one issue. For this input word helloworld I wan ...

PHP and UTF-8 String functions WITHOUT MB-Functions?

I try to use UTF-8 with PHP, the Output seems okay (Display correct äöüß etc, when testing) on my Site, but there is a simply Problem... When I use ec ...

multi-byte characters in libc regcomp and regexec

Is there anyway to get libc6's regexp functions regcomp and regexec to work properly with multi-byte characters? For instance, if my pattern is the u ...

Is it safe to use `strstr` to search for multibyte UTF-8 characters in a string?

Following my previous question: Why `strchr` seems to work with multibyte characters, despite man page disclaimer?, I figured out that strchr was a ba ...

How to properly use MultiByteToWideChar

I am using MultiByteToWideChar to convert my string to a wstring. I am first trying to get the required size for my wstring. According to the document ...

How can I get the correct position of a word in a UTF-8 text?

I have a simple PHP code to get a sentences of a text and bold an specific word. First of all I get an array with the words that I want and their pos ...

PHP: Arabic characters as array keys

I want to implement a simple Arabic to English transliteration. I have defined a mapping array like the following: I expect the following code to c ...

how to tell if a wchar_t has a surrogate (UTF-16)?

I've seen a few other posts on this issue but was unable to find any details on how to determine programatically if a codepoint uses more than one 2-b ...

php sprintf() with foreign characters?

Seams to be like sprintf have a problem with foregin characters? Or is it me doing something wrong? Looks like it work when removing chars like åäö fr ...

Combine two Bytes to WideChar

Is it possible to combine two Bytes to WideChar and if yes, then how? For example, letter "ē" in binary is 00010011 = 19 and 00000001 = 1, or 275 tog ...

PHP multi-byte alternatives UTF8

I've been searching for UTF8-safe alternatives for string manipulation functions. I've found many different opinions and suggestions. I would like to ...

String-Conversion: MBCS <-> UNICODE with multiple \0 within

I am trying to convert a std::string Buffer - containing data from a bitmap file - to std::wstring. I am using MultiByteToWideChar, but that does not ...

Combine several mb_ereg_replace()-calls

How can I combine these replacements into one regular expression? The expressions work as expected but I would like to combine them to less then th ...

How to get correct list position in multi-byte string using preg_match

I am currently matching HTML using this code: It matches everything perfect, however if I have a multibyte character, it counts it as 2 characters ...