简体   繁体   中英

UTF-8 character set

I have a form field which would allow up to 120 characters and also accept all UTF-8 unicode character set including special, numeric and Alpha to provide for i18ncharacters. It should ignore leading and trailing spaces

As I have mostly used limited ASCII set, I am not sure what UTF-8 would include.

Could you please guide me about the basic differences of the ASCII/UTF-8 and the complete character set which should be allowed given the above requirement.

Thank you.

ASCII contains only 128 characters and the latest version of Unicode contains more than 109,000 characters covering 93 scripts.

http://en.wikipedia.org/wiki/ASCII - the full description about ASCII

http://en.wikipedia.org/wiki/Unicode - the wiki article about Unicode

http://unicode.org/charts/ - list of Unicode charts

Simply, UTF-8 is a superset of US-ASCII. Any character in ASCII can be represented in UTF-8, and using the same bit representations. UTF-8 is one representation of Unicode, that allows for representation of any currently defined character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM