简体   繁体   English

J Unicode索引访问器

[英]J unicode index accessor

In J, i can do the following: 在J中,我可以执行以下操作:

r=:'0123456'
m=:3 } r
echo m

and it prints 3, as it should. 并按要求打印3。

However, unicode seems to not work: 但是,unicode似乎不起作用:

'▁▂▃▄▅▆▇'
m=: 3 } r
echo m

prints nothing. 什么都不打印。 My guess is that this is due to } indexing by byte - what is the proper way to index by char position? 我的猜测是这是由于}按字节索引-按char位置索引的正确方法是什么?

You are correct that the indexing of the list given is by byte. 您是正确的给定列表的索引是按字节的。 That is because its datatype is literal. 那是因为它的数据类型是文字的。 If you want it to be interpreted as unicode, then the list needs to be converted to unicode: 如果希望将其解释为unicode,则需要将列表转换为unicode:

   datatype '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'       NB. check datatype of list
literal
   # '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'              NB. count items in list
60
   ucp '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'            NB. convert to unicode point chars
①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳
   datatype ucp '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'   NB. check datatype
unicode
   # ucp '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'          NB. count items in unicode list
20
   3} ucp '①②③④⑤⑥⑦⑧⑨⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳'         NB. index into the list
④

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM