简体   繁体   English

如何将 Freeling 的命令行输出转换为可消费数组

[英]How to convert command line output from Freeling to consumable array

I am using Ruby for this.我为此使用 Ruby。 Freeling (a NLP tool) has a shallow parser which returns a string like this for the text "I just read the book, the grasshopper lies heavy" when I run a shallow parsing command. Freeling(一个 NLP 工具)有一个浅层解析器,当我运行浅层解析命令时,它会为文本“我刚读完这本书,蚱蜢很重”返回这样的字符串。

a = <<EOT
S_[
  sn-chunk_[
    +(I i PRP -)
  ]
  adv_[
    +(just just RB -)
  ]
  vb-chunk_[
    +(read read VB -)
  ]
  sn-chunk_[
    (the the DT -)
    +n-chunk_[
      (book book NN -)
      +n-chunk_[
        +(The_Grasshopper_Lies_Heavy the_grasshopper_lies_heavy NP -)
      ]
    ]
  ]
  st-brk_[
    +(. . Fp -)
  ]
]

EOT

I want to get the following array from this:我想从中获得以下数组:

["I", "just", "read", "the book The Grasshopper Lies Heavy","."]

(I want to merge the words that are under a tree and have it as a single array element.) (我想合并树下的单词并将其作为单个数组元素。)

So far, I have written this much:到目前为止,我已经写了这么多:

b = a.gsub(/.*\[/,'[').gsub(/.*\+?\((\w+|.) .*/,'\1').gsub(/\n| /,"").gsub("_","")

which returns返回

[[I][just][read][the[book[The Grasshopper Lies Heavy]]][.]]

So, how can i get the desired array?那么,我怎样才能得到想要的数组呢?

From your solution so far:从您目前的解决方案来看:

result = a.gsub(/.*\[/,'[').gsub(/.*\+?\((\w+|.) .*/,'\1').gsub(/\n| /,"").gsub("_"," ")
result.split('][').map { |s| s.gsub(/\[|\]/, ' ').strip }     # ["I", "just", "read", "the book The Grasshopper Lies Heavy", "."]

If you call FreeLing from Ruby via the API, you can get the tree and traverse it at will.如果通过API从Ruby中调用FreeLing,就可以得到树,随意遍历。

If you are using the output of the command-line program and loading it into Ruby as a string, it may be easier to call it with option "--output conll" which will produce a tabular format easier to deal with.如果您使用命令行程序的输出并将其作为字符串加载到 Ruby 中,则使用选项“--output conll”调用它可能更容易,这将生成更易于处理的表格格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在bash中逐行将命令输出转换为数组? - How to convert command output to an array line by line in bash? 如何将JSON转换为jQuery在jQuery(从输出)? - How to convert JSON to Array in JQuery (from the output)? 在 Bash 中将命令行参数转换为数组 - Convert command line arguments into an array in Bash 如何逐行将字符串文本从记事本转换为数组? - How to convert string text from notepad into array line by line? 如何将命令行中的数组转换为整数数组? - How to turn an array from command line into an array of integers? jruby / ActiveRecord / jdbcsqlite3 / jruby-poi:从Rails到命令行脚本:如何删除/转换单个元素数组? - jruby/ActiveRecord/jdbcsqlite3/jruby-poi: From Rails to command-line script: How to remove/convert single element array's? 如何从命令行在cURL POST请求中执行数组 - How to do an array in a cURL POST request from the command line 如何从命令输出的行创建数组 - How to create an array from the lines of a command's output 如何将find命令的输出存储在数组中? +重击 - How do I store the output from a find command in an array? + bash 如何从numpy数组中有效地输出每行n个项目 - How to efficiently output n items per line from numpy array
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM