简体   繁体   English

Vim:技术方法

[英]Vim: Technical approach

I've got a 2,500 line document that is the output of a datasheet (excel). 我有一个2,500行文档,它是数据表(excel)的输出。 It's consistent and repeating about 100 times within the document - though not perfectly repeating per line because the data per cycle varies slightly. 它是一致的,并且在文档中重复约100次 - 尽管每行不能完美重复,因为每个周期的数据略有不同。 On each (25-line) cycle, I could gather at least 30 pieces of information for uploading into a custom-built uploader to fill dbs in a website. 在每个(25行)周期中,我可以收集至少30条信息,以便上传到定制的上传器中以填充网站中的dbs。

My first thought is to search/replace using the submatch(*) to structure the captured data correctly for my use in uploading it into my own db, like: 我的第一个想法是使用子匹配(*)搜索/替换正确构造捕获的数据,以便将其上传到我自己的数据库中,如:

data1:data2,data3a|data3b|data3c,data4:data5

At first glance there is enough vim registers to begin, append and then replace and dump (overwrite) the registers before the next cycle - and then repeat. 乍一看,有足够的vim寄存器开始,追加,然后在下一个周期之前替换和转储(覆盖)寄存器 - 然后重复。 But in the future I might want to extend this capturing of data, which might max out my registers (az, 0-9) AND make it difficult to keep track of what's what (counting delimiters). 但是在将来我可能想要扩展这种数据捕获,这可能会最大化我的寄存器(az,0-9)并且很难跟踪什么是什么(计数分隔符)。 So I am contemplating functions to pass the captured text along with a name to call it (bypassing the replace/submatch idea) in order to be set (let) for retrieval and proper formatting at the end of each cycle. 因此,我正在考虑将捕获的文本与名称一起传递的函数(绕过替换/子匹配的想法),以便在每个循环结束时设置(let)以进行检索和正确格式化。 I see a function like: 我看到的功能如下:

function SetVar(varname, varval)
    exe "let @".a:varname." = '".a:varval."'"
  endfunction

I would capture the data like: 我会捕获如下数据:

:/sectionHeader/sectionFooter/g/(pieceOfInfo)/call SetVar('varname1',@)/

where the sectionHeader and sectionFooter define the cycling portion (range) within the document. 其中sectionHeader和sectionFooter定义文档中的循环部分(范围)。 I would probably use RegExps to capture these section names and use a portion of the name to label the variable (instead of varname1) - or maybe an incrementing variable like "i". 我可能会使用RegExps捕获这些部分名称并使用名称的一部分来标记变量(而不是varname1) - 或者可能是像“i”这样的递增变量。

and then format the final output like: 然后格式化最终输出,如:

varname1:varname2,varname3|varname4|varname5,varname6:varname7 

I would think this would be much easier to maintain as the variable names could be made to make sense, thus keeping track through the upload process (and possible future expansion). 我认为这将更容易维护,因为变量名称可以有意义,从而跟踪上传过程(以及可能的未来扩展)。

Questions: 问题:

  1. Does this make sense and is it reasonable as an architectural approach to this solution? 这是否有意义,作为这种解决方案的架构方法是否合理?

  2. Can you suggest a better approach? 你能建议一个更好的方法吗?

Wouldn't an awk script be a better choice for this? awk脚本不是更好的选择吗? You could do the same sort of search & replace, and have a separate output file, and awk's line-by-line operation should avoid some of the issues you might encounter in trying to do this in Vim. 您可以执行相同类型的搜索和替换,并具有单独的输出文件,并且awk的逐行操作应该避免在尝试在Vim中执行此操作时可能遇到的一些问题。

Of course, if you'll never repeat this process, Vim might not be a bad call. 当然,如果你永远不会重复这个过程,Vim可能不是一个糟糕的电话。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM