我如何用gru用grep分割文件？

Question

I have a file that contains many bits of code, and I'd like to refactor all of them into their own files. 我有一个包含很多代码的文件，我想将所有代码重构为自己的文件。 The file in question has some 30k lines, so I don't want to do it by hand. 有问题的文件大约有30k行，所以我不想手动处理。

Each of the sections starts: 每个部分均始于：

module MyModule

(I've changed that name) （我改了名字）

Is there a function to split the file per-mark? 是否有按标记分割文件的功能？ When I use File.readlines I can't find a nice way to split the array. 当我使用File.readlines我找不到拆分数组的好方法。

I don't care how you'd think to name them. 我不在乎您如何命名它们。

Answer 1

I refactored your code. 我重构了您的代码。

File.read('lib/odin.rb').split(/module Odin/).each do |mod|
    File.open("#{mod[/class (\w+)/, 1]}.rb", "w") do |f| 
        f.write("module Odin")
        f.write(mod)
    end
end

Answer 2

I've found the answer, in writing out the question in detail. 通过详细写出问题，我找到了答案。

I'm posting it as an answer, but I'll award the answer to someone else that has a better solution: 我将其发布为答案，但是我会将答案授予具有更好解决方案的其他人：

big_file = File.readlines 'lib/odin.rb'
big_file.
  join(' ').
  split(/module Odin/). 
  map!{|w| w.prepend("module Odin\n") }.
  each do |f| 
    name = "#{f.match(/class ([a-zA-Z]+)/)[1].underscore}.rb"
    File.open(name, "w") do |n| 
      n.write(f)
    end
  end

I also thought of a nice way to name the output files based on content; 我还想到了一种基于内容命名输出文件的好方法。 but I don't care how you'd think to name them. 但我不在乎您如何命名它们。

Answer 3

Ruby has a great method that is part of Enumerable called slice_before : Ruby有一个很棒的方法，它是Enumerable的一部分，称为slice_before ：

require 'pp'

modules = DATA.readlines.map(&:chomp).slice_before(/^module MyModule/).map{ |a| a.join("\n") }
pp modules

__END__
module MyModule
  # 1 stuff
end

module MyModule
  # 2 stuff
end

module MyModule
  # 3 stuff
end

This is the output showing what modules contains: 这是显示哪些modules包含的输出：

["module MyModule\n  # 1 stuff\nend\n",
 "module MyModule\n  # 2 stuff\nend\n",
 "module MyModule\n  # 3 stuff\nend"]

DATA is Ruby sleight-of-hand inherited from Perl. DATA是从Perl继承的Ruby技巧。 Everything in the source file after __END__ is considered part of a "data" block, which is made available to the running code by the interpreter in the DATA file handle, and acts like a data file. __END__之后的源文件中的__END__均视为“数据”块的一部分，解释器在DATA文件句柄中将其提供给正在运行的代码，其作用类似于数据文件。 That means we can use IO methods on it, such as readlines , similarly to how we'd use IO.readlines . 这意味着我们可以像使用IO.readlines一样在其上使用IO方法，例如readlines 。 I'm using __END__ and DATA here because they're convenient for simple tests and short scripts. 我在这里使用__END__和DATA ，因为它们对于简单的测试和简短的脚本很方便。

readlines doesn't remove the trailing line-end when it reads the line, which is what map(&:chomp) does. readlines读取行时不会删除行尾，这是map(&:chomp)所做的。 DATA.read.split("\\n") would have accomplished the same thing. DATA.read.split("\\n")将完成相同的操作。

slice_before is the magic that makes this work. slice_before是使这项工作起作用的魔力。 It takes an array and iterates through it, creating sub-arrays that start each time the pattern finds a hit. 它需要一个数组并对其进行遍历，从而创建子数组，该子数组在每次模式找到匹配时都开始。 Following that it's just a case of rejoining the contents of the sub-arrays back into a single string, prior to writing to the files. 接下来，只是在写入文件之前将子数组的内容重新合并为单个字符串的情况。

After that you just have to loop over modules , saving each one to a different file: 之后，您只需要遍历modules ，将每个modules保存到另一个文件中：

modules.each.with_index(1) do |m, i|
  File.write("module_#{ i }.rb", m)
end

with_index is a nice little method in Enumerator, that is useful when we need to know which item in an array we're processing. with_index是Enumerator中一个不错的小方法，当我们需要知道要处理的数组中的哪个项目时，该方法很有用。 It's similar to each_with_index except we can specify the starting offset value, 1 in this case. 它类似于each_with_index不同之处each_with_index我们可以指定起始偏移值，在这种情况下为1 。

我如何用gru用grep分割文件？

问题描述

3 个解决方案

解决方案1
2 2013-05-08 23:58:29

解决方案2
1 2013-05-08 23:46:15

解决方案3
1 已采纳 2013-05-09 04:24:24

我如何用gru用grep分割文件？

问题描述

3 个解决方案

解决方案1 2 2013-05-08 23:58:29

解决方案2 1 2013-05-08 23:46:15

解决方案3 1 已采纳 2013-05-09 04:24:24

解决方案1
2 2013-05-08 23:58:29

解决方案2
1 2013-05-08 23:46:15

解决方案3
1 已采纳 2013-05-09 04:24:24