[英]Ruby counting duplicates in array and print into new file
I need to count the number of duplicates in an array, find out how many times they appear and then put it into a document.. this is what I've done, and I am now clueless of how to proceed.... The data is from another txt file. 我需要计算数组中重复项的数量,找出它们出现的次数,然后将其放入文档中。.这就是我所做的,现在我不知道如何进行操作了。数据来自另一个txt文件。 I apologize if its a bit messy, but I am so confused right now.
如果这有点混乱,我深表歉意,但是我现在很困惑。
class Ticket
attr_accessor :ticknum
attr_accessor :serialnum
def initialize(ticknum,serialnum)
@ticknum = ticknum
@serialnum = serialnum
end
end
class Ticketbook
def initialize
@ticketbook = Array.new
end
def newticket(ticket)
@ticketbook << ticket
@ticketbook.sort! {|x,y| x.ticknum*1000 + x.serialnum <=> y.ticknum*1000 + y.serialnum}
end
def soldnumber(tickenum2,serialnum2)
@ticknum2 = ticknum2
@serialnumb2 = serialnum2
@antal = 0
for i in 0..@ticketbook.length-1
if @ticknum2 == @ticketbook[i].ticknum && @serialnum2 == @ticketbook[i].serialnum
@antal +=1
end
end
return @antal
end
end
ticketfile = File.open("tickets.txt", "r")
book = Ticketbook.new
ticketfile.each {|line|
a = line.split(",")
newdoc = Ticket.new(a[0].to_i,a[1].to_i)
book.newticket(newdoc)
}
registernums = File.new("registernums.txt", "w")
for i in (0..@ticketbook.length-1)
registernums.print book[i].@ticketnum.to_i + ", "
registernums.print book[i].@serialnumber.to_i + ", "
registernums.puts book[i].soldnumber(i)
end
print registernums
gives me this error: rb 56 unexpected tIVAR, expecting "(" registernums.print book[i].@ticketnum.to_i rb 57 unexpected tIVAR, expecting "(" registernums.print book[i].@serialnum.to_i 给我这个错误:rb 56意外的tIVAR,期望“(” registernums.print book [i]。@ ticketnum.to_i rb 57意外的tIVAR,期望“(” registernums.print book [i]。@ serialnum.to_i
您的for
循环没有主体,因此您的最后几行引用i
在未定义循环的外部。
The problem is with these lines. 问题在于这些行。
registernums.print book[i].@ticketnum.to_i + ", "
registernums.print book[i].@serialnumber.to_i + ", "
To access any objects instance variables, you do not need to put an @
. 要访问任何对象的实例变量,您无需放置
@
。 So the correct code should be 所以正确的代码应该是
registernums.print book[i].ticketnum.to_i + ", "
registernums.print book[i].serialnumber.to_i + ", "
Also as @Jonah pointed out, there should be an end
to end the last for loop. 就像@Jonah指出的那样,应该在最后一个for循环的
end
结束。
There are a few problems here: 这里有一些问题:
for i in (0..@ticketbook.length-1)
registernums.print book[i].@ticketnum.to_i + ", "
registernums.print book[i].@serialnumber.to_i + ", "
registernums.puts book[i].soldnumber(i)
print registernums
This code is outside the TicketBook
class, so none of the instance variables (those beginning by @
) are actually available. 该代码在
TicketBook
类之外,因此实际上没有任何实例变量(以@
开头的实例变量)。
If you want to access the array of tickets from outside the TicketBook, create an 如果要从TicketBook外部访问票证阵列,请创建一个
attr_reader :ticketbook
in the TicketBook
class. 在
TicketBook
类中。
You might want to replace your code by something like: 您可能想要用以下内容替换代码:
book.ticketbook.each_with_index do |tb, i|
registernums.print tb.ticketnum.to_i + ", "
registernums.print tb.ticketnum.to_i + ", "
registernums.puts tb.soldnumber(i)
end
Oh, boy! 好家伙!
Before I start few important points: - You are overusing instance variables - Ticket class is all right, but Ticketbook (should be TicketBook) should only have one instance_variable (the one set in initialize method), the rest should be local to method's scope. 在开始一些重要说明之前:-您正在过度使用实例变量-票证类可以,但是Ticketbook(应为TicketBook)应仅具有一个instance_variable(在initialize方法中设置的一个),其余应位于方法作用域的局部。
Ruby naming convention is to separate words with _ (new_doc, ticket_file and so on) Ruby的命名约定是用_分隔单词(new_doc,ticket_file等)
You should almost never use for
loop - the only reason to use it is to write your own iterator, but you are acting on arrays here - use each
method 您几乎应该永远不要使用
for
循环-使用它的唯一原因是编写自己的迭代器,但是您在此处作用于数组-请使用each
方法
Now about the errors: 现在,关于错误:
ticketfile = File.open("tickets.txt", "r")
book = Ticketbook.new
ticketfile.each {|line|
a = line.split(",")
newdoc = Ticket.new(a[0].to_i,a[1].to_i)
book.newticket(newdoc)
}
registernums = File.new("registernums.txt", "w")
for i in (0..@ticketbook.length-1) # @ticketbook is an instance variable of Ticketbook, you'll get undefined length for nil:NilClass
registernums.print book[i].@ticketnum.to_i + ", " # book is an instance of Ticketbook, [] is not defined on that class!
registernums.print book[i].@serialnumber.to_i + ", "
registernums.puts book[i].soldnumber(i)
print registernums
Your Ticketbook class 您的票务类
class Ticketbook
def initialize
@ticketbook = Array.new #personaly would prefer []
end
def newticket(ticket)
@ticketbook << ticket
@ticketbook.sort! {|x,y| x.ticknum*1000 + x.serialnum <=> y.ticknum*1000 + y.serialnum}
end
def soldnumber(tickenum2,serialnum2)
@ticknum2 = ticknum2 # unnecessary
@serialnumb2 = serialnum2 # unnecessary
@antal = 0
for i in 0..@ticketbook.length-1 # Should be @ticketbook.each do |ticket|
if @ticknum2 == @ticketbook[i].ticknum && @serialnum2 == @ticketbook[i].serialnum
@antal +=1
end
end
@antal
# much better would be:
# def soldnum(ticknum2, serialnum2)
# @ticketbook.select {|ticket| ticket.ticknum == ticknum2 && ticket.serialnum == serialnum }.count
# end
end
end 结束
I would also introduce you to group_by
method - run on array will convert it into a really nice hash, where keys are result of executed block: 我还将向您介绍
group_by
方法-在数组上运行会将其转换为一个非常漂亮的哈希,其中键是已执行块的结果:
[1,2,3,4,5,6].group_by {|e| e.odd?} #=> {true => [1,3,5], false => [2,4,6]}
You can use it to get repetition count in one go: 您可以使用它来一次性获得重复计数:
# inside ticket book
def count_repetitions
Hash[@ticketbook.group_by {|e| [e.ticknum, e.serialnum]}.map {|key, value| [key, value.count]}
end
This should return hash, where keys are two-element arrays containing ticknum and serialnum, and values are number of occurrences 这应该返回哈希,其中键是包含ticknum和serialnum的两个元素的数组,值是出现的次数
tIVAR
refers to an instance variable, so the error message unexpected tIVAR
means that ruby wasn't expecting an instance variable somewhere, and it points to this line (and the one after) tIVAR
引用了一个实例变量,因此错误消息unexpected tIVAR
表示ruby在某处不期望使用实例变量,并且它指向此行(及其后的一行)
registernums.print book[i].@ticketnum.to_i + ", "
Accessing attributes in an object doesn't use the @
character (and it isn't part of the variable name either). 访问对象中的属性不使用
@
字符(也不是变量名的一部分)。 A correct way to access your ticketnum
attribute is 访问您的
ticketnum
属性的正确方法是
registernums.print book[i].ticketnum.to_i + ", "
As your question has been answered, I would like to suggest a more "Ruby-like" way of dealing with your problem. 回答完您的问题后,我想提出一种更“类似于Ruby”的方式来处理您的问题。 First, a few points:
首先,几点:
Ticket
class, as you have done, a two-element array, with the understanding that the first and second elements correspond to the ticket and serial numbers, respectively, or as a hash, with one key for ticket number and another for serial number. Ticket
类的实例,就像已经完成的那样,由两个元素组成的数组,但要理解的是,第一个和第二个元素分别对应于票证和序列号,或者使用一个键作为哈希值表示票号,另一个表示序列号。 I favor a hash. Enumerable
"mixin" module, there is no need to loop over indices. Enumerable
“ mixin”模块中提供的所有方法,无需循环索引。 Avoiding such loops will make your code more compact, easier to read and less likely to contain errors. We begin by adding some tickets to ticketbook
: 首先,向票务
ticketbook
添加一些票证:
ticketbook = []
ticketbook << {tnbr: 22, snbr: 55}
ticketbook << {tnbr: 27, snbr: 65}
ticketbook << {tnbr: 22, snbr: 56}
ticketbook << {tnbr: 27, snbr: 66}
# => [{:tnbr=>22, :snbr=>55}, {:tnbr=>27, :snbr=>65}, \
{:tnbr=>22, :snbr=>55}, {:tnbr=>27, :snbr=>65}]
Now to find duplicates (tickets having the same ticket number but different serial numbers). 现在查找重复项(具有相同票证编号但序列号不同的票证)。 Once you gain more experience with Ruby, you will think of the
Enumerable#group_by
method (or possibly Enumerable#chunk
) whenever you want to group elements of an array by some characteristic: 一旦您对Ruby有了更多的经验,每当您想要按某个特征对数组的元素进行分组时,便会想到
Enumerable#group_by
方法(或者可能是Enumerable#chunk
):
g0 = ticketbook.group_by {|t| t[:tnbr]}
# => {22=>[{:tnbr=>22, :snbr=>55}, {:tnbr=>22, :snbr=>56}], \
# 27=>[{:tnbr=>27, :snbr=>65}, {:tnbr=>27, :snbr=>66}]}
As you see, when we group_by
ticket number, we obtain a hash with elements (k,v)
, where the key k
is a ticket number and the value v
is an array of tickets (hashes) having that ticket number. 如您所见,当我们对票证编号进行
group_by
时,我们将获得一个包含元素(k,v)
的哈希,其中键k
是票证编号,值v
是具有该票证编号的票证(哈希)数组。
This may be all you need. 这可能就是您所需要的。 If you want a count of the numbers of tickets having the same serial number, you could use
Enumerable#map
to convert each value in the g0
hash (an array of tickets having the same ticket number) to the number of such tickets: 如果要对具有相同序列号的票证数量进行计数,可以使用
Enumerable#map
将g0
哈希(具有相同票证号的票证数组)中的每个值转换为此类票证的数量:
g1 = g0.map {|k,v| {k => v.size}} # => [{22=>2}, {27=>2}]
You might stop here, but it would be more convenient if this were instead a hash ( {22=>2, 27=>2}
), rather than an array of single-pair hashes. 您可能会在这里停下来,但是如果这是一个哈希(
{22=>2, 27=>2}
),而不是一个单对哈希数组,将会更加方便。 There are several ways you can convert this array to a hash. 您可以通过多种方式将此数组转换为哈希。 One is to use
map
to convert the hashes to arrays: 一种是使用
map
将哈希值转换为数组:
g2 = g1.map(&:to_a) # => [[[22, 2]], [[27, 2]]]
(where map(&:to_a)
is shorthand for map {|h| h.to_a}
) then use Array#flatten
to convert this to: (其中
map(&:to_a)
是map {|h| h.to_a}
简写),然后使用Array#flatten
将其转换为:
g3 = g2.flatten # => [22, 2, 27, 2]
One way to create a hash (in general) is like this: 创建哈希的一种方法(通常)是这样的:
Hash[1,2,3,4] # => {1=>2, 3=>4}
To do this with the array g3
we need to prepend the array with the "splat" operator: 为此,我们需要在数组
g3
上添加“ splat”运算符:
Hash[*g3] # => {22=>2, 27=>2}
This gives us the desired hash of counts by ticket number. 这为我们提供了所需的按票号计数的哈希值。 I said that was one way to convert an array of single-pair hashes to a hash.
我说过,这是将单对散列数组转换为哈希的一种方法。 Here is a more direct way:
这是一种更直接的方法:
g1.pop.merge(*g1) # => {27=>2, 22=>2}
Here g1.pop
returns {27=>2}
and converts g1
to [{22=>2}]
. 在这里,
g1.pop
返回{27=>2}
并将g1
转换为[{22=>2}]
。 The above expression is therefore equivalent to: 因此,以上表达式等效于:
{27=>2}.merge(*[{22=>2}]) # => {27=>2, 22=>2}
which merges the hashes in the splatted array (here just one) into the hash that precedes merge
. 它将散列数组中的哈希(此处仅是一个)
merge
到merge
之前的哈希中。
Rather than introducing the local variables g0
and g1
, you would normally "chain" these three operations: 通常不用“链接”这三个操作,而不必引入局部变量
g0
和g1
:
ticketbook.group_by {|t| t[:tnbr]}.map {|k,v| {k => v.size}}.pop.merge(*g1)
# => {27=>2, 22=>2}
Lastly, while your version of sort
is fine, you could also do it like this: 最后,虽然您的
sort
版本很好,但是您也可以这样:
ticketbook.sort! {|x,y| (x <=> y) == 0 ? x[:snbr] <=> y[:snbr] : \
x[:tnbr] <=> y[:tnbr]}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.