[英]Split line in data file
I'm struggling with splitting lines in my data file. 我正在努力在我的数据文件中拆分行。 Here is a few lines sample: 以下是几行示例:
1:0 2:120
1:1 2:131
1:2 2:26
1:3 2:568
1:4 2:176
1:5 2:28 3:549
1:6 2:17
1:7 2:6 3:217 4:401 5:636
1:8 2:139
I want to split it to get out each value ... maybe in form of an array: 我想拆分它以获得每个值...也许以数组的形式:
((1, 2) , (0, 120))
((1, 2) , (1, 131))
...
((1, 2, 3, 4, 5) , (7, 6, 217, 401, 636))
meaning that for each line the array could have different dimensions. 意味着对于每一行,数组可以具有不同的维度。 I was trying to split it in two steps, but it doesn't work. 我试图分两步拆分它,但它不起作用。
inf = open("datafile.txt", 'r')
for line in inf:
line.split()
for x in line.split():
x.split(':',1)
You can group the elements of two lists, using zip
function. 您可以使用zip
函数对两个列表的元素进行分组。
with open("Input.txt") as inf:
for line in inf:
print zip(*map(lambda x: map(int, x.split(":")), line.split()))
Output 产量
[(1, 2), (0, 120)]
[(1, 2), (1, 131)]
[(1, 2), (2, 26)]
[(1, 2), (3, 568)]
[(1, 2), (4, 176)]
[(1, 2, 3), (5, 28, 549)]
[(1, 2), (6, 17)]
[(1, 2, 3, 4, 5), (7, 6, 217, 401, 636)]
[(1, 2), (8, 139)]
Suggestion : It is always good to open the files with with
keyword, like I have shown in the code above. 建议: with
关键字打开文件总是好的,就像我在上面的代码中所示。 Because, it will take care of closing/releasing the resources, even if the program fails with an exception. 因为,它将负责关闭/释放资源,即使程序因异常而失败。
Explanation: 说明:
Since zip
is a function call, the parameters are evaluated first. 由于zip
是函数调用,因此首先评估参数。 Lets come to the *
later. 让我们来看看*
。 map(lambda x: map(int, x.split(":")), line.split())
, we apply the lambda function lambda x: map(int, x.split(":"))
to each and every element of the list of strings returned by line.split()
(which splits the sentences at whitespace characters and returns the list). map(lambda x: map(int, x.split(":")), line.split())
,我们将lambda函数lambda x: map(int, x.split(":"))
应用于每个line.split()
返回的字符串列表中的每个元素(以空格字符分割句子并返回列表)。
Each and every split word, will be passed as parameter to the lambda function one by one. 每个拆分字将作为参数逐个传递给lambda函数。 If we take the first case, first "1:0"
will be sent to the lambda function as x
, where we split based on :
which will give a list ["1", "0"]
and then we apply int
function over that, which will give [1, 0]
. 如果我们采用第一种情况,首先将"1:0"
作为x
发送到lambda函数,其中我们基于以下内容进行拆分:
这将给出一个列表["1", "0"]
然后我们将int
函数应用于那将给出[1, 0]
。 So, after all the lines are split and lambda is applied, the result will be like this 因此,在分割所有行并应用lambda之后,结果将如下所示
[[1, 0], [2, 120]]
[[1, 1], [2, 131]]
[[1, 2], [2, 26]]
[[1, 3], [2, 568]]
[[1, 4], [2, 176]]
[[1, 5], [2, 28], [3, 549]]
[[1, 6], [2, 17]]
[[1, 7], [2, 6], [3, 217], [4, 401], [5, 636]]
[[1, 8], [2, 139]]
Now we have two elements in each list. 现在我们在每个列表中都有两个元素。 Remember the *
which we decided to discuss later, it will unpack the list and pass all the elements as parameters to the zip
function, like this 记得以后我们决定讨论的*
,它将解压缩列表并将所有元素作为参数传递给zip
函数,就像这样
zip([1, 0], [2, 120])
Now zip
will pick all the first elements and put them in a list, and then it will pick all the second elements and put them in a list and so on. 现在zip
将选择所有第一个元素并将它们放在一个列表中,然后它将选择所有第二个元素并将它们放在一个列表中,依此类推。
This is how we get the answer you expected. 这就是我们如何得到您期望的答案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.