[英]Python: Parsing a specific column (from scratch, no “import csv”) in tab-separated-file
I've written some code that can parse a string into tuples as such: 我已经编写了一些可以将字符串解析为元组的代码,例如:
s = '30M3I5X'
l = []
num = ""
for c in s:
if c in '0123456789':
num = num + c
print(num)
else:
l.append([int(num), c])
num = ""
print(l)
Ie; 即
'30M3I5X'
becomes 变成
[[30, 'M'], [3, 'I'], [5, 'X']]
That part works just fine. 那部分工作正常。 I'm struggling now, however, with figuring out how to get the values from the first column of a tab-separated-value file to become my new 's'.
但是,我现在正在努力寻找如何从制表符分隔值文件的第一列中获取值,以使其成为新的“ s”。 Ie;
即 for a file that looks like:
对于看起来像这样的文件:
# File Example #
30M3I45M2I20M I:AAC-I:TC
50M3X35M2I20M X:TCC-I:AG
There would somehow be a loop incorporated to take only the first column, producing 将以某种方式合并一个循环以仅采用第一列,从而产生
[[30, 'M'],[3, 'I'],[45, 'M'],[2, 'I'],[20, 'M']]
[[50, 'M'],[3, 'X'],[35, 'M'],[2, 'I'],[20, 'M']]
without having to use 无需使用
import csv
Or any other module. 或任何其他模块。
Thanks so much! 非常感谢!
The following code would serve your purpose 以下代码将满足您的目的
rows = ['30M3I45M2I20M I:AAC-I:TC', '30M3I45M2I20M I:AAC-I:TC']
for row in rows:
words = row.split(' ')
print(words[0])
l = []
num = ""
for c in words[0]:
if c in '0123456789':
num = num + c
else:
l.append([int(num), c])
print(l)
Change row.split(' ') to ('\\t') or any other seperator as per the need 根据需要将row.split('')更改为('\\ t')或任何其他分隔符
something like this should do what you're looking for. 这样的事情应该可以满足您的需求。
filename = r'\path\to\your\file.txt'
with open(filename,'r') as input:
for row in input:
elements = row.split()
# processing goes here
elements[0] contains the string that is the first column of data in the file. elements [0]包含字符串,它是文件中数据的第一列。
Edit: 编辑:
to end up with a list of the lists of processed data: 最终得到处理数据列表的列表:
result = []
filename = r'\path\to\your\file.txt'
with open(filename,'r') as input:
for row in input:
elements = row.split()
# processing goes here
result.append(l) # l is the result of your processing
So this is what ended up working for me--took bits and pieces from everyone, thank you all! 这就是最终对我有用的东西-吸引了每个人的点滴,谢谢大家!
Note: I know it's a bit verbose, but since I'm new, it helps me keep track of everything :) 注意:我知道这有点冗长,但是由于我是新手,因此它可以帮助我跟踪所有事情:)
#Defining the parser function
def col1parser(col1):
l = []
num = ""
for c in col1:
if c in '0123456789':
num = num + c
else:
l.append([int(num), c])
num = ""
print(l)
#Open file, run function on column1
filename = r'filepath.txt'
with open(filename,'r') as input:
for row in input:
elements = row.split()
col1 = elements[0]
l = col1parser(col1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.