Python：解析制表符分隔文件中的特定列（从头开始，没有“ import csv”）

Question

I've written some code that can parse a string into tuples as such: 我已经编写了一些可以将字符串解析为元组的代码，例如：

s = '30M3I5X'
l = []
num = ""
for c in s:
  if c in '0123456789':
     num = num + c
     print(num)
  else:
     l.append([int(num), c])
  num = ""

print(l)

Ie; 即

'30M3I5X'

becomes 变成

[[30, 'M'], [3, 'I'], [5, 'X']]

That part works just fine. 那部分工作正常。 I'm struggling now, however, with figuring out how to get the values from the first column of a tab-separated-value file to become my new 's'. 但是，我现在正在努力寻找如何从制表符分隔值文件的第一列中获取值，以使其成为新的“ s”。 Ie; 即 for a file that looks like: 对于看起来像这样的文件：

# File Example #
30M3I45M2I20M   I:AAC-I:TC
50M3X35M2I20M   X:TCC-I:AG

There would somehow be a loop incorporated to take only the first column, producing 将以某种方式合并一个循环以仅采用第一列，从而产生

[[30, 'M'],[3, 'I'],[45, 'M'],[2, 'I'],[20, 'M']]
[[50, 'M'],[3, 'X'],[35, 'M'],[2, 'I'],[20, 'M']]

without having to use 无需使用

import csv

Or any other module. 或任何其他模块。

Thanks so much! 非常感谢！

Answer 1

Just open the path to the file and iterate through the records? 只需打开文件的路径并遍历记录？

def fx(s):    
    l=[]  
    num=""  
    for c in s:  
        if c in '0123456789':  
           num=num+c  
        print(num)  
        else:  
           l.append([int(num), c])  
      num=""  
    return l

with open(fp) as f:
  for record in f:
      s, _ = record.split('\t')
      l = fx(s)
      # process l here ...

Answer 2

The following code would serve your purpose 以下代码将满足您的目的

rows = ['30M3I45M2I20M   I:AAC-I:TC', '30M3I45M2I20M   I:AAC-I:TC']

for row in rows:
    words = row.split('  ')
    print(words[0])
    l = []
    num = ""
    for c in words[0]:
        if c in '0123456789':
             num = num + c
        else:
            l.append([int(num), c])

    print(l)

Change row.split(' ') to ('\\t') or any other seperator as per the need 根据需要将row.split（''）更改为（'\\ t'）或任何其他分隔符

Answer 3

something like this should do what you're looking for. 这样的事情应该可以满足您的需求。

filename = r'\path\to\your\file.txt'
with open(filename,'r') as input:
    for row in input:
        elements = row.split()
        # processing goes here

elements[0] contains the string that is the first column of data in the file. elements [0]包含字符串，它是文件中数据的第一列。

Edit: 编辑：

to end up with a list of the lists of processed data: 最终得到处理数据列表的列表：

result = []
filename = r'\path\to\your\file.txt'
with open(filename,'r') as input:
    for row in input:
        elements = row.split()
        # processing goes here
        result.append(l) # l is the result of your processing

Answer 4

So this is what ended up working for me--took bits and pieces from everyone, thank you all! 这就是最终对我有用的东西-吸引了每个人的点滴，谢谢大家！

Note: I know it's a bit verbose, but since I'm new, it helps me keep track of everything :) 注意：我知道这有点冗长，但是由于我是新手，因此它可以帮助我跟踪所有事情:)

#Defining the parser function

def col1parser(col1):
l = []
num = ""
for c in col1:
    if c in '0123456789':
        num = num + c
    else:
        l.append([int(num), c])
        num = ""
print(l)


#Open file, run function on column1
filename = r'filepath.txt'
with open(filename,'r') as input:
    for row in input:
        elements = row.split()
        col1 = elements[0]
        l = col1parser(col1)

Python：解析制表符分隔文件中的特定列（从头开始，没有“ import csv”）

问题描述

4 个解决方案

解决方案1
0 2018-01-25 02:52:52

解决方案2
0 2018-01-25 02:54:17

解决方案3
0 2018-01-25 02:55:44

解决方案4
0 已采纳 2018-01-25 16:55:04

Python：解析制表符分隔文件中的特定列（从头开始，没有“ import csv”）

问题描述

4 个解决方案

解决方案1 0 2018-01-25 02:52:52

解决方案2 0 2018-01-25 02:54:17

解决方案3 0 2018-01-25 02:55:44

解决方案4 0 已采纳 2018-01-25 16:55:04

解决方案1
0 2018-01-25 02:52:52

解决方案2
0 2018-01-25 02:54:17

解决方案3
0 2018-01-25 02:55:44

解决方案4
0 已采纳 2018-01-25 16:55:04