[英]Python loading txt file and split lines by position in line
I am new here and a python beginner.我是新来的,是 python 初学者。 I received a text file containing 100k lines each containing 120 characters.
我收到了一个包含 100k 行的文本文件,每行包含 120 个字符。 Every line is representing data for 14 columns but as some values are shorter the other they are filled up with blank.
每行代表 14 列的数据,但由于某些值较短,而另一些值则用空白填充。 That way I don´t have a separator like ",".
这样我就没有像“,”这样的分隔符。 If I would choose blank as separator, the values would not go to the correct column.
如果我选择空白作为分隔符,则值不会 go 到正确的列。
Lines are like线条就像
O2020august Opel .
L2015may BMW .
L2016april Mercedes.
O2021january Opel .
L2023februaryAudi .
I am stuck with我被困住了
df = pd.read_csv('text.txt', index_col=0, header = None)
print (data)
I am happy for any approach suggested.我对建议的任何方法感到高兴。 Doesn´t need to be pandas.
不需要是 pandas。
Cheers Jeanny干杯珍妮
Or you can use a simple helper function that does the job for you.或者您可以使用一个简单的助手 function 为您完成这项工作。
def split_by_pos(string_to_split, *args):
"""
Splits a string at the given positions
:param string_to_split: the string to be split
:param args: the positions where the function will split the string.
:return: the splitted string as a tuple
"""
return_value = list()
args = sorted(args)
previous = 0
for position in args:
return_value.append(string_to_split[previous:position])
previous = position
return_value.append(string_to_split[previous:])
return tuple(return_value)
with open("a_random_file.txt", "r", encoding="utf-8") as fp:
lines = fp.readlines()
for line in lines:
print(split_by_pos(line, 1, 5, 12))
I believe something like that can solve your problem.我相信这样的事情可以解决你的问题。
for line in txt:
#line should point something like that => "O2020august Opel"
print(line)
s1 = line[:1]
s2 = line[1:5]
s3 = line[5:13]
.
.
.
print(s1, s2, s3)
You can use readline
and readlines
methods of Python file read API.您可以使用 Python 文件的
readline
和readlines
方法读取 API。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.