简体   繁体   English

以'200'开头时将行追加到上一行

[英]Append line to previous line when it starts with '200'

My Problem is the following: 我的问题如下:
I have a csv-file with lines that normally start with '200'. 我有一个csv文件,其行通常以'200'开头。 In this file there are unwanted linebreaks. 此文件中有不需要的换行符。

Eg 例如

200 Peter Pan 
200 John Smith 
200 Susan Murray 
200 Harald  
Williams
200 Liam Noah

That's how the file should look at the end: 这就是文件结尾的样子:

200 Peter Pan
200 John Smith
200 Susan Murray
200 Harald Williams
200 Liam Noah

So whenever a line does not start with '200' it should be appended to the previous line. 因此,只要一行不以“ 200”开头,都应将其附加到前一行。 I hope this should be quite easy with Python but I'm not getting it right so far. 我希望这对于Python来说应该很容易,但是到目前为止我还没有做好。

with open(<file_name>, 'r+') as file:
    text = str();
    for line in file:
        if line[0:3] == "200":
            text = "{}\n{}".format(text, line.strip());
        else:
            text = "{} {}".format(text, line.strip());
    file.seek(0);
    file.write(text[1:]);

The following code will do the job... 以下代码将完成这项工作...

With a file called file.csv , with contents: 使用名为file.csvfile ,其内容为:

200 Peter Pan 
200 John Smith 
200 Susan Murray 
200 Harald  
Williams
200 Liam Noah

after we run the following script : 在运行以下script

lines = open("file.csv", "r").read().split("\n")
for i, line in enumerate(lines):
   if not line.startswith("200"):
      lines[i-1] = lines[-1].strip() + " " + line
      lines.pop(i)

open("file.csv", "w").write("\n".join(lines)+"\n")

the file is updated as you wanted it to be to: file将按照您希望的那样进行更新:

200 Peter Pan
200 John Smith
200 Susan Murray
200 Harald Williams
200 Liam Noah

How does it work? 它是如何工作的?

The steps: 步骤:

  • read in the .csv file as a string and convert it to a list of lines by splitting the string on the new-line ( '\\n' ) character. 阅读中.csv文件作为string并将其转换为一个listlinessplittingstring在新行( '\\n' )字符。
  • iterate through the enumerated lines list so we have two variables to work with: the index and the lines. 遍历enumerated lines list因此我们需要使用两个variablesindex and the行。
  • check if the line starts with "200" . 检查line是否以"200"开头。
  • if it does, append the line to the line one index before (by stripping the line before and adding a space between), and then remove the line from the list of lines by popping its index . 如果这样做, appendlineline一个index之前(通过strippingline之前和加入之间的空间),然后取出linelistlines通过poppingindex
  • finally, we open the same .csv file for writing and write to it the new lines . 最后,我们open相同的.csv file进行writing ,并将新lines write其中。 The string from the list of the lines is got by adding a new-line character between each line and finally adding an extra one at the end. 通过在每line之间添加line并最终在末尾添加一个额外的lines来获得lines list中的string

Hope this helps you out! 希望这可以帮助你!

Read csv file and iterate over lines: 读取csv文件并遍历以下行:

with open('test.csv', 'r') as f:
    f_csv = csv.reader(f)
    # header = next(f_csv)
    for row in f_csv:
        if not row.startswith(200):
            #append previous row

The code that works perfectly fine for me is the following: 下面的代码对我来说很好用:

with open('testing2.CSV', 'r+', encoding="utf-8") as file:
    text = str();
    for line in file:
        if line[0:3] == "200":
            text = text + '\n';
        text = text + line.strip();
    file.seek(0);
    file.write(text);

It even keeps the first line which is nice as my CSV file has headers. 它甚至保留了第一行,因为我的CSV文件具有标题,因此很好。 Thanks to everyone who helped here especially Benjamin James Drury and Joe Iddon. 感谢所有在这里帮助过的人,特别是本杰明·詹姆斯·德鲁里和乔·伊登。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM