簡體   English   中英

從第 2 行讀取文件或跳過標題行

[英]Read file from line 2 or skip header row

如何跳過標題行並從第 2 行開始讀取文件?

with open(fname) as f:
    next(f)
    for line in f:
        #do something
f = open(fname,'r')
lines = f.readlines()[1:]
f.close()

如果你想要第一行,然后你想對文件執行一些操作,這段代碼會很有幫助。

with open(filename , 'r') as f:
    first_line = f.readline()
    for line in f:
            # Perform some operations

如果切片可以在迭代器上工作......

from itertools import islice
with open(fname) as f:
    for line in islice(f, 1, None):
        pass
f = open(fname).readlines()
firstLine = f.pop(0) #removes the first line
for line in f:
    ...

為了概括讀取多個標題行的任務並提高可讀性,我將使用方法提取。 假設您想對coordinates.txt前三行進行標記以用作標題信息。

例子

coordinates.txt
---------------
Name,Longitude,Latitude,Elevation, Comments
String, Decimal Deg., Decimal Deg., Meters, String
Euler's Town,7.58857,47.559537,0, "Blah"
Faneuil Hall,-71.054773,42.360217,0
Yellowstone National Park,-110.588455,44.427963,0

然后提取方法允許你指定要與頭信息做(在這個例子中,我們簡單的記號化基礎上,逗號標題行並返回一個列表,但有足夠的空間做更多的)東西

def __readheader(filehandle, numberheaderlines=1):
    """Reads the specified number of lines and returns the comma-delimited 
    strings on each line as a list"""
    for _ in range(numberheaderlines):
        yield map(str.strip, filehandle.readline().strip().split(','))

with open('coordinates.txt', 'r') as rh:
    # Single header line
    #print next(__readheader(rh))

    # Multiple header lines
    for headerline in __readheader(rh, numberheaderlines=2):
        print headerline  # Or do other stuff with headerline tokens

輸出

['Name', 'Longitude', 'Latitude', 'Elevation', 'Comments']
['String', 'Decimal Deg.', 'Decimal Deg.', 'Meters', 'String']

如果coordinates.txt .txt 包含另一個標題行,只需更改numberheaderlines 最重要的是,很清楚__readheader(rh, numberheaderlines=2)正在做什么,我們避免了必須弄清楚或評論已接受答案的作者為什么在其代碼中使用next()的歧義。

如果你想從第 2 行開始讀取多個 CSV 文件,這就像一個魅力

for files in csv_file_list:
        with open(files, 'r') as r: 
            next(r)                  #skip headers             
            rr = csv.reader(r)
            for row in rr:
                #do something

(這是Parfait對另一個問題的回答的一部分)

# Open a connection to the file
with open('world_dev_ind.csv') as file:

    # Skip the column names
    file.readline()

    # Initialize an empty dictionary: counts_dict
    counts_dict = {}

    # Process only the first 1000 rows
    for j in range(0, 1000):

        # Split the current line into a list: line
        line = file.readline().split(',')

        # Get the value for the first column: first_col
        first_col = line[0]

        # If the column value is in the dict, increment its value
        if first_col in counts_dict.keys():
            counts_dict[first_col] += 1

        # Else, add to the dict and set value to 1
        else:
            counts_dict[first_col] = 1

# Print the resulting dictionary
print(counts_dict)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM