简体   繁体   English

从.txt文件列中提取特定数据(在括号内)?

[英]Extract specific data (within brackets) from a column of .txt file?

I have the following text. 我有以下文字。 I added the #of rows for each line, which is not included in the text and it must not be considered. 我为每行添加了#of行,这些行未包含在文本中,因此不得考虑。

(line1)The following table of hex bolt head dimensions was adapted from ASME B18.2.1, Table 2, "Dimensions of Hex Bolts."

(line2)
(line3)Size Nominal (Major)
(line4)Diameter [in]            Width Across Flats          Head Height
(line5)        Nominal [in] Minimum [in]    Nominal [in]        Minimum [in]
(line6)1/4" 0.2500          7/16"(0.438)    0.425       11/64"  0.150

I am trying to extract the data from some of the columns but I am having problem extracting from column 2 which includes a float within brackets 我试图从一些列中提取数据,但我从第2列中提取问题,其中包括括号内的浮点数

From a txt file that contents columns and row of information I tried to organize it on lists. 从内容列和信息行的txt文件,我试图在列表上组织它。 One of the columns has a float within brackets like this "7/16"(0.438) , which is in column 2 and I need to store 0.438 in a list. 其中一列在括号内有一个浮点数,如"7/16"(0.438) ,在第2列中,我需要在列表中存储0.438。

I also want to skip the first 5 rows given that those are strings and I just want to start reading from the 6th row 我还想跳过前5行,因为这些是字符串,我只想从第6行开始阅读

def Main():

    filename = 'BoltSizes.txt' #file name
    f1 = open(filename, 'r')  # open the file for reading
    data = f1.readlines()  # read the entire file as a list of strings
    f1.close()  # close    the file  ... very important

    #creating empty arrays
    Diameter = []
    Width_Max = []
    Width_Min = []
    Head_Height = []

    for line in data: #loop over all the lines
        cells = line.strip().split(",") #creates a list of words

        val = float(cells[1])
        Diameter.append(val)

        #Here I need to get only the float from the brackets 7/16"(0.438)
        val = float(cells[2])
        Width_Max.append(val)

        val = float(cells[3])
        Width_Min.append(val)

        val = float(cells[5])
        Head_Height.append(val)

Main()

I am getting this error: 我收到此错误:

line 16, in Main
    val = float(cells[1]) ValueError: could not convert string to float: ' Table 2'

Since data is a clasic Python list, you can use list indices to get a parsing range. 由于data是一个clasic Python列表,您可以使用列表索引来获取解析范围。 So, to skip first 5 columns, you should pass data[5:] to the for loop. 因此,要跳过前5列,您应该将data[5:]传递给for循环。

Fixing second column is a bit more complicated task; 修复第二列是一项更复杂的任务; best way to extract data from column #2 would be to use re.search() . 从第2列中提取数据的最佳方法是使用re.search()

So, you can change your code to something like this: 因此,您可以将代码更改为以下内容:

# we'll use regexp to extract value for col no. 2
import re
# skips first five rows
for line in data[5:]:
   # strips the excesive whitespace and replaces them with single comma
   strip = re.sub("\s+", ",", line)
   cells = strip.split(",") # creates a list of words

   # parsing column 0, 1..
   ...
   # column 2 is critical
   tmp = re.search(r'\((.*?)\)', cells[2])
   # we have to check if re.search() returned something
   if tmp:
      # we're taking group 1, group 0 includes brackets.
      val = tmp.group(1)
      # one more check, val should be numeric value for float to work.
      if val.isnumeric():
         Width_Max.append(float(val))

   # continue your parsing

Problem with this code is that it will probably break first time your data changes, but since you've put only one row I can't provide more detailed help. 这段代码的问题在于它可能会在第一次数据更改时中断,但由于您只放了一行,因此我无法提供更详细的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM