从.txt文件列中提取特定数据（在括号内）？

Question

I have the following text. 我有以下文字。 I added the #of rows for each line, which is not included in the text and it must not be considered. 我为每行添加了#of行，这些行未包含在文本中，因此不得考虑。

(line1)The following table of hex bolt head dimensions was adapted from ASME B18.2.1, Table 2, "Dimensions of Hex Bolts."

(line2)
(line3)Size Nominal (Major)
(line4)Diameter [in]            Width Across Flats          Head Height
(line5)        Nominal [in] Minimum [in]    Nominal [in]        Minimum [in]
(line6)1/4" 0.2500          7/16"(0.438)    0.425       11/64"  0.150

I am trying to extract the data from some of the columns but I am having problem extracting from column 2 which includes a float within brackets 我试图从一些列中提取数据，但我从第2列中提取问题，其中包括括号内的浮点数

From a txt file that contents columns and row of information I tried to organize it on lists. 从内容列和信息行的txt文件，我试图在列表上组织它。 One of the columns has a float within brackets like this "7/16"(0.438) , which is in column 2 and I need to store 0.438 in a list. 其中一列在括号内有一个浮点数，如"7/16"(0.438) ，在第2列中，我需要在列表中存储0.438。

I also want to skip the first 5 rows given that those are strings and I just want to start reading from the 6th row 我还想跳过前5行，因为这些是字符串，我只想从第6行开始阅读

def Main():

    filename = 'BoltSizes.txt' #file name
    f1 = open(filename, 'r')  # open the file for reading
    data = f1.readlines()  # read the entire file as a list of strings
    f1.close()  # close    the file  ... very important

    #creating empty arrays
    Diameter = []
    Width_Max = []
    Width_Min = []
    Head_Height = []

    for line in data: #loop over all the lines
        cells = line.strip().split(",") #creates a list of words

        val = float(cells[1])
        Diameter.append(val)

        #Here I need to get only the float from the brackets 7/16"(0.438)
        val = float(cells[2])
        Width_Max.append(val)

        val = float(cells[3])
        Width_Min.append(val)

        val = float(cells[5])
        Head_Height.append(val)

Main()

I am getting this error: 我收到此错误：

line 16, in Main
    val = float(cells[1]) ValueError: could not convert string to float: ' Table 2'

Answer 1

Since data is a clasic Python list, you can use list indices to get a parsing range. 由于data是一个clasic Python列表，您可以使用列表索引来获取解析范围。 So, to skip first 5 columns, you should pass data[5:] to the for loop. 因此，要跳过前5列，您应该将data[5:]传递给for循环。

Fixing second column is a bit more complicated task; 修复第二列是一项更复杂的任务; best way to extract data from column #2 would be to use re.search() . 从第2列中提取数据的最佳方法是使用re.search() 。

So, you can change your code to something like this: 因此，您可以将代码更改为以下内容：

# we'll use regexp to extract value for col no. 2
import re
# skips first five rows
for line in data[5:]:
   # strips the excesive whitespace and replaces them with single comma
   strip = re.sub("\s+", ",", line)
   cells = strip.split(",") # creates a list of words

   # parsing column 0, 1..
   ...
   # column 2 is critical
   tmp = re.search(r'\((.*?)\)', cells[2])
   # we have to check if re.search() returned something
   if tmp:
      # we're taking group 1, group 0 includes brackets.
      val = tmp.group(1)
      # one more check, val should be numeric value for float to work.
      if val.isnumeric():
         Width_Max.append(float(val))

   # continue your parsing

Problem with this code is that it will probably break first time your data changes, but since you've put only one row I can't provide more detailed help. 这段代码的问题在于它可能会在第一次数据更改时中断，但由于您只放了一行，因此我无法提供更详细的帮助。

从.txt文件列中提取特定数据（在括号内）？

问题描述

1 个解决方案

解决方案1
0 2019-05-10 00:35:53

从.txt文件列中提取特定数据（在括号内）？

问题描述

1 个解决方案

解决方案1 0 2019-05-10 00:35:53

解决方案1
0 2019-05-10 00:35:53