简体   繁体   English

使用python从.txt文件中检索数据?

[英]retrieving data from a .txt file using python?

In the following you can see data from a ephemeris.txt file. 在下面的内容中,您可以查看ephemeris.txt文件中的数据。 Now I want to retrieve several columns(say, for example the column starting with 00:00, 27.69 and 44.1) and name the array as x,y,z. 现在,我想检索几列(例如,以00:00、27.69和44.1开头的列)并将数组命名为x,y,z。 What do I have to do? 我需要做什么?

I tried this 我试过了

x, y, z = numpy.loadtxt("ephemeris.txt", unpack=True)

And this get this error 这得到这个错误

"ValueError: could not convert string to float: Date__(UT)__HR:MN"

Could you also help me in converting that HR:MN into minute only? 您还能帮我将HR:MN转换为分钟吗?

Date__(UT)__HR:MN     R.A.__(a-apparent)__DEC\
**********************************************\
 2013-Jan-01 00:00 *   14 31 27.69 -12 29 44.1\
 2013-Jan-01 00:01 *   14 31 27.71 -12 29 44.1\
 2013-Jan-01 00:02 *   14 31 27.72 -12 29 44.2\
 2013-Jan-01 00:03 *   14 31 27.73 -12 29 44.2\
 2013-Jan-01 00:04 *   14 31 27.75 -12 29 44.3\
 2013-Jan-01 00:05 *   14 31 27.76 -12 29 44.3\
 2013-Jan-01 00:06 *   14 31 27.77 -12 29 44.4\
 2013-Jan-01 00:07 *   14 31 27.78 -12 29 44.4\
 2013-Jan-01 00:08 *   14 31 27.80 -12 29 44.4\
 2013-Jan-01 00:09 *   14 31 27.81 -12 29 44.5\

thanks in advance 提前致谢

You can use some more arguments of the loadtxt function. 您可以使用loadtxt函数的更多参数。

The error you are getting most probably is due to the first two header lines, so skip them with the skiprows=2 argument; 您最有可能遇到的错误是由于前两个标题行,因此请使用skiprows=2参数跳过它们;

Also, each row contains data in a different format, separated by space. 此外,每一行都包含以不同格式(由空格分隔)的数据。 Use delimiter=' ' just in case, and you can opt between dtype=string and dtype=object . 以防万一,请使用delimiter=' ' ,您可以在dtype=stringdtype=object之间进行选择。

a = numpy.loadtxt("ephemeris.txt", delimiter=' ', dtype=string, skiprows=2)

This should give you a single array from where you can perform many kinds of "conversions": split one array per column, create a list of rows, etc. 这应该为您提供一个数组,从中可以执行多种“转换”:每列拆分一个数组,创建行列表等。

x,y,z,etc = numpy.hsplit(a, a.shape[1])
x = x.astype(datetime)

# or
x = a[:,0].astype(datetime)
y = a[:,1].astype(some_type)

or something along these lines... 或类似的东西...

Hope this helps, and please elaborate more in the comments if needed. 希望对您有所帮助,请根据需要在评论中进行详细说明。

import re
f = open("ephemeris.txt")
for line in f.readlines():
    r = re.search("(\d{4})\-(\w{3})-(\d{2}) (\d{2}):(\d{2}) \*   (.*?)\\\n", line)
    if r:
        print "Year: "+r.group(1)
        print "Month: "+r.group(2)
        print "Day: "+r.group(3)
        print "Hour: "+r.group(4)
        print "Minute: "+r.group(5)
        print "Data: "+r.group(6)

This will read every line of the file, check if it matches the pattern and if it does so, print all the data it could retrieve. 这将读取文件的每一行,检查它是否与模式匹配,如果匹配,则打印它可以检索的所有数据。

You can also split each line, setting a character separator. 您还可以分割每行,设置一个字符分隔符。 Then you can access on each (string) token using indexes: 然后,您可以使用索引访问每个(字符串)令牌:

def prova():
    f = open('/home/frenk/Desktop/ephemeris.txt')
    l = []
    for line in f:
        l = line.split(" ")
        print "date: " + l[1]

Second, if you want to convert a string like "31" to integer 31, you can simply write: 其次,如果要将字符串“ 31”转换为整数31,则可以简单地编写:

x = int('31')

Note that you can select a slice of string using slice notation: 请注意,您可以使用切片符号选择字符串的切片:

string = "This is a slice of string"
print string[10:15]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM