简体   繁体   English

"如何打开 .data 文件扩展名"

[英]How to open a .data file extension

I am working on side stuff where the data provided is in a .data<\/code> file.我正在研究提供的数据位于.data<\/code>文件中的.data<\/code> 。 How do I open a .data<\/code> file to see what the data looks like and also how do I read from a .data<\/code> file programmatically through python?如何打开.data<\/code>文件以查看数据的外观,以及如何通过 python 以编程方式从.data<\/code>文件中读取? I have Mac OSX我有 Mac OSX

NOTE:<\/strong> The Data I am working with is for one of the KDD cup challenges<\/code>注意:<\/strong>我正在使用的数据是针对KDD cup challenges<\/code>

"

Kindly try using Notepad or Gedit to check delimiters in the file ( .data files are text files too).请尝试使用记事本或 Gedit 检查文件中的分隔符( .data文件也是文本文件)。 After you have confirmed this, then you can use the read_csv method in the Pandas library in python.确认了这一点后,就可以在python中使用Pandas库中的read_csv方法了。

import pandas as pd
file_path = "~/AI/datasets/wine/wine.data"
# above .data file is comma delimited
wine_data = pd.read_csv(file_path, delimiter=",")

It vastly depends on what is in it.这在很大程度上取决于其中的内容。 It could be a binary file or it could be a text file.它可以是二进制文件,也可以是文本文件。

If it is a text file then you can open it in the same way you open any file (f=open(filename,"r"))如果它是一个文本文件,那么您可以像打开任何文件一样打开它 (f=open(filename,"r"))

If it is a binary file you can just add a "b" to the open command (open(filename,"rb")).如果它是一个二进制文件,你可以在打开命令(open(filename,"rb"))中添加一个“b”。 There is an example here:这里有一个例子:

Reading binary file in Python and looping over each byte 在 Python 中读取二进制文件并遍历每个字节

Depending on the type of data in there, you might want to try passing it through a csv reader (csv python module) or an xml parsing library (an example of which is lxml)根据那里的数据类型,您可能想尝试通过 csv 阅读器(csv python 模块)或 xml 解析库(其中一个例子是 lxml)传递它

After further into from above and looking at the page the format is:从上面进一步进入并查看页面后,格式为:

Data Format The datasets use a format similar as that of the text export format from relational databases:数据格式数据集使用的格式类似于关系数据库中的文本导出格式:

One header lines with the variables names One line per instance Separator tabulation between the values There are missing values (consecutive tabulations)带有变量名称的标题行 每个实例一行 值之间的分隔符列表 缺少值(连续列表)

Therefore see this answer:因此,请参阅此答案:

parsing a tab-separated file in Python 在 Python 中解析制表符分隔的文件

I would advise trying to process one line at a time rather than loading the whole file, but if you have the ram why not...我建议尝试一次处理一行而不是加载整个文件,但如果你有内存为什么不......

I suspect it doesnt open in sublime because the file is huge, but that is just a guess.我怀疑它不会在 sublime 中打开,因为文件很大,但这只是一个猜测。

To get a quick overview of what the file may content you could do this within a terminal, using strings or cat , for example:要快速了解文件可能包含的内容,您可以在终端中使用stringscat执行此操作,例如:

$ strings file.data

or或者

$ cat -v file.data

In case you forget to pass the -v option to cat and if is a binary file you could mess your terminal and therefore need to reset it:如果您忘记将-v选项传递给 cat 并且如果是二进制文件,您可能会弄乱您的终端,因此需要重置它:

$ reset

I was just dealing with this issue myself so I thought I would share my answer.我只是自己处理这个问题,所以我想我会分享我的答案。 I have a .data file and was unable to open it by simply right clicking it.我有一个 .data 文件,无法通过右键单击它来打开它。 MACOS recommended I open it using Xcode so I tried it but it did not work. MACOS 建议我使用 Xcode 打开它,所以我尝试了它,但它不起作用。

Next I tried open it using a program named "Brackets".接下来我尝试使用名为“Brackets”的程序打开它。 It is a text editing program primarily used for HTML and CSS.它是一个主要用于 HTML 和 CSS 的文本编辑程序。 Brackets did work.括号确实有效。

I also tried PyCharm as I am a Python Programmer.我也尝试过 PyCharm,因为我是一名 Python 程序员。 Pycharm worked as well and I was also able to read from the file using the following lines of code: Pycharm 也能正常工作,我还可以使用以下代码行读取文件:

inf = open("processed-1.cleveland.data", "r")

lines = inf.readlines()

for line in lines:
    print(line, end="")

It works for me.这个对我有用。

import pandas as pd
# define your file path here
your_data = pd.read_csv(file_path, sep=',')
your_data.head()

I mean that just take it as a csv file if it is seprated with ','.我的意思是,如果它用“,”分隔,则将其作为 csv 文件。 solution from @mustious.来自@mustious 的解决方案。

I have a file named "MacBook Pro.data" size 1 GB and I can't open it.我有一个名为“MacBook Pro.data”的文件,大小为 1 GB,但我无法打开它。 Have tried Terminal, Python, Text and more.尝试过终端、Python、文本等。 Has anyone an idea how to open it?有谁知道如何打开它?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM