简体   繁体   English

在Debian OS上使用熊猫从github存储库读取csv文件时出现奇怪的csv输出

[英]Strange csv output when csv file is read from a github repo using pandas on Debian OS

I have the following data on a csv file: 我在csv文件中有以下数据:

XG,612.0
YG,-1924.0500000000002
ZG,-959.085
A_mod,6.889112523645457
I1_mod,0.478595694542785
I2_mod,32.64258822366686

If I open it using excel or atom, everything's normal. 如果我使用excel或atom打开它,一切正常。 The file is on the folder of my GitHub repo, I don't know if this is important, it shouldn't, but when I read it using the pd.read_csv() function on python, I got the following result: 该文件位于我的GitHub存储库的文件夹中,我不知道这是否重要,应该不重要,但是当我在python上使用pd.read_csv()函数读取该pd.read_csv() ,得到了以下结果:

在此处输入图片说明

It seems like pandas is reading some kind of metadata from the file, but not the file itself. 熊猫似乎正在从文件中读取某种元数据,而不是文件本身。 I'm running python 3.6 from Jupyterlab on a Debian Google Cloud VM instance. 我在Debian Google Cloud VM实例上从Jupyterlab运行python 3.6。 I don't think all of this should be a problem but this is the first time I'm seeing this happening and I have no clue about what's happening. 我不认为所有这些都应该成为问题,但这是我第一次看到这种情况,而且我不知道发生了什么。

Could someone please tell me how to fix this problem and explain why is it caused? 有人可以告诉我如何解决此问题,并解释其原因吗?

Thank you very much in advance. 提前非常感谢您。

EDIT 编辑

The files are contained in the local folder cloned via URL from the github website. 这些文件包含在通过URL从github网站克隆的本地文件夹中。 So basically using git clone to your local machine should produce the same effect. 因此,基本上在本地计算机上使用git clone应该会产生相同的效果。

In python, i'm using pd.read_csv('my_file.csv') . 在python中,我正在使用pd.read_csv('my_file.csv')

Anotehr curious thing, is that on my personal machine under windows 10 i have no problem at all reading the files. Anotehr奇怪的是,在Windows 10下我的个人计算机上,我完全没有读取文件的问题。 But in the Google Cloud VM instance using exactly the same procedure is where i'm having this strange issue. 但是在Google Cloud VM实例中,使用完全相同的过程就是我遇到这个奇怪问题的地方。

You're looking at the Git LFS pointer file instead of the actual file. 您正在查看的是Git LFS指针文件,而不是实际文件。 version , oid and size are parts of the Git LFS spec . versionoidsize是Git LFS 规范的一部分 Git LFS keeps these files in lieu of the actual large data files in the repository. Git LFS会使用这些文件代替存储库中的实际大数据文件。 They are supposed to be seamlessly replaced on checkout. 他们应该在结帐时无缝替换。

Check the output of the git clone on the box where you get the wrong result. 在得到错误结果的框中检查git clone的输出。 There seems to be a configuration problem with Git LFS. Git LFS似乎存在配置问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM