简体   繁体   English

.h5文件的统计分析(SPSS?)

[英]Statistical analysis of .h5 files (SPSS?)

I have two sets of data in separated .h5 files (Hierarchical Data Format 5, HDF5), obtained with python scripts, and I would like to perform statistical analysis to find correlations between them. 我在用Python脚本获得的独立.h5文件(Hierarchical Data Format 5,HDF5)中有两组数据,我想进行统计分析以找到它们之间的相关性。 My experience here is limited; 我在这里的经验有限; I don't know any R. 我不知道任何R。

I would like to load the data into SPSS, but SPSS doesn't seem to support .h5. 我想将数据加载到SPSS中,但是SPSS似乎不支持.h5。 What would be the best way to go here? 去这里最好的方法是什么? I can write everything to a .csv file, but I would loose the names of the variables. 我可以将所有内容都写入.csv文件,但会丢失变量的名称。 Is there a way to convert the data without loosing any information? 有没有一种方法可以在不丢失任何信息的情况下转换数据? And why doesn't SPSS support h5 anyway? 为什么SPSS仍然不支持h5?

I am aware of the existence of the Rpy module. 我知道Rpy模块的存在。 Do you think it is worthwhile to learn programming in R? 您认为在R中学习编程值得吗? Would this give me the same arsenal of methods as I have in SPSS? 这会给我与SPSS中相同的方法库吗?

Thank you for your input! 谢谢您的意见!

Is there a way to convert the data without losing any information? 有没有一种方法可以在不丢失任何信息的情况下转换数据?

If the HDF5 data is regular enough, you can just load it in Python or R and save it out again as CSV (or even SPSS .sav format if you're a bit more adventurous and/or care about performance). 如果HDF5数据足够常规,则可以将其加载到Python或R中,然后再次将其另存为CSV(或者,如果您比较冒险和/或在意性能的话,甚至可以保存为SPSS .sav格式)。

Why doesn't SPSS support h5 anyway? 为什么SPSS仍然不支持h5?

Who knows. 谁知道。 It probably should. 它可能应该。 Oh well. 那好吧。

Do you think it is worthwhile to learn programming in R? 您认为在R中学习编程值得吗?

If you find SPSS useful, you may also find R useful. 如果您发现SPSS有用,那么您可能还会发现R有用。 Since you mentioned Python, you may find that useful too, but it's more of a general-purpose language: more flexible, but less focused on math and stats. 自从您提到Python之后,您可能会发现它也很有用,但是它更多地是一种通用语言:更灵活,但较少关注数学和统计信息。

Would R give me the same arsenal of methods as I have in SPSS? R是否会给我与SPSS中相同的方法库?

Probably, depending on exactly what you're doing. 可能取决于您在做什么。 R has most stuff for math and stats, including some fairly esoteric and/or new algorithms in installable packages. R在数学和统计方面有很多东西,包括可安装软件包中一些比较深奥的和/或新的算法。 It has a few things Python doesn't have (yet), but Python also covers most of the bases for many users. 它有Python还没有的一些功能,但Python还涵盖了许多用户的大多数基础。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM