简体   繁体   English

将Matlab数据文件读入Python,需要导出为CSV

[英]Read Matlab Data File into Python, Need to Export to CSV

I have read a Matlab file containing a large amount of arrays as a dataset into Python storing the Matlab Dictionary under the variable name mat using the command:我已将包含大量数组作为数据集的 Matlab 文件读入 Python,使用以下命令将 Matlab 字典存储在变量名称mat下:

mat = loadmat('Sample Matlab Extract.mat')

Is there a way I can then use Python's write to csv functionality to save this Matlab dictionary variable I read into Python as a comma separated file?有没有办法然后我可以使用 Python 的写入 csv 功能来保存我读入 Python 的这个 Matlab 字典变量作为逗号分隔的文件?

with open('mycsvfile.csv','wb') as f:
   w = csv.writer(f)
   w.writerows(mat.items())
   f.close()

creates a CSV file with one column containing array names within the dictionary and then another column containing the first element of each corresponding array.创建一个 CSV 文件,其中一列包含字典中的数组名称,然后另一列包含每个对应数组的第一个元素。 Is there a way to utilize a command similar to this to obtain all corresponding elements within the arrays inside of the 'mat' dictionary variable?有没有办法利用与此类似的命令来获取“mat”字典变量内的数组中的所有相应元素?

The function scipy.io.loadmat generates a dictionary looking something like this:函数scipy.io.loadmat生成一个看起来像这样的字典:

{'__globals__': [],
 '__header__': 'MATLAB 5.0 MAT-file, Platform: MACI, Created on: Wed Sep 24 16:11:51 2014',
 '__version__': '1.0',
 'a': array([[1, 2, 3]], dtype=uint8),
 'b': array([[4, 5, 6]], dtype=uint8)}

It sounds like what you want to do is make a .csv file with the keys "a", "b", etc. as the column names and their corresponding arrays as data associated with each column.听起来您想要做的是使用键“a”、“b”等作为列名和它们对应的数组作为与每列关联的数据制作一个 .csv 文件。 If so, I would recommend using pandas to make a nicely formatted dataset that can be exported to a .csv file.如果是这样,我建议使用pandas制作一个格式良好的数据集,该数据集可以导出为 .csv 文件。 First, you need to clean out the commentary members of your dictionary (all the keys beginning with "__").首先,您需要清除字典的注释成员(所有以“__”开头的键)。 Then, you want to turn each item value in your dictionary into a pandas.Series object.然后,您想将字典中的每个项目值转换为pandas.Series对象。 The dictionary can then be turned into a pandas.DataFrame object, which can also be saved as a .csv file.然后可以将字典转换为pandas.DataFrame对象,该对象也可以保存为 .csv 文件。 Your code would look like this:您的代码如下所示:

import scipy.io
import pandas as pd

mat = scipy.io.loadmat('matex.mat')
mat = {k:v for k, v in mat.items() if k[0] != '_'}
data = pd.DataFrame({k: pd.Series(v[0]) for k, v in mat.items()}) # compatible for both python 2.x and python 3.x

data.to_csv("example.csv")

This is correct solution for converting any .mat file into .csv file.这是将任何 .mat 文件转换为 .csv 文件的正确解决方案。 Try it尝试一下

   import scipy.io
   import numpy as np
   data = scipy.io.loadmat("file.mat")

   for i in data:
        if '__' not in i and 'readme' not in i:
              np.savetxt(("file.csv"),data[i],delimiter=',')
import scipy.io
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

class MatDataToCSV():

    def init(self):

        pass

    def convert_mat_tocsv(self):

        mat = scipy.io.loadmat('wiki.mat')

        instances = mat['wiki'][0][0][0].shape[1]
        columns = ["dob", "photo_taken", "full_path", "gender",\
                "name", "face_location", "face_score", "second_face_score"]
        df = pd.DataFrame(index = range(0,instances), columns = columns)

        for i in mat:
            if i == "wiki":
                current_array = mat[i][0][0]
                for j in range(len(current_array)):
                    df[columns[j]] = pd.DataFrame(current_array[j][0])
        return df

reading a matfile (.MAT) with the below code data = scipy.io.loadmat(files[0])使用以下代码读取 matfile (.MAT) data = scipy.io.loadmat(files[0])

gives a dictionary of values and keys给出值和键的字典

and " ' header ', ' version ', ' globals '" these are some of the default values which we need to remove和“' header '、' version '、' globals '”这些是我们需要删除的一些默认值

cols=[]
for i in data:
    if '__' not in i :
       cols.append(i)
temp_df=pd.DataFrame(columns=cols)
for i in data:
    if '__' not in i :
       temp_df[i]=(data[i]).ravel()

we remove the unwanted header values using "if '__' not in i:" and then make a dataframe using the rest of the headers and finally assign the column values to respective column headers我们使用“if '__' not in i:”删除不需要的标题值,然后使用其余标题制作数据框,最后将列值分配给相应的列标题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM