如何从.xlsx 文件创建具有多个空格作为分隔符的.dat 文件

Question

I have an xlsx file, where each row corresponds to a sample with associated features in each column, as shown here: xlsx file example我有一个 xlsx 文件，其中每一行对应于每列中具有相关特征的样本，如下所示： xlsx 文件示例

I am trying to convert this xlsx file into a dat file, with multiple spaces separating the columns, as displayed in the example below:我正在尝试将此 xlsx 文件转换为 dat 文件，并使用多个空格分隔列，如下例所示：

samples      property  feature1  feature2  feature3
sample1       3.0862    0.8626    0.7043    0.6312
sample2       2.8854    0.7260    0.7818    0.6119
sample3       0.6907    0.4943    0.0044    0.4420
sample4       0.9902    0.0106    0.0399    0.9877
sample5       0.7242    0.0970    0.3199    0.5504

I have tried doing this by creating a dataframe in pandas and using dataframe.to_csv to save the file as a.dat, but it only allows me to use one character as a delimiter.我尝试通过在 pandas 中创建 dataframe 并使用 dataframe.to_csv 将文件另存为 a.dat 来执行此操作，但它只允许我使用一个分隔符Does anyone know how I might go about creating a file like the one above?有谁知道我怎么可能 go 关于创建一个像上面的文件？

Answer 1

You can use the string representation to_string of the dataframe, imported by pandas from Excel:您可以使用 dataframe 的字符串表示to_string ，由 pandas 从 Excel 导入：

df = pd.read_excel('input.xlsx')
with open ('output.dat', 'w') as f:
    f.write(df.to_string(index=False))

Answer 2

This is another approach to do so without using DataFrame .这是另一种不使用DataFrame的方法。 We will have more flexibility since we do all the structure ourselves from the ground up.我们将拥有更大的灵活性，因为我们从头开始自己完成所有结构。

Suppose you have read the xlsx file and store it in the form of 2-d list as follows:假设您已经读取了 xlsx 文件，并将其以二维列表的形式存储，如下所示：

lines = [['sample1', 3.0862, 0.8626, 0.7043, 0.6312],
        ['sample2', 2.8854, 0.7260, 0.7818, 0.6119],
        ['sample3', 0.6907, 0.4943, 0.0044, 0.4420],
        ['sample4', 0.9902, 0.0106, 0.0399, 0.9877],
        ['sample5', 0.7242, 0.0970, 0.3199, 0.5504]]

We can make use of string methods like ljust , rjust , or center .我们可以使用像ljust 、 rjust或center这样的字符串方法。 Right here, I just show you the use of ljust that takes the length as the first argument.在这里，我只是向您展示了ljust的用法，它将长度作为第一个参数。 The length will be the total width for left justification.长度将是左对齐的总宽度。

One could also use f-string to do padding in the format of f'{var:^10.4f}' .也可以使用 f-string 以f'{var:^10.4f}'的格式进行填充。 The meaning of each component is:每个组件的含义是：

^ represents centering (can be changed to < for left justification or > for right justification) ^表示居中（可更改为<左对齐或>右对齐）
10 is the padding length 10是填充长度
.4 is the number of decimal places .4是小数位数
f means float f表示浮动

So, here is the final script.所以，这是最终的脚本。

padding1 = 12
padding2 = 10

print('samples'.ljust(padding1 + 1) + 'property  ' + 'feature1  ' + 'feature2  ' + 'feature3')
for line in lines:
    text = line[0].ljust(padding1)
    for i in range(1, len(line)):
        text += f'{line[i]:^{padding2}.4f}'
    print(text)

如何从.xlsx 文件创建具有多个空格作为分隔符的.dat 文件

问题描述

2 个解决方案

解决方案1
0 2021-12-16 14:58:21

解决方案2
0 2021-12-16 20:01:34

如何从.xlsx 文件创建具有多个空格作为分隔符的.dat 文件

问题描述

2 个解决方案

解决方案1 0 2021-12-16 14:58:21

解决方案2 0 2021-12-16 20:01:34

解决方案1
0 2021-12-16 14:58:21

解决方案2
0 2021-12-16 20:01:34