[英]How to create .dat file with multiple spaces as delimiter from .xlsx file
I have an xlsx file, where each row corresponds to a sample with associated features in each column, as shown here: xlsx file example我有一个 xlsx 文件,其中每一行对应于每列中具有相关特征的样本,如下所示: xlsx 文件示例
I am trying to convert this xlsx file into a dat file, with multiple spaces separating the columns, as displayed in the example below:我正在尝试将此 xlsx 文件转换为 dat 文件,并使用多个空格分隔列,如下例所示:
samples property feature1 feature2 feature3
sample1 3.0862 0.8626 0.7043 0.6312
sample2 2.8854 0.7260 0.7818 0.6119
sample3 0.6907 0.4943 0.0044 0.4420
sample4 0.9902 0.0106 0.0399 0.9877
sample5 0.7242 0.0970 0.3199 0.5504
I have tried doing this by creating a dataframe in pandas and using dataframe.to_csv to save the file as a.dat, but it only allows me to use one character as a delimiter.我尝试通过在 pandas 中创建 dataframe 并使用 dataframe.to_csv 将文件另存为 a.dat 来执行此操作,但它只允许我使用一个分隔符Does anyone know how I might go about creating a file like the one above?有谁知道我怎么可能 go 关于创建一个像上面的文件?
This is another approach to do so without using DataFrame
.这是另一种不使用DataFrame
的方法。 We will have more flexibility since we do all the structure ourselves from the ground up.我们将拥有更大的灵活性,因为我们从头开始自己完成所有结构。
Suppose you have read the xlsx file and store it in the form of 2-d list as follows:假设您已经读取了 xlsx 文件,并将其以二维列表的形式存储,如下所示:
lines = [['sample1', 3.0862, 0.8626, 0.7043, 0.6312],
['sample2', 2.8854, 0.7260, 0.7818, 0.6119],
['sample3', 0.6907, 0.4943, 0.0044, 0.4420],
['sample4', 0.9902, 0.0106, 0.0399, 0.9877],
['sample5', 0.7242, 0.0970, 0.3199, 0.5504]]
We can make use of string methods like ljust
, rjust
, or center
.我们可以使用像ljust
、 rjust
或center
这样的字符串方法。 Right here, I just show you the use of ljust
that takes the length as the first argument.在这里,我只是向您展示了ljust
的用法,它将长度作为第一个参数。 The length will be the total width for left justification.长度将是左对齐的总宽度。
One could also use f-string to do padding in the format of f'{var:^10.4f}'
.也可以使用 f-string 以f'{var:^10.4f}'
的格式进行填充。 The meaning of each component is:每个组件的含义是:
^
represents centering (can be changed to <
for left justification or >
for right justification) ^
表示居中(可更改为<
左对齐或>
右对齐)10
is the padding length 10
是填充长度.4
is the number of decimal places .4
是小数位数f
means float f
表示浮动So, here is the final script.所以,这是最终的脚本。
padding1 = 12
padding2 = 10
print('samples'.ljust(padding1 + 1) + 'property ' + 'feature1 ' + 'feature2 ' + 'feature3')
for line in lines:
text = line[0].ljust(padding1)
for i in range(1, len(line)):
text += f'{line[i]:^{padding2}.4f}'
print(text)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.