繁体   English   中英

如何将制表符扩展到空格并排列列表格式?

[英]How to expand tab characters into white space and lining up the columns gives tabular format?

我的问题是如何将制表符扩展到空格并排列列表格式?

数据以单个字符串显示

'Patient_ID\tAge\tGender\tTumor_Size\tNearby_Cancer_Lymphnodes\tCancer_Spread\tHistological_Type\tLymph_Nodes\tTreatment\ntcga.5l.aat0\t42\tfemale\tt2\tn0\tm0\th_t_1\t0\tplan_1\ntcga.aq.a54o\t51\tmale\tt2\tn0\tm0\th_t_2\t0\tplan_2\ntcga.aq.a7u7\t55\tfemale\tt2\tn2a\tm0\th_t_1\t4\tplan_4\n'`

我希望输出与以下相同。

Patient_ID     Age    Gender    Tumor_Size    Nearby_Cancer_Lymphnodes    
tcga.5l.aat0   42     female    t2            n0
import pandas as pd

string = 'Patient_ID\tAge\tGender\tTumor_Size\tNearby_Cancer_Lymphnodes\tCancer_Spread\tHistological_Type\tLymph_Nodes\tTreatment\ntcga.5l.aat0\t42\tfemale\tt2\tn0\tm0\th_t_1\t0\tplan_1\ntcga.aq.a54o\t51\tmale\tt2\tn0\tm0\th_t_2\t0\tplan_2\ntcga.aq.a7u7\t55\tfemale\tt2\tn2a\tm0\th_t_1\t4\tplan_4\n'

lines = string.split('\n')[0:-1] # Split string by '\n', ignoring the last one
data = [line.split('\t') for line in lines] # Split strings by '\t'
df = pd.DataFrame(data) # Create a pandas data frame from the data
print(df)
              0    1       2           3                         4              5                  6            7          8
0    Patient_ID  Age  Gender  Tumor_Size  Nearby_Cancer_Lymphnodes  Cancer_Spread  Histological_Type  Lymph_Nodes  Treatment
1  tcga.5l.aat0   42  female          t2                        n0             m0              h_t_1            0     plan_1
2  tcga.aq.a54o   51    male          t2                        n0             m0              h_t_2            0     plan_2
3  tcga.aq.a7u7   55  female          t2                       n2a             m0              h_t_1            4     plan_4

如果你想处理数据我会推荐pandas包,并将字符串读取为像这样的文件StringIO

from io import StringIO
import pandas as pd

s = """Patient_ID\tAge\tGender\tTumor_Size\tNearby_Cancer_Lymphnodes\tCancer_Spread\tHistological_Type\tLymph_Nodes\tTreatment\ntcga.5l.aat0\t42\tfemale\tt2\tn0\tm0\th_t_1\t0\tplan_1\ntcga.aq.a54o\t51\tmale\tt2\tn0\tm0\th_t_2\t0\tplan_2\ntcga.aq.a7u7\t55\tfemale\tt2\tn2a\tm0\th_t_1\t4\tplan_4\n"""
df = pd.read_csv(StringIO(s), sep="\t")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM