将txt文件（类似于字典格式）读入pandas dataframe

Question

I have a txt file that looks like this:我有一个看起来像这样的 txt 文件：

('GTCC', 'ACTB'): 1
('GTCC', 'GAPDH'): 2
('CGAG', 'ACTB'): 1
('CGAG', 'GAPDH'): 4

where the first string is a gRNA name, the second string is a gene name, and the number is a count of those two strings occurring together.其中第一个字符串是 gRNA 名称，第二个字符串是基因名称，数字是这两个字符串一起出现的计数。

I want to read this into a pandas dataframe and re-shape it so that it looks like this:我想将其读入 pandas dataframe 并重新塑造它，使其看起来像这样：

      ACTB GAPDH
GTCC   1     2
CGAG   1     4

How might I do this?我该怎么做？

The file will not always be this size-- it will often be much larger (200 gRNA names x 20 gene names) but the size will be variable.文件并不总是这么大——它通常会大得多（200 个 gRNA 名称 x 20 个基因名称），但大小是可变的。 There will always only be one gRNA name and one gene name per count.每次计数总是只有一个 gRNA 名称和一个基因名称。 The titles of the columns/rows are accurate as to what the real file will look like (some string of letters for the rows and some gene name for the columns).列/行的标题对于真实文件的外观是准确的（行的一些字母字符串和列的一些基因名称）。

Answer 1

This is certainly not the cleanest way to do it, but I figured out a way to get what I wanted:这当然不是最干净的方法，但我想出了一种方法来获得我想要的东西：

df = pd.read_csv('test.txt', sep=",|:", engine ='python', names=['gRNA','gene','count'])
df["gRNA"]=df["gRNA"].str.replace("(","")
df["gRNA"]=df["gRNA"].str.replace("'","")
df["gene"]=df["gene"].str.replace(")","")
df["gene"]=df["gene"].str.replace("'","")
df=df.pivot(index='gRNA', columns='gene', values='count')

将txt文件（类似于字典格式）读入pandas dataframe

问题描述

1 个解决方案

解决方案1
0 2022-02-03 21:31:37

将txt文件（类似于字典格式）读入pandas dataframe

问题描述

1 个解决方案

解决方案1 0 2022-02-03 21:31:37

解决方案1
0 2022-02-03 21:31:37