Python从文本文件存储和打印数据

Question

I'm trying a Python script that takes from the user a sequence of certain letters, (A, C,G ,T) and prints the percentage of A's, C's, G's, and T's. 我正在尝试一个Python脚本，该脚本从用户那里获取某些字母的序列（A，C，G和T），并打印A，C，G和T的百分比。

For example if the user types AGGTGACCCT then the output should be A: 20 C: 30 G: 30 T: 20 例如，如果用户键入AGGTGACCCT，则输出应为A：20 C：30 G：30 T：20

I'm fairly experienced with Java, but new to Python. 我对Java有相当的经验，但是对Python还是陌生的。 I don't how to use Scanners like I would in Java. 我没有像在Java中那样使用扫描仪。 I tried searching through a reference library but couldn't really figure anything out. 我尝试搜索参考库，但无法真正找出任何答案。

Answer 1

collections.Counter is a very handy tool and worth learning about when you start using python. collections.Counter是一个非常方便的工具，值得您开始使用python学习。

from collections import Counter

inp = input("Enter letters") # input() if using python 3

l = len(inp.strip()) # get length of input string ,inp.strip() removes any whitespace, just use len(inp) if you want to include

c = Counter(inp)

for char in c:
    c[char] = c[char] * 100 / l  # don't need to cast as float for python 3
print (c)
Counter({'C': 30.0, 'G': 30.0, 'A': 20.0, 'T': 20.0})

There is a module csv that has a DictWriter that will be able to write the data to file. 有一个具有DictWriter的模块csv ，该DictWriter能够将数据写入文件。

Answer 2

You can read directly from the standard input stream, sys.stdin , like so: 您可以直接从标准输入流sys.stdin ，如下所示：

$ cat read.py 
import sys

userin = sys.stdin.read()
print [c for c in userin]

$ python read.py 
HELLO
['H', 'E', 'L', 'L', 'O', '\n']

And then you can pipe a text file to stdin, like: 然后，您可以将文本文件通过管道传递到stdin，例如：

$ cat input.txt 
HELLO
$ python read.py < input.txt 
['H', 'E', 'L', 'L', 'O', '\n']

Or, if you want to read a file directly: 或者，如果您想直接读取文件：

>>> import io
>>> with io.open('input.txt', mode='rb') as f:
...     print [c for c in f.read()]
... 
['H', 'E', 'L', 'L', 'O', '\n']

Answer 3

If you can save the sequence in a comma separated file (csv), then you could do something along the lines of: 如果您可以将序列保存在以逗号分隔的文件（csv）中，则可以执行以下操作：

import pandas as pd

sequence = pd.read_csv(file_name)
As = 0
Cs = 0
Gs = 0
Ts = 0
total = len(sequence)

for letter in sequence:
    if letter == 'A':
        As += 1.0
    elif letter == 'C':
        Cs += 1.0
    elif letter == 'G':
        Gs += 1.0
    elif letter == 'T':
        Ts += 1.0

percent_A = As/total
percent_C = As/total
percent_T = As/total
percent_G = As/total

Or: 要么：

import pandas as pd

sequence_list = []
sequence = pd.read_csv(file_name)
for letter in sequence:
    sequence_list.append(letter)

As = sequence_list.count('A')
Cs = sequence_list.count('C')
Gs = sequence_list.count('G')
Ts = sequence_list.count('T')

total = len(sequence_list)

percent_A = As/total
percent_C = As/total
percent_T = As/total
percent_G = As/total

This general structure holds for tsvs as well. 这种通用结构也适用于tsv。

Python从文本文件存储和打印数据

问题描述

3 个解决方案

解决方案1
0 已采纳 2014-08-06 18:15:00

解决方案2
0 2014-08-06 18:18:48

解决方案3
0 2014-08-06 18:24:38

Python从文本文件存储和打印数据

问题描述

3 个解决方案

解决方案1 0 已采纳 2014-08-06 18:15:00

解决方案2 0 2014-08-06 18:18:48

解决方案3 0 2014-08-06 18:24:38

解决方案1
0 已采纳 2014-08-06 18:15:00

解决方案2
0 2014-08-06 18:18:48

解决方案3
0 2014-08-06 18:24:38