[英]Converting a string into 3 character substrings
I have an assignment where we read from a text file of Covid-19 sequences.我有一个任务,我们从 Covid-19 序列的文本文件中读取。 I have read in the first line as a string and now have to use a substring method to break down this line into groups of 3 characters that forms a codon sequence.
我已将第一行作为字符串读取,现在必须使用 substring 方法将此行分解为 3 个字符组,其中 forms 是密码子序列。 I am having trouble visualizing how to break this down?
我无法想象如何将其分解? This is the first line of the file and every 3rd letter makes a codon.
这是文件的第一行,每第三个字母组成一个密码子。 What I have now is
testLine = scan.nextLine();
我现在拥有的是
testLine = scan.nextLine();
AGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAG
for (int i = 0; i < testLine.length(); i += 3)
{
String codon = testLine.substring(0,3);
codonList.add(codon);
}
System.out.println(codonList);
I know I am close, the output from my code above prints the first codon AGA 20 times repeatedly.我知道我很接近,我上面的代码中的 output 重复打印第一个密码子 AGA 20 次。 Here is the output:
这是 output:
[AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA, AGA]
Edit* I was able to get it with the help of everyone.编辑* 在大家的帮助下,我能够得到它。 The issue I am having now is replicating this for the whole file.
我现在遇到的问题是为整个文件复制这个。 I added a hasNext method and it doesn't seem to work the same way.
我添加了一个 hasNext 方法,但它的工作方式似乎不同。
while(scan.hasNext())
testLine = scan.nextLine();
for (int i = 0; i < testLine.length(); i += 3)
{
String codon = testLine.substring(i, i + 3);
codonList.add(codon);
}
System.out.println(codonList);
}
Here is my output with the hasnext added:
[ATT, AAT, TTT, AGT, AGT, GCT, ATC]
Just use the index in the loop to substring
.只需将循环中的索引用于
substring
。
String codon = testLine.substring(i, Math.min(i + 3, testLine.length()));
String#split
can also be used.也可以使用
String#split
。
System.out.println(Arrays.toString(testLine.split("(?<=\\G.{3})")));
It seems you were very close.看来你们已经很亲近了。 You need to use i instead of 0 in the loop.
您需要在循环中使用i而不是0 。
Here is my solution in C#.这是我在 C# 中的解决方案。 I know you ask Java but I had a C# IDE open...
我知道你问 Java 但我有一个 C# IDE 打开...
List<string> codonList = new List<string>();
string testLine = "AGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAG";
for (int i = 0; i < testLine.Length; i += 3)
{
String codon = testLine.Substring(i, 3);
codonList.Add(codon);
}
int cnt = 0;
foreach (string s in codonList)
{
cnt++;
if (cnt != codonList.Count)
{
Console.Write(s + ", ");
}
else
{
Console.WriteLine(s);
}
}
Console.ReadLine();
This will work:这将起作用:
String testLine = "AGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAG";
List<String> codonList = new ArrayList<String>();
String newTestLine = testLine;
for (int i = 0; i < testLine.length(); i += 3) {
newTestLine = testLine.substring(i);
String codon = newTestLine.substring(0, 3);
codonList.add(codon);
}
System.out.println(codonList);
Here's a one liner:这是一个衬里:
String[] parts = testLine.split("(?<=\\G...)");
This works by splitting at points in the input that are 3 characters after the end of the last match (denoted by \G
, which is initialized to start of input).这通过在最后一个匹配结束后的 3 个字符处分割输入中的点来工作(由
\G
表示,它被初始化为输入的开始)。
If you really need a List:如果你真的需要一个列表:
List<String> parts = Arrays.asList(testLine.split("(?<=\\G...)"));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.