I have the following file containing some information about an audio file.
Language = es-ES
Duration = 00:00:00.9100000
Unos amigos.
*
Language = es-ES
Duration = 00:00:03.5600000
Yo sé vamos a la fiesta en English with.
*
Language = en-US
Duration = 00:00:05.0200000
Hey, let us go to the party and Spanish. We say bye Marcella.
*
Language = es-ES
Duration = 00:00:02.2700000
Fiesta que yo use.
*
Language = es-ES
Duration = 00:00:00.8300000
La fiesta.
I want to combine the duration and every sentence together if it's the same language. I was thinking of splitting to an array of strings first using * as a delimiter but I don't know how to combine the duration or the sentences together, any help? I'm using C# btw. Is it better to create an object for each paragraph?
string[]subs=textFile.Split('*')
The wanted output:
Language = es-ES
Duration = 00:00:08.93
Unos amigos. Yo sé vamos a la fiesta en English with. Fiesta que yo use. La fiesta.
Language = en-US
Duration = 00:00:05.0200000
Hey, let us go to the party and Spanish. We say bye Marcella.
var source = @"Language = es-ES
Duration = 00:00:00.9100000
Unos amigos.
*
Language = es-ES
Duration = 00:00:03.5600000
Yo sé vamos a la fiesta en English with.
*
Language = en-US
Duration = 00:00:05.0200000
Hey, let's go to the party and Spanish. We say bye Marcella.
*
Language = es-ES
Duration = 00:00:02.2700000
Fiesta que yo use.
*
Language = es-ES
Duration = 00:00:00.8300000
La fiesta.";
var results =
from section in source.Split(new string[] { $"*{Environment.NewLine}" }, StringSplitOptions.None)
let parts = section.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
let language = parts[0].Split('=', StringSplitOptions.TrimEntries)[1]
let duration = TimeSpan.Parse(parts[1].Split('=', StringSplitOptions.TrimEntries)[1])
let text = parts[2]
group new { duration, text } by language into languages
select new
{
language = languages.Key,
duration = languages.Select(x => x.duration).Aggregate((x, y) => x.Add(y)),
text = String.Join(" ", languages.Select(x => x.text)),
};
Given this source data I got this:
I would do something like this. It is very messy and not good code, but I am in a hurry so make out of it what you want. It would be probably be best practice to make a class for each language.
List<string> language = new List<string>();
List<TimeSpan> duration = new List<TimeSpan>();
List<string> text = new List<string>();
void Main(string[] args)
{
string file = System.IO.File.ReadAllText(@"path\file.txt");
string[] lines = file.Split('\n');
for(int i = 0; i < lines.Length; i++)
{
int pos = language.IndexOf(lines[i]);
if(pos != -1)
{
i++;
duration[pos].Add(TimeSpan.Parse(lines[i].Substring(10, 16)));
i++;
text[pos] += lines[i];
i+=2;
}else
{
language.Add(lines[i]);
pos = language.IndexOf(lines[i]);
i++;
duration.Add(TimeSpan.Parse(lines[i].Substring(10, 16)));
i++;
text.Add(lines[i]);
i += 2;
}
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.