简体   繁体   中英

How can I change the letters of a sequence in a text file?

I must improve and extend this code. Into detail, I have a text file with codes of genotype (ie AGGGGCCCTATTCGCCC.....) that want to change these codes like this:

A -> T

G -> C

C -> G

T -> A

I mean that A change to T like above. Then I save this new code in my file.

I would be grateful if you guided me through this.

#include <iostream>
#include <fstream>
#include <vector>
#include <string>

int readFile (std::string Genotype, std::vector<std::string>& fileContent)
{

    // Opening the Genotype file
    std::ifstream CGenotype("AT.txt");

    // Checking if object is valid
    if (CGenotype.fail())
    {
        std::cout << "Cannot open the Genotype File : " << Genotype << std::endl;
        return EXIT_FAILURE;
    }

    if (CGenotype.peek() == std::ifstream::traits_type::eof())
    {
        std::cout << "The file is empty: " << Genotype << std::endl;
        return EXIT_FAILURE;
    }
    std::string str;
    // Reading the next line from genotype file untill it reaches the end.
    while (std::getline(CGenotype, str))
    {
        // Line contains string of length > 0 then save it in vector
        if (str.size() > 0)
        {
            fileContent.push_back(str);
        }
    }
    //Closing the genotype file
    CGenotype.close();
    return EXIT_SUCCESS;
}

int writeFile (std::string Genotype, std::vector<std::string>& fileContent)
{
    std::string str;
    while (std::getline(CGenotype, str))
    {
    if (str== 'A';
    cout << 'T';
    else if (str== 'T';
    cout << 'A';
    else if (str== 'C';
    cout << 'G';
    else if (str== 'G';
    cout << 'C';
    }
    CGenotype.close();
} 
int main()
{
    std::vector<std::string> fileContent;

    // Getting the contents of genotype file in a vector
    int fileCheck = readFile("AT.txt", fileContent);

    if (!fileCheck)
    {
        // Printing the vector contents
        for (std::string& line : fileContent)
            std::cout << line << std::endl;
    }
}

I'm thinking something like this (Explanation embedded where appropriate):

#include <algorithm>
#include <iterator>
#include <fstream>
#include <filesystem>
int main()
{
    {
        // open input and disposable temporary output file
        std::ifstream in("in.txt");
        std::ofstream out("out.txt");
        
        //read character from input file, write transformed character to output file
        std::transform(std::istream_iterator<char>(in),
                         std::istream_iterator<char>(),
                         std::ostream_iterator<char>(out),
                         [](char val)
                         {
                             switch(val)
                             {
                                 case 'A': return 'T';
                                 case 'G': return 'C';
                                 case 'C': return 'G';
                                 case 'T': return 'A';
                                 default: return val;
                             }
                         });
    } // RAII closes open files here
    
    // replace input file
    std::filesystem::remove("in.txt"); 
    std::filesystem::rename("out.txt", "in.txt");
}

Rational for not transforming the file in place as is done in the other answers: If anything goes wrong, up until the input file is replaced by the output file, there is no damage done to the input file. In the event of failure, the window for damage like a half-transformed file is minimal.

How about something like this? This version processes each character, not each line. And to keep things short, I didn't include any domain-specific error handling.

I'm assuming you wanted to process each individual character... and each character is either replaced inline or left as is.

int main() {
    // ... Open the file (will default to both read and write)
    std::fstream s("AT.txt");
   
    // ... Get initial position (i.e., 0)
    long pos = s.tellp() ;

    // ... Repeat: read a character until you can't  
    while ( s.seekp(pos++) ) {
        // ... Parse the current character
        switch( s.peek() ) {
        case 'A': s.write("T", 1); break ; // ... replace inline
        case 'G': s.write("C", 1); break ; // ... replace inline
        case 'C': s.write("G", 1); break ; // ... replace inline
        case 'T': s.write("A", 1); break ; // ... replace inline
        default:                   break ; // ... nothing to translate
        }
    }
    // .... File will close automagically
    return EXIT_SUCCESS ;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM