從帶有單詞的文本文件中讀取整數

Question

我試圖從這樣結構化的文本文件中讀取整數....

ALS 46000
BZK 39850
CAR 38000
//....

使用ifstream。

我考慮了兩個選擇。

1）正則表達式使用Boost

2）創建一次性字符串（即我讀到一個單詞，不要對它做任何事情，然后讀入分數）。 但是，這是最后的手段。

有沒有辦法在C ++中表達我希望ifstream只讀取整數文本？ 如果事實證明有一種更簡單的方法可以實現這一點，我不願意使用正則表達式。

Answer 1

為什么簡單的事情變得復雜？

這有什么不對的：

ifstream ss("C:\\test.txt");

int score;
string name;
while( ss >> name >> score )
{
    // do something with score
}

Answer 2

編輯：事實上，使用解析器可以直接使用精神處理流，而不是我之前建議的：

+(omit[+(alpha|blank)] >> int_)

和一行代碼（變量定義除外）：

void extract_file()
{
    std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");    
    boost::spirit::istream_iterator it_begin(f), it_end;

    // extract all numbers into a vector
    std::vector<int> vi;
    parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi);

    // print them to verify
    std::copy(vi.begin(), vi.end(), 
        std::ostream_iterator<int>(std::cout, ", " ));

}

你可以用一行將所有數字同時放入一個向量中，但這並不簡單。

如果你不介意使用boost.spirit2 。 解析器只從一行獲取數字

omit[+(alpha|blank)] >> int_

提取一切都是

+(alpha|blank) >> int_

請參閱下面的整個程序（使用VC10 Beta 2測試）：

#include <boost/spirit/include/qi.hpp>  
#include <iostream>  
#include <string>  
#include <cstring> 
#include <vector>  

#include <fstream>
#include <algorithm>
#include <iterator>

using std::cout; 

using namespace boost::spirit;  
using namespace boost::spirit::qi;    

void extract_everything(std::string& line) 
{
    std::string::iterator it_begin = line.begin();
    std::string::iterator it_end   = line.end();    

    std::string s;
    int i;

    parse(it_begin, it_end, +(alpha|blank)>>int_, s, i);

    cout << "string " << s  
         << "followed by nubmer " << i 
         << std::endl;

}

void extract_number(std::string& line) 
{
    std::string::iterator it_begin = line.begin();
    std::string::iterator it_end   = line.end();    

    int i;

    parse(it_begin, it_end, omit[+(alpha|blank)] >> int_, i);

    cout << "number only: " << i << std::endl;

} 

void extract_line()
{
    std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");
    std::string s;
    int i; 

    // iterated file line by line
    while(getline(f, s))
    {
        cout << "parsing " << s << " yields:\n";
        extract_number(s);  // 
        extract_everything(s);
    }

}

void extract_file()
{
    std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt");    
    boost::spirit::istream_iterator it_begin(f), it_end;

    // extract all numbers into a vector
    std::vector<int> vi;
    parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi);

    // print them to verify
    std::copy(vi.begin(), vi.end(), 
        std::ostream_iterator<int>(std::cout, ", " ));

}

int main(int argc, char * argv[])  
{    
    extract_line();
    extract_file();

    return 0;  
}

輸出：

parsing ALS 46000 yields:
number only: 46000
string ALS followed by nubmer 46000
parsing BZK 39850 yields:
number only: 39850
string BZK followed by nubmer 39850
parsing CAR 38000 yields:
number only: 38000
string CAR followed by nubmer 38000
46000, 39850, 38000,

Answer 3

您可以調用ignore來跳過指定數量的字符。

istr.ignore(4);

您也可以告訴它停在分隔符處。 您仍然需要知道前導字符串可能的最大字符數，但這也適用於較短的前導字符串：

istr.ignore(10, ' ');

您還可以編寫一個只讀取字符的循環，直到您看到第一個數字字符：

char c;
while (istr.getchar(c) && !isdigit(c))
{
    // do nothing
}
if (istr && isdigit(c))
    istr.putback(c);

Answer 4

這里是：P

private static void readFile(String fileName) {

        try {
            HashMap<String, Integer> map = new HashMap<String, Integer>();
            File file = new File(fileName);

            Scanner scanner = new Scanner(file).useDelimiter(";");
            while (scanner.hasNext()) {
                String token = scanner.next();
                String[] split = token.split(":");
                if (split.length == 2) {
                    Integer count = map.get(split[0]);
                    map.put(split[0], count == null ? 1 : count + 1);
                    System.out.println(split[0] + ":" + split[1]);
                } else {
                    split = token.split("=");
                    if (split.length == 2) {
                        Integer count = map.get(split[0]);
                        map.put(split[0], count == null ? 1 : count + 1);
                        System.out.println(split[0] + ":" + split[1]);
                    }
                }
            }
            scanner.close();
            System.out.println("Counts:" + map);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        readFile("test.txt");
    }
}

Answer 5

fscanf(file, "%*s %d", &num);

或％05d，如果你有前導零和固定寬度5 ....

有時用C ++做事的最快方法是使用C. :)

Answer 6

您可以創建一個ctype facet，將字母分類為空格。 創建使用此構面的區域設置，然后使用該區域設置填充流。 有了這個，您可以從流中提取數字，但所有字母都將被視為空格（即，當您提取數字時，字母將被忽略，就像空格或制表符一樣）：

這樣的語言環境可能如下所示：

#include <iostream>
#include <locale>
#include <vector>
#include <algorithm>

struct digits_only: std::ctype<char> 
{
    digits_only(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table()
    {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::space);

        if (rc['0'] == std::ctype_base::space)
            std::fill_n(&rc['0'], 9, std::ctype_base::mask());
        return &rc[0];
    }
};

使用它的示例代碼可能如下所示：

int main() {
    std::cin.imbue(std::locale(std::locale(), new digits_only()));

    std::copy(std::istream_iterator<int>(std::cin), 
        std::istream_iterator<int>(),
        std::ostream_iterator<int>(std::cout, "\n"));
}

使用您的示例數據，我從中得到的輸出如下所示：

46000
39850
38000

請注意，就目前而言，我寫這個只接受數字。 如果（例如）你正在閱讀浮點數，你也想保留'。' （或特定於語言環境的等效項）作為小數點。 處理事物的一種方法是從普通ctype表的副本開始，然后將要忽略的事物設置為space 。

從帶有單詞的文本文件中讀取整數

問題描述

6 個解決方案

解決方案1
9 已采納 2010-01-18 06:54:01

解決方案2
4 2010-01-18 09:36:36

解決方案3
1 2010-01-18 06:37:54

解決方案4
0 2010-01-18 06:52:21

解決方案5
0

解決方案6
0 2010-01-18 07:00:32

從帶有單詞的文本文件中讀取整數

問題描述

6 個解決方案

解決方案1 9 已采納 2010-01-18 06:54:01

解決方案2 4 2010-01-18 09:36:36

解決方案3 1 2010-01-18 06:37:54

解決方案4 0 2010-01-18 06:52:21

解決方案5 0

解決方案6 0 2010-01-18 07:00:32

解決方案1
9 已采納 2010-01-18 06:54:01

解決方案2
4 2010-01-18 09:36:36

解決方案3
1 2010-01-18 06:37:54

解決方案4
0 2010-01-18 06:52:21

解決方案5
0

解決方案6
0 2010-01-18 07:00:32