I have written some code that loads some files containing a list of words (one word pr line). each word is added to a multiset. later I try to search the multiset with multiset.find("aWord"). where I look for the word and substrings of the word in the multiset.
This code works fine if I compile it with qt on a windows system.
But don't work if i compile it in qt on my mac !
my goal is to make it work from qt on my mac.
I am woking on macbook Air (13" early 2018) with a
macOS Majave version 10.14.4 instalation
Buil version 18E226
local 18.5.0 Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT
2019; root:xnu-4903.251.3~3/RELEASE_X86_64 x86_64
Using a qt installation: QTKit:
Version: 7.7.3
Obtained from: Apple
Last Modified: 13/04/2019 12.11
Kind: Intel
64-Bit (Intel): Yes
Get Info String: QTKit 7.7.3, Copyright 2003-2012, Apple Inc.
Location: /System/Library/Frameworks/QTKit.framework
Private: No
And xcode installation:
Xcode 10.2
Build version 10E125
I have tried to print out: every strings that i am searching for and every string i should find in the multiset as hex format and concluded that some of the letters do not match. in there hex value. despite i think my whole system run utf-8 and the file also is utf-8 encoded.
Dictionary.h
#ifndef DICTIONARY_H
#define DICTIONARY_H
#include <iostream>
#include <vector>
#include <set>
class Dictionary
{
public:
Dictionary();
void SearchForAllPossibleWordsIn(std::string searchString);
private:
std::multiset<std::string, std::less<std::string>> mDictionary;
void Initialize(std::string folder);
void InitializeLanguage(std::string folder, std::string languageFileName);
};
#endif // DICTIONARY_H
Dictionary.cpp
#include "Dictionary.h"
#include <vector>
#include <set>
#include <iostream>
#include <fstream>
#include <exception>
Dictionary::Dictionary()
{
Initialize("../Lektion10Projekt15-1/");
}
void Dictionary::Initialize(std::string folder)
{
InitializeLanguage(folder,"da-utf8.wl");
}
void Dictionary::InitializeLanguage(std::string folder, std::string languageFileName)
{
std::ifstream ifs;
ifs.open(folder+languageFileName,std::ios_base::in);
if (ifs.fail()) {
std::cerr <<"Error! Class: Dictionary. Function: InitializeLanguage(...). return: ifs.fail to load file '" + languageFileName + "'" << std::endl;
}else {
std::string word;
while (!ifs.eof()) {
std::getline(ifs,word);
mDictionary.insert(word);
}
}
ifs.close();
}
void Dictionary::SearchForAllPossibleWordsIn(std::string searchString)
{
std::vector<std::string> result;
for (unsigned int a = 0 ; a <= searchString.length(); ++a) {
for (unsigned int b = 1; b <= searchString.length()-a; ++b) {
std::string substring = searchString.substr(a,b);
if (mDictionary.find(substring) != mDictionary.end())
{
result.push_back(substring);
}
}
}
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i) {
std::cout << result[i] << std::endl;
}
}
}
main.cpp
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
I have tried to change the following line in main.cpp
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
to (OBS: the first word in the word list is byggearbejderen)
std::ifstream ifs;
ifs.open("../Lektion10Projekt15-1/da-utf8.wl",std::ios::in);
if (ifs.fail()) {
std::cerr <<"Error!" << std::endl;
}else {
std::getline(ifs,searchword);
}
ifs.close();
myDictionary.SearchForAllPossibleWordsIn(searchword);
And then in the main.cpp add som print out with the expected string and substring in hex value.
std::cout << " cout as hex test:" << std::endl;
myDictionary.SearchForAllPossibleWordsIn(searchword);
std::cout << "Suposet search resul for ''bygearbejderen''" << std::endl;
for (char const elt: "byggearbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "byggearbejderen" << std::endl;
for (char const elt: "arbejderen")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "arbejderen" << std::endl;
for (char const elt: "ren")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "ren" << std::endl;
for (char const elt: "en")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "en" << std::endl;
for (char const elt: "n")
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
std::cout << "n" << std::endl;
And also added the same print in the outprint of result in Dictonary.cpp
std::cout << "result of seartchword as hex" << std::endl;
if (!result.empty()) {
for (unsigned int i = 0; i < result.size() ;++i)
{
for (char const elt: result[i] )
{
std::cout << std::hex << std::setw(2) << std::setfill('0') << static_cast<int>(elt) << " ";
}
std::cout << result[i] << std::endl;
}
}
which gave the following output:
result of seartchword as hex
ffffffef ffffffbb ffffffbf 62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 0d byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 0d arbejderen
72 65 6e 0d ren
65 6e 0d en
6e 0d n
Suposet search resul for ''bygearbejderen''
62 79 67 67 65 61 72 62 65 6a 64 65 72 65 6e 00 byggearbejderen
61 72 62 65 6a 64 65 72 65 6e 00 arbejderen
72 65 6e 00 ren
65 6e 00 en
6e 00 n
where I notice that some values were different. I don't know why this is the case when i am on a macOS but not the case on windows. I do not know if there are any settings of encoding in my environment I need to change or set correct.
I would like i my main.cpp looked liked this:
#include <iostream>
#include "Dictionary.h"
int main()
{
Dictionary myDictionary;
myDictionary.SearchForAllPossibleWordsIn("byggearbejderen");
return 0;
}
resulting in the following output:
byggearbejderen
arbejderen
ren
en
n
Line endings for text files are different on Windows than they are on a Mac. Windows uses both CR/LF characters (ASCII codes 13 and 10, respectively). Old Macs used the CR character alone, Linux systems use just the LF. If you create a text file on Windows, then copy it to your Mac, the line endings might not be handled correctly.
If you look at the last character in your output, you'll see it is a 0d
, which would be the CR character. I don't know how you generated that output, but it is possible that the getline
on the Mac is treating that as a normal character, and including it in the string that has been read in.
The simplest solution is to either process that text file beforehand to get the line endings correct, or strip the CR off the end of the words after they are read in.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.