[英]How can I get all possibilities using a markov chain?
I am trying to implement a linguistic passphrase cracker using markov chains. 我正在尝试使用markov链实现语言密码破解程序。
The idea behind this is to chose n-grams from a text, select a starting n-gram (usually a word that is at the beginning of a sentence) and represent it as a state, using the first n-1 characters. 其背后的思想是从文本中选择n-gram,选择起始n-gram(通常是句子开头的单词),并使用前n-1个字符将其表示为状态。 As an example, for "the" I will have "th".
例如,对于“ the”,我将拥有“ th”。 This will have a list of letters with their occurences, and will be represented as a dictionary.
这将具有出现字母的列表,并将其表示为字典。
dict["th"] = [('e', 120), ('a', 79)]
etc. For each of these values I will try to create a markov chain that will satisfy either my password or my password length. dict["th"] = [('e', 120), ('a', 79)]
等。对于这些值,我将尝试创建一个满足我的密码或密码长度的markov链。 What that means is that when the markov chain has the same length as my password that I am trying to find, I will stop the execution and check if the markov chain is the same with my password. 这意味着当markov链与我要查找的密码具有相同的长度时,我将停止执行并检查markov链与我的密码是否相同。 I am trying to implement this using a recursive function but for some reason I am getting stack overflow.
我试图使用递归函数来实现这一点,但是由于某种原因,我正在堆栈溢出。
def ceva(myTry, good_all, pwd, guess, level):
save = myTry
if len(pwd) == len(guess):
if pwd == guess:
return 1
else:
if myTry in good_all.keys():
values = good_all[myTry]
for i in range(0,len(values)):
#print(i, len(values))
letter = values[i][0]
#print("First",myTry, letter)
pwd += letter
if i != len(values)-1:
if len(pwd) == len(guess):
#print("In if", pwd, myTry)
if pwd == guess:
print("I found:", pwd)
return 1
else:
pwd = pwd[0:len(pwd)-1]
else:
myTry += letter
myTry = myTry[1:]
#print("In else: ",pwd, myTry)
return ceva(myTry, good_all, pwd, guess, level)
else:
if len(pwd) == len(guess):
#print("In if", pwd, myTry)
if pwd == guess:
print("I found:", pwd)
return 1
pwd = pwd[0:len(pwd)-1]
for key, letterList in starter_follows.items():
myTry = key.replace("_", "")
# i will not treat the case when the starting phrase
# is a single character
if myTry == "i":
pass
else:
for letter in letterList:
if letter[0] not in "_.-\"!":
myTry += letter[0]
pwd = copy.copy(myTry)
#print("Starter:", pwd)
res=ceva(myTry, good_all, pwd, toGuess, 1)
myTry = myTry[0:len(myTry)-1]
With this algorithm i am reaching the maximum recursion depth. 通过这种算法,我达到了最大递归深度。 But I am trying to obtain all the markov chains until the passphrase is found.
但是我试图获取所有的马尔可夫链,直到找到密码短语为止。
EDIT 1: Now, with the updated code, the password is found but only because I am looping thorugh all the possible last letters. 编辑1:现在,使用更新的代码,找到了密码,但这仅是因为我正在循环所有可能的最后一个字母。
Eg: "indeed" 例如:“确实”
ind
is already in my list of starters, and all the tri-grams I am finding have "e" as their most common next letter. ind
已经在我的入门列表中,我发现的所有三字母都以“ e”作为最常见的下一个字母。 So e is added, then the next e, then a next e and now the password is "indeee", but i am slicing the last letter and going through the for again, and it ultimately finds "indeed", which is okay. 因此,先添加e,再添加下一个e,再添加下一个e,现在密码为“独立”,但是我将最后一个字母切成薄片并再次输入for,它最终找到“独立”,这是可以的。 The problem is that if I will give
indedd
it will not find my password, because the second "d" is never looped through. 问题是,如果我提供
indedd
,它将找不到我的密码,因为第二个“ d”永远不会循环通过。 How can I go back in my iteration and loop through all possible letters at all levels? 如何返回迭代并遍历所有级别的所有可能字母?
I managed to pull it off eventually thanks to the answers given. 由于给出的答案,我最终设法完成了任务。 I am posting the working algorithm in hope that it will help someone someday.
我发布了有效的算法,希望有一天能对某人有所帮助。
def ceva(myTry, good_all, pwd, guess, flag):
if len(pwd) > len(guess):
return 0
if len(pwd) == len(guess):
if pwd == guess:
flag = 1
return 1
else:
return 0
save = copy.copy(myTry)
#print("Start of functionn:", "[", pwd, ",", myTry, "]")
if flag == 1:
return 1
else:
if myTry in good_all.keys():
# get the list of letters for this specific trigram
values = good_all[myTry]
if len(pwd) <= len(guess):
for i in range(0,len(values)):
#myTry = copy.copy(save)
# get the letter
letter = values[i][0]
# add the letter to the password
pwd += letter
# if i found the password, set flag to 1 and break the loop
if pwd == guess:
flag = 1
print("I found:", pwd)
return 1
# add the new letter to the trigram, and the get rid of the first
# letter, in order to create the new trigram
myTry += letter
myTry = myTry[1:]
#print("Pwd before cutting: [", pwd, "]")
res = ceva(myTry, good_all, pwd, guess, flag)
#print(res)
if res == 0:
pwd = pwd[0:len(pwd)-1]
myTry = pwd[-3:]
# print("This is after stop: [", pwd,",", myTry, "]")
else:
return 1
if flag == 0:
return 0
else:
return 1
flag = 0
for key, letterList in starter_follows.items():
myTry = key.replace("_", "")
# i will not treat the case when the starting phrase
# is a single character
if myTry == "i":
pass
else:
#print("Aaaa")
for letter in letterList:
if letter[0] not in "_.-\"!":
# add the letter to create a (n-1)-gram
myTry += letter[0]
# create a copy of the starting password
pwd = copy.copy(myTry)
# call the recursive function
res=ceva(myTry, good_all, pwd, toGuess, flag)
# go back to the begining, to try a new start
# e.g.: "the" first, "tha" second
myTry = myTry[0:len(myTry)-1]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.