[英]Python: Expanding Complicated Tree DataStructure
我正在探索一个数据结构,该数据结构将扩展为子元素并解析为最终元素。 但是我只想存储前两个级别。
例子:假设我从纽约开始,该郡分为布朗克斯,金斯,纽约,皇后区和里士满这两个县,但最终还是以某种方式定居美国。
我不确定这是否是一个很好的例子,但只是在这里明确说明是对该问题的更清楚的解释。
A (expands to) B,C,D -> B (expands to) K,L,M -> K resolves to Z
我最初是在一系列for循环中编写它,然后使用递归,但在递归中,我失去了一些可扩展的元素,因此,我没有深入研究每个扩展元素。 我把递归版本和非递归版本都放了。 我正在寻找有关构建此数据结构的建议,以及最佳方法是什么。
我为扩展版本中的每个元素调用一个数据库查询,该查询返回一个项目列表。 直到解析为单个元素。 在没有递归的情况下,我不会一直放松直到其他人解决的最后一个要素。 但是,与递归不同。 我也是python的新手,所以希望在这样的网站上问这个问题不是一个坏问题。
returnCategoryQuery
是一种通过调用数据库查询返回项目列表的方法。
#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];
# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);
for key, value in idTitleDictionary.iteritems():
for startCategory in value[0]:
#print startCategory + "End of Query";
categoryResults = [];
try:
categoryRow = "";
baseCategoryTree[startCategory] = [];
print categoryQuery + startCategory + "'";
cursor.execute(categoryQuery + startCategory + "'");
done = False;
while not done:
categoryRow = cursor.fetchone();
if not categoryRow:
done = True;
continue;
categoryResults.append(categoryRow['cl_to']);
for subCategoryResult in categoryResults:
print startCategory.encode('ascii') + " - " + subCategoryResult;
for item in returnCategoryQuery(categoryQuery + subCategoryResult + "'"):
print startCategory.encode('ascii') + " - " + subCategoryResult + " - " + item;
for subItem in returnCategoryQuery(categoryQuery + item + "'"):
print startCategory.encode('ascii') + " - " + subCategoryResult + " - " + item + " - " + subItem;
for subOfSubItem in returnCategoryQuery(categoryQuery + subItem + "'"):
print startCategory.encode('ascii') + " - " + subCategoryResult + " - " + item + " - " + subItem + " - " + subOfSubItem;
for sub_1_subOfSubItem in returnCategoryQuery(categoryQuery + subOfSubItem + "'"):
print startCategory.encode('ascii') + " - " + subCategoryResult + " - " + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem;
for sub_2_subOfSubItem in returnCategoryQuery(categoryQuery + sub_1_subOfSubItem + "'"):
print startCategory.encode('ascii') + " - " + subCategoryResult + " - " + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem + " - " + sub_2_subOfSubItem;
except Exception, e:
traceback.print_exc();
def crawlSubCategory(subCategoryList):
level = 1;
expandedList = [];
for eachCategory in subCategoryList:
level = level + 1
print "Level " + str(level) + " " + eachCategory;
#crawlSubCategory(returnCategoryQuery(categoryQuery + eachCategory + "'"));
for subOfEachCategory in returnCategoryQuery(categoryQuery + eachCategory + "'"):
level = level + 1
print "Level " + str(level) + " " + subOfEachCategory;
expandedList.append(crawlSubCategory(returnCategoryQuery(categoryQuery + subOfEachCategory + "'")));
return expandedList;
#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];
# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);
for key, value in idTitleDictionary.iteritems():
for startCategory in value[0]:
#print startCategory + "End of Query";
categoryResults = [];
try:
categoryRow = "";
baseCategoryTree[startCategory] = [];
print categoryQuery + startCategory + "'";
cursor.execute(categoryQuery + startCategory + "'");
done = False;
while not done:
categoryRow = cursor.fetchone();
if not categoryRow:
done = True;
continue;
categoryResults.append(categoryRow['cl_to']);
#crawlSubCategory(categoryResults);
except Exception, e:
traceback.print_exc();
#baseCategoryTree[startCategory].append(categoryResults);
baseCategoryTree[startCategory].append(crawlSubCategory(categoryResults));
您是否要查找“女王”并得知它在美国? 您是否尝试过用XML对树进行编码,并使用lxml.etree
查找元素,然后使用getpath
以XPath格式返回路径?
这意味着在树上添加第四个顶层,即World,然后您将搜索Queens并了解到Queens的路径是World/USA/NewYork/Queens
。 问题的答案始终是XPath
的第二项。
当然,您总是可以仅从XML构建树并使用树搜索算法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.