I want to set max_chapter
and do functions until chapter number reaches the max_chapter
. Then, book number increased by 1 and go through the same function until the chapter number reaches the max_chapter
.
For example, Book 1 - Chapter 1~20, the Book turns out to be Book 2 and do functions Book 2 - Chapter 1~20,... and so on.
Here is a part of my code that I have the question about:
import requests
from bs4 import BeautifulSoup
import operator
def start(max_book):
word_list = []
book = 1
chapter = 1
while book <= max_book:
url = ('http://www.holybible.or.kr/B_NIV/cgi/bibleftxt.php?VR=NIV&VL='
+ str(book) + '&CN=' + str(chapter) + '&CV=99')
while chapter <= 1:
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
for bible_text in soup.findAll('font', {'class': 'tk4l'}):
content = bible_text.get_text()
words = content.lower().split()
for each_word in words:
word_list.append(each_word)
chapter += 1
else:
book += 1
print(word_list)
start(1)
IIUC , you need to read first 20 chapters from each book.
def Readchapters(max_books,max_chapters):
book=1
chapter=1
while book <= max_books:
while chapter<=max_chapters:
print "reading book :",book ,"Chapter : ",chapter
url = 'http://www.holybible.or.kr/B_NIV/cgi/bibleftxt.php?VR=NIV&VL={}&CN={}&CV=99'.format(book, chapter)
source_code = requests.get(url).text
soup = BeautifulSoup(source_code, "html.parser")
'''
#do your scraping here
................................
................................
'''
chapter+=1 #move to next chapter
book += 1 #move to next book
chapter=1 #reset the chapter back
Readchapters(2,20)
output
reading book : 1 Chapter : 1
reading book : 1 Chapter : 2
reading book : 1 Chapter : 3
reading book : 1 Chapter : 4
reading book : 1 Chapter : 5
reading book : 1 Chapter : 6
reading book : 1 Chapter : 7
reading book : 1 Chapter : 8
reading book : 1 Chapter : 9
reading book : 1 Chapter : 10
reading book : 1 Chapter : 11
reading book : 1 Chapter : 12
reading book : 1 Chapter : 13
reading book : 1 Chapter : 14
reading book : 1 Chapter : 15
reading book : 1 Chapter : 16
reading book : 1 Chapter : 17
reading book : 1 Chapter : 18
reading book : 1 Chapter : 19
reading book : 1 Chapter : 20
reading book : 2 Chapter : 1
reading book : 2 Chapter : 2
reading book : 2 Chapter : 3
reading book : 2 Chapter : 4
reading book : 2 Chapter : 5
reading book : 2 Chapter : 6
reading book : 2 Chapter : 7
reading book : 2 Chapter : 8
reading book : 2 Chapter : 9
reading book : 2 Chapter : 10
reading book : 2 Chapter : 11
reading book : 2 Chapter : 12
reading book : 2 Chapter : 13
reading book : 2 Chapter : 14
reading book : 2 Chapter : 15
reading book : 2 Chapter : 16
reading book : 2 Chapter : 17
reading book : 2 Chapter : 18
reading book : 2 Chapter : 19
reading book : 2 Chapter : 20
So, I think the str(chapter) and str(book) both should increase in order. no?
You simply just need to include the construction of the url inside the inner while loop to ensure that the url is updated with the new chapter number.
while book <= max_book:
while chapter <= 1:
url = 'http://www.holybible.or.kr/B_NIV/cgi/bibleftxt.php?VR=NIV&VL={}&CN={}&CV=99'.format(book, chapter)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.