简体   繁体   中英

Using my Python Web Crawler in my site

I created a Web Crawler in Python 3.7 that pulls different info and stores them into 4 different arrays. I have now come across an issue that I am not sure how to fix. I want to use the data from those four arrays in my site and place them into a table made from JS and HTML/CSS. How do I go about accessing the info from my Python file in my JavaScript file? I tried searching in other places before creating an account, and came across some things that talk of using Json, but I am not too familiar with these and would appreciate some help if that is the way to do it. I will post my code below which I have stored in the same directory as my other sites files. Thanks in advance!

from requests import get
from bs4 import BeautifulSoup
from flask import Flask
app = Flask(__name__)


@app.route("/")
def main():
    # lists to store data
    names = []
    gp = []
    collectionScore = []
    arenaRank = []

    url = 'https://swgoh.gg/g/21284/gid-1-800-druidia/'
    response = get(url)

    soup = BeautifulSoup(response.content, 'html.parser')

    # username of the guild members:
    for users in soup.findAll('strong'):
        if users.text.strip().encode("utf-8") != '':
            if users.text.strip().encode("utf-8") == '\xe9\x82\x93\xe6\xb5\xb7':
                names.append('Deniz')
            else:
                names.append(users.text.strip().encode("utf-8"))
        if users.text.strip().encode("utf-8") == 'Note':
            names.remove('Note')
        if users.text.strip().encode("utf-8") == 'GP':
            names.remove('GP')
        if users.text.strip().encode("utf-8") == 'CS':
            names.remove('CS')

    print(names)

    # GP of the guild members:
    for galacticPower in soup.find_all('td', class_='text-center'):
        gp.append(galacticPower.text.strip().encode("utf-8"))
    totLen = len(gp)

    i = 0
    finGP = []
    while i < totLen:
        finGP.append(gp[i])
        i += 4
    print(finGP)

    # CS of the guild members:
    j = 1
    while j < totLen:
        collectionScore.append(gp[j])
        j += 4
    print(collectionScore)

    # Arena rank of guild member:
    k = 2
    while k < totLen:
        arenaRank.append(gp[k])
        k += 4
    print(arenaRank)

if __name__ == "__main__":
    app.run()

TLDR: I want to use the four lists - finGP, names, collectionScore, and arenaRank in a JavaScript or HTML file. How do I go about doing this?

Ok, this will be somewhat long but I'm going to try breaking it down into simple steps. The goal of this answer is to:

  1. Have you get a basic webpage being generated and served from python.
  2. Insert the results of your script as javascript into the page.
  3. Do some basic rendering with the data.

What this answer is not:

  1. An in-depth javascript and python tutorial. We don't want to overload you with too many concepts at one time. You should eventually learn about databases and caching, but that's further down the road.

Ok, here's what I want you to do first. Read and implement this tutorial up until the "Creating a Signup Page" section. That starts to get into dealing with Mysql, which isn't something you need to worry about right now.

Next, you need to execute your scraping script when a request for the server. When you get the results back, you output those into the html page template inside a script tag that looks like:

<script>
  const data = [];
  console.log(data);
</script>

Inside the brackets in data = [] use json.dumps ( https://docs.python.org/2/library/json.html ) to format your Python array data as json. Json is actually a subset of javascript, so you just output it as a raw javascript string here and it gets loaded into the webpage via the script tag.

The console.log statement in the script tag will show the data in the dev tools in your browser.

For now, lets pause here. Get all of this working first (probably a few hours to a day's work). Getting into doing html rendering with javascript is a different topic and again, I don't want to overload you with too much information right now.

Leave comments on this answer if you need extra help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM