简体   繁体   中英

How to replace dataframe append with concat?

I have been doing an online course which includes the following snippet of code for web scraping. When I run it in the course's Jupyter notebook environment it doesn't come up with any errors.

But when I run it in my own I get a warning to use concat instead of append for the dataframe.

What do I need to do to modify this snippet below to use dataframe concat? I've looked up a few other examples of this problem and tried various ways to modify the code but I just can't seem to get it to work.

population_data = pd.DataFrame(columns=["Rank", "Country", "Population", "Area", "Density"])

for row in tables[table_index].tbody.find_all("tr"):
    col = row.find_all("td")
    if (col != []):
        rank = col[0].text
        country = col[1].text
        population = col[2].text.strip()
        area = col[3].text.strip()
        density = col[4].text.strip()
        population_data = population_data.append({"Rank":rank, "Country":country, "Population":population, "Area":area, "Density":density}, ignore_index=True)

population_data

The warning looks like this:

C:\Users\My Name\AppData\Local\Temp\ipykernel_22060\394869253.py:11: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
population_data = population_data.append({"Rank":rank, "Country":country, "Population":population, "Area":area, "Density":density}, ignore_index=True)

Here is my solution for replacing it. I've also cleaned variable initialisation to be more neat.

cols = ["Rank", "Country", "Population", "Area", "Density"]

population_data = pd.DataFrame(columns=cols)

for row in tables[table_index].tbody.find_all("tr"):
    col = row.find_all("td")
    if (col != []):
        rank, country, population, area, density =
          col[0].text, col[1].text, col[2].text.strip(),
          col[3].text.strip(), col[4].text.strip()
        new_entry_df = pd.DataFrame(np.array([rank,country,population,area,density]),
               columns=cols)
        population_data = pd.concat([population_data, new_entry_df], axis=0)

population_data.tail(3)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM