简体   繁体   English

Geopandas - 带有大陆数据的图表

[英]Geopandas - plot chart with continent data

I am trying to plot the data on the continents with Geopandas.我正在尝试使用 Geopandas 绘制大陆上的数据。

I have the following number of tickets from my pivot table on the number of tickets logged from each country:我的数据透视表中有以下数量的票证,这些票证是从每个国家/地区记录的票证数量:

    Number of Tickets
region
Africa            370
Americas         1130
Asia              873
Europe            671
Oceania           445

In my ticket list dataframe, I have the cases logged from each country.在我的工单列表数据框中,我记录了每个国家/地区的案例。 Each country is mapped to a region and a continent.每个国家都映射到一个地区和一个大陆。 Following the logic, each ticket logged has a country, region and continent assigned.按照逻辑,记录的每张票都分配有国家、地区和大洲。

To be able to plot the data, I merge the Geopandas dataframe (country geometries) with my ticket dataframe on 3-letter country codes and make sure that the resulting dataframe is a geodataframe:为了能够绘制数据,我将 Geopandas 数据框(国家几何图形)与我的 3 个字母国家代码的票证数据框合并,并确保生成的数据框是一个地理数据框:

tickets_region = pd.merge(world, tickets, left_on='ISO_A3', right_on='code-3')

type(tickets_region)
geopandas.geodataframe.GeoDataFrame

I try to plot the data with the following code:我尝试使用以下代码绘制数据:

fig, ax = plt.subplots()
ax = tickets_region.plot('continent', cmap='Reds',scheme='headtailbreaks')
ax.tick_params(left=False, labelleft=False, bottom=False, labelbottom=False)
plt.title('Number of Tickets by Continent')
plt.box(False)
plt.show()

However this code block never finishes, eats up memory and CPU cycles and I have to press Ctrl-C to cut it out.然而,这个代码块永远不会完成,会占用内存和 CPU 周期,我必须按 Ctrl-C 才能将其删除。 Same code works with 'code-3' (3-letter country codes.)相同的代码适用于“code-3”(3 个字母的国家/地区代码。)

I assume that this is due to the 'continent' geography not defined in the geojson file, but I am expecting that to be filled by Python by adding up the number of tickets.我假设这是由于 geojson 文件中未定义“大陆”地理,但我希望 Python 通过将票数相加来填充。 I see that my expectation has a broken logic somewhere, but I am not able to see that.我看到我的期望在某处有一个错误的逻辑,但我看不到。

Any ideas on how I can make the continent plot work?关于如何让大陆情节发挥作用的任何想法?

Thank you.谢谢你。

Edit: "world" dataframe is the geojson file download from https://datahub.io/core/geo-countries编辑:“世界”数据框是从https://datahub.io/core/geo-countries下载的 geojson 文件

You can use the method dissolve() from the GeoPandas dataframe.您可以使用 GeoPandas 数据帧中的 solve dissolve()方法。 You can have a look at the GeoPandas documentation here .您可以在此处查看 GeoPandas 文档。 Your code can be modified like this :您的代码可以这样修改:

tickets_region = tickets_region.dissolve(by='continent', aggfunc='sum')

fig, ax = plt.subplots()
ax = tickets_region.plot(column='Number of Tickets', cmap='Reds',scheme='headtailbreaks')
ax.tick_params(left=False, labelleft=False, bottom=False, labelbottom=False)
plt.title('Number of Tickets by Continent')
plt.box(False)
plt.show()

I have utilized this thread to make an analysis recently.我最近利用这个线程进行了分析。 The data lacked continent co-ordinates to plot the graph, so had an idea to import the existing dataset and merge them together.数据缺少大陆坐标来绘制图表,因此有一个想法导入现有数据集并将它们合并在一起。 Here is the import and dissolve code:这是导入和dissolve代码:

import geopandas as gpd

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres")).drop(['gdp_md_est'], axis=1)
world = world.dissolve(by='continent', aggfunc='sum')
world = world.merge(d, how='inner', left_on='continent', right_index=True)

The Kaggle notebook is available at https://www.kaggle.com/code/pavfedotov/gtc-map Kaggle 笔记本可在https://www.kaggle.com/code/pavfedotov/gtc-map

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM