[英]Im trying to extract data from a URL in Python and geeting this error when trying to find the most common data source
This is my code这是我的代码
import pandas as pd
# Read the data from the Wikipedia page into a Pandas DataFrame
url = "https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population"
df = pd.read_html(url, attrs={"class": "wikitable"})[0]
# Visualize the DataFrame
print(df)
# Print the number of records in the DataFrame
print(f"There are {len(df)} records in the DataFrame.")
# Find the most common data source
most_common_source = df["Source"].value_counts().index[0]
print(f"The most common data source is {most_common_source}.")
KeyError Traceback (most recent call last) /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3360 try: -> 3361 return self._engine.get_loc(casted_key) 3362 except KeyError as err: KeyError Traceback(最近调用最后)/usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3360 try: -> 3361 return self._engine.get_loc(casted_key) 3362 除了 KeyError 作为错误:
8 frames pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() 8帧 pandas/_libs/hashtable_class_helper.pxi pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi 在 pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Source' KeyError:“来源”
The above exception was the direct cause of the following exception:上述异常是以下异常的直接原因:
KeyError Traceback (most recent call last) /usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3361 return self._engine.get_loc(casted_key) 3362 except KeyError as err: -> 3363 raise KeyError(key) from err 3364 3365 if is_scalar(key) and isna(key) and not self.hasnans: KeyError Traceback(最后一次调用)/usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 3361 return self._engine.get_loc (casted_key) 3362 除了 KeyError 为 err: -> 3363 raise KeyError(key) from err 3364 3365 if is_scalar(key) and isna(key) and not self.hasnans:
KeyError: 'Source' KeyError:“来源”
The table you are trying to extract from does not have a column named Source .您尝试从中提取的表没有名为Source的列。 Maybe you meant something like this:也许你的意思是这样的:
most_common_source = \
df["Source (official or from the\xa0United Nations)"].value_counts().index[0]
You can always print your columns with list(df.columns)
.您始终可以使用list(df.columns)
打印您的列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.