简体   繁体   中英

How to produce a correct bar plot using Pandas and Matplotlib.pyplot, from a list of dictionaries

My problem is that I'm trying to create a bar plot, but it is not outputting correctly.

I have a list of dictionaries.

Each dictionary contains all of the data and attributes associated with thousands of tweets from Twitter. Each dictionary contains attributes as key:value combinations including the tweet content, the screen name of the person tweeting, the language of the tweet, the country of origin of the tweet, and more.

To create my bar plot for the language attribute, I have a list comprehension that attempts to read in the list as a Pandas dataframe and output the data as a bar plot with 5 frequency bars for each of the top 5 most used languages in my list of tweets.

Here is my code for the language bar plot (note that my list of dictionaries containing each tweet is called tweets_data) :

tweets_df = pd.DataFrame()

tweets_df['lang'] = map(lambda tweet: tweet['lang'], tweets_data)

tweets_by_lang = tweets_df['lang'].value_counts()

fig, ax = plt.subplots()
ax.tick_params(axis='x', labelsize=15)
ax.tick_params(axis='y', labelsize=10)
ax.set_xlabel('Languages', fontsize=15)
ax.set_ylabel('Number of tweets' , fontsize=15)
ax.set_title('Top 5 languages', fontsize=15, fontweight='bold')
tweets_by_lang[:5].plot(ax=ax, kind='bar', color='red')

As I said, I should be getting 5 bars, one for each of the top five languages in my data. Instead, I am getting the graph show below. 在此处输入图片说明

Your problem is here:

tweets_df['lang'] = map(lambda tweet: tweet['lang'], tweets_data)

The issue, as your comment suggests, is down to changes from Python 2 to 3. In Python 2, map() returns a list. But in Python 3, map() returns an iterator . The hint is that there's only one value of tweets_df['lang'].value_counts() and it's the <map ... > iterator object).

In either Python 2 or 3, you can use a list comprehension instead:

tweet_by_lang = pd.Series([tweet['lang'] for tweet in tweets_data]).value_counts()

Or in Python 3, you can follow @Triptych's advice from the answer linked above and wrap map() in a list() :

tweets_df['lang'] = list(map(lambda tweet: tweet['lang'], tweets_data))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM