Plotting arrays with different lengths in seaborn

Question

I have a dataframe that I would like to make a strip plot out of, the array consists of the following

   Symbol  Avg.Sentiment  Weighted  Mentions                                          Sentiment
0     AMC           0.14      0.80       557  [-0.38, -0.48, -0.27, -0.42, 0.8, -0.8, 0.13, ...
2     GME           0.15      0.26       175  [-0.27, 0.13, -0.53, 0.65, -0.91, 0.66, 0.67, ...
1      BB           0.23      0.29       126  [-0.27, 0.34, 0.8, -0.14, -0.39, 0.4, 0.34, -0...
11    SPY          -0.06     -0.03        43  [0.32, -0.38, -0.54, 0.36, -0.18, 0.18, -0.33,...
4    SPCE           0.26      0.09        35  [0.65, 0.57, 0.74, 0.48, -0.54, -0.15, -0.3, -...
13     AH           0.06      0.02        33  [0.62, 0.66, -0.18, -0.62, 0.12, -0.42, -0.59,...
12   PLTR           0.16      0.05        29  [0.66, 0.36, 0.64, 0.59, -0.42, 0.65, 0.15, -0...
15   TSLA           0.13      0.03        24  [0.1, 0.38, 0.64, 0.42, -0.32, 0.32, 0.44, -0....

and so on, the number of elements in the list of 'Sentiment' are the same as the number of mentions, I would like to make a strip plot with the Symbol as the x axis and sentiment as the y axis, I believe the problem that I'm encountering is because of the different lengths of list, the actual error reading I'm getting is

ValueError: setting an array element with a sequence.

the code that I'm trying to use to create the strip plot is this

def symbolSentimentVisualization(dataset):
    sns.stripplot(x='Symbol',y='Sentiment',data=dataset.loc[:9])
    plt.show()

the other part of my issue I would guess has something to do with numpy trying to set multidimensional arrays with different lengths before being put into a seaborn plot, but not 100% on that, if the solution is to plot one row at a time and then merge plots that would definitely work but I'm not sure what exactly I should call to do that because trying it out with the following doesn't seem to work either.

def symbolSentimentVisualization(dataset):
    sns.stripplot(x=dataset['Symbol'][0],y=dataset['Sentiment'][0],data=dataset.loc[:9])
    plt.show()

Answer 1

IIUC explode 'Sentiment' first then plot:

df = df.explode('Sentiment')
ax = sns.stripplot(x="Symbol", y="Sentiment", data=df)

Sample Data:

np.random.seed(5)
df = pd.DataFrame({
    'Symbol': ['AMC', 'GME', 'BB', 'SPY', 'SPCE'],
    'Mentions': [557, 175, 126, 43, 35]
})

df['Sentiment'] = df['Mentions'].apply(lambda x: (np.random.random(x) * 2) - 1)

  Symbol  Mentions                                          Sentiment
0    AMC       557  [-0.556013657820521, 0.7414646123547528, -0.58...
1    GME       175  [-0.5673003921341209, -0.6504850189478857, 0.1...
2     BB       126  [0.7771316020052821, 0.26579994709269994, -0.4...
3    SPY        43  [-0.5966607678089173, -0.4473484233894889, 0.7...
4   SPCE        35  [0.7934741289205556, 0.17613102678923398, 0.58...

Resulting Graph:

Complete Working Example with Sample Data:

import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt

np.random.seed(5)
df = pd.DataFrame({
    'Symbol': ['AMC', 'GME', 'BB', 'SPY', 'SPCE'],
    'Mentions': [557, 175, 126, 43, 35]
})

df['Sentiment'] = df['Mentions'].apply(lambda x: (np.random.random(x) * 2) - 1)

df = df.explode('Sentiment')
ax = sns.stripplot(x="Symbol", y="Sentiment", data=df)
plt.show()

Plotting arrays with different lengths in seaborn

Question

1 answers

solution1
1 ACCPTED 2021-05-29 02:43:34

Plotting arrays with different lengths in seaborn

Question

1 answers

solution1 1 ACCPTED 2021-05-29 02:43:34

solution1
1 ACCPTED 2021-05-29 02:43:34