简体   繁体   中英

Filtering a dataframe in bokeh using select and perform a group operation on the filtered dataframe

I am using Bokeh to create a standalone HTML report. My primary source of data is a dataframe. I have already found out how to update a table or a plot using CustomJS callbacks. However, I would like to filter the original dataframe using a Select widget and after that I would like to perform a grouping operation on the filtered dataframe. So far I was not able to figure it out. For example: if my df dataframe looks like in the table below:

ColA ColB ColC
A B 1
A B 1
C C 1

Now I would like to select first all the rows where ColB='B' then group by ColA
df[df['ColB']=='B'].groupby('ColA').agg({'ColC':'sum'}) Then I would use the grouped df as a source for plot or tables. Thank you in advance.

You can't use real Pandas operations in a standalone HTML output, because that kind of output is just HTML and JavaScript in the browser, and the browser does not know anything about Python or Pandas. You have two options:

  • Use CustomJS callbacks, and do any grouping manually with JavaScript code, or

  • Deploy a Bokeh Server application that would allow you to use real Python callbacks (eg that could call Pandas functions)

For standalone HTML you could use third party JS libraries like dataframe-js . This way you can pass your entire Python dataframe to JS and perform filter and group operations in JS more then less like you are used to do in Python.See example below (tested for Bokeh v2.1.1):

import os
import pandas as pd
from bokeh.io import save
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.util.browser import view

df = pd.DataFrame({'ColA': ['A', 'A', 'A'], 'ColB': ['B', 'B', 'C'], 'ColC': [1, 2, 3],'ColD': [5, 2, 5] })

template = """
{% block postamble %}
    <script src="https://code.jquery.com/jquery-3.4.1.min.js"></script>
    <script src="https://gmousse.github.io/dataframe-js/dist/dataframe.min.js"></script>
    
    <script>
        $(document).ready(function() {
            var DataFrame = dfjs.DataFrame
            var line = Bokeh.documents[0].get_model_by_name('my_line')
            var df = new DataFrame(line.data_source.data)
            
            console.log('printing original dataframe')
            df.show()
            
            groups = df.groupBy('ColB').toCollection()
            
            for (i = 0; i<groups.length; i++) {
                var group_tuple = groups[i]
        
                var name = group_tuple['groupKey']['ColB']
                var group = group_tuple['group']
                
                console.log('printing group ' + name)
                group.show()
            }
        });
    </script>
    
{% endblock %} """

source = ColumnDataSource(df)

p = figure()
p.line('ColC', 'ColD', source = source, name="my_line")
save(p, template=template)
view(os.path.join(os.path.dirname(__file__), os.path.basename(__file__)).replace('.py', ".html"))

JS Console output:

printing original dataframe
| ColA      | ColB      | ColC      | ColD      | index     |
------------------------------------------------------------
| A         | B         | 1         | 5         | 0         |
| A         | B         | 2         | 2         | 1         |
| A         | C         | 3         | 5         | 2         |
​printing group B
| ColA      | ColB      | ColC      | ColD      | index     |
------------------------------------------------------------
| A         | B         | 1         | 5         | 0         |
| A         | B         | 2         | 2         | 1         |
​printing group C
| ColA      | ColB      | ColC      | ColD      | index     |
------------------------------------------------------------
| A         | C         | 3         | 5         | 2         |

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM