简体   繁体   中英

combine histograms in gnuplot (weighted average of bin counts)

let us say I have two pieces of data for the same quantity, each coming with its error. In particular I have two histograms in gnuplot files roughly in the format xA yA dyA for histogram A xB yB dyB for histogram B (the xA and the xB values are the same)

To increase the precision of the histograms I can "merge them" to get a better estimate of y. In practice I want to obtain an histogram where the y values are the weighted average of the yA and yB values, with weights given by the inverse of their errors.

This is a pretty standard operation in data manipulation and I was expecting that some utility would exist to do this on gnuplot histograms. It turns out that I failed to find this utility ...

So I would like to ask if there is any such program out there that already does this. In case it does not exist I'd like to ask for a suggestion in what language to write this. I have already something that does it in Wolfram Mathematica, but I want now to perform the operation in Unix shell so I was wondering if python would be a good choice to manipulate gnuplot files or there is something more suited.

Thanks, Roberto

To be more exact I have histograms in a .gnu file that somebody gives me in the format

   # comments 
     set title "sqrt(p^2(5)) distribution" font "Helvetica, 20" 
     set xlabel "sqrt(p^2(5))" font "Helvetica, 20" 
     set ylabel "d{/Symbol s}/dsqrt(p^2(5))" font "Helvetica, 20" 
     set xrange [    0.00000:  40.00000] 
     plot "-" with histeps 
        4.50000        3986.18        1.27863 
        5.50000        3986.18        1.27863 
        6.50000        3986.18        1.27863 
    e 

     set title "m(5) distribution" font "Helvetica, 20" 
     set xlabel "m(5)" font "Helvetica, 20" 
     set ylabel "d{/Symbol s}/dm(5)" font "Helvetica, 20" 
     set xrange [    0.00000:  40.00000] 
     plot "-" with histeps 
        4.50000        3986.18        1.27863 
        5.50000        3986.18        1.27863 
        6.50000        3986.18        1.27863 
     e 

I would like to extract all the data from this file to combine, for instance, the m(5) histogram that I have in several files (combine means do a weighted average, as stated above). Any quick way to read in this data in Python and manipulate the histograms to combine them?

Yes, Python and Numpy are a good choice for this. If your files contain only a fixed amount of numbers per line you can use the numpy.loadtxt function to read them and savetxt to write them. Otherwise you will have to use the general Python IO routines.

The simplest way is to use paste to merge the two files (assuming, that the x-values are the same and in the same order in both files), and do the calculations in gnuplot.

Consider the two test files A.txt

1 5 1
2 1 2

and B.txt :

1 3 1
2 4 1

Using the script

set style fill solid noborder
set boxwidth 0.8 relative

set yrange [0:*]
weighted_avg(yA, dyA, yB, dyB) = ((yA/dyA + yB/dyB)/(1.0/dyA + 1.0/dyB))

plot '< paste A.txt B.txt' using 1:(weighted_avg($2, $3, $5, $6)) with boxes notitle

You get the following histogram

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM