I am writing some code which makes a histogram out of imported excel files from a directory and bins the data according to some parameter and exports new excel files accordingly, binned into their respective bins, for ex. if there is a number 5.16 then the bin count for bin (5,10] will go up by one and so on. However, I wanted to write something in which I can input a specific value that would change the bins accordingly, like if I wanted bins of 3, I would choose n=3
and the code would now bin accordingly, such that it makes bin (0,3], (3,6], etc. and the same rules as previously would apply. The original code is:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openpyxl
from pandas import ExcelWriter
import os
datadir = '/Users/user/Desktop/Newfolder/'
for file in os.listdir(datadir):
if file.endswith('.xlsx'):
data = pd.read_excel(os.path.join(datadir, file))
counts, bins, patches = plt.hist(data.values, bins=range(0,
int(max(data.values)+5), 5))
df = pd.DataFrame({'bin_leftedge': bins[:-1], 'count': counts})
plt.title('Data')
plt.xlabel('Neuron')
plt.ylabel('# of Spikes')
plt.show()
outfile = os.path.join(datadir, file.replace('.xlsx', '_bins.xlsx'))
writer = pd.ExcelWriter(outfile)
df.to_excel(writer)
writer.save()
My thought was to use argparse as a command line argument which I could us to input the change in the bins, so I wrote:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openpyxl
from pandas import ExcelWriter
import os
import argparse
datadir = '/Users/user/Desktop/Newfolder/'
parser = argparse.ArgumentParser(description = 'Calculating the bin width')
parser.add_argument('n', type=int, help='Changing of the width')
args = parser.parse_args()
def vary(n):
wid = n
return wid
if __name__ == '__main__':
print(vary(args.n))
for file in os.listdir(datadir):
if file.endswith('.xlsx'):
data = pd.read_excel(os.path.join(datadir, file))
counts, bins, patches = plt.hist(data.values, bins=range(0,
int(max(data.values)+vary(n)), vary(n)))
df = pd.DataFrame({'bin_leftedge': bins[:-1], 'count': counts})
plt.title('Data')
plt.xlabel('Neuron')
plt.ylabel('# of Spikes')
plt.show()
outfile = os.path.join(datadir, file.replace('.xlsx', '_bins.xlsx'))
writer = pd.ExcelWriter(outfile)
df.to_excel(writer)
writer.save()
I apologize in advance if this looks idioitic, since I am pretty new to coding and still don't know too much about anything. Anyway, I get an error say that
Traceback (most recent call last):
File "\Users\user\Desktop\Bins.py", line 25, in <module>
counts, bins, patches = plt.hist(data.values, bins=range(0, int(max(data.values)+vary(n)), vary(n)))
NameError: name 'n' is not defined
Could I get some help with this, how would I go about implementing this command line argument ( argparse
) into the histogram so I can input the according bins everytime I need to change them. Any help would be greatly appreciated, thank you
Without going too much into your code, command line arguments are easily processes using the sys
module:
import sys
print sys.argv # first element is the script name, then follow the parameters as strings
So that this script, if I name it sysArgs.py
and call it with some parameters in the console, prints
python sysArgs.py lala 5
['sysArgs.py', 'lala', '5']
If you want to only pass one parameter n
, then change it to
import sys
n = int(sys.argv[1])
# do stuff with n other than printing it
print n
Here's how I'd organize your code. main
can be reorganized, but my focus is on the location of the commandline parsing.
datadir = '/Users/user/Desktop/Newfolder/'
n = 3 # a default if not used with argparse
def main(datadir, n):
# might split the load and the plot functions
# or put the action for one file in a separate function
for file in os.listdir(datadir):
if file.endswith('.xlsx'):
data = pd.read_excel(os.path.join(datadir, file))
counts, bins, patches = plt.hist(data.values, bins=range(0,
int(max(data.values)+n), n))
df = pd.DataFrame({'bin_leftedge': bins[:-1], 'count': counts})
plt.title('Data')
plt.xlabel('Neuron')
plt.ylabel('# of Spikes')
plt.show()
outfile = os.path.join(datadir, file.replace('.xlsx', '_bins.xlsx'))
writer = pd.ExcelWriter(outfile)
df.to_excel(writer)
writer.save()
if __name__ == '__main__':
# run only as a script; not on import
# if more complicated define this parser in a function
parser = argparse.ArgumentParser(description = 'Calculating the bin width')
parser.add_argument('n', type=int, help='Changing of the width')
args = parser.parse_args()
print(args) # to see what argparse does
main(datadir, args.n)
# main(args.datadir, args.n) # if parser has a datadir argument
(I haven't tested this.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.