I'm trying to make histogram by python. I am starting with the following snippet:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
I understand it's using dictionary function to solve the problem. But I'm just confused about the 4th line: if x in d:
d is to be constructed, there's nothing in d yet, so how come if x in d?
Keep in mind, that if
is inside a for
loop.
So, when you're looking at the very first item in L
there is nothing in d
, but when you get to the next item in L
, there is something in d
, so you need to check whether to make a new bin on the histogram ( d[x] = 1
), or add the item to an existing bin ( d[x] += 1
).
In Python, we actually have some shortcuts for this:
from collections import defaultdict
def histogram(L):
d = defaultdict(int)
for x in L:
d[x] += 1
return d
This automatically starts each bin in d
at zero (what int()
returns) so you don't have to check if the bin exists. On Python 2.7 or higher:
from collections import Counter
d = Counter(L)
Will automatically make a mapping of the frequencies of each item in L
. No other code required.
You can create a histogram with a dict comprehension:
histogram = {key: l.count(key) for key in set(L)}
The code inside of the for
loop will be executed once for each element in L
, with x
being the value of the current element.
Lets look at the simple case where L
is the list [3, 3]
. The first time through the loop d
will be empty, x
will be 3, and 3 in d
will be false, so d[3]
will be set to 1. The next time through the loop x
will be 3 again, and 3 in d
will be true, so d[3]
will be incremented by 1.
You can use a Counter
, available from Python 2.7 and Python 3.1+.
>>> # init empty counter
>>> from collections import Counter
>>> c = Counter()
>>> # add a single sample to the histogram
>>> c.update([4])
>>> # add several samples at once
>>> c.update([4, 2, 2, 5])
>>> # print content
>>> print c
Counter({2: 2, 4: 2, 5: 1})
The module brings several nice features, like addition, subtraction, intersection and union on counters. The Counter
can count anything which can be used as a dictionary key.
I think the other guys have explained you why if x in d
. But here is a clue, how this code should be written following "don't ask permission, ask forgiveness":
...
try:
d[x] += 1
except KeyError:
d[x] = 1
The reason for this, is that you expect this error to appear only once (at least once per method call). Thus, there is no need to check if x in d
.
You can create your own histogram in Python using for example matplotlib
. If you want to see one example about how this could be implemented, you can refer to this answer .
In this specific case, you can use doing:
temperature = [4, 3, 1, 4, 6, 7, 8, 3, 1]
radius = [0, 2, 3, 4, 0, 1, 2, 10, 7]
density = [1, 10, 2, 24, 7, 10, 21, 102, 203]
points, sub = hist3d_bubble(temperature, density, radius, bins=4)
sub.axes.set_xlabel('temperature')
sub.axes.set_ylabel('density')
sub.axes.set_zlabel('radius')
if x isn't in d, then it gets put into d with d[x] = 1. Basically, if x shows up in d more than once it increases the number matched with x.
Try using this to step through the code: http://people.csail.mit.edu/pgbovine/python/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.