I have this function:
def find_nearest(array,value):
idx = (np.abs(array-value)).argmin()
return array[idx]
def df_to_count_dict(df):
count_dict = Counter(df.values)
holder = []
for i in range(1,max(list(count_dict.keys()))):
if i in count_dict.keys(): continue
holder.append(i)
for i in holder:
j = find_nearest(np.array(list(count_dict.keys())),i)
count_dict.update({i:count_dict[j]})
return count_dict
What it does is it takes a data series and uses the Counter function from collection to return back a dictionary. It also replaces values which are not in the dictionary with the closest value.
Now, I want to amend this function to return the same object, the count_dict but replace values not in the keys of the dictionary with the average between what it the missing value is between.
This is best explained by an example:
Take
test = pd.Series([1,2,3,3,7,7,7,8])
Without the function above we get:
Counter(test.values)
Out[459]: Counter({1: 1, 2: 1, 3: 2, 7: 3, 8: 1})
Using the function we get
df_to_count_dict(test)
Out[458]: Counter({1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 2, 7: 3, 8: 1})
As you can see it has added keys 4,5,6 with values 2 as 2 is the value of the closest key (the closest key is 3).
What I have it to return is the AVERAGE between the value of lower closest key and the upper closest key, so the upper closest key is 3, which has value 2, and the upper closest key is 7, which has value 3, so I want the final product to look something like:
df_to_count_dict(test)
Out[458]: Counter({1: 1, 2: 1, 3: 2, 4: 2.5, 5: 2.5, 6: 2.5, 7: 3, 8: 1})
I hope someone can help
This look a lot like school work. So you should figure it out your self. But here is a hint. The query you are being asked to develop is finding the mean between the predecessor's count and the successor's count. The predessor is the largest key smaller or equal to the input and the successor is the smallest key larger than the input.
If you need O(log(n))-complexity then you might look at binary search trees bintrees is a good package https://pypi.python.org/pypi/bintrees/2.0.4 .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.