简体   繁体   中英

How to conditionally sort X-axis values in Matplotlib plot?

I have the following DataFrame:

file     size
abc1.txt  2.1 MB
abc2.txt  1.0 MB
abc3.txt  1.5 MB
abc4.txt  767.9 KB

When I plot these data ( plt.plot(df['file'],df['size']) ), the values of KB and MB are obviously incorrectly ordered and are messed. How can I sort them so that the sorting would start from KB and would continue with MB?

767.9 KB  1.0 MB  1.5 MB  2.1 MB
df = pd.DataFrame({'file': [1,2,3,4], 'size': ['2.1 MB', '1.0 MB', '1.5 MB', '767.9 KB']})
cv= {'': 1, 'KB': 1e1, 'MB': 1e6, 'GB': 1e9, 'TB': 1e12}
df['size_bytes'] = df['size'].apply(lambda x: float(x.split()[0])*cv[x.split()[1]] 
                                    if len(x.split())==2 else float(x))
fig, ax = plt.subplots()
plt.plot(df['file'],df['size_bytes'])

And if you want the y axis in human readable form

def to_human_readable(size):
    power = 1000
    n = 0
    mem = {0 : '', 1: 'KB', 2: 'MB', 3: 'GB', 4: 'TB'}
    while size > power:
        size /=  power
        n += 1
    return "{0} {1}".format(size, mem[n])

ax.set_yticklabels([to_human_readable(v) if v >= 0 else ' ' for v in  
                    ax.get_yticks(minor=False)])

在此处输入图片说明

(In digital storage 1kb = 1000)

First it's reading your numbers as a string, so any order wouldn't really make much sense and further the the space between the points is not representative.

Also in general I'd say it's poor practice to have different units on the same axis. Better to convert to the same unit:

import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame([['abc1.txt',  '2.1 MB'],
                   ['abc2.txt',  '1.0 MB'],
                   ['abc3.txt',  '1.5 MB'],
                   ['abc4.txt',  '767.9 KB']], columns=["file", 'size'])

# This is a list comprehension that splits the number out of the string, converts it to a float, 
# and divides it by 1000 if the other part of the string is 'KB'.
df['size_float'] = [float(x[0])/1000 if x[1]=='KB' else float(x[0]) for x in df['size'].str.split()]
plt.plot(df['file'],df['size_float'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM