简体   繁体   English

Java - 将人类可读的大小转换为字节

[英]Java - Convert Human Readable Size to Bytes

I've found lots of information about converting raw byte information into a human-readable format, but I need to do the opposite, ie convert the String "1.6 GB" into the long value 1717990000. Is there an in-built/well-defined way to do this, or will I pretty much have to roll my own?我发现了很多关于将原始字节信息转换为人类可读格式的信息,但我需要做相反的事情,即将字符串“1.6 GB”转换为长值 1717990000。是否有内置/井-定义的方法来做到这一点,还是我几乎必须自己动手?

[Edit]: Here is my first stab... [编辑]:这是我的第一次刺伤...

static class ByteFormat extends NumberFormat {
    @Override
    public StringBuffer format(double arg0, StringBuffer arg1, FieldPosition arg2) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public StringBuffer format(long arg0, StringBuffer arg1, FieldPosition arg2) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public Number parse(String arg0, ParsePosition arg1) {
        return parse (arg0);
    }

    @Override
    public Number parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unit = arg0.substring(spaceNdx + 1);
        int factor = 0;
        if (unit.equals("GB")) {
            factor = 1073741824;
        }
        else if (unit.equals("MB")) {
            factor = 1048576;
        }
        else if (unit.equals("KB")) {
            factor = 1024;
        }

        return ret * factor;
    }
}

Spring Framework, on version 5.1, added a DataSize class which allows parsing human-readable data sizes into bytes, and also formatting them back to their human-readable form. Spring Framework 在 5.1 版中添加了一个DataSize类,该类允许将人类可读的数据大小解析为字节,并将它们格式化回人类可读的形式。 It can be found here .可以在这里找到。

If you use Spring Framework, you can upgrade to >=5.1 and use this class.如果使用 Spring Framework,则可以升级到 >=5.1 并使用此类。 Otherwise you can c/p it and the related classes (while complying to the license).否则你可以 c/p 它和相关的类(同时遵守许可证)。

Then you can use it:然后你可以使用它:

DataSize dataSize = DataSize.parse("16GB");
System.out.println(dataSize.toBytes());

will give the output:将给出输出:

17179869184 17179869184

However, the pattern used to parse your input但是,用于解析输入的模式

  • Does not support decimals (so, you can use 1GB , 2GB , 1638MB , but not 1.6GB )不支持小数(因此,您可以使用1GB2GB1638MB ,但不能使用1.6GB
  • Does not support spaces (so, you can use 1GB but not 1 GB )不支持空格(因此,您可以使用1GB但不能使用1 GB

I would recommend to stick to the convention for compatibility/easy maintainability.我会建议坚持兼容性/易于维护的约定。 But if that does not suit your needs, you need to copy & edit the file - it is a good place to start.但如果这不符合您的需要,您需要复制和编辑文件 - 这是一个很好的起点。

I've never heard about such well-known library, which implements such text-parsing utility methods.我从来没有听说过这么有名的库,它实现了这样的文本解析实用程序方法。 But your solution seems to be near from correct implementation.但是您的解决方案似乎离正确实施很近。

The only two things, which I'd like to correct in your code are:我想在您的代码中更正的唯一两件事是:

  1. define method Number parse(String arg0) as static due to it utility nature将方法Number parse(String arg0)定义为静态,因为它具有实用性质

  2. define factor s for each type of size definition as final static fields.为每种类型的大小定义定义factor作为final static字段。

Ie it will be like this one:即它会是这样的:

private final static long KB_FACTOR = 1024;
private final static long MB_FACTOR = 1024 * KB_FACTOR;
private final static long GB_FACTOR = 1024 * MB_FACTOR;

public static double parse(String arg0) {
    int spaceNdx = arg0.indexOf(" ");
    double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
    switch (arg0.substring(spaceNdx + 1)) {
        case "GB":
            return ret * GB_FACTOR;
        case "MB":
            return ret * MB_FACTOR;
        case "KB":
            return ret * KB_FACTOR;
    }
    return -1;
}

A revised version of Andremoniy's answer that properly distinguishes between kilo and kibi, etc. Andremoniy 的答案的修订版,可以正确区分公斤和 kibi 等。

private final static long KB_FACTOR = 1000;
private final static long KIB_FACTOR = 1024;
private final static long MB_FACTOR = 1000 * KB_FACTOR;
private final static long MIB_FACTOR = 1024 * KIB_FACTOR;
private final static long GB_FACTOR = 1000 * MB_FACTOR;
private final static long GIB_FACTOR = 1024 * MIB_FACTOR;

public static double parse(String arg0) {
    int spaceNdx = arg0.indexOf(" ");
    double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
    switch (arg0.substring(spaceNdx + 1)) {
        case "GB":
            return ret * GB_FACTOR;
        case "GiB":
            return ret * GIB_FACTOR;
        case "MB":
            return ret * MB_FACTOR;
        case "MiB":
            return ret * MIB_FACTOR;
        case "KB":
            return ret * KB_FACTOR;
        case "KiB":
            return ret * KIB_FACTOR;
    }
    return -1;
}

All in one answer, parses to long :一应俱全,解析为long

public class SizeUtil {

    public static String units = "BKMGTPEZY";

    public static long parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");    
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unitString = arg0.substring(spaceNdx+1);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();
    }

    public static void main(String[] args) {
        System.out.println(parse("300.00 GiB")); // requires a space
        System.out.println(parse("300.00 GB"));
        System.out.println(parse("300.00 B"));
        System.out.println(parse("300 EB"));
    }
}

I know this is much later but I was looking for a similar function which takes into account the SI prefix as well.我知道这要晚得多,但我一直在寻找一个类似的函数,它也考虑了SI 前缀 So I landed up creating one myself and I thought it might be useful for other people.所以我自己创建了一个,我认为它可能对其他人有用。

public static String units = "KMGTPE";

/**
 * Converts from human readable to byte format
 * @param number The number value of the amount to convert
 * @param unit The unit: B, KB, MB, GB, TB, PB, EB
 * @param si Si prefix
 * @return byte value
 */
public static double parse(double number, String unit, boolean si)
{
    String identifier = unit.substring(0, 1);
    int index = units.indexOf(identifier);
    //not already in bytes
    if (index!=-1)
    {
        for (int i = 0; i <= index; i++)
            number = number * (si ? 1000 : 1024);
    }
    return number;
}

I'm sure this is possible to do with recursion as well.我相信这也可以与递归有关。 It was too simple to bother...打扰太简单了……

Following approach can also be used and makes it generic, and not dependent on space character to parse.也可以使用以下方法并使其通用,并且不依赖于空格字符来解析。

Thanks to @RobAu for the hint above.感谢@RobAu 提供上述提示。 Added a new method to get the index of first letter in the string, and changed the parse method to get index based on this new method.添加了获取字符串首字母索引的新方法,并在此新方法的基础上将解析方法更改为获取索引。 I have kept the original parse method and added a new parseAny method, so the results can be compared.我保留了原来的 parse 方法,并添加了一个新的 parseAny 方法,这样就可以比较结果了。 Hope it helps someone.希望它可以帮助某人。

Also, thanks to this answer for the indexOf method - https://stackoverflow.com/a/11214786/6385674 .另外,感谢 indexOf 方法的这个答案 - https://stackoverflow.com/a/11214786/6385674

public class ConversionUtil {

    public static String units = "BKMGTPEZY";

    public static long parse(String arg0) {
        int spaceNdx = arg0.indexOf(" ");    
        double ret = Double.parseDouble(arg0.substring(0, spaceNdx));
        String unitString = arg0.substring(spaceNdx+1);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();
    }
    /** @return index of pattern in s or -1, if not found */
    public static int indexOf(Pattern pattern, String s) {
        Matcher matcher = pattern.matcher(s);
        return matcher.find() ? matcher.start() : -1;
    }    
    public static long parseAny(String arg0)
    {
        int index = indexOf(Pattern.compile("[A-Za-z]"), arg0);
        double ret = Double.parseDouble(arg0.substring(0, index));
        String unitString = arg0.substring(index);
        int unitChar = unitString.charAt(0);
        int power = units.indexOf(unitChar);
        boolean isSi = unitString.indexOf('i')!=-1;
        int factor = 1024;
        if (isSi) 
        {
            factor = 1000;
        }

        return new Double(ret * Math.pow(factor, power)).longValue();       

    }
    public static void main(String[] args) {
        System.out.println(parse("300.00 GiB")); // requires a space
        System.out.println(parse("300.00 GB"));
        System.out.println(parse("300.00 B"));        
        System.out.println(parse("300 EB"));
        System.out.println(parseAny("300.00 GiB"));
        System.out.println(parseAny("300M"));
    }
}

I write a file size human readable utility enum class, Hope it helps you!我写了一个文件大小的人类可读实用程序枚举类,希望对你有帮助!

/**
 * The file size human readable utility class, 
 * provide  mutual conversions from human readable size to byte size
 * 
 * The similar function in stackoverflow, linked:
 *  https://stackoverflow.com/questions/3758606/how-to-convert-byte-size-into-human-readable-format-in-java?r=SearchResults
 * 
 * Apache also provide similar function
 * @see org.apache.commons.io.FileUtils#byteCountToDisplaySize(long)
 * 
 * @author Ponfee
 */
public enum HumanReadables {

    SI    (1000, "B", "KB",  "MB",  "GB",  "TB",  "PB",  "EB" /*, "ZB",  "YB" */), // 

    BINARY(1024, "B", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB"/*, "ZiB", "YiB"*/), // 
    ;

    private static final String FORMAT = "#,##0.##";
    private static final Pattern PATTERN = Pattern.compile(".*[0-9]+.*");

    private final int      base;
    private final String[] units;
    private final long[]   sizes;

    HumanReadables(int base, String... units) {
        this.base  = base;
        this.units = units;
        this.sizes = new long[this.units.length];

        this.sizes[0] = 1;
        for (int i = 1; i < this.sizes.length; i++) {
            this.sizes[i] = this.sizes[i - 1] * this.base; // Maths.pow(this.base, i);
        }
    }

    /**
     * Returns a string of bytes count human readable size
     * 
     * @param size the size
     * @return human readable size
     */
    public strictfp String human(long size) {
        if (size == 0) {
            return "0" + this.units[0];
        }

        String signed = "";
        if (size < 0) {
            signed = "-";
            size = size == Long.MIN_VALUE ? Long.MAX_VALUE : -size;
        }

        /*int unit = (int) Maths.log(size, this.base);
        return signed + format(size / Math.pow(this.base, unit)) + " " + this.units[unit];*/

        int unit = find(size);
        return new StringBuilder(13) // 13 max length like as "-1,023.45 GiB"
            .append(signed)
            .append(formatter().format(size / (double) this.sizes[unit]))
            .append(" ")
            .append(this.units[unit])
            .toString();
    }

    public strictfp long parse(String size) {
        return parse(size, false);
    }

    /**
     * Parse the readable byte count, allowed suffix units: "1", "1B", "1MB", "1MiB", "1M"
     * 
     * @param size   the size
     * @param strict the strict, if BINARY then verify whether contains "i"
     * @return a long value bytes count
     */
    public strictfp long parse(String size, boolean strict) {
        if (size == null || size.isEmpty()) {
            return 0L;
        }
        if (!PATTERN.matcher(size).matches()) {
            throw new IllegalArgumentException("Invalid format [" + size + "]");
        }

        String str = size = size.trim();
        long factor = this.sizes[0];
        switch (str.charAt(0)) {
            case '+': str = str.substring(1);               break;
            case '-': str = str.substring(1); factor = -1L; break;
        }

        int end = 0, lastPos = str.length() - 1;
        // last character isn't a digit
        char c = str.charAt(lastPos - end);
        if (c == 'i') {
            // last pos cannot end with "i"
            throw new IllegalArgumentException("Invalid format [" + size + "], cannot end with \"i\".");
        }

        if (c == 'B') {
            end++;
            c = str.charAt(lastPos - end);

            boolean flag = isBlank(c);
            while (isBlank(c) && end < lastPos) {
                end++;
                c = str.charAt(lastPos - end);
            }
            // if "B" head has space char, then the first head non space char must be a digit
            if (flag && !Character.isDigit(c)) {
                throw new IllegalArgumentException("Invalid format [" + size + "]: \"" + c + "\".");
            }
        }

        if (!Character.isDigit(c)) {
            // if not a digit character, then assume is a unit character
            if (c == 'i') {
                if (this == SI) {
                    // SI cannot contains "i"
                    throw new IllegalArgumentException("Invalid SI format [" + size + "], cannot contains \"i\".");
                }
                end++;
                c = str.charAt(lastPos - end);
            } else {
                if (this == BINARY && strict) {
                    // if strict, then BINARY must contains "i"
                    throw new IllegalArgumentException("Invalid BINARY format [" + size + "], miss character \"i\".");
                }
            }

            switch (c) {
                case 'K': factor *= this.sizes[1]; break;
                case 'M': factor *= this.sizes[2]; break;
                case 'G': factor *= this.sizes[3]; break;
                case 'T': factor *= this.sizes[4]; break;
                case 'P': factor *= this.sizes[5]; break;
                case 'E': factor *= this.sizes[6]; break;
                /*
                case 'Z': factor *= this.bytes[7]; break;
                case 'Y': factor *= this.bytes[8]; break;
                */
                default: throw new IllegalArgumentException("Invalid format [" + size + "]: \"" + c + "\".");
            }

            do {
                end++;
                c = str.charAt(lastPos - end);
            } while (isBlank(c) && end < lastPos);
        }

        str = str.substring(0, str.length() - end);
        try {
            return (long) (factor * formatter().parse(str).doubleValue());
        } catch (NumberFormatException | ParseException e) {
            throw new IllegalArgumentException("Failed to parse [" + size + "]: \"" + str + "\".");
        }
    }

    public int base() {
        return this.base;
    }

    public String[] units() {
        return Arrays.copyOf(this.units, this.units.length);
    }

    public long[] sizes() {
        return Arrays.copyOf(this.sizes, this.sizes.length);
    }

    private int find(long bytes) {
        int n = this.sizes.length;
        for (int i = 1; i < n; i++) {
            if (bytes < this.sizes[i]) {
                return i - 1;
            }
        }
        return n - 1;
    }

    private DecimalFormat formatter() {
        return new DecimalFormat(FORMAT);
    }

    private boolean isBlank(char c) {
        return c == ' ' || c == '\t';
    }

}

Another option based on @gilbertpilz code.另一个基于@gilbertpilz 代码的选项。 In this case using regex to get the value and the factor.在这种情况下,使用正则表达式来获取值和因子。 It is also case insensitive.它也不区分大小写。

    private final static long KB_FACTOR = 1000;
    private final static long KIB_FACTOR = 1024;
    private final static long MB_FACTOR = 1000 * KB_FACTOR;
    private final static long MIB_FACTOR = 1024 * KIB_FACTOR;
    private final static long GB_FACTOR = 1000 * MB_FACTOR;
    private final static long GIB_FACTOR = 1024 * MIB_FACTOR;

    private long parse(String arg0) throws ParseException {
        Pattern pattern = Pattern.compile("([0-9]+)(([KMG])I?B)");
        Matcher match = pattern.matcher(arg0);

        if( !match.matches() || match.groupCount()!=3)
            throw new ParseException("Wrong format", 0);

        long ret = Long.parseLong(match.group(0));
        switch (match.group(2).toUpperCase()) {
            case "GB":
                return ret * GB_FACTOR;
            case "GIB":
                return ret * GIB_FACTOR;
            case "MB":
                return ret * MB_FACTOR;
            case "MIB":
                return ret * MIB_FACTOR;
            case "KB":
                return ret * KB_FACTOR;
            case "KIB":
                return ret * KIB_FACTOR;
        }

        throw new ParseException("Wrong format", 0);
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM