简体   繁体   English

可以存储时间戳的最小字节数是多少?

[英]What is the smallest number of bytes that can store a timestamp?

I want to create my own time stamp data structure in C. 我想在C中创建自己的时间戳数据结构。

DAY ( 0 - 31 ), HOUR ( 0 - 23 ), MINUTE ( 0 - 59 ) DAY(0 - 31),HOUR(0 - 23),MINUTE(0 - 59)

What is the smallest data structure possible? 可能的最小数据结构是什么?

Well, you could pack it all in an unsigned short (That's 2 bytes , 5 bits for Day, 5 bits for hour, 6 bits for minute)... and use some shifts and masking to get the values. 好吧,你可以把它全部打包成一个unsigned short (这是2个字节 ,5位为Day,5位为小时,6位为分钟)......并使用一些移位和屏蔽来获取值。

unsigned short timestamp = <some value>; // Bits: DDDDDHHHHHMMMMMM

int day = (timestamp >> 11) & 0x1F;
int hour = (timestamp >> 6) & 0x1F;
int min = (timestamp) & 0x3F;

unsigned short dup_timestamp = (short)((day << 11) | (hour << 6) | min); 

or using macros 或使用宏

#define DAY(x)    (((x) >> 11) & 0x1F)
#define HOUR(x)   (((x) >> 6)  & 0x1F)
#define MINUTE(x) ((x)         & 0x3F)
#define TIMESTAMP(d, h, m) ((((d) & 0x1F) << 11) | (((h) & 0x1F) << 6) | ((m) & 0x3F)

(You didn't mention month/year in your current version of the question, so I've omitted them). (你在当前版本的问题中没有提到月/年,所以我省略了它们)。

[ Edit : use unsigned short - not signed short .] [ 编辑 :使用unsigned short - not signed short 。]

Do you mean HOUR 0-23 and MINUTE 0-59? 你的意思是0-23和分钟0-59? I've heard of leap seconds but not leap minutes or hours. 我听说过闰秒但不是闰秒或小时。

(log (* 31 60 24) 2)
=> 15.446

So you can fit these values 16 bits, or 2 bytes. 因此,您可以将这些值拟合为16位或2个字节。 Whether this is a good idea or not is a completely different question. 这是一个好主意是一个完全不同的问题。

  • Month: range 1 - 12 => 4 bits 月份:范围1 - 12 => 4位
  • Date: range 1 - 31 => 5 bits 日期:范围1 - 31 => 5位
  • Hour: range 0 - 24 => 5 bits 小时:范围0 - 24 => 5位
  • Minute: range 0 - 60 => 6 bits 分钟:范围0 - 60 => 6位

  • Total: 20 bits 总计:20位

You can use a bitfield and use a compiler/platform specific pragma to keep it tight: 您可以使用位域并使用编译器/平台特定的pragma来保持紧密:

typedef struct packed_time_t {
    unsigned int month  : 4;
    unsigned int date   : 5;
    unsigned int hour   : 5;
    unsigned int minute : 6;
} packed_time_t; 

But do you really need this? 但你真的需要这个吗? Wouldn't the standard time functions be enough? 标准时间功能不足够吗? Bitfields vary depending on architecture, padding and so on ... not a portable construct. 位域根据架构,填充等而有所不同......不是便携式构造。

Note: The original question has been edited, and the month is no longer necessary. 注意:原始问题已被编辑,不再需要月份。 The original calculations were below: 原始计算如下:

It's simply a matter of how much computation you want to do. 这只是你想要做多少计算的问题。 The tightest way to pack it is if you can make your own type, and use the following math to convert from and to its corresponding integer: 打包它的最简单方法是,如果你可以创建自己的类型,并使用以下数学转换和相应的整数:

Valid ranges are: 有效范围是:

Month: 1-12 -> (0-11)+1
Day: 1-31 -> (0-30)+1
Hour: 0-24
Minute: 0-60

You can choose an order to store the values in (I'll keep it in the above order). 您可以选择存储值的订单(我将按上述顺序保存)。

Month-1 Day-1  Hour   Minute
(0-11)  (0-30) (0-23) (0-59)

Do a bit of multiplication/division to convert the values using the following formula as a guide: 使用以下公式作为指导,进行一些乘法/除法以转换值:

value = (((Month - 1) * 31 + (Day - 1)) * 24 + Hour) * 60 + Minute

So, you have the minimum value 0 and the maximum value ((11*31+30)*24+23)*60+59 , which is 535,679. 因此,您具有最小值0和最大值((11*31+30)*24+23)*60+59 ,即535,679。 So you need 20 bits minimum to store this value as an unsigned integer ( 2^20-1 = 1,048,575; 2^19-1 = 524,287 ). 因此,您需要最少20位才能将此值存储为无符号整数( 2^20-1 = 1,048,575; 2^19-1 = 524,287 )。

If you want to make things dificult but save a byte, you can use 3 bytes and manipulate them yourself. 如果你想让事情变得困难但是保存一个字节,你可以使用3个字节并自己操作它们。 Or you can use an int (32-bit) and work with it normally using simple math operators. 或者您可以使用int(32位)并使用简单的数学运算符来正常使用它。

BUT There's some room to play with there though, so let's see if we can make this easier: 但是那里有一些空间可以玩,所以让我们看看我们是否可以让这更容易:

Valid ranges are, again: 有效范围是:

Month: 1-12 -> (0-11)+1 --- 4 bits (you don't even need the -1)
Day: 1-31 -> (0-30)+1   --- 5 bits (you again don't need the -1) 
Hour: 0-24              --- 5 bits
Minute: 0-60            --- 6 bits

That's a total of 20 bits, and really easy to manipulate. 总共20位,真的很容易操作。 So you don't gain anything by compacting any further than using simple bit-shifting, and you can store the value like this: 因此,除了使用简单的位移之外,你不会通过压缩得到任何东西,你可以像这样存储值:

19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
---Month--- ----Day------- ---Hour--- --Minute---

If you don't care about the month, the tightest you can get is: 如果你不关心月份,你可以得到的最紧的是:

value = ((Day - 1) * 24 + Hour) * 60 + Minute

leaving you with a range of 0 to 44,639 which can fit neatly in a 16-bit short . 让你的范围从0到44,639,可以整齐地放在16位short

There's some room to play with there though, so let's see if we can make this easier: 虽然有一些空间可以玩,所以让我们看看我们是否可以让这更容易:

Valid ranges are, again: 有效范围是:

Day: 1-31 -> (0-30)+1 --- 5 bits (you don't even need the -1) 
Hour: 0-24            --- 5 bits
Minute: 0-60          --- 6 bits

That's a total of 16 bits, and again really easy to manipulate. 这总共是16位,并且再次非常容易操作。 So....store the value like this: 所以....存储这样的值:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
----Day------- ---Hour--- --Minute---

Why not just use the (4-byte?) output of the C time() function with NULL as an argument. 为什么不使用C time()函数的(4字节?)输出,并将NULL作为参数。 It's just the Unix epoch time (ie the number of seconds since January 1st, 1970). 它只是Unix纪元时间(即自1970年1月1日以来的秒数)。 Like Joe's answer, it gives you much more room to grow than any answer that tries to pack in months and days and years into bits. 就像Joe的回答一样,它给你提供比任何试图以几个月,几天和几年打包的答案更多的成长空间。 It's standard. 这是标准的。 Converting the time_t variable to an actual time is trivial in standard C (on Unix, at least) and most of the time, if you have a data structure intended to hold a 3 byte variable, it may be rounded up to 4 bytes anyway. 在标准C中(至少在Unix上)将time_t变量转换为实际时间是微不足道的,并且大多数情况下,如果您有一个旨在保存3字节变量的数据结构,则无论如何它都可以舍入到4个字节。

I know you're trying to optimize heavily for size, but 4 bytes is pretty damn small. 我知道你正在努力优化大小,但4个字节非常小。 Even if you truncate off the top byte, you still get 194 days of distinct times out of it. 即使你截断了顶部字节,你仍然可以获得194天的不同时间。

You can get even more out of this by taking the time from time(NULL) and dividing it by 60 before storing it, truncating it to a minute and storing that. 你可以通过从time(NULL)取出time(NULL)并将其除以60然后再存储它,将其截断一分钟并存储它来获得更多。 3 bytes of that gives you, as shown above, 388 months, and for 2 bytes you can store 45 days. 如上所示,3个字节为您提供388个月,而对于2个字节,您可以存储45天。

I would go with the 4-byte version, simply because I don't see the difference between 2, 3 and 4 bytes as being at all significant or vital to any program running or not (unless it's a bootloader). 我会选择4字节版本,因为我没有看到2,3和4字节之间的区别对于任何运行或不运行的程序都是重要的或至关重要(除非它是一个引导加载程序)。 It's simpler to get and simpler to handle, and will probably save you many headaches in the end. 获得更简单,操作更简单,最终可能会为您节省许多麻烦。

EDIT: The code I posted didn't work. 编辑:我发布的代码不起作用。 I've had 3 hours of sleep and I'll figure out how to do the bit-twiddling correctly eventually. 我已经有3个小时的睡眠时间,我会弄清楚如何正确地进行正确的比赛。 Until then, you can implement this yourself. 在此之前,您可以自己实现。

For the use case you describe, (minute resolution of times in a 31 day range) I'd just use a 16-bit minute counter. 对于您描述的用例,(31天范围内的分钟分辨率)我只使用16位分钟计数器。 If you're serializing this data (to disk, network) then you can use some variable length integer encoding to save bytes for small values. 如果要序列化此数据(到磁盘,网络),则可以使用一些可变长度整数编码来保存小值的字节。

In general you can compute this answer as follows (where log2 is the base 2 logarithm, ie the number of bits): 通常,您可以按如下方式计算此答案(其中log2是基数2对数,即位数):

  • If you want to use shifts and masks to get the data in and out, take log2() of the number of possible values for each field, round up (to get bits), add the results (to get total bits), divide by eight (total bytes, w. fractional bytes), and round up again (total bytes). 如果你想使用移位和掩码来获取和输出数据,请记录每个字段的可能值数量的log2(),向上舍入(得到位),添加结果(得到总位),除以8(总字节数,w。小数字节),再次向上舍入(总字节数)。

    log2(60) + log2(60) + log2(24) + log2(31) + log2(12) = 6+6+5+5+4 = 26 bits = 4 bytes log2(60)+ log2(60)+ log2(24)+ log2(31)+ log2(12)= 6 + 6 + 5 + 5 + 4 = 26位= 4字节

  • If you want to get the fields in and out by multiplying & adding / dividing & modulo, multiply together the number of possible values for each field and take log2() of that, divide by eigth, and round up. 如果你想通过乘以&加/除&模数来输入和输出字段,将每个字段的可能值的数量相乘并取log2(),除以eigth,然后向上舍入。

    log2(60*60*24*31*12) = 24.9379 bits = 4 bytes log2(60 * 60 * 24 * 31 * 12)= 24.9379位= 4个字节

  • You can save a tiny additional amount of space by combining non-isoformal fields (eg storing day of year rather than month and day of month) but it is seldom worth it. 您可以通过组合非异常字段(例如,存储一年中的某一天而不是月份和日期)来节省一些额外的空间,但它很少值得。

    log2(60*60*24*366) = 24.91444 bits = 4 bytes log2(60 * 60 * 24 * 366)= 24.91444位= 4个字节

-- MarkusQ "teach a man to fish" - MarkusQ“教人钓鱼”

just to offer an alternative: 只是提供一个替代方案:

  • if you only need minute-level resolution, 如果你只需要分钟级别的分辨率,
  • and you don't cross date boundaries (month/year) 并且你没有跨越日期边界(月/年)
  • and your messages are sequential with guaranteed delivery 并且您的消息是有序的,保证交付

then you can store the timestamp as an offset from the timestamp of the last message. 然后,您可以将时间戳存储为距离最后一条消息的时间戳的偏移量。

In this case, you only need enough bits to hold the maximum number of minutes between messages. 在这种情况下,您只需要足够的位来保持消息之间的最大分钟数。 For example, if you emit messages at most 255 minutes apart, then one byte will suffice. 例如,如果相隔最多255分钟发出消息,那么一个字节就足够了。

Note, however, that the very first message may need to include an absolute timestamp in its payload, for synchronization. 但是,请注意,第一条消息可能需要在其有效负载中包含绝对时间戳,以进行同步。

[i'm not saying this is a good solution - it's fairly fragile and makes a lot of assumptions - just an alternative one] [我不是说这是一个很好的解决方案 - 它相当脆弱并且做了很多假设 - 只是另一种假设]

60 Minutes/Hour means you'd need at least 6 bits to store the minute (since 59th minute == 111011b), while 24 Hours/Day means another 5 bits (23rd hour == 10111b). 60分钟/小时意味着您需要至少6位来存储分钟(自第59分钟== 111011b),而24小时/天意味着另外5位(第23小时== 10111b)。 If you want to account for any of the (possibly) 366 Days/Year, you'd need 9 more bits (366th day (365 when day 1 == 0) == 101101101b). 如果你想要考虑任何(可能的)366天/年,你需要9位(第366天(第1天365 = = 0)== 101101101b)。 So if you wanted to store everything in a purely accessible format, you'd need 20 bits == 3 Bytes. 因此,如果您想以纯粹可访问的格式存储所有内容,则需要20位== 3字节。 Alternatively, adding a Month field would make the total possible Days value go from 366 to 31 -- down to 5 bits, with 4 more bits for the month. 或者,添加Month字段会使总可能的Days值从366变为31 - 减少到5位,当月还有4位。 This would also give you 20 bits, or 3 bytes with 4 bits to spare. 这也可以给你20位,或3位字节,4位备用。

Conversely, if you kept track of the date just by minutes from some start date, 3 bytes would give you a resolution of 16,777,215 minutes before you rolled over to 0 again -- that's about 279,620 hours, 11,650 days, and about 388 months, and that's using all 24 bits. 相反,如果您从某个开始日期开始跟踪日期只需几分钟,那么在您再次转为0之前,3个字节将为您提供16,777,215分钟的分辨率 - 大约为279,620小时,11,650天和大约388个月,以及这是使用所有24位。 That's probably a better way to go, if you don't care about seconds, and if you don't mind taking a little bit of execution time to interpret the hour, day and month. 这可能是一个更好的方法,如果你不关心秒,如果你不介意花一点点执行时间来解释小时,日和月。 And this would be much easier to increment! 这将是更容易增加!

5 bits for the day plus 5 bits for the hour plus 6 bits for the minute equals an unsigned short. 当天的5位加上小时的5位加上分钟的6位等于无符号短路。 Any further packing would not reduce the storage space required and would increase code complexity and cpu usage. 任何进一步的打包都不会减少所需的存储空间,并且会增加代码复杂性和CPU使用率。

Well, disregarding the superfluous HOUR 24 and MINUTE 60, we have 31 x 24 x 60 = 44,640 possible unique time values. 好吧,无论多余的HOUR 24和MINUTE 60,我们有31 x 24 x 60 = 44,640个可能的独特时间值。 2^15 = 32,768 < 44,640 < 65,536 = 2^16 so we'll need at least 16 bits (2 bytes) to represent these values. 2 ^ 15 = 32,768 <44,640 <65,536 = 2 ^ 16所以我们需要至少16位(2字节)来表示这些值。

If we don't want to be doing modulo arithmetic to access the values each time, we need to be sure to store each in its own bit field. 如果我们不想每次都使用模运算来访问值,我们需要确保将每个存储在它自己的位字段中。 We need 5 bits to store the DAY, 5 bits to store the HOUR, and 6 bits to store the MINUTE, which still fits in 2 bytes: 我们需要5位来存储DAY,5位来存储HOUR,以及6位来存储MINUTE,它仍然适合2个字节:

struct day_hour_minute {
  unsigned char DAY:5; 
  unsigned char HOUR:5;
  unsigned char MINUTE:6;
};

Including the MONTH would increase our unique time values by a factor of 12, giving 535,680 unique values, which would require at least 20 bits to store (2^19 = 524,288 < 535,680 < 1,048,576 = 2^20), which requires at least 3 bytes. 包括MONTH会使我们的独特时间值增加12倍,得到535,680个唯一值,这需要至少20位才能存储(2 ^ 19 = 524,288 <535,680 <1,048,576 = 2 ^ 20),这需要至少3个字节。

Again, to avoid modulo arithmetic, we need a separate bit field for MONTH, which should only require 4 bits: 同样,为了避免模运算,我们需要一个单独的MONTH位域,它应该只需要4位:

struct month_day_hour_minute {
  unsigned char MONTH:4;
  unsigned char DAY:5;
  unsigned char HOUR:5;
  unsigned char MINUTE:6;
  unsigned char unused: 4;
};

In both of these examples however, be aware that C prefers its data structures be on-cut - that is, that they are multiples of 4 or 8 bytes (usually), so it may pad your data structures beyond what is minimally necessary. 但是,在这两个示例中,请注意C更喜欢其数据结构是切入的 - 也就是说,它们是4或8个字节(通常)的倍数,因此它可以填充您的数据结构,超出最低限度的要求。

For example, on my machine, 例如,在我的机器上,

#include <stdio.h>

struct day_hour_minute {
  unsigned int DAY:5;
  unsigned int HOUR:5;
  unsigned int MINUTE:6;
};
struct month_day_hour_minute {
  unsigned int MONTH:4;
  unsigned int DAY:5;
  unsigned int HOUR:5;
  unsigned int MINUTE:6;
  unsigned int unused: 4;
};

#define DI( i ) printf( #i " = %d\n", i )
int main(void) {
  DI( sizeof(struct day_hour_minute) );
  DI( sizeof(struct month_day_hour_minute) );
  return 0;
}

prints: 打印:

sizeof(struct day_hour_minute) = 4
sizeof(struct month_day_hour_minute) = 4

To simplify this without loss of generality, 为了简化这一点而不失一般性,

Day (0 - 30), Hour (0 - 23), Minute (0 - 59) 日(0 - 30),小时(0 - 23),分钟(0 - 59)

encoding = Day + (Hour + (Minute)*24)*31

Day = encoding %31
Hour = (encoding / 31) % 24
Minute = (encoding / 31) / 24

The maximum value of encoding is 44639 which is slightly less than 16 bits. 编码的最大值是44639,略小于16位。

Edit: rampion said basically the same thing. 编辑:rampion表示基本相同的事情。 And this gets you the minimal representation, which is less than the bitwise interleaving representation. 这会使您获得最小的表示,这比按位交错表示要少。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 C中可存储10,000,000的最小数据类型是什么? - What's the smallest data type in C that can store the number 10,000,000? 我可以可靠生成的最小非零数是多少? - What is the Smallest Nonzero Number that I can Reliably Generate? 如果一个 C 有符号整数类型以 22 位存储,它可以存储的最小值是多少? - If a C signed integer type is stored in 22 bits, what is the smallest value it can store? 存储一定位数所需的字节数 - number of bytes required to store a certain number of bits 这个基本的 if 代码有什么问题,可以找到所写的最小数字 - what is wrog with this basic if code for finding the smallest number writen 计算填充字节数的最佳方法是什么 - What is the best way to calculate number of padding bytes 如何在多维数组的主对角线上找到最小的数字? - How can i find smallest number in main diagonal of multidimensional array? 我可以#include定义DWORD的最小Windows头文件是什么? - What is the smallest Windows header I can #include to define DWORD? 可以用C编程语言表示的1 /(2 ^ x)的最小精确表示形式是什么? - What is the smallest exact representation of 1/(2^x) that can be represented in the C programming language? 出于什么原因,结构中还有3个字节? 可以删除吗? - For what reason there are 3 bytes more in structs? Can remove?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM