简体   繁体   English

如何在Azure表存储中使用整数RowKeys?

[英]How to use integer RowKeys in Azure Table Storage?

I have consecutively numbered entities that I want to persist with the Azure Table Service, however the type of the RowKey column is problematic. 我有连续编号的实体,我想坚持使用Azure表服务,但RowKey列的类型是有问题的。 The number of the entity should be stored in the RowKey column, so I can query entities fast ( PK = '..' && RowKey = 5 ), get newest entities ( RowKey > 10 ) and query a certain set of entities ( RowKey > 5 && RowKey < 10 ). 实体的数量应存储在RowKey列中,因此我可以快速查询实体( PK = '..' && RowKey = 5 ),获取最新实体( RowKey > 10 )并查询某组实体( RowKey > 5 && RowKey < 10 )。

Since RowKey must be a string, lower-than comparisons are problematic ( "100" < "11" ). 由于RowKey必须是一个字符串,低于比较是有问题的( "100" < "11" )。 I thought about prepending zeros to the numbers (so that "100" > "011" ), but I can't predict the number of entities (and thus the number of zeros). 我想过在数字前面加零(这样"100" > "011" ),但是我无法预测实体的数量(因此也就是零的数量)。

I know I could just create an integer column, but I would loose the performance advantage of the indexed RowKey column (plus I don't have any other information suitable for RowKey). 我知道我可以创建一个整数列,但是我会放弃索引的RowKey列的性能优势(另外我没有任何其他适合RowKey的信息)。 Did anyone have this problem before? 以前有人有这个问题吗?

I had a similar problem, with the added caveat that I also wanted to support having the RowKey sorted in descending order. 我有一个类似的问题,添加警告我还想支持RowKey按降序排序。 In my case I did not care about supporting trillions of possible values because I was correctly using the PartitionKey and also using scoping prefixes when needed to further segment the RowKey (like "scope-id" -> "12-8374"). 在我的情况下,我不关心支持数万亿的可能值,因为我正确使用PartitionKey并在需要时使用作用域前缀来进一步细分RowKey(如“scope-id” - >“12-8374”)。

In the end I settled on an specific implementation of the general approach suggested by enzi. 最后,我决定了enzi建议的一般方法的具体实现。 I used a modified version of Base64 encoding, producing a four character string, which supports over 16 million values and can be sorted in ascending or descending order. 我使用了Base64编码的修改版本,生成了一个四字符串,支持超过1600万个值,可以按升序或降序排序。 Here is the code, which has been unit tested but lacks range checking/validation. 这是代码,已经过单元测试但缺少范围检查/验证。

/// <summary>
/// Gets the four character string representation of the specified integer id.
/// </summary>
/// <param name="number">The number to convert</param>
/// <param name="ascending">Indicates whether the encoded number will be sorted ascending or descending</param>
/// <returns>The encoded string representation of the number</returns>
public static string NumberToId(int number, bool ascending = true)
{
    if (!ascending)
        number = 16777215 - number;

    return new string(new[] { 
        SixBitToChar((byte)((number & 16515072) >> 18)), 
        SixBitToChar((byte)((number & 258048) >> 12)), 
        SixBitToChar((byte)((number & 4032) >> 6)), 
        SixBitToChar((byte)(number & 63)) });
}

/// <summary>
/// Gets the numeric identifier represented by the encoded string.
/// </summary>
/// <param name="id">The encoded string to convert</param>
/// <param name="ascending">Indicates whether the encoded number is sorted ascending or descending</param>
/// <returns>The decoded integer id</returns>
public static int IdToNumber(string id, bool ascending = true)
{
    var number = ((int)CharToSixBit(id[0]) << 18) | ((int)CharToSixBit(id[1]) << 12) | ((int)CharToSixBit(id[2]) << 6) | (int)CharToSixBit(id[3]);

    return ascending ? number : -1 * (number - 16777215);
}

/// <summary>
/// Converts the specified byte (representing 6 bits) to the correct character representation.
/// </summary>
/// <param name="b">The bits to convert</param>
/// <returns>The encoded character value</returns>
[MethodImplAttribute(MethodImplOptions.AggressiveInlining)] 
static char SixBitToChar(byte b)
{
    if (b == 0)
        return '!';
    if (b == 1)
        return '$';
    if (b < 12)
        return (char)((int)b - 2 + (int)'0');
    if (b < 38)
        return (char)((int)b - 12 + (int)'A');
    return (char)((int)b - 38 + (int)'a');
}

/// <summary>
/// Coverts the specified encoded character into the corresponding bit representation.
/// </summary>
/// <param name="c">The encoded character to convert</param>
/// <returns>The bit representation of the character</returns>
[MethodImplAttribute(MethodImplOptions.AggressiveInlining)] 
static byte CharToSixBit(char c)
{
    if (c == '!')
        return 0;
    if (c == '$')
        return 1;
    if (c <= '9')
        return (byte)((int)c - (int)'0' + 2);
    if (c <= 'Z')
        return (byte)((int)c - (int)'A' + 12);
    return (byte)((int)c - (int)'a' + 38);
}

You can just pass false to the ascending parameter to ensure the encoded value will sort in the opposite direction. 您可以将false传递给升序参数,以确保编码值将以相反的方向排序。 I selected ! 我选择了! and $ to complete the Base64 set since they are valid for RowKey values. 和$来完成Base64集,因为它们对RowKey值有效。 This algorithm can be easily amended to support additional characters, though I firmly believe that larger numbers do not make sense for RowKey values as table storage keys must be efficiently segmented. 可以轻松修改此算法以支持其他字符,但我坚信较大的数字对RowKey值没有意义,因为必须有效地分割表存储键。 Here are some examples of output: 以下是输出的一些示例:

0 -> !!!! 0 - > !!!! asc & zzzz desc asc&zzzz desc

1000 -> !!Dc asc & zzkL desc 1000 - > !! Dc asc&zzkL desc

2000 -> !!TE asc & zzUj desc 2000 - > !! TE asc&zzUj desc

3000 -> !!is asc & zzF5 desc 3000 - > !!是asc&zzF5 desc

4000 -> !!yU asc & zz$T desc 4000 - > !! yU asc&zz $ T desc

5000 -> !$C6 asc & zylr desc 5000 - >!$ C6 asc&zylr desc

6000 -> !$Rk asc & zyWD desc 6000 - >!$ Rk asc&zyWD desc

7000 -> !$hM asc & zyGb desc 7000 - >!$ hM asc&zyGb desc

8000 -> !$x! 8000 - >!$ x! asc & zy0z desc asc&zy0z desc

9000 -> !0Ac asc & zxnL desc 9000 - >!0Ac asc&zxnL desc

I found an easy way but the previous solution is more efficient (regarding key length). 我找到了一种简单的方法,但之前的解决方案更有效(关于密钥长度)。 Instead of using all alphabets we can use just the numbers and the key is to make the length fixed (0000,0001,0002,.....): 我们可以只使用数字而不是使用所有字母表,关键是使长度固定(0000,0001,0002,.....):

public class ReadingEntity : TableEntity
{
    public static string KeyLength = "000000000000000000000";
    public ReadingEntity(string partitionId, int keyId)
    {
        this.PartitionKey = partitionId;
        this.RowKey = keyId.ToString(KeyLength); ;


    }
    public ReadingEntity()
    {
    }
}


public IList<ReadingEntity> Get(string partitionName,int date,int enddate)
{
        CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

        // Create the CloudTable object that represents the "people" table.
        CloudTable table = tableClient.GetTableReference("Record");

        // Construct the query operation for all customer entities where PartitionKey="Smith".
        TableQuery<ReadingEntity> query = new TableQuery<ReadingEntity>().Where(TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionName),
    TableOperators.And,TableQuery.CombineFilters(
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThan, enddate.ToString(ReadingEntity.KeyLength)), TableOperators.And,
    TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThanOrEqual, date.ToString(ReadingEntity.KeyLength)))));
        return table.ExecuteQuery(query).ToList();
}

Hope this helps. 希望这可以帮助。

I solved this problem by creating a custom RowKey class that wraps around a String and provides an Increment method. 我通过创建一个自定义RowKey类来解决这个问题,该类包装了一个String并提供了一个Increment方法。

I can now define a range of valid characters (eg 0-9 + az + AZ ) and "count" within this range (eg az9 + 1 = aza , azZ + 1 = aA0 ). 我现在可以在此范围内定义一系列有效字符(例如0-9 + az + AZ )和“count”(例如az9 + 1 = azaazZ + 1 = aA0 )。 The advantage of this compared to using only numbers is that I have a far greater range of possible keys ( 62^n instead of 10^n ). 与仅使用数字相比,这样做的优点是我有更大范围的可能键( 62^n而不是10^n )。

I still have to define the length of the string beforehand and mustn't change it, but now I can store pretty much any number of entities while keeping the string itself much shorter. 我仍然必须预先定义字符串的长度,不能更改它,但现在我可以存储几乎任意数量的实体,同时保持字符串本身更短。 For example, with 10 digits I can store ~8*10^17 keys and with 20 digits ~7*10^35 . 例如,有10个数字我可以存储~8*10^17键和20个数字~7*10^35

The number of valid characters can of course be increased further to use the number of digits even more effectively, but in my case the above range was sufficient and is still readable enough for debugging purposes. 有效字符的数量当然可以进一步增加以更有效地使用数字位数,但在我的情况下,上述范围是足够的并且仍然足够可读以用于调试目的。

I hope this answer helps others who run into the same problem. 我希望这个答案可以帮助那些遇到同样问题的人。

EDIT: Just as a side note in case anyone wants to implement something similar: You will have to create custom character ranges and can't just count from 0 upwards, because there are illegal characters (eg / , \\ ) between the numbers ( 0-9 ) and the lowercase letters. 编辑:只是作为一个附注,以防任何人想要实现类似的东西:你将不得不创建自定义字符范围,不能只从0向上计数,因为数字之间有非法字符(例如/\\ )( 0-9 )和小写字母。

I found a potential solution if you're using Linq to query against Azure Table Storage . 如果您使用Linq查询Azure表存储,我找到了一个潜在的解决方案。

You add something like this to your model for the table... 你将这样的东西添加到你的模型表...

public int ID 
{
    get
    {
        return int.Parse(RowKey);
    }
}

And then you can do this in your Linq query... 然后你可以在Linq查询中执行此操作...

.Where(e => e.ID > 1 && e.ID < 10);

With this technique you're not actually adding the "ID" column to the table since it has no "set" operation in it. 使用这种技术,您实际上并没有将“ID”列添加到表中,因为它没有“set”操作。

The one thing I'm unsure about is what's happening behind the scenes exactly. 我不确定的一件事是幕后发生了什么。 I want to know what the query to Azure Table Storage looks like in its final form, but I'm not sure how to find that out. 我想知道Azure Table Storage的查询在最终形式中是什么样的,但我不知道如何找到它。 I haven't been able to find that information when debugging and using quickwatch. 调试和使用quickwatch时,我无法找到该信息。

UPDATE UPDATE

I still haven't figured out what's happening, but I have a strong feeling that this isn't very efficient. 我仍然没有弄清楚发生了什么,但我有一种强烈的感觉,认为这不是很有效。 I'm thinking the way to go is to create a sortable string as the OP did. 我想要走的路是像OP那样创建一个可排序的字符串 Then you can use the RowKey.CompareTo() function in your Linq where clause to filter by a range. 然后,您可以使用Linq where子句中的RowKey.CompareTo()函数按范围进行过滤。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM