简体   繁体   English

Cassandra更新不一致

[英]Cassandra updates not working consistently

I run the following code on my local (mac) machine and on a remote unix server.: 我在本地(mac)机器和远程unix服务器上运行以下代码:

public void deleteValue(final String id, final String value) {
    log.info("Removing value " + value);
    final Collection<String> valuesBeforeRemoval = getValues(id);
    final MutationBatch m = keyspace.prepareMutationBatch();
    m.withRow(VALUES_CF, id).deleteColumn(value);
    try {
      m.execute();
    } catch (final ConnectionException e) {
      log.error("Unable to delete  location " + value, e);
    }
    final Collection<String> valuesAfterRemoval = getValues(id);
    if (valuesAfterRemoval.size()!=(valuesBeforeRemoval.size()-1)) {
      log.error("value " + value + " was supposed to be removed from list "  + valuesBeforeRemoval + " but it wasn't: " + valuesAfterRemoval);
    }
...
  }

protected Collection<String> getValues(final String id) {
  try {
    final OperationResult<ColumnList<String>> operationResult = keyspace
            .prepareQuery(VALUES_CF).getKey(id).execute();
    final ColumnList<String> result = operationResult.getResult();
    if (result.isEmpty()) {
      log.info("No  value found for id: " + id);
      return new ArrayList<String>();
    }
    return result.getColumnNames();
  } catch (final ConnectionException e) {
    log.error("Unable to retrieve session " + id, e);
  }
  return new ArrayList<String>();
}

Locally, that line is never executed, which makes sense: 在本地,该行永远不会执行,这是有道理的:

log.error("value " + value + " was supposed to be removed from list "  + valuesBeforeRemoval + " but it wasn't: " + valuesAfterRemoval);

but that line is executed on my dev server: 但该行在我的开发服务器上执行:

[ERROR] [main] [nowsdSessionDaoCassandraImpl] [2013-03-08 13:12:24,801] [] - value 3 was supposed to be removed from list [3, 2, 1, 0, 7, 6, 5, 4, 9, 8] but it wasn't: [3, 2, 1, 0, 7, 6, 5, 4, 9, 8] [错误] [主要] [nowsdSessionDaoCassandraImpl] [2013-03-08 13:12:24,801] [] - 值3应该从列表中删除[3,2,1,0,7,6,5,4, 9,8]但不是:[3,2,1,0,7,6,5,4,9,8]

  • I am using com.netflix.astyanax 我正在使用com.netflix.astyanax
  • Both my local machine and the remote dev server connect to the very same cassandra instance. 我的本地计算机和远程开发服务器都连接到同一个cassandra实例。
  • Both my local machine and the remote dev server run the very same test creating a new row family, and adding 10 records before one is deleted. 我的本地计算机和远程开发服务器都运行相同的测试,创建一个新的行系列,并在删除一个之前添加10个记录。
  • When the error occurs on dev, log.error("Unable to delete location " + value, e); 当dev,log.error发生错误时(“无法删除位置”+值,e); was not executed (ie running the deletion command didn't produce any exception). 没有执行(即运行删除命令没有产生任何异常)。
  • I am 100% positive that no other code is affecting the content of the database while I am running the test on dev so this isn't some strange concurrency issue. 当我在dev上运行测试时,我100%肯定没有其他代码影响数据库的内容所以这不是一些奇怪的并发问题。

What could possibly explain that the deleteColumn(value) request runs without producing any error but still does not remove the column from the database? 什么可以解释deleteColumn(value)请求运行而不产生任何错误但仍然不从数据库中删除列?

ADDITIONAL INFO 附加信息

Here is how I created the keyspace: 以下是我创建键空间的方法:

create keyspace sessiondata
    with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
    and strategy_options = {replication_factor:1};

Here is how I created the column family values, referenced as VALUES_CF in the code above: 以下是我创建列族值的方法,在上面的代码中引用为VALUES_CF:

create column family values
    with comparator = UTF8Type
;

Here is how the keyspace referenced in the java code above is defined: 以下是如何定义上面java代码中引用的键空间:

final AstyanaxContext.Builder contextBuilder = getBuilder();
final AstyanaxContext<Keyspace> keyspaceContext = contextBuilder
        .forKeyspace(keyspaceName).buildKeyspace(
                ThriftFamilyFactory.getInstance());
keyspaceContext.start();
keyspace = keyspaceContext.getEntity();

where getBuilder is: getBuilder的位置是:

  private Builder getBuilder() {
    final AstyanaxConfigurationImpl conf = new AstyanaxConfigurationImpl()
    .setDiscoveryType(NodeDiscoveryType.NONE)
    .setRetryPolicy(new RunOnce());

    final ConnectionPoolConfigurationImpl poolConf = new ConnectionPoolConfigurationImpl("MyPool")
    .setPort(port)
    .setMaxConnsPerHost(1)
    .setSeeds(value);

    return new AstyanaxContext.Builder()
    .forCluster(cluster)
    .withAstyanaxConfiguration(conf)
    .withConnectionPoolConfiguration(poolConf)
    .withConnectionPoolMonitor(new CountingConnectionPoolMonitor());
  }

SECOND UPDATE 第二次更新

  • First, the issues are not solely related to deletes. 首先,问题不仅与删除有关。 I observe similar problems when updating records in the database, reading them, and not being able to read the updates I just wrote 我在更新数据库中的记录,读取它们,以及无法读取我刚才写的更新时发现了类似的问题

  • Second, I created a test that does 100 times the following operations: 其次,我创建了一个测试,它执行以下操作100次:

    • write a row into cassandra 写一行到cassandra
    • update that row in cassandra 更新cassandra中的那一行
    • read back that row from cassandra and check whether the row was indeed updated, and checking again regularly after delays if it wasn't 从cassandra读回该行并检查该行是否确实更新,如果不是,则在延迟后定期再次检查

    What I observe from that test is that: 我从该测试中观察到的是:

    • again, when I run that code locally, all 100 iterations pass right away (no retry ever needed) 再次,当我在本地运行该代码时,所有100次迭代立即通过(不需要重试)
    • when I run that code on the remote server, some of the iterations pass, some fail. 当我在远程服务器上运行该代码时,一些迭代通过,一些失败。 When they fail, no matter how large the delay (I wait up to 10 seconds), the test always fail. 当它们失败时,无论延迟有多大(我等待10秒),测试总是失败。

At this point, I am really not sure how any cassandra setup could explain this behavior since I connect to the very same server for my tests and since the delays I insert are much larger than any additional latency I may need to run the test when connecting from my local machine. 在这一点上,我真的不确定任何cassandra设置如何解释这种行为,因为我连接到我的测试的同一台服务器,因为我插入的延迟比连接时运行测试时可能需要的任何额外延迟大得多从我的本地机器。

The only relevant difference seems to be which machine the code is running on. 唯一相关的区别似乎是运行代码的机器。

THIRD UPDATE 第三次更新

If in the test mentioned in the previous update, I insert a delay between the 2 writes, the code starts passing if the delay is >= 1,000 ms. 如果在上一次更新中提到的测试中,我在2次写入之间插入延迟,则如果延迟> = 1,000 ms,则代码开始通过。 A delay of, say, 100 ms doesn't help. 延迟,例如,100毫秒没有帮助。 I also modified the builder to set the default read and write consistencies to the most demanding: ALL, and that had no impact on the results of the test (still failing about half of the time unless delay between writes >1s): 我还修改了构建器以将默认读取和写入一致性设置为最苛刻的:ALL,并且对测试结果没有影响(除非写入之间的延迟> 1s,否则仍然会失败大约一半的时间):

final AstyanaxConfigurationImpl conf = new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.NONE)
.setRetryPolicy(new RunOnce()).setDefaultReadConsistencyLevel(ConsistencyLevel.CL_ALL).setDefaultWriteConsistencyLevel(ConsistencyLevel.CL_ALL);

To debug, try printing the full row instead of just the column names. 要进行调试,请尝试打印整行而不仅仅是列名。 When I say the full row I mean the column name, column value and the time stamp. 当我说完整行时,我的意思是列名,列值和时间戳。 A long shot is clocks are wrong on one of your test machines and this is throwing out your tests on the other. 很长一段时间是你的一台测试机器上的时钟错误,而这是另一方面的测试。

Another thing to double check is that ip is indeed what you think it is, in both your application and cassandra. 另一件需要仔细检查的是,在你的应用程序和cassandra中,ip确实是你的想法。 When you retrieve it print it between something, like println("-" + ip "-"). 当你检索它时,在println(“ - ”+ ip“ - ”)之类的东西之间打印它。 Before and after your try block for the execute in deleteSecureLocation do a get for only that column, not the entire row. 在deleteSecureLocation中执行try块之前和之后,只对该列执行get,而不是整行。 I'm not too sure how to do that in astynax, on the cli it would be get[id][ip]. 我不太确定如何在astynax中做到这一点,在cli它会得到[id] [ip]。

Something to keep in mind is that a delete won't fail even if there's nothing to delete. 需要记住的是,即使没有任何内容可以删除,删除也不会失败。 To cassandra it's a write, the only thing that will make it a delete is if on read it's the latest timestamped entry against that row/column name. 对于cassandra来说,这是一个写入,唯一能使它成为删除的是,如果在读取它是针对该行/列名称的最新时间戳条目。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM