简体   繁体   中英

Galera Cluster concerns

I want to use Galera cluster in our production environment, but i have some concerns;

  1. Each table must have at least one explicit primary key defined.

  2. Each table must run under InnoDB or XtraDB storage engine.

  3. Chunk up your big transaction in batches. For example, rather than having one transaction insert 100,000 rows, break it up into smaller chunks of eg, insert 1000 rows per transaction.

  4. Your application can tolerate non-sequential auto-increment values.

  5. Schema changes are handled differently.

  6. Handle hotspots/Galera deadlocks by sending writes to a single node.

I will like some clarification for all aforementioned points.Also we have over 600 databases in production, can galera work in this Environment??

Thanks

That is a LOT to handle in one shot. There are two issues, table creation (invloves Schema , see point 5) and applications that use those tables. I'll try:

1)Each table must have at least one explicit primary key defined.

When you are creating a table, you can't have any table that DOES NOT have a primary key. Tables are created with fields and INDEXES. One of those indexes must be declared as PRIMARY KEY.

2)Each table must run under InnoDB or XtraDB storage engine.

When tables are created, the must have ENGINE=InnoDB or ENGINE=XtraDB . Galera does not handle the default MyISAM type tables

3)Chunk up your big transaction in batches. For example, rather than having one transaction insert 100,000 rows, break it up into smaller chunks of eg, insert 1000 rows per transaction.

This is not related to your schema, but your application. Try not to have an application that INSERT sa lot of data in one transaction . Note that this will work, but is risky. This is NOT a requirement, but a suggestion.

4)Your application can tolerate non-sequential auto-increment values.

With a cluster, you can have multiple servers being updated. If a field is auto-incremented, each cluster member could be trying to increment the same field. Your application should NEVER EVER assume that the next ID is related to the previous ID. For auto-increment fields, do not IMPOSE a value, let the DB handle it.

5)Schema changes are handled differently.

The Schema is the description of the tables and indexes and not the transactions that add, delete or retrieve information. You have multiple servers, so a Schema change has to be handled with care, so that all servers do catch up.

6)Handle hotspots/Galera deadlocks by sending writes to a single node.

This is both application and DB related. A deadlock is a condition where 2 different parts of an app try to get a value (ValueA), as the DB to lock it so it can be changed, and then try to get another value (ValueB) for the same use. If another part tries to First Lock ValueB , then ValueA, we have a deadlock, Because each app has locked the next value of the other app. To avoid this, it's best tp write to only one server in the cluster and use the other servers for reading. Do note that you can still have deadlocks in your applications. But you can avoid Galera creating the situation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM