简体   繁体   English

如何在MySQL的字符串的第二个大写字母前插入空格?

[英]How to insert a space before the 2nd capital letter of a string in MySQL?

I have some data from a 3rd-party and one column is a concatenation of first and last name - but there is no space in between. 我有来自第三方的一些数据,其中一列是姓和名的串联-但两者之间没有空格。 My goal is to insert a space before the second capital letter, eg: 我的目标是在第二个大写字母之前插入一个空格,例如:

some_name
-------------
AdamPeterson
JohnSmith
StevenMulroy

Would become: 会成为:

some_name
-------------
Adam Peterson
John Smith
Steven Mulroy

I know that this isn't foolproof, but it's the best it's going to get with the source data that I have. 我知道这不是万无一失的,但这是我拥有的源数据所能获得的最好的结果。

I need to do this in SQL rather than Excel etc - because the data is refreshed regularly on the database level, and is then handled by another system without first being exported. 我需要在SQL中而不是在Excel等中执行此操作-因为数据会在数据库级别定期刷新,然后由另一个系统处理,而无需首先导出。

Any help is greatly appreciated! 任何帮助是极大的赞赏!

For Mysql 8 对于MySQL 8

SELECT REGEXP_REPLACE(CAST('JohnLexxxanon' as BINARY), '^([A-Z][a-z]+)([A-Z][a-z]+)$', '$1 $2');

For MariaDb 10+ 对于MariaDb 10+

SELECT REGEXP_REPLACE(CAST('JohnLexxxanon' as BINARY), '^([A-Z][a-z]+)([A-Z][a-z]+)$', '\\1 \\2');

Data is casted to binary to achieve case sensitivity. 数据被强制转换为二进制以实现区分大小写。

This works for MySql 8 and MariaDb 10+ 这适用于MySql 8和MariaDb 10+

Here is a general query for all MySQL 5.1+ versions pretty sure it will also run on MariaDB. 这是对所有MySQL 5.1+版本的常规查询,可以肯定它也将在MariaDB上运行。
The general idea is using a MySQL number generator to split the string into "tokens" and check the ascii range if the "token" is a capital letter or not 一般想法是使用MySQL数字生成器将字符串拆分为“令牌”,并检查“ ASCII”范围(如果“令牌”为大写字母)

Query 询问

SELECT 
   names.name   
 ,  INSERT (
       names.name 
     , LOCATE(
           SUBSTRING(names.name, number_generator.number, 1)  
         , names.name   
       )
     , 1
     , CONCAT(' ', SUBSTRING(names.name, number_generator.number, 1))
   ) AS changed_name
FROM (
  SELECT 
    @row := @row + 1 AS number 
  FROM (
    SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT   6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
  CROSS JOIN (
  SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
  SELECT @row := 0 
) init_user_params 
) AS number_generator
CROSS JOIN 
 names 
WHERE
   number_generator.number > 1
 AND
   ASCII(SUBSTRING(names.name, number_generator.number, 1)) BETWEEN 65 AND 90

Result 结果

| name         | changed_name  |
| ------------ | ------------- |
| AdamPeterson | Adam Peterson |
| JohnSmith    | John Smith    |
| StevenMulroy | Steven Mulroy |

see demo 观看演示

Note 注意

This query will not scale well on (very) large tables with millions or even billions of records because off the CROSS JOIN 此查询无法在具有数百万甚至数十亿条记录的(非常)大型表上很好地扩展,因为CROSS JOIN

Or you can use a tableless approach with 或者您可以使用无表方法

CROSS JOIN ( SELECT 'AdamPeterson' AS name UNION SELECT 'JohnSmith' UNION SELECT 'StevenMulroy' ) AS names

see that demo 看那个演示

Or use batches when you have big tables 或有大桌子时使用批处理

CROSS JOIN ( SELECT name FROM names WHERE id >= 1 AND id <= 2 ORDER BY names.id ASC
) AS names

Why no LIMIT ? 为什么没有LIMIT LIMIT is slow when using it with large offset numbers like LIMIT 1000000, 1000 . 当将LIMIT 1000000, 1000等大偏移量与LIMIT 1000000, 1000一起使用时, LIMIT会变慢。 MySQL needs to fetch 1001000 records and drop 1000000 records again from a (disk in worst case scenario) temporary table MySQL需要从(在最坏情况下的磁盘)临时表中提取1001000条记录并再次删除1000000条记录

see that demo 看那个演示

Edited 编辑

it all looks like black magic to me! 在我看来,这一切都像黑魔法! It is almost perfect - try the names 'AlexLafferty' or 'LaurenAnderson'. 这几乎是完美的-尝试使用名称“ AlexLafferty”或“ LaurenAnderson”。 Maybe an off-by-one bug or something on the A? 也许是一个错误的bug或A上的某个东西? Thanks for all your help! 感谢你的帮助!

After a review i noticed that using LOCATE(..) in the INSERT(..) is pretty much redundant and can be removed to get it working properly. 经过审查后,我注意到在INSERT(..)中使用LOCATE(..)非常多余,可以将其删除以使其正常工作。

So the patch is 所以补丁是

SELECT 
    names.name   
 ,  INSERT (
       names.name 
     , number_generator.number
     , 1
     , CONCAT(' ', SUBSTRING(names.name, number_generator.number, 1))
   ) AS changed_name

see demo 观看演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM