在第二次出现字符后拆分文本

Question

I need to split text before the second occurrence of the '-' character.我需要在第二次出现“-”字符之前拆分文本。 What I have now is producing inconsistent results.我现在所拥有的是产生不一致的结果。 I've tried various combinations of rsplit and read through and tried other solutions on SO, with no results.我尝试了rsplit的各种组合并通读并尝试了其他解决方案，但没有结果。

Sample file name to split: 'some-sample-filename-to-split' returned in data.filename .要拆分的示例文件名：在data.filename中返回'some-sample-filename-to-split' 。 In this case, I would only like to have 'some-sample' returned.在这种情况下，我只想返回'some-sample' 。

fname, extname = os.path.splitext(data.filename)
file_label = fname.rsplit('/',1)[-1]
file_label2 = file_label.rsplit('-',maxsplit=3)
print(file_label2,'\n','---------------','\n')

Answer 1

You can do something like this:你可以这样做：

>>> a = "some-sample-filename-to-split"
>>> "-".join(a.split("-", 2)[:2])
'some-sample'

a.split("-", 2) will split the string upto the second occurrence of - . a.split("-", 2)将字符串拆分到第二次出现- 。

a.split("-", 2)[:2] will give the first 2 elements in the list. a.split("-", 2)[:2]将给出列表中的前 2 个元素。 Then simply join the first 2 elements.然后只需加入前 2 个元素。

OR或

You could use regular expression : ^([\\w]+-[\\w]+)您可以使用正则表达式： ^([\\w]+-[\\w]+)

>>> import re
>>> reg = r'^([\w]+-[\w]+)'
>>> re.match(reg, a).group()
'some-sample'

EDIT: As discussed in the comments, here is what you need:编辑：正如评论中所讨论的，这是您需要的：

def hyphen_split(a):
    if a.count("-") == 1:
        return a.split("-")[0]
    return "-".join(a.split("-", 2)[:2])

>>> hyphen_split("some-sample-filename-to-split")
'some-sample'
>>> hyphen_split("some-sample")
'some'

Answer 2

A generic form to split a string into halves on the nth occurence of the separator would be:在第 n 次出现分隔符时将字符串分成两半的通用形式是：

def split(strng, sep, pos):
    strng = strng.split(sep)
    return sep.join(strng[:pos]), sep.join(strng[pos:])

If pos is negative it will count the occurrences from the end of string.如果pos是负数，它将从字符串的末尾计算出现次数。

>>> strng = 'some-sample-filename-to-split'
>>> split(strng, '-', 3)
('some-sample-filename', 'to-split')
>>> split(strng, '-', -4)
('some', 'sample-filename-to-split')
>>> split(strng, '-', 1000)
('some-sample-filename-to-split', '')
>>> split(strng, '-', -1000)
('', 'some-sample-filename-to-split')

Answer 3

You can use str.index() :您可以使用str.index() ：

def hyphen_split(s):
    pos = s.index('-')
    try:
        return s[:s.index('-', pos + 1)]
    except ValueError:
        return s[:pos]

test:测试：

>>> hyphen_split("some-sample-filename-to-split")
'some-sample'
>>> hyphen_split("some-sample")
'some'

Answer 4

You could use regular expressions:您可以使用正则表达式：

import re

file_label = re.search('(.*?-.*?)-', fname).group(1)

Answer 5

在处理数据帧和整个列值所需的拆分时，lambda 函数比正则表达式更好。

df['column_name'].apply(lambda x: "-".join(x.split('-',2)[:2]))

Answer 6

Here's a somewhat cryptic implementation avoiding the use of join() :这是一个有点神秘的实现，避免使用join() ：

def split(string, sep, n):
    """Split `string´ at the `n`th occurrence of `sep`"""
    pos = reduce(lambda x, _: string.index(sep, x + 1), range(n + 1), -1)
    return string[:pos], string[pos + len(sep):]

在第二次出现字符后拆分文本

问题描述

6 个解决方案

解决方案1
76 已采纳 2016-03-30 05:11:33

解决方案2
10 2018-08-24 16:06:16

解决方案3
5 2016-03-30 17:35:47

解决方案4
4 2016-03-30 05:21:12

解决方案5
0 2021-06-30 11:21:11

解决方案6
0 2023-01-26 13:48:25

在第二次出现字符后拆分文本

问题描述

6 个解决方案

解决方案1 76 已采纳 2016-03-30 05:11:33

解决方案2 10 2018-08-24 16:06:16

解决方案3 5 2016-03-30 17:35:47

解决方案4 4 2016-03-30 05:21:12

解决方案5 0 2021-06-30 11:21:11

解决方案6 0 2023-01-26 13:48:25

解决方案1
76 已采纳 2016-03-30 05:11:33

解决方案2
10 2018-08-24 16:06:16

解决方案3
5 2016-03-30 17:35:47

解决方案4
4 2016-03-30 05:21:12

解决方案5
0 2021-06-30 11:21:11

解决方案6
0 2023-01-26 13:48:25