PySpark / Python 切片和索引问题

Question

谁能告诉我如何从 Python output 中提取某些值。

我想使用索引或切片从以下 output 中检索值“ocweeklyreports”：

'config': '{"hiveView":"ocweeklycur.ocweeklyreports"}

这应该相对容易，但是，我在定义切片/索引配置时遇到问题

以下将成功给我“ocweeklyreports”

myslice = config['hiveView'][12:30]

但是，我需要修改索引或切片，以便在“ocweeklycur”之后获得任何值

Answer 1

我不确定你正在处理什么 output 以及你想要它有多健壮，但如果它只是一个字符串，你可以做类似的事情（快速而肮脏的解决方案）。

input = "Your input"
indexStart = input.index('.') + 1 # Get the index of the input at the . which is where you would like to start collecting it
finalResponse = input[indexStart:-2])
print(finalResponse) # Prints ocweeklyreports

同样，这不是最优雅的解决方案，但希望它能有所帮助或至少提供一个起点。 另一个更强大的解决方案是使用正则表达式，但我目前对正则表达式还不是很熟练。 如果您有任何问题或疑虑，请告诉我，我可以尝试解决。

Answer 2

您几乎可以使用正则表达式完成所有操作。 看看这是否有帮助：

import re
def search_word(di):
  st = di["config"]["hiveView"]
  p = re.compile(r'^ocweeklycur.(?P<word>\w+)')
  m = p.search(st)
  return m.group('word')

if __name__=="__main__":
  d = {'config': {"hiveView":"ocweeklycur.ocweeklyreports"}}
  print(search_word(d))

Answer 3

以下最适合我：

# Extract the value of the "hiveView" key
hive_view = config['hiveView']

# Split the string on the '.' character
parts = hive_view.split('.')

# The value you want is the second part of the split string
desired_value = parts[1]

print(desired_value)  # Output: "ocweeklyreports"

PySpark / Python 切片和索引问题

问题描述

3 个解决方案

解决方案1
0 2022-12-31 21:57:46

解决方案2
0 2022-12-31 22:04:27

解决方案3
0 2023-01-02 20:45:35

PySpark / Python 切片和索引问题

问题描述

3 个解决方案

解决方案1 0 2022-12-31 21:57:46

解决方案2 0 2022-12-31 22:04:27

解决方案3 0 2023-01-02 20:45:35

解决方案1
0 2022-12-31 21:57:46

解决方案2
0 2022-12-31 22:04:27

解决方案3
0 2023-01-02 20:45:35