PySpark / Python 切片和索引問題

Question

誰能告訴我如何從 Python output 中提取某些值。

我想使用索引或切片從以下 output 中檢索值“ocweeklyreports”：

'config': '{"hiveView":"ocweeklycur.ocweeklyreports"}

這應該相對容易，但是，我在定義切片/索引配置時遇到問題

以下將成功給我“ocweeklyreports”

myslice = config['hiveView'][12:30]

但是，我需要修改索引或切片，以便在“ocweeklycur”之后獲得任何值

Answer 1

我不確定你正在處理什么 output 以及你想要它有多健壯，但如果它只是一個字符串，你可以做類似的事情（快速而骯臟的解決方案）。

input = "Your input"
indexStart = input.index('.') + 1 # Get the index of the input at the . which is where you would like to start collecting it
finalResponse = input[indexStart:-2])
print(finalResponse) # Prints ocweeklyreports

同樣，這不是最優雅的解決方案，但希望它能有所幫助或至少提供一個起點。 另一個更強大的解決方案是使用正則表達式，但我目前對正則表達式還不是很熟練。 如果您有任何問題或疑慮，請告訴我，我可以嘗試解決。

Answer 2

您幾乎可以使用正則表達式完成所有操作。 看看這是否有幫助：

import re
def search_word(di):
  st = di["config"]["hiveView"]
  p = re.compile(r'^ocweeklycur.(?P<word>\w+)')
  m = p.search(st)
  return m.group('word')

if __name__=="__main__":
  d = {'config': {"hiveView":"ocweeklycur.ocweeklyreports"}}
  print(search_word(d))

Answer 3

以下最適合我：

# Extract the value of the "hiveView" key
hive_view = config['hiveView']

# Split the string on the '.' character
parts = hive_view.split('.')

# The value you want is the second part of the split string
desired_value = parts[1]

print(desired_value)  # Output: "ocweeklyreports"

PySpark / Python 切片和索引問題

問題描述

3 個解決方案

解決方案1
0 2022-12-31 21:57:46

解決方案2
0 2022-12-31 22:04:27

解決方案3
0 2023-01-02 20:45:35

PySpark / Python 切片和索引問題

問題描述

3 個解決方案

解決方案1 0 2022-12-31 21:57:46

解決方案2 0 2022-12-31 22:04:27

解決方案3 0 2023-01-02 20:45:35

解決方案1
0 2022-12-31 21:57:46

解決方案2
0 2022-12-31 22:04:27

解決方案3
0 2023-01-02 20:45:35