[英]Split Pandas Column of type String using fixed width (similar to Excel text-to-columns functionality with fixed width)
I have a dataframe of CCYPair and corresponding spot values similar to the below:我有一个 CCYPair 的 dataframe 和对应的点值,如下所示:
Current Dateframe:当前日期范围:
d = {'CCYPair': ['EURUSD', 'USDJPY'], 'Spot': [1.2, 109]}
df = pd.DataFrame(data=d)
I am looking to split the CCYPair column into CCY1 and CCY2.我希望将 CCYPair 列拆分为 CCY1 和 CCY2。 This would be easily achieved in Excel using Text-to-columns or through Left and Right functions.
这可以在 Excel 中使用 Text-to-columns 或通过 Left 和 Right 函数轻松实现。 However, even after searching for a while, I am finding it quite tricky to achieve the same result in a pandas dataframe.
但是,即使搜索了一段时间,我发现在 pandas dataframe 中实现相同的结果非常棘手。
I could only find pandas.read_fwf but that is for reading from a file.我只能找到 pandas.read_fwf 但这是从文件中读取的。 I already have a dataframe and am looking to split one of the columns based on fixed width.
我已经有一个 dataframe 并且希望根据固定宽度拆分其中一列。
I am sure I am missing something basic here - just can't figure out what.我确定我在这里遗漏了一些基本的东西——只是不知道是什么。
I have tried df['CCY1'] = df['CCYPair'][0:3]
But that applies the [0:3] on the column and not each entry within the column.我试过
df['CCY1'] = df['CCYPair'][0:3]
但这将 [0:3] 应用于列而不是列中的每个条目。 So I end up getting the first three CCYPair values and then NaNs.所以我最终得到了前三个 CCYPair 值,然后是 NaN。
Expected outcome:预期结果:
d = {'CCY1': ['EUR', 'USD'], 'CCY2': ['USD', 'JPY'], 'Spot': [1.2, 109]}
df = pd.DataFrame(data=d)
You can try extract
:您可以尝试
extract
:
df[['CCY1','CCY2']] = df.CCYPair.str.extract('(.{3})(.*)')
Output: Output:
CCYPair Spot CCY1 CCY2
0 EURUSD 1.2 EUR USD
1 USDJPY 109.0 USD JPY
You can also use str.slice method:您还可以使用 str.slice 方法:
df['CCY1'] = df['CCYPair'].str.slice(stop=3)
df['CCY2'] = df['CCYPair'].str.slice(start=3)
Output: Output:
CCYPair Spot CCY1 CCY2
0 EURUSD 1.2 EUR USD
1 USDJPY 109.0 USD JPY
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.