简体   繁体   English

读取具有缺失值的数据框

[英]reading data-frame with missing values

I am trying to read some df with few columns and few rows where in some rows data are missing.我正在尝试读取一些列和行很少的 df,其中某些行数据丢失。 For example df looks like this, also elements of the df are separated sometimes with uneven number of spaces:例如 df 看起来像这样,有时 df 的元素也会用不均匀数量的空格分隔:

0.5 0.03   
0.1  0.2  0.3  2 
0.2  0.1   0.1  0.3
0.5 0.03  
0.1  0.2   0.3  2

Is there any way to extract this:有没有办法提取这个:

0.1  0.2  0.3  2 
0.2  0.1   0.1  0.3
0.1  0.2   0.3  2

Any suggestions.有什么建议么。

Thanks.谢谢。

You can parse manually your file:您可以手动解析您的文件:

import re

with open('data.txt') as fp:
    df = pd.DataFrame([re.split(r'\s+', l.strip()) for l in fp]).dropna(axis=0)

Output: Output:

>>> df
     0    1    2    3
1  0.1  0.2  0.3    2
2  0.2  0.1  0.1  0.3
4  0.1  0.2  0.3    2

You can try this:你可以试试这个:

import pandas as pd
import numpy as np

df = {
    'col1': [0.5, 0.1, 0.2, 0.5, 0.1],
    'col2': [0.03, 0.2, 0.1, 0.03, 0.2],
    'col3': [np.nan, 0.3, 0.1, np.nan, 0.3],
    'col4': [np.nan, 2, 0.3, np.nan, 2]
}

data = pd.DataFrame(df)

print(data.dropna(axis=0))

Output: Output:

   col1  col2  col3  col4
   0.1   0.2   0.3   2.0
   0.2   0.1   0.1   0.3
   0.1   0.2   0.3   2.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM