簡體 English 中英

Python 按分位數過濾較大的文本

[英]Python filter larger text by quantile

原文 2023-01-14 07:51:00 6 1 python/ algorithm/ optimization/ filter/ quantile

假設我正在處理一個非常大的文本文件，我有以下偽代碼

xx_valueList = []
lines=[]
with line in file: 
    xx_value = calc_xxValue(line)
    xx_valueList.append(xx_value)
    lines.append(lines)

# get_quantile_value is a function return the cutoff value with a specific quantile precent
cut_offvalue = get_quantile_value(xx_valueList, precent=0.05)
for line in lines: 
    if calc_xxValue(line) > cut_offvalue: 
         # do someting here

注意文件很大，可能來自一個pipe，不想看兩遍。

我們必須先讀取整個文件才能獲得過濾文件的截斷值

上面的方法可以，但是memory的消耗太大了，有沒有什么算法優化可以提高效率，減少memory的消耗？

1 個解決方案

xx_value_list = []
cut_offvalue = 0
with open(file, 'r') as f:
    for line in f:
        xx_value = calc_xxValue(line)
        xx_value_list.append(xx_value)
        if len(xx_value_list) % 100 == 0:
            cut_offvalue = get_quantile_value(xx_value_list, precent=0.05)
        if xx_value < cut_offvalue: 
            # do something here
            pass

Python加權分位數為R wtd.quantile()

[英]Python weighted quantile as R wtd.quantile()

Python中的分位數函數

[英]Quantile functions in Python

列出Python分位數

[英]List Python Quantile

Python：分位數用於子列表的列表

[英]Python: quantile for list of sublists

如何使用分位數過濾數據？

[英]How to filter data using quantile?

Python Turtle，在屏幕上以更大的字體繪制文本

[英]Python Turtle, draw text with on screen with larger font

python中的單位根分位數自回歸

[英]Unit root quantile autogression in python

matlab中分位數的等效python命令

[英]Equivalent python command for quantile in matlab

使用分位數 python 移除異常值

[英]Remove outlier using quantile python

用分位數回歸和Python識別異常值

[英]Identifying Outliers with Quantile Regression and Python

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 Python加權分位數為R wtd.quantile() Python中的分位數函數列出Python分位數 Python：分位數用於子列表的列表如何使用分位數過濾數據？ Python Turtle，在屏幕上以更大的字體繪制文本 python中的單位根分位數自回歸 matlab中分位數的等效python命令使用分位數 python 移除異常值用分位數回歸和Python識別異常值

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM