分析来自 Pandas 中多个 .txt 文件的数据

Question

I have 1000+ text files.我有 1000 多个文本文件。 Each has dates ( which I have made the index) and stock prices (which are column 0).每个都有日期（我制作了索引）和股票价格（第 0 列）。 I have created the code to find an individual file's price's moving average, and rolling difference between the price and the moving average.我创建了代码来查找单个文件的价格的移动平均线，以及价格和移动平均线之间的滚动差异。 I would like to create code to do this for every file.我想为每个文件创建代码来执行此操作。 I have to upload them in groups because it uses too much memory to upload them at once.我必须成组上传它们，因为一次上传它们会占用太多内存。

I imagine I would have to use a for loop to iterate through the files and find the metrics for each one.我想我将不得不使用 for 循环来遍历文件并找到每个文件的指标。 But how would I do that?但是我该怎么做呢？ How can I upload all the files into a group, and say, group them into one variable, then create a loop to find the moving average and difference from price for each one?如何将所有文件上传到一个组中，比如说，将它们分组到一个变量中，然后创建一个循环来查找每个文件的移动平均线和价格差异？

Edit: I am using numpy,pandas, and matplotlib.编辑：我正在使用 numpy、pandas 和 matplotlib。 I'd also like to be able to find the stocks which the difference from the moving average is the greatest.我还希望能够找到与移动平均线相差最大的股票。

Any help would be greatly appreciated任何帮助将不胜感激

Answer 1

If you are looking to just iterate over all of your input files in a given folder, you might want to try os.listdir() to get a list of filenames, which you can then process sequentially.如果您只想遍历给定文件夹中的所有输入文件，您可能需要尝试os.listdir()来获取文件名列表，然后您可以按顺序处理这些文件名。 If your files are spread over layers of folder, you could use os.walk() to traverse the directories.如果您的文件分布在文件夹层中，您可以使用os.walk()来遍历目录。 You can find info on these methods here: https://docs.python.org/3/library/os.html您可以在此处找到有关这些方法的信息： https : //docs.python.org/3/library/os.html

Answer 2

How large are these 1000files?这 1000 个文件有多大？ If they are a couple MB each, just guessing, merge all files into one single file and you can do whatever you want with it.如果它们每个都有几 MB，只是猜测，将所有文件合并到一个文件中，您可以对它做任何想做的事情。

import pandas as pd
import csv
import glob
import os

#os.chdir("C:\\Users\\Excel\\Desktop\\test\\")
results = pd.DataFrame([])
filelist = glob.glob("C:\\your_path\\*.csv")
#dfList=[]
for filename in filelist:
    print(filename)  
    namedf = pd.read_csv(filename, skiprows=0, index_col=0)
    results = results.append(namedf)

results.to_csv('C:\\your_path\\CombinedFile.csv')

分析来自 Pandas 中多个 .txt 文件的数据

问题描述

2 个解决方案

解决方案1
1 2020-02-12 21:51:55

解决方案2
0 2020-02-13 16:46:46

分析来自 Pandas 中多个 .txt 文件的数据

问题描述

2 个解决方案

解决方案1 1 2020-02-12 21:51:55

解决方案2 0 2020-02-13 16:46:46

解决方案1
1 2020-02-12 21:51:55

解决方案2
0 2020-02-13 16:46:46