简体   繁体   English

Plot n张图并将它们保存在n个不同的文件中

[英]Plot n graphs and save them in n different files

First and foremost, I'd like to precise that I am no expert in Python and still learning how to use pandas. I dig through older posts but I don't find a suitable answer.首先,我想明确一点,我不是 Python 方面的专家,并且仍在学习如何使用 pandas。我翻阅了较旧的帖子,但没有找到合适的答案。

I've been trying to code a data analysis of 92 contracts.我一直在尝试对 92 份合同的数据分析进行编码。 For each of them, I'd like to plot a specific analysis (taking some columns of a same dataframe each time) and save each analysis in a different folder (Analysis 1, Analysis 2, ...).对于它们中的每一个,我想 plot 进行特定分析(每次都采用相同的 dataframe 的一些列)并将每个分析保存在不同的文件夹中(分析 1,分析 2,...)。

So far, I am facing many difficulties.到目前为止,我面临着许多困难。 Thus, before focusing on WHAT to plot, I'd like to understand how to code the saving of each plot in a different.png file each time.因此,在关注 plot 之前,我想了解如何每次将每个 plot 的保存编码到不同的 .png 文件中。 The code I've tried does not seem to save anything as when I go to the folder it's empty.我试过的代码似乎没有保存任何东西,因为当我 go 到它是空的文件夹时。

Thanks to waykiki's help, I've been able to update my code.感谢 waykiki 的帮助,我已经能够更新我的代码。 Now I know how to create as many folders as analysis I produce.现在我知道如何创建与我生成的分析一样多的文件夹。 Yet, I do not seem to understand how to code the plot of 92 graphs per analysis.然而,我似乎不明白如何对每次分析 92 个图表的 plot 进行编码。 My code now looks like this:我的代码现在看起来像这样:

import pandas as pd
import matplotlib.pyplot as plt
import os

# Folder in which I want the analyses to be saved
URL5 = r"C:\Users\A\AppData\Local\Programs\Python\Python39"
# 1 graph per ID_Contrat (thus, 92 graphs)
groups = outer_merged_df.groupby("ID_Contrat") #where outer_merged_df is my dataframe
# How to name each plot.
List_ID_Contrat = outer_merged_df["ID_Contrat"].tolist()

def create_plot(file_name, x, y):
    # Create your plot. It is my understanding that here I should just give the x and the y I want to plot.
    fig = plt.figure()
    plt.plot(x, y, color = "red", kind = "line", legend = "true", linewidth = 2)
    plt.savefig(file_name)
    plt.show()

def main():
    # must be full-path. 
    parent_folder = URL5
    # move to parent directory
    os.chdir(parent_folder)
    # I want file_name to be different for each graph
    extension = ".png"
    # 5 = how many analyses I want to do
    for i in range(5):
        for name in List_ID_Contrat :
            file_name = "Analyse" + str[i+1] "{}" + extension.format(name) # I want file_name to be different for each graph and looking like "Analyse i Contrat XX"
        # Create a new folder
        folder_name = 'Analysis ' + str(i+1)
        os.mkdir(folder_name)
        full_file_name = folder_name + '/' + file_name
        x = np.linspace(1,100,100)
        y = np.random.random(100)
        create_plot(full_file_name, x, y)
        print("plot "+ savefile +" finished".format(name))
        
if __name__ == "__main__":
    main()

Yet, when I run my code, it does not plot 92 graphs nor want to create the folders anymore (though it did using Waykiki's method).然而,当我运行我的代码时,它不会 plot 92 graphs 也不想再创建文件夹(尽管它确实使用了 Waykiki 的方法)。 The for loop is broken during hte first round (i only get the folder "Analysis 1") I get the Error Message: for 循环在第一轮中被打破(我只得到文件夹“Analysis 1”)我收到错误消息:

AttributeError: 'Line2D' object has no property 'kind'

Could you please explain to me how I can save the graphs?您能否向我解释一下如何保存图表? I am getting lost..我迷路了..

Thanks谢谢

I think your approach is right, in the sense that you've divided your problem into 2 steps:我认为您的方法是正确的,因为您已将问题分为两个步骤:

1.) Get the technical details done (create, organise and navigate through the folders and data). 1.) 完成技术细节(创建、组织和浏览文件夹和数据)。

2.) Do the actual creation/drawing of plots. 2.) 实际创建/绘制地块。

Here is a simple prototype script.这是一个简单的原型脚本。 This script creates N number of subfolders located in the main directory '/home/user/my_analysis/' .此脚本在主目录'/home/user/my_analysis/'中创建 N 个子文件夹。 All subfolders are named "AnalysisX", where X is the number of the folder.所有子文件夹都命名为“AnalysisX”,其中 X 是文件夹的编号。

Every folder contains a different plot.每个文件夹包含一个不同的 plot。

Note: my folder paths are for a linux machine, so just keep in mind that '/home/user/some_folder/' isn't a valid path in windows, (I see you've already got that part right. but it might be useful for other users).注意:我的文件夹路径是针对 linux 机器的,所以请记住, '/home/user/some_folder/'不是 windows 中的有效路径,(我看到你已经把那部分弄对了。但它可能对其他用户有用)。

import os
import numpy as np
import matplotlib.pyplot as plt


def create_plot(file_name, x, y):
    # Create your plot
    fig = plt.figure()
    plt.plot(x, y, color='red', linewidth=2)
    plt.savefig(file_name)
    plt.show()


def main():
    # must be full-path
    parent_folder = '/home/user/my_analysis/'

    # move to parent directory
    os.chdir(parent_folder)

    file_name = 'plot'
    extension = '.png'
    for i in range(5):
        # Create a new folder
        folder_name = 'Analysis' + str(i+1)
        os.mkdir(folder_name)

        full_file_name = folder_name + '/' + file_name + extension
        x = np.linspace(1, 100, 100)
        y = np.random.random(100)
        create_plot(full_file_name, x, y)


if __name__ == '__main__':
    main()

For clarity, this is what the folder-structure looks like.为清楚起见,这就是文件夹结构的样子。 I've only censored my real username:我只审查了我的真实用户名:

在此处输入图像描述

You still haven't provide the DataFrame as an example.您仍然没有提供 DataFrame 作为示例。 I have no access to your local folder.我无权访问您的本地文件夹。 I assume you have pandas DataFrame anyway, so I write the code for random data.我假设你有 pandas DataFrame 无论如何,所以我写了随机数据的代码。 Before giving you a code, I'll try to clear up some misunderstanding:在给你代码之前,我会试着澄清一些误解:

1. Quoting your comment: 1. 引用你的评论:

# Create your plot. It is my understanding that here I should just give the x and the y I want to plot. Yes, this is correct. # 创建你的 plot。我的理解是这里我应该只给 plot 我想要的 x 和 y 。是的,这是正确的。 However, you mixed up pandas plotting and matplotlib:但是,您混淆了 pandas 绘图和 matplotlib:

plt.plot(x, y, color = "red", kind = "line", legend = "true", linewidth = 2)

Stick to one.坚持一个。 kind='line', legend = 'true' is pandas plotting, while plt.plot() is matplotlib plotting. kind='line', legend = 'true'是 pandas 绘图,而plt.plot()是 matplotlib 绘图。 Mixing it won't work;)混合它是行不通的;)

2. extension = '.png' is not necessary (at least in this case) 2. extension = '.png'不是必需的(至少在这种情况下)

plt.savefig() will always give you .png anyway. plt.savefig()无论如何都会给你.png I didn't try it, but I guess it might even cause additional problem if you add .png as your file name.我没有尝试过,但我想如果您添加.png作为文件名,它甚至可能会导致其他问题。

So this is my code:所以这是我的代码:

def create_plot(file_name, x, y):
    fig, ax = plt.subplots()
    ax.plot(x, y, 'r', linewidth = 2)
    plt.savefig(file_name)
    plt.close()

def createalotofdata(n, df):
    for i in range(n):
        df[f'data number{i}'] = np.random.rand(10)
#     print(df)

x = np.arange(10)
df = pd.DataFrame({'x0': x})

createalotofdata(5, df)

for i in range(len(list(df))-1):
    create_plot(f'Plot number {i}', df['x0'], df[f'data number{i}'])

So the output will be nothing to see, only the plots are saved:所以output就没什么可看的了,只保存了剧情:

在此处输入图像描述

Hope you understand and can adapt according to your need.希望您能理解并能根据您的需要进行调整。 Do ask again if something is still unclear.如果还有什么不清楚的地方,请再问一遍。

So yesterday I posted this question: how can I plot n graphs, for different analyses, and save them in different.png files?所以昨天我发布了这个问题:如何将 plot n 个图表用于不同的分析,并将它们保存在不同的 .png 文件中? Thanks to Karina and Waykiki (and somehow, myself) I made it.多亏了 Karina 和 Waykiki(还有我自己),我做到了。 Below is the code I now have - that actually works - with an example.下面是我现在拥有的代码 - 实际有效 - 带有一个例子。

I created a simple example with a simple dataframe:我用一个简单的 dataframe 创建了一个简单的例子:

import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame({'ID':['A','B','B','A','C','C'], 'X': [5,3,4,2,5,3], 'Y':[1,2,6,4,5,2]}) #simple dataframe

def create_plot(file_name, x, y):
    # Create your plot
    plt.plot(x, y, color='red', linewidth=2, label = ID) # As I was advised, I stoped using "group.plot" which is a function from pandas plotting library : stick to one library !
    plt.savefig(file_name)
    plt.show()

def main():
    # must be full-path
    parent_folder = r"C:\Users\A\AppData\Local\Programs\Python\Python39\Test"
    
    # move to parent directory
    os.chdir(parent_folder)

    extension = '.png'
    for i in range(5):
        # Create a new folder
        folder_name = 'Analysis' + str(i+1)
        file_name = 'Analysis' + str(i+1)
        #print(type(file_name))
        os.mkdir(folder_name)
        for ID in df.ID.unique():
        #for ID, group in groups:
            df1 = df[df.ID == ID]
            file_name = "Analysis " + str(i+1) +" - {}".format(ID)
            print(file_name)
            full_file_name = folder_name + '/' + file_name + extension
            x = df1.X
            y = df1.Y
            create_plot(full_file_name, x, y)
if __name__ == '__main__':
    main()

This code works.此代码有效。 I can now:我现在可以:

  1. Plot figures using the create_plot() function Plot 使用 create_plot() 的数字 function
  2. Create 1 folder per analysis (here 5 analysis)每次分析创建 1 个文件夹(这里有 5 个分析)
  3. Save each plot to a.png file whose name is as defined in "file_name" (namely Analysis 1 - C (in folder Analysis1), Analysis 2 - A (in folder Analysis2, ...)将每个 plot 保存到一个 .png 文件,其名称在“file_name”中定义(即分析 1 - C(在文件夹 Analysis1 中),分析 2 - A(在文件夹 Analysis2 中,...)

Now what I need to code is:现在我需要编码的是:

  1. How to tell to my code that for analysis 1 I want some columns of my df, for analysis 2 some other columns, and so on and so forth如何告诉我的代码,为了分析 1,我想要我的 df 的一些列,为了分析 2 一些其他列,等等
  2. Change x_axis label so it presents dates that I have defined.更改 x_axis label 以显示我定义的日期。

Hope this will help the community !希望这对社区有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM