简体   繁体   中英

Plot n graphs and save them in n different files

First and foremost, I'd like to precise that I am no expert in Python and still learning how to use pandas. I dig through older posts but I don't find a suitable answer.

I've been trying to code a data analysis of 92 contracts. For each of them, I'd like to plot a specific analysis (taking some columns of a same dataframe each time) and save each analysis in a different folder (Analysis 1, Analysis 2, ...).

So far, I am facing many difficulties. Thus, before focusing on WHAT to plot, I'd like to understand how to code the saving of each plot in a different.png file each time. The code I've tried does not seem to save anything as when I go to the folder it's empty.

Thanks to waykiki's help, I've been able to update my code. Now I know how to create as many folders as analysis I produce. Yet, I do not seem to understand how to code the plot of 92 graphs per analysis. My code now looks like this:

import pandas as pd
import matplotlib.pyplot as plt
import os

# Folder in which I want the analyses to be saved
URL5 = r"C:\Users\A\AppData\Local\Programs\Python\Python39"
# 1 graph per ID_Contrat (thus, 92 graphs)
groups = outer_merged_df.groupby("ID_Contrat") #where outer_merged_df is my dataframe
# How to name each plot.
List_ID_Contrat = outer_merged_df["ID_Contrat"].tolist()

def create_plot(file_name, x, y):
    # Create your plot. It is my understanding that here I should just give the x and the y I want to plot.
    fig = plt.figure()
    plt.plot(x, y, color = "red", kind = "line", legend = "true", linewidth = 2)
    plt.savefig(file_name)
    plt.show()

def main():
    # must be full-path. 
    parent_folder = URL5
    # move to parent directory
    os.chdir(parent_folder)
    # I want file_name to be different for each graph
    extension = ".png"
    # 5 = how many analyses I want to do
    for i in range(5):
        for name in List_ID_Contrat :
            file_name = "Analyse" + str[i+1] "{}" + extension.format(name) # I want file_name to be different for each graph and looking like "Analyse i Contrat XX"
        # Create a new folder
        folder_name = 'Analysis ' + str(i+1)
        os.mkdir(folder_name)
        full_file_name = folder_name + '/' + file_name
        x = np.linspace(1,100,100)
        y = np.random.random(100)
        create_plot(full_file_name, x, y)
        print("plot "+ savefile +" finished".format(name))
        
if __name__ == "__main__":
    main()

Yet, when I run my code, it does not plot 92 graphs nor want to create the folders anymore (though it did using Waykiki's method). The for loop is broken during hte first round (i only get the folder "Analysis 1") I get the Error Message:

AttributeError: 'Line2D' object has no property 'kind'

Could you please explain to me how I can save the graphs? I am getting lost..

Thanks

I think your approach is right, in the sense that you've divided your problem into 2 steps:

1.) Get the technical details done (create, organise and navigate through the folders and data).

2.) Do the actual creation/drawing of plots.

Here is a simple prototype script. This script creates N number of subfolders located in the main directory '/home/user/my_analysis/' . All subfolders are named "AnalysisX", where X is the number of the folder.

Every folder contains a different plot.

Note: my folder paths are for a linux machine, so just keep in mind that '/home/user/some_folder/' isn't a valid path in windows, (I see you've already got that part right. but it might be useful for other users).

import os
import numpy as np
import matplotlib.pyplot as plt


def create_plot(file_name, x, y):
    # Create your plot
    fig = plt.figure()
    plt.plot(x, y, color='red', linewidth=2)
    plt.savefig(file_name)
    plt.show()


def main():
    # must be full-path
    parent_folder = '/home/user/my_analysis/'

    # move to parent directory
    os.chdir(parent_folder)

    file_name = 'plot'
    extension = '.png'
    for i in range(5):
        # Create a new folder
        folder_name = 'Analysis' + str(i+1)
        os.mkdir(folder_name)

        full_file_name = folder_name + '/' + file_name + extension
        x = np.linspace(1, 100, 100)
        y = np.random.random(100)
        create_plot(full_file_name, x, y)


if __name__ == '__main__':
    main()

For clarity, this is what the folder-structure looks like. I've only censored my real username:

在此处输入图像描述

You still haven't provide the DataFrame as an example. I have no access to your local folder. I assume you have pandas DataFrame anyway, so I write the code for random data. Before giving you a code, I'll try to clear up some misunderstanding:

1. Quoting your comment:

# Create your plot. It is my understanding that here I should just give the x and the y I want to plot. Yes, this is correct. However, you mixed up pandas plotting and matplotlib:

plt.plot(x, y, color = "red", kind = "line", legend = "true", linewidth = 2)

Stick to one. kind='line', legend = 'true' is pandas plotting, while plt.plot() is matplotlib plotting. Mixing it won't work;)

2. extension = '.png' is not necessary (at least in this case)

plt.savefig() will always give you .png anyway. I didn't try it, but I guess it might even cause additional problem if you add .png as your file name.

So this is my code:

def create_plot(file_name, x, y):
    fig, ax = plt.subplots()
    ax.plot(x, y, 'r', linewidth = 2)
    plt.savefig(file_name)
    plt.close()

def createalotofdata(n, df):
    for i in range(n):
        df[f'data number{i}'] = np.random.rand(10)
#     print(df)

x = np.arange(10)
df = pd.DataFrame({'x0': x})

createalotofdata(5, df)

for i in range(len(list(df))-1):
    create_plot(f'Plot number {i}', df['x0'], df[f'data number{i}'])

So the output will be nothing to see, only the plots are saved:

在此处输入图像描述

Hope you understand and can adapt according to your need. Do ask again if something is still unclear.

So yesterday I posted this question: how can I plot n graphs, for different analyses, and save them in different.png files? Thanks to Karina and Waykiki (and somehow, myself) I made it. Below is the code I now have - that actually works - with an example.

I created a simple example with a simple dataframe:

import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame({'ID':['A','B','B','A','C','C'], 'X': [5,3,4,2,5,3], 'Y':[1,2,6,4,5,2]}) #simple dataframe

def create_plot(file_name, x, y):
    # Create your plot
    plt.plot(x, y, color='red', linewidth=2, label = ID) # As I was advised, I stoped using "group.plot" which is a function from pandas plotting library : stick to one library !
    plt.savefig(file_name)
    plt.show()

def main():
    # must be full-path
    parent_folder = r"C:\Users\A\AppData\Local\Programs\Python\Python39\Test"
    
    # move to parent directory
    os.chdir(parent_folder)

    extension = '.png'
    for i in range(5):
        # Create a new folder
        folder_name = 'Analysis' + str(i+1)
        file_name = 'Analysis' + str(i+1)
        #print(type(file_name))
        os.mkdir(folder_name)
        for ID in df.ID.unique():
        #for ID, group in groups:
            df1 = df[df.ID == ID]
            file_name = "Analysis " + str(i+1) +" - {}".format(ID)
            print(file_name)
            full_file_name = folder_name + '/' + file_name + extension
            x = df1.X
            y = df1.Y
            create_plot(full_file_name, x, y)
if __name__ == '__main__':
    main()

This code works. I can now:

  1. Plot figures using the create_plot() function
  2. Create 1 folder per analysis (here 5 analysis)
  3. Save each plot to a.png file whose name is as defined in "file_name" (namely Analysis 1 - C (in folder Analysis1), Analysis 2 - A (in folder Analysis2, ...)

Now what I need to code is:

  1. How to tell to my code that for analysis 1 I want some columns of my df, for analysis 2 some other columns, and so on and so forth
  2. Change x_axis label so it presents dates that I have defined.

Hope this will help the community !

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM