I have more than 100k iterations calling one C function, and running the profiler I see +25 Mb for each call of that function.
Result is not cumulative so Memory allocation must not be cumulative, otherwise it is a memory leak (I output after each call to a file).
I already saw how one can free variables in C, with py_decref
before returning. But in my case, and the way algorithm was built, I have no return variable, but pointers as parameters that are useful in return to Python code in each iteration.
With my basic C language knowledge, I tried turning the parameters to return values without succeeding.
Here is the declaration of parameters and the call:
My C lib is Illumination:
Illumination.restype = None
Illumination.argtypes = [ctypes.c_double, ctypes.c_double,
ndpointer(ctypes.c_float),
ndpointer(ctypes.c_int)]
All intermediate variables in Illumination are freed simply using free
.
Function in C is:
void Illumination(double Longitude, double Latitude, float *fact_sun, int *nb_nodes)
I already played on type like double to float hopefully results will be concise later but it did not help with the memory leak.
ctypes._reset_cache()
and gc.collect()
did not help anywhere in the loop or in the script in general.
To my understanding I need to deal with it in Python. Sadly I don't know how.
Profiling result is
Line # Mem usage Increment Occurences Line Contents
============================================================
30 106.109 MiB 106.109 MiB 1 @profile
31 def my_func():
32
33 # C Illumination function from shared file Illumination.so
34
35 106.242 MiB 0.133 MiB 1 lib = ctypes.CDLL('Illumination.so')
36 106.242 MiB 0.000 MiB 1 Illumination = lib.Illumination
37 106.242 MiB 0.000 MiB 1 Illumination.restype = None
38 106.242 MiB 0.000 MiB 1 Illumination.argtypes = [ctypes.c_double, ctypes.c_double,
39 106.242 MiB 0.000 MiB 1 ndpointer(ctypes.c_float),
40 106.242 MiB 0.000 MiB 1 ndpointer(ctypes.c_int)]
41
42 106.242 MiB 0.000 MiB 1 start_time = time.time()
43
44 # @profile
45 # #############################################################################################################
46 # variable initialization and declaration
47 # #############################################################################################################
48 # PATHNAMES
49
50 106.242 MiB 0.000 MiB 1 SHAPE_MODEL_PATHNAME = 'fac_ets.obj'
51 106.242 MiB 0.000 MiB 1 BARYCENTER_PATHNAME = 'bar_ycen.txt'
52 SS_COORD_PATHNAME = \
53 106.242 MiB 0.000 MiB 1 'SS_coord/sb_lonlat_ort.txt'
54
55 # 20201001174159.txt: Dt 3h old kernel | 20201014151855_1styear.txt: 25min
56 # 25_11_20.txt new kernel, dt=24,min
57 # lonlat_27_11_20_1ort_3h : Dt 3h new kernel
58
59 106.242 MiB 0.000 MiB 1 SH_FACET_PATHNAME = 'self_illu_facets/'
60
61 # facets_SH_geometry old folder
62
63 # facets_SH = pd.DataFrame(genfromtxt("facets_SH.txt"))[0].astype(np.int)
64
65 # SHAPE MODEL AND DX RESOLUTION
66
67 106.242 MiB 0.000 MiB 1 nb_face = 124938 # number of facets in shape model
68 106.242 MiB 0.000 MiB 1 nb_node = 62471 # number of nodes in shape model
69 106.242 MiB 0.000 MiB 1 nb_pts = 1 # & 18817 old kernel | 136346 & 18796 new kernel
70
71 # # LONGITUDE AND LATITUDE
72
73 115.289 MiB 9.047 MiB 1 lon_lat_dH = genfromtxt(SS_COORD_PATHNAME)
74 115.289 MiB 0.000 MiB 1 lon_lat_dH = pd.DataFrame(lon_lat_dH)
75
76 # if we read it from spice data file (ignore date and distance)
77
78 115.289 MiB 0.000 MiB 1 Longitude = lon_lat_dH[3]
79 115.883 MiB 0.594 MiB 1 Latitude = 90 - lon_lat_dH[4]
80 115.883 MiB 0.000 MiB 1 dH = lon_lat_dH[6] * 6.684587e-9 # 1.496e-8
81
82 # constants for energy equations
83
84 115.883 MiB 0.000 MiB 1 F = 1368 # constant
85 115.883 MiB 0.000 MiB 1 sigma = 5.67E-8 # stefan-boltzman constant
86 115.883 MiB 0.000 MiB 1 albedo = 0.06 # geometric albedo
87 115.883 MiB 0.000 MiB 1 albedo_bond = 0.04 # bond albedo
88 115.883 MiB 0.000 MiB 1 emiss = 0.95 # emissivity
89 115.883 MiB 0.000 MiB 1 inv_pi = 1 / pi
90 115.883 MiB 0.000 MiB 1 fact_SH_VIS = F * albedo * inv_pi
91 115.883 MiB 0.000 MiB 1 fact_T = (1 - albedo_bond) * F / (emiss * sigma)
92 115.883 MiB 0.000 MiB 1 fact_SH_IR = emiss * sigma * inv_pi
93
94 115.883 MiB 0.000 MiB 1 facets_selection = [44248] # 44247,44248
95
96 # 25821,119166-119167,35730-35731,43638,119147
97
98 179.820 MiB 0.000 MiB 2 for facet in facets_selection:
99
100 115.883 MiB 0.000 MiB 1 facet_i = facet
101 OUTPUT_PATHNAME = \
102 'flux_outputs/flux_new_kernel/region8' \
103 115.883 MiB 0.000 MiB 1 + str(facet_i) + 'testProfiler.txt'
104
105 # variables for illumination geometry
106
107 115.883 MiB 0.000 MiB 1 cos_Illumination = np.empty(nb_face, dtype=np.float32) # # float32 ## added
108 115.883 MiB 0.000 MiB 1 nb_node_Illumination = np.empty(nb_face, dtype=np.int32) # # int8 ## added
109
110 # variables for geometry of self-heating (cosz2)
111
112 116.387 MiB 0.504 MiB 1 cos_alpha = [None] * nb_face
113 117.418 MiB 1.031 MiB 1 angle_solid = [None] * nb_face
114 118.449 MiB 1.031 MiB 1 factor = [None] * nb_face
115
116 # #############################################################################################################
117 # SELF-HEATING GEOMETRY
118 # #############################################################################################################
119
120 # READING DATA OF SELF-HEATING GEOMETRY
121
122 127.465 MiB 9.016 MiB 1 Nod_coord = genfromtxt(SHAPE_MODEL_PATHNAME)
123 127.465 MiB 0.000 MiB 1 Nod_coord = pd.DataFrame(Nod_coord)
124 127.465 MiB 0.000 MiB 1 x_node = (Nod_coord[1])[0:nb_node].tolist()
125 127.465 MiB 0.000 MiB 1 y_node = (Nod_coord[2])[0:nb_node].tolist()
126 127.773 MiB 0.309 MiB 1 z_node = (Nod_coord[3])[0:nb_node].tolist()
127 127.773 MiB 0.000 MiB 1 i_f = np.asarray((Nod_coord[1])[nb_node:nb_face
128 128.801 MiB 1.027 MiB 1 + nb_node]).astype(int) - 1
129 128.801 MiB 0.000 MiB 1 j_f = np.asarray((Nod_coord[2])[nb_node:nb_face
130 129.840 MiB 1.039 MiB 1 + nb_node]).astype(int) - 1
131 129.840 MiB 0.000 MiB 1 k_f = np.asarray((Nod_coord[3])[nb_node:nb_face
132 130.871 MiB 1.031 MiB 1 + nb_node]).astype(int) - 1
133 130.871 MiB 0.000 MiB 1 m = [i_f[facet_i], j_f[facet_i], k_f[facet_i]]
134 130.871 MiB 0.000 MiB 1 node1 = np.array([x_node[m[0]], y_node[m[0]], z_node[m[0]]])
135 130.871 MiB 0.000 MiB 1 node2 = np.array([x_node[m[1]], y_node[m[1]], z_node[m[1]]])
136 130.871 MiB 0.000 MiB 1 node3 = np.array([x_node[m[2]], y_node[m[2]], z_node[m[2]]])
137 130.871 MiB 0.000 MiB 1 node1_node2 = np.array([node2[0] - node1[0], node2[1]
138 130.871 MiB 0.000 MiB 1 - node1[1], node2[2] - node1[2]])
139 130.871 MiB 0.000 MiB 1 node1_node3 = np.array([node3[0] - node1[0], node3[1]
140 130.871 MiB 0.000 MiB 1 - node1[1], node3[2] - node1[2]])
141
142 # READING FACETS BARYCENTER DATA
143
144 138.703 MiB 7.832 MiB 1 barycenter = genfromtxt(BARYCENTER_PATHNAME)
145 138.703 MiB 0.000 MiB 1 barycenter = pd.DataFrame(barycenter)
146 138.703 MiB 0.000 MiB 1 x_bary = barycenter[1]
147 138.703 MiB 0.000 MiB 1 y_bary = barycenter[2]
148 138.703 MiB 0.000 MiB 1 z_bary = barycenter[3]
149
150 # NORMAL VECTOR OF the FACET
151
152 138.703 MiB 0.000 MiB 1 vect_a = np.cross(node1_node2, node1_node3)
153
154 # SH = genfromtxt('output.txt')
155
156 138.703 MiB 0.000 MiB 1 SH = genfromtxt(glob.glob(SH_FACET_PATHNAME + '*'
157 143.492 MiB 4.789 MiB 1 + str(facet_i) + '.txt')[0])
158 143.492 MiB 0.000 MiB 1 SH = pd.DataFrame(SH)
159 143.492 MiB 0.000 MiB 1 S_facet = SH[0]
160 143.492 MiB 0.000 MiB 1 cos_SH = SH[1]
161 143.492 MiB 0.000 MiB 1 node_SH = SH[2]
162 143.492 MiB 0.000 MiB 1 x = SH[3]
163
164 # SOLID ANGLE
165
166 153.730 MiB 0.000 MiB 124939 for l in range(nb_face):
167 153.730 MiB 4.891 MiB 124938 if node_SH[l] != 3 or x[l] == 0:
168 153.730 MiB 0.000 MiB 124520 factor[l] = 0
169 else:
170
171 # vector facet->facet_x
172
173 153.730 MiB 4.867 MiB 418 vect_b = np.array([x_bary[l] - x_bary[facet_i],
174 153.730 MiB 0.000 MiB 418 y_bary[l] - y_bary[facet_i],
175 153.730 MiB 0.000 MiB 418 z_bary[l] - z_bary[facet_i]])
176 153.730 MiB 0.480 MiB 418 if np.linalg.norm(vect_b) == 0:
177 153.730 MiB 0.000 MiB 1 factor[l] = 0
178 else:
179 153.730 MiB 0.000 MiB 417 cos_alpha[l] = np.vdot(vect_a, vect_b) \
180 153.730 MiB 0.000 MiB 417 / (np.linalg.norm(vect_a)
181 153.730 MiB 0.000 MiB 417 * np.linalg.norm(vect_b))
182 153.730 MiB 0.000 MiB 417 if cos_alpha[l] <= 0:
183 factor[l] = 0
184 else:
185 factor[l] = cos_SH[l] * cos_alpha[l] \
186 153.730 MiB 0.000 MiB 417 * S_facet[l] / (np.linalg.norm(vect_b)
187 153.730 MiB 0.000 MiB 417 * np.linalg.norm(vect_b))
188 153.730 MiB 0.000 MiB 418 del vect_b # ###added
189 153.730 MiB 0.000 MiB 124938 ctypes._reset_cache()
190 # #############################################################################################################
191 # energy flux calculations
192 # #############################################################################################################
193
194 153.730 MiB 0.000 MiB 1 SH = 0
195 153.730 MiB 0.000 MiB 1 flux_neighbor_VIS = 0
196 153.730 MiB 0.000 MiB 1 flux_neighbor_IR = 0
197
198 # opening output file to write
199
200 153.730 MiB 0.000 MiB 1 file = open(OUTPUT_PATHNAME, 'w')
201
202 # file2 = open(OUTPUT_PATHNAME, 'w')
203
204 179.820 MiB 0.000 MiB 2 for i in range(nb_pts):
205
206 153.730 MiB 0.000 MiB 1 VIS = 0.0
207 153.730 MiB 0.000 MiB 1 IR = 0.0
208
209 153.730 MiB 0.000 MiB 1 inv_dH_sqr = 1 / (dH[i] * dH[i])
210
211 153.730 MiB 0.000 MiB 1 Illumination(Longitude[i], Latitude[i], cos_Illumination,
212 180.023 MiB 26.293 MiB 1 nb_node_Illumination)
213 180.023 MiB 0.000 MiB 1 if i % 1000 == 0:
214 179.820 MiB -0.203 MiB 1 gc.collect() # ##added
215 179.820 MiB 0.000 MiB 124939 for l in range(nb_face):
216 179.820 MiB 0.000 MiB 124938 if nb_node_Illumination[l] != 3 or cos_Illumination[l] \
217 179.820 MiB 0.000 MiB 42745 < 0:
218 179.820 MiB 0.000 MiB 82193 cos_Illumination[l] = 0
219
220 179.820 MiB 0.000 MiB 124938 if l == facet_i or factor[l] == 0:
221 179.820 MiB 0.000 MiB 124521 flux_neighbor_VIS = 0
222 179.820 MiB 0.000 MiB 124521 flux_neighbor_IR = 0
223 else:
224 flux_neighbor_VIS = cos_Illumination[l] * factor[l] \
225 179.820 MiB 0.000 MiB 417 * fact_SH_VIS * inv_dH_sqr
226 179.820 MiB 0.000 MiB 417 T_neighbor_4 = max(fact_T * cos_Illumination[l]
227 179.820 MiB 0.000 MiB 417 * inv_dH_sqr, 160000) # 30**4, 40k
228 flux_neighbor_IR = fact_SH_IR * factor[l] \
229 179.820 MiB 0.000 MiB 417 * T_neighbor_4
230
231 179.820 MiB 0.000 MiB 124938 VIS = VIS + flux_neighbor_VIS
232 179.820 MiB 0.000 MiB 124938 IR = IR + flux_neighbor_IR
233
234 179.820 MiB 0.000 MiB 1 Sol = cos_Illumination[facet_i] * F * inv_dH_sqr * (1
235 179.820 MiB 0.000 MiB 1 - albedo)
236 179.820 MiB 0.000 MiB 1 SH = (VIS + IR) * (1 - albedo)
237 179.820 MiB 0.000 MiB 1 Flux = Sol + SH
238 179.820 MiB 0.000 MiB 1 file.write(' %5.5f %5.5f %5.5f %5.5f %5.5f \n' % (Flux,
239 179.820 MiB 0.000 MiB 1 Sol, SH, VIS, IR))
240
241 # # geometry of illumination of facets contributing to selfheating of our facet
242 # for y in range(nb_facets_SH):
243 # file2.write(" %2.5f %2.5f \n" % (cos_Illumination[facets_SH[y]], nb_node_Illumination[facets_SH[y]]))
244
245 # file2.close()
246
247 179.820 MiB 0.000 MiB 1 file.close()
248 179.820 MiB 0.000 MiB 1 gc.collect()
249
250 179.820 MiB 0.000 MiB 1 cos_Illumination = None
251 179.820 MiB 0.000 MiB 1 del cos_Illumination
252 179.820 MiB 0.000 MiB 1 nb_node_Illumination = None
253 179.820 MiB 0.000 MiB 1 del nb_node_Illumination
254 179.820 MiB 0.000 MiB 1 interval = time.time() - start_time
255 179.820 MiB 0.000 MiB 1 print ('Total time in min:', interval / 60)
256 179.820 MiB 0.000 MiB 1 ctypes._reset_cache()
257 179.820 MiB 0.000 MiB 1 gc.collect()
Any hints?
Edit: Code pastebin.com/N76eXu2w & pastebin.com/m2dT4CyQ
A tool that is particularly helpful in the case that you are using both Python and C++ is https://github.com/vmware/chap because it understands both allocations in python arenas and allocations done using libc malloc. The tool is open source and runs on Linux.
In your case, what I would do is the following:
From the shell prompt:
echo 0x37 >/proc/<pid-of-your-process>/coredump_filter
Gather a core for your process using gcore.
Wait one minute or so.
Gather another core for your process using gcore.
Open each core in chap and do the following at the chap command prompt:
redirect on
describe used
describe leaked
summarize used
If you have an actual leak, where allocations are becoming unreachable, the output of "describe leaked" will reflect that.
If you simply have container growth, where some container is continually growing, you should be able to see that pretty quickly by comparing the output of "describe used" for the two cases. You can identify some kind of object that appears much more in the second core, pick one of these, and use chap on the second core to understand why it is held.
Thanks for posting the pastebin links. I see there that you have already attempted to change the Illumination()
function to return the fact_sun
pointer instead of passing it as an argument. I also see the free_mem()
function you may have also added. I will suggest an answer based on this version.
It looks like the free_mem()
function is being used correctly for the cos_Illumination
pointer in your python code. However, it appears you have a slight mistake in your implementation of free_mem()
in C that will make a big difference. On line 576 of your second pastebin link, it should be if (fact_sun)
and not if (!fact_sun)
. In other words, you want to free the memory if the pointer has a value (ie, it's not NULL), not when it doesn't have a value. (I assume you were not seeing any output from the printf
statement you have there when it's inside the if
block... which is anyway incorrect, by the way: it should be printf("%p\n", (void * )fact_sun);
without the &
.)
Note that, strictly speaking, the check for NULL in free_mem()
shouldn't be necessary anyway, as freeing a NULL pointer is supposed to be nop according the the C standard. (In fact, your code is relying on that elsewhere given that it only allocates and assigns slots 1
through N_Node
and N_Face
of several of the vectors, but frees slots 0
through N_Node
and N_Face
.) But I personally think it's probably still a good idea to check it, as I have seen older instances of C compilers that have issues with freeing a NULL pointer.
While setting fact_sun
to NULL there isn't necessary either (as fact_sun
is just a local variable and so goes out of scope when the function returns), that practice would be a good idea for each and all of the global variables that are being freed in the other free_memory()
function. However, does not appear to be a source of problems for the code as it currently stands.
So overall, it becomes:
// free memory of variable returned to python
void free_mem(double * fact_sun) {
if (fact_sun) {
printf("%p\n", (void * )fact_sun);
free(fact_sun);
}
}
Finally, although it's probably not a big deal (as long as the pointer is NULL), it looks like you do not need the free(nb_nodes)
on line 380 of your second pastebin link (since you never allocate it, unless there's more code that you didn't include).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.