In the ring buffer, can we retrieve the callchain only for PERF_RECORD_SAMPLE
or it can be done for other record types as well?
The man page of perf_event_open
only explictly states the callchain to be available for PERF_RECORD_SAMPLE
. I am particularly interested getting the callchain for PERF_RECORD_SWITCH
to get the stack trace for when my program is context switching in and out. I've tried a method of reading the callchain from the buffer, but seeing the addresses returned, it looks incorrect.
size_t index = mapping->data_tail; //mapping is the pointer to the ring buffer
uintptr_t base = reinterpret_cast<uintptr_t>(mapping) + PageSize;
size_t start_index = index % DataSize;
size_t end_index = start_index + sizeof(struct perf_event_header);
memcpy(buf, reinterpret_cast<void*>(base + start_index), sizeof(struct perf_event_header));
struct perf_event_header* header = reinterpret_cast<struct perf_event_header*>(buf);
uintptr_t p = reinterpret_cast<uintptr_t>(header) + sizeof(struct perf_event_header)
// Only sampling PERF_SAMPLE_CALLCHAIN
uint64_t* base = reinterpret_cast<uint64_t*>(p);
uint64_t size = *base; // Should be callchain size
base++;
for (int i = 0; i < size; i++) {
cout << *base << endl; // prints the addresses in the callchain stack
}
The 2 main issues with the output I am getting using this snippet are that : 1. All PERF_RECORD_SWITCH
have the same callchain. Which should be extremely unlikely. 2. The output is not consistent across multiple runs. The callchain size keeps varying from 0 (mostly) to 4,6, 16 and sometimes a very big (undefined) number.
The callchain is only available for PERF_RECORD_SAMPLE
events.
When reading the different record types, you should follow the struct
definitions from perf_event_open
rather than just trying to access individual fields by pointers, ie,
struct perf_record_switch {
struct perf_event_header header;
struct sample_id sample_id;
};
And then cast the whole event reinterpret_cast<struct perf_record_switch*>(header)
Specifically the layout of the sample type is highly dependent on the configuration and may include multiple dynamically sized arrays which can prevent using a static struct.
Technically, you can collect callchain from switching events by using the sched:sched_switch
tracepoint sampling event. This results in PERF_RECORD_SAMPLE
events. However, your might not always see useful information but mainly in-kernel scheduling details.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.