Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash attempting to parse perf sched recording #8

Open
jasonk000 opened this issue Mar 24, 2023 · 1 comment · May be fixed by #9
Open

crash attempting to parse perf sched recording #8

jasonk000 opened this issue Mar 24, 2023 · 1 comment · May be fixed by #9

Comments

@jasonk000
Copy link

Crashes due to some header parsing issue when attempting to read a perf sched recording.

Specifically, it looks to attempt to call skip() over a buffer of much shorter length during parseSample.

To reproduce:

$ sudo perf sched record -- sleep 3
$ sudo chmod 644 perf.data
$ ./perfdump -i ./perf.data 
events:
  0xc0000ca000={Event:317 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca080={Event:305 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca100={Event:311 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca180={Event:318 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca200={Event:316 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca280={Event:320 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca300={Event:309 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca380={Event:308 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca400={Event:307 SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Period|Raw|TID|Time ReadFormat:ID Flags:Disabled|ExcludeGuest|Inherit|SampleIDAll Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
  0xc0000ca480={Event:EventSoftwareDummy SamplePeriod:1 SampleFreq:0 SampleFormat:CPU|IP|Identifier|Raw|TID|Time ReadFormat:ID Flags:AuxOutput|Comm|CommExec|Inherit|Ksymbol|Mmap|MmapInodeData|SampleIDAll|Task Precise:EventPrecisionArbitrarySkid WakeupEvents:0 WakeupWatermark:0 BranchSampleType:0 SampleRegsUser:0 SampleStackUser:0 SampleRegsIntr:0 AuxWatermark:0 SampleMaxStack:0}
build IDs:
  {CPUModeKernel -1 77ab8542b6540fae3367bbb18a7aeede3c3b6f7b [kernel.kallsyms]}
  {CPUModeUser -1 f33b8f26bed889157b75129711599f9ca5ec9d0e [vdso]}
hostname: jkoch
OS release: 5.19.0-35-generic
version: 
arch: x86_64
CPUs online: 32
CPUs available: 32
CPU desc: AMD Ryzen 9 5950X 16-Core Processor
CPUID: AuthenticAMD,25,33,0
total memory: 134984663040
cmdline: [/usr/lib/linux-hwe-5.19-tools-5.19.0-35/perf sched record -- sleep 3]
core groups: [0-31]
thread groups: [0,16 1,17 2,18 3,19 4,20 5,21 6,22 7,23 8,24 9,25 10,26 11,27 12,28 13,29 14,30 15,31]
NUMA nodes: [{0 134984663040 118821437440 0-31}]
PMU mappings: map[1:software 2:tracepoint 4:cpu 5:breakpoint 6:kprobe 7:uprobe 8:ibs_fetch 9:ibs_op 10:amd_iommu_0 11:msr 12:power]
groups: []

panic: runtime error: slice bounds out of range [38803:60]

goroutine 1 [running]:
github.com/aclements/go-perf/perffile.(*bufDecoder).skip(...)
	/home/jkoch/code/go-perf/perffile/bufdecoder.go:15
github.com/aclements/go-perf/perffile.(*Records).parseSample(0xc0000ca500, 0xc0000939b0, 0xc000340168, 0x68?)
	/home/jkoch/code/go-perf/perffile/records.go:538 +0x164b
github.com/aclements/go-perf/perffile.(*Records).Next(0xc0000ca500)
	/home/jkoch/code/go-perf/perffile/records.go:148 +0x525
github.com/aclements/go-perf/perffile.(*File).Records(0xc0000c8000, 0xc00000e018?)
	/home/jkoch/code/go-perf/perffile/reader.go:350 +0x2f5
main.main()
	/home/jkoch/code/go-perf/cmd/perfdump/main.go:77 +0x84b
@jasonk000
Copy link
Author

jasonk000 commented Mar 24, 2023

Haven't had time to completely chase it down, however, it looks like the record has the Period set, but, the parseSample sees an attribute/type without the period. So, parseSample does not attempt to read a period value when one exists, and when it reads the _RAW size, it sees the period (in the example, 38803), which is much longer than the expected RAW sample data. perf sched script does show the period being read, so it seems like it's probably at attribute parsing/mapping.

https://github.com/aclements/go-perf/blob/master/perffile/records.go#L519-L538

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant