Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sudden PM2 core dump. Asking for assistance in finding the root cause, if possible. #5787

Open
JuriStefanovski opened this issue Mar 18, 2024 · 4 comments

Comments

@JuriStefanovski
Copy link

JuriStefanovski commented Mar 18, 2024

What's going wrong?

host syslog extract:
Mar 15 03:30:10 Center1 systemd[1]: Started Process Core Dump (PID 3079673/UID 0).
Mar 15 03:30:12 Center1 systemd-coredump[3079674]: Core file was truncated to 2147483648 bytes.
Mar 15 03:30:17 Center1 systemd-coredump[3079674]: Process 391241 (PM2 v5.3.0: God) of user 1000 dumped core.#12#012Stack trace of thread 391241:#12#0 0x00007f204686a00b n/a (n/a + 0x0)
Mar 15 03:30:17 Center1 systemd[1]: [email protected]: Succeeded.
Mar 15 03:30:17 Center1 systemd[1]: pm2-user.service: New main PID 391249 does not belong to service, and PID file is not owned by root. Refusing.
Mar 15 03:30:17 Center1 systemd[1]: pm2-user.service: Main process exited, code=dumped, status=6/ABRT
Mar 15 03:30:17 Center1 systemd[1]: pm2-user.service: Failed with result 'core-dump'.

How could we reproduce this issue?

Not sure if this can be easily replicated, this PM2 worked great for a long time, but perhaps the information contained in the core dump could be of interest to developers?
The core dump file is available, but its size is around 299 MB.
Ready to upload/send it wherever you say, if necessary.
Please close this case if it is not relevant to you. I apologize for any inconvenience.

Supporting information

Linux Mint 20.3 Una
Linux Center1 5.4.0-113-generic #127-Ubuntu SMP Wed May 18 14:30:56 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

PM2 report
Date : Mon Mar 18 2024 14:16:49 GMT+0200 (Eastern European Standard Time)

Daemon
pm2d version : 5.3.0
node version : 10.19.0
node path : not found
argv : /usr/bin/node,/usr/local/lib/node_modules/pm2/lib/Daemon.js
argv0 : node
user : user
uid : 1000
gid : 1000
uptime : 4747min

--- CLI ----------------------------------------------------
local pm2 : 5.3.0
node version : 10.19.0
node path : /usr/local/bin/pm2
argv : /usr/bin/node,/usr/local/bin/pm2,report
argv0 : node
user : user
uid : 1000
gid : 1000

--- System info --------------------------------------------
arch : x64
platform : linux
type : Linux
cpus : Intel(R) Xeon(R) E-2246G CPU @ 3.60GHz
cpus nb : 12
freemem : 20717719552
totalmem : 33243213824
home : /home/user

--- PM2 list -----------------------------------------------
┌────┬────────────────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
├────┼────────────────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤
│ 2 │ db │ default │ N/A │ fork │ 10845 │ 3D │ 0 │ online │ 0.7% │ 68.8mb │ user │ disabled │
│ 3 │ line │ default │ N/A │ fork │ 10901 │ 3D │ 0 │ online │ 38.1% │ 88.2mb │ user │ disabled │
│ 4 │ linecleanup │ default │ N/A │ fork │ N/A │ 0 │ 0 │ stopped │ 0% │ 0b │ user │ disabled │
│ 1 │ registry_server │ default │ N/A │ fork │ 10821 │ 3D │ 5 │ online │ 1.1% │ 20.2mb │ user │ disabled │
└────┴────────────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
Module
┌────┬──────────────────────────────┬───────────────┬──────────┬──────────┬──────┬──────────┬──────────┬──────────┐
│ id │ module │ version │ pid │ status │ ↺ │ cpu │ mem │ user │
├────┼──────────────────────────────┼───────────────┼──────────┼──────────┼──────┼──────────┼──────────┼──────────┤
│ 0 │ pm2-logrotate │ 2.7.0 │ 10694 │ online │ 0 │ 0.3% │ 100.0mb │ user │
└────┴──────────────────────────────┴───────────────┴──────────┴──────────┴──────┴──────────┴──────────┴──────────┘

--- Daemon logs --------------------------------------------
/home/user/.pm2/pm2.log last 20 lines:
PM2 | 2024-03-15T07:09:07: PM2 log: Application log path : /home/user/.pm2/logs
PM2 | 2024-03-15T07:09:07: PM2 log: Worker Interval : 30000
PM2 | 2024-03-15T07:09:07: PM2 log: Process dump file : /home/user/.pm2/dump.pm2
PM2 | 2024-03-15T07:09:07: PM2 log: Concurrent actions : 2
PM2 | 2024-03-15T07:09:07: PM2 log: SIGTERM timeout : 1600
PM2 | 2024-03-15T07:09:07: PM2 log:

PM2 | 2024-03-15T07:09:08: PM2 log: App [pm2-logrotate:0] starting in -fork mode-
PM2 | 2024-03-15T07:09:08: PM2 log: App [pm2-logrotate:0] online
PM2 | 2024-03-15T07:09:12: PM2 log: App [registry_server:1] starting in -fork mode-
PM2 | 2024-03-15T07:09:12: PM2 log: App [registry_server:1] online
PM2 | 2024-03-15T07:09:13: PM2 log: App [db:2] starting in -fork mode-
PM2 | 2024-03-15T07:09:13: PM2 log: App [db:2] online
PM2 | 2024-03-15T07:09:14: PM2 log: App [line:3] starting in -fork mode-
PM2 | 2024-03-15T07:09:14: PM2 log: App [line:3] online
PM2 | 2024-03-16T07:09:08: PM2 log: [PM2] This PM2 is not UP TO DATE
PM2 | 2024-03-16T07:09:08: PM2 log: [PM2] Upgrade to version 5.3.1
PM2 | 2024-03-17T07:09:08: PM2 log: [PM2] This PM2 is not UP TO DATE
PM2 | 2024-03-17T07:09:08: PM2 log: [PM2] Upgrade to version 5.3.1
PM2 | 2024-03-18T07:09:08: PM2 log: [PM2] This PM2 is not UP TO DATE
PM2 | 2024-03-18T07:09:08: PM2 log: [PM2] Upgrade to version 5.3.1

@ultimate-tester
Copy link
Contributor

The first thing that comes to mind is that your system was out of memory, could that have been the case?

@JuriStefanovski
Copy link
Author

JuriStefanovski commented Mar 21, 2024

I looked at the statistics of the monitoring system - at the time of the crash, ~40% of 32 GB of RAM remained free in the system.

Update: Having studied the statistics a little more carefully, I noticed that about two days before the failure, memory consumption began to increase, slowly but surely, from the usual ~48% to 58%
It looks like a memory leak, but it is not possible to identify the culprit process; my monitoring, unfortunately, is very basic.

@arasmussen
Copy link

Running into the same / a similar issue. Also posted an issue here: #5797

@Potherca
Copy link

Potherca commented Dec 3, 2024

Could this be related to #5694 / #5721 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants