Bash Script for monitoring ulimit for multiple processes
Use the script i posted here: https://gist.github.com/blak3r/d470a4254059acca926a
Then pipe it to a file so you can analyze the sockets used by your app overtime.
Use the script i posted here: https://gist.github.com/blak3r/d470a4254059acca926a
Then pipe it to a file so you can analyze the sockets used by your app overtime.
I came across a bunch of posts on stackoverflow suggesting I use something like graceful-js to solve my problem, or to increase ulimit. Essentially, they’re workarounds. I wanted to know what the root problem was. Here’s the process I came up with. If anyone knows a better way, please let me know in the comments.
There is a number of file handles a process can have open. Note: that sockets also create a file handle. Once you reach the limit you cannot open anymore and a cryptic error message such as “Error connect EMFILE” will end up in your log file (hopefully). The default limit (at least on my ubuntu system) is 1024.
This command will output the number of open handles for nodejs processes:
lsof -i -n -P | grep nodejs
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... nodejs 12211 root 1012u IPv4 151317015 0t0 TCP 10.101.42.209:40371->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1013u IPv4 151279902 0t0 TCP 10.101.42.209:43656->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1014u IPv4 151317016 0t0 TCP 10.101.42.209:34450->54.236.3.168:80 (ESTABLISHED) nodejs 12211 root 1015u IPv4 151289728 0t0 TCP 10.101.42.209:52691->54.236.3.173:80 (ESTABLISHED) nodejs 12211 root 1016u IPv4 151305607 0t0 TCP 10.101.42.209:47707->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1017u IPv4 151289730 0t0 TCP 10.101.42.209:45423->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1018u IPv4 151289731 0t0 TCP 10.101.42.209:36090->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1019u IPv4 151314874 0t0 TCP 10.101.42.209:49176->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1020u IPv4 151289768 0t0 TCP 10.101.42.209:45427->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1021u IPv4 151289769 0t0 TCP 10.101.42.209:36094->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1022u IPv4 151279903 0t0 TCP 10.101.42.209:43836->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1023u IPv4 151281403 0t0 TCP 10.101.42.209:43930->54.236.3.172:80 (ESTABLISHED) ....
Notice the: 1023u (last line) —- that’s the 1024th file handle which is the default maximum.
Now, Look at the last column. That indicates which resource is open. You’ll probably see a number of lines all with the same resource name. Hopefully, that now tells you where to look in your code for the leak.
If you have multiple node processes, you can isolate it by using PID in 2nd column.
In my case above, I noticed that there were a bunch of very similar IP Addresses. They were all 54.236.3.###
So, I started doing iplookups on all the different 3rd party services I used… loggly, newrelic, pubnub… until I ultimately determined it was pubnub. Turned out we were creating a new socket each time we published an event instead of reusing.
Use this syntax to determine how many open handles a process has open…
I used this command to test the number of files that were opened after doing various events in my app.
lsof -i -n -P | grep "8465" | wc -l
root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 28 root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 31 root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 34
ulimit -a
The line you want will look like this:
open files (-n) 1024