How to determine what's causing: "Error: connect EMFILE" (Node.js)
I came across a bunch of posts on stackoverflow suggesting I use something like graceful-js to solve my problem, or to increase ulimit. Essentially, they’re workarounds. I wanted to know what the root problem was. Here’s the process I came up with. If anyone knows a better way, please let me know in the comments.
What This Error Means
There is a number of file handles a process can have open. Note: that sockets also create a file handle. Once you reach the limit you cannot open anymore and a cryptic error message such as “Error connect EMFILE” will end up in your log file (hopefully). The default limit (at least on my ubuntu system) is 1024.
How To Isolate
This command will output the number of open handles for nodejs processes:
lsof -i -n -P | grep nodejs
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... nodejs 12211 root 1012u IPv4 151317015 0t0 TCP 10.101.42.209:40371->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1013u IPv4 151279902 0t0 TCP 10.101.42.209:43656->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1014u IPv4 151317016 0t0 TCP 10.101.42.209:34450->54.236.3.168:80 (ESTABLISHED) nodejs 12211 root 1015u IPv4 151289728 0t0 TCP 10.101.42.209:52691->54.236.3.173:80 (ESTABLISHED) nodejs 12211 root 1016u IPv4 151305607 0t0 TCP 10.101.42.209:47707->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1017u IPv4 151289730 0t0 TCP 10.101.42.209:45423->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1018u IPv4 151289731 0t0 TCP 10.101.42.209:36090->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1019u IPv4 151314874 0t0 TCP 10.101.42.209:49176->54.236.3.172:80 (ESTABLISHED) nodejs 12211 root 1020u IPv4 151289768 0t0 TCP 10.101.42.209:45427->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1021u IPv4 151289769 0t0 TCP 10.101.42.209:36094->54.236.3.170:80 (ESTABLISHED) nodejs 12211 root 1022u IPv4 151279903 0t0 TCP 10.101.42.209:43836->54.236.3.171:80 (ESTABLISHED) nodejs 12211 root 1023u IPv4 151281403 0t0 TCP 10.101.42.209:43930->54.236.3.172:80 (ESTABLISHED) ....
Notice the: 1023u (last line) —- that’s the 1024th file handle which is the default maximum.
Now, Look at the last column. That indicates which resource is open. You’ll probably see a number of lines all with the same resource name. Hopefully, that now tells you where to look in your code for the leak.
If you have multiple node processes, you can isolate it by using PID in 2nd column.
In my case above, I noticed that there were a bunch of very similar IP Addresses. They were all 54.236.3.###
So, I started doing iplookups on all the different 3rd party services I used… loggly, newrelic, pubnub… until I ultimately determined it was pubnub. Turned out we were creating a new socket each time we published an event instead of reusing.
Command Reference
Use this syntax to determine how many open handles a process has open…
To get a count of open files for a certain pid
I used this command to test the number of files that were opened after doing various events in my app.
lsof -i -n -P | grep "8465" | wc -l
root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 28 root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 31 root@ip-10-101-42-209:/var/www# lsof -i -n -P | grep "nodejs.*8465" | wc -l 34
What is your process limit?
ulimit -a
The line you want will look like this:
open files (-n) 1024
Reader Comments (3)
lsof -i -n -P | grep node