Tuesday
Nov032015
Using GREP + SORT + UNIQ to find occurrences of a repeated event by an id field
Tuesday, November 3, 2015 at 08:03PM
Use Case: you have log file but it has a bunch of entries that start with a time stamp. You had a process which continuously crashed such that each time it restarted it would reprint an occurance. Well we want to just get one match for the first time it occurred… luckily i had an identifier in my log file to key off of.
My Log File
grep "Description Mismatch" logfile.log
[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 16:45:01.672] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 17:00:02.073] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 17:00:02.146] [WARN] scheduler - Description Mismatch 562b997fb0e2bbb208f1f7dd CRM Description: -- Auto Created by Callinize Callinize Description: gq [2015-10-24 17:15:01.815] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vmThe first 3 and the last are the same occurrence but at different times. First we sort using the ```-k``` option. 8,8 means the 8th field which matches "562b955c8c01d13309889115" ``` grep "Description Mismatch" logfile.log | sort -k 8,8 ```
[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 16:45:01.672] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 17:00:02.073] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 17:15:01.815] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm
Next, lets use uniq
with -f (which is the field to start the comparison at) to get the uniq lines. Now we have:
grep "Description Mismatch" logfile.log | sort -k 8,8 | uniq -f 8
[2015-10-24 16:30:01.655] [WARN] scheduler - Description Mismatch 562b955c8c01d13309889115 CRM Description: -- Auto Created by Callinize Callinize Description: vm [2015-10-24 17:00:02.146] [WARN] scheduler - Description Mismatch 562b997fb0e2bbb208f1f7dd CRM Description: -- Auto Created by Callinize Callinize Description: gq [2015-10-24 18:00:04.052] [WARN] scheduler - Description Mismatch 562ba7a41cce2fd10843296f CRM Description: -- Auto Created by Callinize Callinize Description: sold
Final Command
Tack on a wc -1 to get a count output.
grep "Description Mismatch" logfile.log | sort -k 8,8 | uniq -f 8 | wc -l
Reader Comments