Back to Top

Friday, March 19, 2010

Parsing pcap files with Perl

4175923040_b41d970b17_b Recently I was reading the blogpost on the BrekingPoint labs log about parsing pcap files with Perl and I immediately said to myself: it is impossible that there isn’t a module on CPAN, because Perl is great. Turns out I was right, there is Net::TcpDumpLog which can be combined with the NetPacket family of modules to parse the higher level protocols. Because example code is rather sparse on the POD pages of the respective modules, here is a small example to illustrate their use:

use Net::TcpDumpLog;
use NetPacket::Ethernet;
use NetPacket::IP;
use NetPacket::TCP;
use strict;
use warnings;

my $log = Net::TcpDumpLog->new(); 

foreach my $index ($log->indexes) { 
  my ($length_orig, $length_incl, $drops, $secs, $msecs) = $log->header($index); 
  my $data = $log->data($index);
  my $eth_obj = NetPacket::Ethernet->decode($data);    
  next unless $eth_obj->{type} == NetPacket::Ethernet::ETH_TYPE_IP;

  my $ip_obj = NetPacket::IP->decode($eth_obj->{data});
  next unless $ip_obj->{proto} == NetPacket::IP::IP_PROTO_TCP;

  my $tcp_obj = NetPacket::TCP->decode($ip_obj->{data});
  my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime($secs + $msecs/1000);
  print sprintf("%02d-%02d %02d:%02d:%02d.%d", 
    $mon, $mday, $hour, $min, $sec, $msecs), 
    " ", $eth_obj->{src_mac}, " -> ", 
    $eth_obj->{dest_mac}, "\n";    
  print "\t", $ip_obj->{src_ip}, ":", $tcp_obj->{src_port}, 
    " -> ", 
    $ip_obj->{dest_ip}, ":", $tcp_obj->{dest_port}, "\n";

The code does the following: it opens the pcap file named “foo.pcap”, iterates over all the packets (it assumes that they all are Ethernet packets) and looks for TCP packets. Finally it prints out some information about these packets (capture time, source/destination mac, source/destination ip:port). You can customize it to fit your needs.

Small, somewhat offtopic rant: one should always think at least twice before publishing code which does such elementary things. Find a library and use it. If it doesn’t work, try patching it so that it works and send back the code to the original author. Only if this fails should you start from scratch.

Reusing existing code has many advantages: from your point of view, you can be sure that you can get code which worked for a couple of people. This is especially true for Perl modules which have a strong culture of testing. Also, even these “simple” problems like parsing a TCP packet have many corner cases which you will almost certainly miss at the first go, and as a result, half of your time will be spent hunting them down and only half of your time will be dedicated to solving the actual problem (this is if you are lucky – if you are unlucky, your code will skip over the special cases and it may make your entire analysis irrelevant).

Looking at it from the other side we have: more concentration of the way to do “X” means that the code will be more tested, leading it to be used more, meaning that it will be better tested and thus creating a positive feedback loop. Also, if you believe in the open-source ethos (and supposedly you do, since you published your code in the first place), you should consider maximizing the return while minimizing the effort needed.

Picture taken from greyloch's photostream with permission.

Update: updated NetPacket link - thank you Anonymous.


  1. Anonymous1:00 AM

    Have you tried It's a Ruby API for reading, searching, slicing pcaps.

  2. @Anonymous: thanks, I was periferically aware of Xtractr, mainly because I've read about it on the Mu Dynamics Labs blog. However I'm more a Perl guy than a Ruby guy, but I have to admit that the API examples look very nice and expressive.

  3. Anonymous7:14 AM

    Hmm, a PERL binding for the REST API that xtractr provides sounds like an interesting project...

  4. Anonymous6:04 AM

    Thanks, this saved me some slogging and experimenting :)

    Do you mean to use localtime($secs) instead of localtime(time)?

  5. @Anonymous: thanks good catch. Sorry for the late reply. I've updated the post. The actual formula is (as far as I can tell) "$secs + $msecs/1000".

  6. Anonymous1:45 AM

    Thanks for this snippet. I am not able to run this module. I am getting not found error. Can you please let me know why this error?

  7. @Anonymous: this is not a module, rather a perl script (snippet). You should run it directly (ie. perl

  8. Anonymous10:07 AM

    heloo guyz..i get an error that canot read "test.pcap" file..what could be the issue.. no such file or dirctory..could you plese tell where the Pcap file should be placed to run this program..

  9. @Anonymous: the test.pcap file should be in the same directory where you start your script from. You can change where the file is read from by using a different path on the '$log->read("foo.pcap");' line (for example: $log->read("/home/cdman/test/bar.pcap");)

  10. Anonymous11:01 AM

    only FYI: NetPack link has changed:

  11. @Anonymous - thank you, I updated the link in the article.

  12. Nice example - thanx.

    I had to add "use Exporter;" to the top in order for it to work. Otherwise you get this error:

    $ perl -wc
    Can't locate package Exporter for @NetPacket::Ethernet::ISA at line 5.
    Can't locate package Exporter for @NetPacket::ISA at /usr/lib/perl5/site_perl/5.10/NetPacket/ line 17.
    Can't locate package Exporter for @NetPacket::IP::ISA at line 6.
    Can't locate package Exporter for @NetPacket::TCP::ISA at line 7. syntax OK

    Also, I changed the "foo.pcap" to be $ARGV[0] so I could just pass it an argument.

    my $log = Net::TcpDumpLog->new();

    Now you can do " ", put in your path and use it anywhere.

    I was wondering if anyone knew of any scripts that already existed that used these modules and exported useful information about the pcap, such as

    - total number of flows
    - total number of unique IP's seen (src vs dst)
    - total number of unique ports seen (src vs dst)
    - breakdown of traffic types (percentages based on dst port)
    - top n flows, IP's, dst ports, packet sizes, etc

    Basically I'm looking to get via text / perl the same stuff I might get from Wireshark's analysis. I just want it in text instead of graphical format or to have to launch wireshark.