Wednesday, February 18, 2009

Generating Web Graphs with RMAT and Eigen template c++ library

R-MAT: is a recursive model for graph mining. It has some useful properties:
  1. On-line property. The number of nodes and edges can change with times;
  2. Power law degree distribution property;
  3. Small world property: diameter is much smaller than the order of the graph;
  4. Many dense bipartite subgraphs. The number of distinct bipartite cliques is large when compared to a random graph.
RMAT recursively subdivides the adjacency matrix in four equal quadrants and selects one of the four quadrants with unequal probability (a, b, c or d). When the selected portion is a 1x1 cell, the corresponding direct edge is set.

Here you have the RMAT C++ code implemented with Eigen template library. As an example, I report the timings taken on my VMware virtual machine (guest: Linux Ubuntu with 1GB memory, host: Windows Vista Professional, core duo, 4Gb)
  • 2^16 nodes, 8*2^16 edges; real 0m6.453s
  • 2^18 nodes, 8*2^18 edges; real 0m53.253s
  • 2^20 nodes, 8*2^20 edges; real 2m17.383s
  • 2^21 nodes, 8*2^21 edges; real 7m26.611s
I believe that these performance can be further improved by adopting fixed-size matrices, with size defined at compile time. Well, running on a dedicated server and not a virtual machine may help as well. But this is not bad, the library is pretty good for rapid coding!

2 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. I could not find the code in the zip file. Can you re-upload this?

    ReplyDelete