Spatio-temporal Cluster Detection For Pocket Switched Networks

Introduction


  • Many data-sets exists which contain human encounters and movement patterns.
  • Lots of research on data delivery possibilities in these data-sets in order to create Pocket Switched Networks.
  • Clustering of similar devices or people within data-sets has been shown to improve data delivery efficiency.

What is a cluster?

A group of vertices that have more edges among each other than with other vertices of the graph [Newman 2004].

How are clusters used?

Used in efficient data dissemination where exact destination location is unknown:
  1. Data is transferred between and within clusters...
  2. Which can limit data duplication.

How are clusters formed?

Clusters are made up of transient, pair-wise connections.
Clusters can be detected:
  • Centrally on data-sets.
  • Or distributively by the devices themselves over time.

How are clusters formed?

In either case cluster size usually increases monotonically.

Example clustering from the Simple [Hui 2007] cluster detection method.
Lambda governs cluster growth.

What are the issues?

Monotonically increasing clusters can lead to obsolete cluster membership which causes efficiency loss.
Example clustering from the Simple [Hui 2007] cluster detection method.
Lambda governs cluster growth.
Why?

Contact Graphs

Activity within the data-sets is "bursty".

Contact Graphs Data Delivery


Shortest Path ≠ Fastest Path
Could clusters better reflect current network conditions instead of increasing in size monotonically over time?
Introducing the Expectation-based Spatio-temporal Clustering approach.

Expectation-based detection approach

Combines identifying time frames where connection times are high, with strongly connected components.

  • A directed graph is strongly connected if there is a path from each vertex in the graph to every other vertex.
  • Idea is to increase data dissemination efficiency by identifying clusters comprised of these strongly connected components.
Image credit: http://en.wikipedia.org/wiki/Strongly_connected_component

Resulting Strongly Connected Components

Examble using the Haggle Cambridge data-set.

Spatial
vs.
Spatio-temporal
Data dissemination results.

Results

TTL of 1 hour (Cambridge) and 1 day (Reality).
Data delivery

  • Blue Line - Expectation-based approach (SEBS)
  • Red Line - BUBBLE (K-Clique)
  • Green Line - BUBBLE (Simple)

Results

TTL of 1 hour (Cambridge) and 1 day (Reality).
Efficiency

Results

Combing all end results (Including Haggle Infocom5 and Infocom6 data-sets).



MethodData DeliveryEfficiency
(Delivered/Relayed)
BUBBLE K-Clique0.6480.072
BUBBLE Simple0.6480.069
SEBS0.6370.188

1% fewer packets delivered for a massive efficiency saving.

Conclusions

  1. Generally more packets are delivered as experiments progress than when using aggregate clusters with BUBBLE.
  2. The strongly connected components (SCCs) are too exclusive for high data delivery rates. Routing based on this excludes many devices.
  3. Long time frames are needed to create large SCCs. Reality 6 hours and Infocom6 12 hours.
  4. If frames are large, metric differences between frames is usually small and there is loss of temporal relevance.
  5. Automatic frame size detection (or getting rid of frames completely) would be really useful.

Use a spacebar or arrow keys to navigate