Satellite Workshop Report

Graph Data Formats

Ulrik Brandes, M. Scott Marshall, and Stephen C. North

Abstract: Prompted by the increasing demand for a standard exchange format for graph data, an informal workshop was held in conjunction with Graph Drawing 2000. The participants identified requirements for such a standard and formed a group to work out a proposal. The current status of this effort is publicly available at http://www.graphdrawing.org/data/format/.

This report appeared in the Proceedings of GD 2000, Lecture Notes in Computer Science 1984, pages 407-409. © Springer-Verlag, 2001.


Introduction

Graph drawing tools need to store and exchange graph data. Despite several earlier attempts to define a standard, no agreed-upon format is widely accepted, and indeed, many packages support only their own custom format.

Motivated by the goals of tool interoperability, access to benchmark data sets, and, most importantly, data exchange over the Web, the Steering Committee of the Graph Drawing Symposium started a new initiative. In a concerted effort, a standard format will be defined and proposed to developers of graph drawing and related software.

As a first step towards this end, an informal workshop was announced on the GD 2000 Web page and held the day before the start of the symposium. The workshop featured two invited presentations and plenty of discussion.

Date and location:  20 September 2000, 14:00--17:30, Colonial Williamsburg
Local organization:  Joe Marks and Kathy Ryall
Minutes:  Helen Purchase
Chair:  Ulrik Brandes
 
14:00-14:20  Stephen North: Introduction to XML
14:20-14:40  Scott Marshall: Survey of Graph Data Formats
15:00-17:30  Discussion blocks on
  - general requirements
  - specific goals
  - future activities

The widespread interest in a standard was underscored by the unexpectedly large number of participants. With several others noting that they were, unfortunately, unable to attend, the following persons were present at the meeting:

Summary of Discussion

The discussion was held in an open and constructive atmosphere. None of the topics required vote-taking, since each was discussed until either there was no noticeable dissent or a topic was identified as requiring more detailed investigation.

In a first round of discussion, the scope of the project was delineated by gathering potentially relevant features of a common exchange format. Among such requirements were the support of hypergraphs, clustering, hierachical graphs, graph properties and corresponding certificates, dynamic graphs, 3D graphics, animation, and even layout algorithms as part of the description of a graph drawing. In addition, issues such as scalability and human readability were addressed briefly.

Given this vast array of options, the unanimous decision was to focus on a simple format that should be sufficient for the most common cases, yet extensible to incorporate more advanced features. However, it became obvious that even for a simple format, several difficult decisions need to be made. At this point, Susan Sim was asked to report on the current state of these decisions in a similiar project called GXL (Graph Exchange Language), aimed at defining an exchange format for graphs arising in software reengineering and graph transformation. Subsequently, several of these issues were discussed in some depth, for instance, wether a file should contain unique identifiers for nodes (yes) and edges (optionally), or whether a graph should be allowed to have both directed and undirected edges (yes), or parallel edges (yes) or loops (yes).

While opinions on some of these details differ from the path currently taken in GXL, it was agreed that the proposal should be defined in cooperation with the GXL consortium and other relevant groups.

Results

The consensus among the participants was that a graph data format should have a layered architecture that conceptually separates the following five aspects (in order of increasing level of detail):

  1. structure (vertices, edges, incidences, attributes)
  2. topology (ordering of edges, crossings, etc.)
  3. shape (bends, feature shapes, etc.)
  4. geometry (positions, feature sizes, etc.)
  5. rendering (colors, styles, icons, textures, etc.)
The number of layers of the format need not necessarily match this list, but it should be possible to omit higher layers from a graph description. The design should be open to allow formats for applications other than graph drawing to specify other characteristics on top of the lower layers. In particular, the goal is to collaborate with groups working on a data exchange format for software reengineering and graph transformations. See http://www.cs.toronto.edu/~simsuz/wosef/ for more information.

While extensions to incorporate layout algorithms, dynamic graphs, clustering, and so on may not be part of the initial proposal, the need for such extensions should be taken into account.

As a result of the discussion, a working group was formed, and several others have joined since, or expressed interest in contributing. The group will focus on defining the layer structure with extension points. It is understood that a draft proposal will be made available for public review before the 2001 Symposium on Graph Drawing (GD 2001), and that the proposal will be published in the GD 2001 proceedings. The current status of the project is maintained at http://www.graphdrawing.org/data/format/.


Background Information

XML (eXtensible Markup Language)
Universal format for structured documents and data on the Web
 
RDF (Resource Description Framework)
Metadata format
 
HDF (Hierarchical Data Format)
Scientific data format
 
GraphXML
XML-based graph format proposal
 
GXL (Graph Exchange Language)
XML-based graph format proposal
 
GRXL
XML-based format used with Grrr (Graph Rewriting Programming Language)
 
GML (Graph Modelling Language)
 
SVG (Scalable Vector Graphics format)
XML-based format for graphics
 

Related Efforts


last modified 20 Nov 2000