solution

KangurOS Solution

The graph, which I have in mind has directed edges which always connect three vertices. In such a connection, each vertex corresponds to either a subject, the predicate, or the object (like in the semantic web). In another relation, this vertex could also have a different role. This means we have a universe U of objects and a relation R={(x,y,z)|x,y,z el. of U} We decided to use this:

       w



       P

       |

       |

x S---------O z

       |

       |

       W



       y
  • S: subject side

  • P: predicate side

  • O: object side

  • W: way id side

  • w,x,y,z are vertices, containing a file

The following is obsolet:

x S---------O z

       |

       |

       P



       y
  • S: subject side

  • P: predicate side

  • O: object side

  • x,y,z are vertices, containing a file

In a first implementation, a vertex will be stored in four files:

  1. the file itself with its content

  2. a tuple-list with links to the pairs of predicates and objects, where this vertex is subject

  3. a tuple-list with links to the pairs of subjects and objects, where this vertex is predicate

  4. a tuple-list with links to the pairs of subjects and predicates, where this vertex is object

In a database, we would have the following two tables: Objects-database

id: ID

value: BLOB

id is indexed, value not (id might be primary key) Link database

subject:ID

predicate:ID

object:ID

All are indexed. The type ID will need to have to be something which could uniquely address all possible objects we could have. Let's for a beginning assume, that id is a 256 bit integer. Then we could even give random numbers to it and we would most likely (at least at is way more likely that our computer is destroyed by a meteor) a double hit, if we go up to 1000 Peta = 10**60 (the hole internetarchive is 2 Petabyte nowadays) objects. if an Object would need at least 256 bit + data to store, this would mean 32000 Petabyte storage. Ideas about clustering objects: To accelerate access time it would make sense, if data which belongs together is near to each other. This gets especially interesting if data is on different servers. Graph visualizers:

Pure 3D engines for python: