简体   繁体   中英

Gremlin Interface for Filesystem

A filesystem has a tree or graph structure (depending whether you allow hard or symbolic links).

I am looking for a away to traverse a filesystem with Gremlin queries. I tried to wrap the FileSystem with a bit of software see

https://github.com/BITPlan/com.bitplan.simplegraph

In https://github.com/BITPlan/com.bitplan.simplegraph/blob/master/src/test/java/com/bitplan/simplegraph/TestFileSystem.java

There is a JUnit test showing that things work in principle. Eg a traversal like

GraphTraversal<Vertex, Vertex> javaFiles = start.g().V().has("ext", "java");
    long javaFileCount=javaFiles.count().next().longValue();

works.

What I do not like about the implementation is that is just looks like Gremlin in some parts and eg there is recursiveOut function as a workaround instead of having a proper repeat() available. The recursion is also flawed since it's inefficiently handling the intermediate ArrayLists.

What's worse is that the wrapping has to visit all files to get a proper graph before the traversal with the gremlin approach can start. I'd rather have an implementation where the traversal steps will lead to visit the corresponding File or Directory in the Filesystem while doing the traversal.

How could the code/approach be improved to get closer to the above goals?

Alternatively - the improvement might be not worthwhile if there is some better/comparable implementation out there that can already do what I am describing.

Which such Filesystem Traversal APIs based on Apache Tinkerpop/Gremlin would you know of?

JUnit Test to show the principle

package com.bitplan.simplegraph;

import static org.junit.Assert.assertEquals;

import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.junit.Test;

import com.bitplan.filesystem.FileNode;
import com.bitplan.filesystem.FileSystem;
import com.bitplan.simplegraph.SimpleSystem;

/**
 * test navigating the Filesystem with SimpleGraph approaches
 * @author wf
 *
 */
public class TestFileSystem {
  boolean debug=true;
  @Test
  public void testFileSystem() throws Exception {
    SimpleSystem fs=new FileSystem();
    FileNode start = (FileNode) fs.moveTo("src");
    if (debug)
      start.printNameValues(System.out);
    start.recursiveOut("files",Integer.MAX_VALUE).forEach(childFile->{
      if (debug)
        childFile.printNameValues(System.out);
    });
    long filecount = start.g().V().count().next().longValue();
    if (debug)
      System.out.println(filecount);
    assertEquals(25,filecount);
    GraphTraversal<Vertex, Vertex> javaFiles = start.g().V().has("ext", "java");
    long javaFileCount=javaFiles.count().next().longValue();
    assertEquals(10,javaFileCount);
    javaFiles.forEachRemaining(javaFile-> {
      for (String key:javaFile.keys()) {
        if (debug)
          System.out.println(String.format("%s = %s", key, javaFile.property(key).value()));
      }
      //Map<String, Object> javaFileMap =javaFile.valueMap().next();
      //javaFileMap.forEach((k, v) -> System.out.println(String.format("%s = %s", k, v)));
    });
  }

}

The SimpleGraph FileSystem module

Has the needed capability.

 // create a new FileSystem access supplying the result as a SimpleSystem API
 SimpleSystem fs=new FileSystem();  
 // connect to this system with no extra information (e.g. no credentials) and move to the "src" node 
 SimpleNode start = fs.connect("").moveTo("src");
 // do gremlin style out traversals recusively to any depth 
 start.recursiveOut("files",Integer.MAX_VALUE);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM