简体   繁体   English

获取 UNIX 目录树结构到 JSON 对象中

[英]Get UNIX directory tree structure into JSON object

I'm trying to build a browser application that visualizes file structures, so I want to print the file structure into a JSON object.我正在尝试构建一个可视化文件结构的浏览器应用程序,因此我想将文件结构打印到 JSON 对象中。

I've tried using many variation of 'ls' piped to sed, but it seems like find works best.我已经尝试使用通过管道传输到 sed 的“ls”的许多变体,但似乎 find 效果最好。

Right now I'm just trying to use the command现在我只是想使用命令

find ~ -maxdepth ? -name ? -type d -print

And tokenize the path variables并标记路径变量

I've tried just simple ajax with PHP-exec this, but the array-walking is really slow.我已经尝试过使用 PHP-exec 这个简单的 ajax,但是数组遍历真的很慢。 I was thinking to do it straight from bash script, but I can't figure out how to get the pass-by-reference for associative arrays to recursively add all the tokenized path variables to the tree.我想直接从 bash 脚本中执行此操作,但我无法弄清楚如何获取关联数组的传递引用以递归地将所有标记化路径变量添加到树中。

Is there a better or established way to do this?有没有更好或既定的方法来做到这一点?

Thanks!谢谢!

I don't know what your application's requirements are, but one solution that solves your problem (and a number of other problems) is to hide the actual file system layout behind an abstraction layer.我不知道您的应用程序的要求是什么,但是解决您的问题(以及许多其他问题)的一种解决方案是将实际文件系统布局隐藏在抽象层后面。

Essentially, you write two threads.本质上,您编写了两个线程。 The first scrapes the file structures and creates a database representation of their contents.第一个抓取文件结构并创建其内容的数据库表示。 The second responds to browser requests, queries the database created by the first thread, and generates your JSON (ie a normal web request handler thread).第二个响应浏览器请求,查询第一个线程创建的数据库,并生成您的 JSON(即普通的 Web 请求处理程序线程)。

By abstracting the underlying storage structure (the file system), you create a layer that can add concurrency, deal with IO errors, etc. When someone changes a file within the structure, it's not visible to web clients until the "scraper" thread detects the change and updates the database.通过抽象底层存储结构(文件系统),你创建了一个可以添加并发、处理 IO 错误等的层。 当有人更改结构内的文件时,Web 客户端是不可见的,直到“刮板”线程检测到更改并更新数据库。 However, because web requests are not tied to reading the underlying file structure and merely query a database, response time should be fast.但是,由于 Web 请求与读取底层文件结构无关,而只是查询数据库,因此响应时间应该很快。

HTH, nate. HTH,你好。

9 years later... Using tree should do the job. 9 年后......使用应该可以完成这项工作。

tree ~ -J -L ? -P '?' -d --noreport

where:在哪里:

  • -J output as json -J 输出为 json
  • -L max level depth (equiv to find -maxdepth) -L 最大级别深度(相当于找到 -maxdepth)
  • -P pattern to include (equiv. to find -name)要包含的 -P 模式(相当于查找 -name)
  • -d directories only (equiv. to find -type d) -d 仅目录(相当于查找 -type d)

Walking the disk is always going to be slower than ideal, simply because of all the seeking that needs to be done.遍历磁盘总是比理想的要慢,仅仅是因为需要完成所有的查找。 If that's not a problem for you, my advice would be to work to eliminate overhead... starting with minimizing the number of fork() calls.如果这对您来说不是问题,我的建议是努力消除开销……从最小化 fork() 调用的次数开始。 Then you can just cache the result for however long you feel is appropriate.然后,您可以将结果缓存多久您觉得合适。

Since you've already mentioned PHP, my suggestion is to write your entire server-side system in PHP and use the DirectoryIterator or RecursiveDirectoryIterator classes.由于您已经提到了 PHP,我的建议是用 PHP 编写整个服务器端系统并使用DirectoryIteratorRecursiveDirectoryIterator类。 Here's an SO answer for something similar to what you're asking for implemented using the former.这是一个类似于您要求使用前者实现的内容的SO 答案

If disk I/O overhead is a problem, my advice is to implement a system along the lines of mlocate which caches the directory listing along with the directory ctimes and uses stat() to compare ctimes and only re-read directories whose contents have changed.如果磁盘 I/O 开销是一个问题,我的建议是按照mlocate 的方式实现一个系统,该系统将目录列表与目录 ctimes 一起缓存,并使用 stat() 比较 ctimes 并仅重新读取内容已更改的目录.

I don't do much filesystem work in PHP, but, if it'd help, I can offer you a Python implementation of the basic mlocate-style updatedb process.我在 PHP 中没有做太多文件系统工作,但是,如果有帮助,我可以为您提供基本 mlocate 风格的 updatedb 进程的 Python 实现。 (I use it to index files which have to be restored from DVD+R manually if my drive ever fails because they're too big to fit on my rdiff-backup target drive comfortably) (我用它来索引文件,如果我的驱动器出现故障,必须手动从 DVD+R 恢复,因为它们太大而无法舒适地安装在我的 rdiff-backup 目标驱动器上)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM