简体   繁体   English

在Python中存储testbin / pastebin的文件

[英]Storing files for testbin/pastebin in Python

I'm basically trying to setup my own private pastebin where I can save html files on my private server to test and fool around - have some sort of textarea for the initial input, save the file, and after saving I'd like to be able to view all the files I saved. 我基本上是在尝试设置自己的私有pastebin,在其中我可以将html文件保存在私有服务器上进行测试和四处走动-为初始输入提供某种文本区域,保存文件,保存后我希望能够查看我保存的所有文件。

I'm trying to write this in python, just wondering what the most practical way would be of storing the file(s) or the code? 我试图用python编写此代码,只是想知道最实际的存储方式是存储文件或代码吗? SQLite? SQLite的? Straight up flat files? 整理平面文件?

One other thing I'm worried about is the uniqueness of the files, obviously I don't want conflicting filenames ( maybe save using 'title' and timestamp? ) - how should I structure it? 我担心的另一件事是文件的唯一性,显然我不想使用冲突的文件名(也许使用“ title”和时间戳保存?)-我应该如何构造它?

I wrote something similar a while back in Django to test jQuery snippets. 我在Django上写过类似的文章来测试jQuery代码段。 See: 看到:

http://jquery.nodnod.net/ http://jquery.nodnod.net/

I have the code available on GitHub at http://github.com/dz/jquerytester/tree/master if you're curious. 如果您感到好奇,我可以在GitHub上的代码http://github.com/dz/jquerytester/tree/master获得

If you're using straight Python, there are a couple ways to approach naming: 如果您使用的是纯Python,则有几种方法可以命名:

  1. If storing as files, ask for a name, salt with current time, and generate a hash for the filename. 如果存储为文件,则要求输入名称,当前时间加盐,并为文件名生成一个哈希。

  2. If using mysqlite or some other database, just use a numerical unique ID. 如果使用mysqlite或其他数据库,则只需使用数字唯一ID。

Personally, I'd go for #2. 就个人而言,我会排名第二。 It's easy, ensures uniqueness, and allows you to easily fetch various sets of 'files'. 这很容易,可确保唯一性,并允许您轻松获取各种“文件”。

Have you considered trying lodgeit . 您是否考虑过尝试寄宿 Its a free pastbin which you can host yourself. 它是一个免费的pastbin,您可以自己托管。 I do not know how hard it is to set up. 我不知道设置有多困难。

Looking at their code they have gone with a database for storage (sqllite will do). 查看他们的代码,他们将数据库与存储一起使用(sqllite会这样做)。 They have structured there paste table like, (this is sqlalchemy table declaration style). 他们在那里构造了粘贴表之类的(这是sqlalchemy表的声明样式)。 The code is just a text field. 该代码只是一个文本字段。

pastes = Table('pastes', metadata,
        Column('paste_id', Integer, primary_key=True),
        Column('code', Text),
        Column('parent_id', Integer, ForeignKey('pastes.paste_id'),
               nullable=True),
        Column('pub_date', DateTime),
        Column('language', String(30)),
        Column('user_hash', String(40), nullable=True),
        Column('handled', Boolean, nullable=False),
        Column('private_id', String(40), unique=True, nullable=True)
    )

They have also made a hierarchy (see the self join) which is used for versioning. 他们还建立了用于版本控制的层次结构(请参阅自我连接)。

Plain files are definitely more effective. 纯文件肯定更有效。 Save your database for more complex queries. 保存数据库以进行更复杂的查询。

If you need some formatting to be done on files, such as highlighting the code properly, it is better to do it before you save the file with that code. 如果您需要对文件进行某种格式设置,例如正确突出显示代码,则最好使用该代码保存文件之前进行格式化。 That way you don't need to apply formatting every time the file is shown. 这样,您无需在每次显示文件时都应用格式设置。

You definitely would need somehow ensure all file names are unique, but this task is trivial, since you can just check, if the file already exists on the disk and if it does, add some number to its name and check again and so on. 您肯定需要以某种方式确保所有文件名都是唯一的,但是此任务很简单,因为您可以检查磁盘上是否已存在该文件,如果存在,则在其名称上添加一些编号并再次检查,依此类推。

Don't store them all in one directory either, since filesystem can perform much worse if there are A LOT (~ 1 million) files in the single directory, so you can structure your storage like this: 也不要将它们全部存储在一个目录中,因为如果单个目录中有很多文件(〜100万个),文件系统的性能可能会更差,所以您可以按以下方式构建存储结构:

FILE_DIR/YEAR/MONTH/FileID.html and store the "YEAR/MONTH/FileID" Part in the database as a unique ID for the file. FILE_DIR / YEAR / MONTH / FileID.html并将“ YEAR / MONTH / FileID”部分作为文件的唯一ID存储在数据库中。

Of course, if you don't worry about performance (not many users, for example) you can just go with storing everything in the database, which is much easier to manage. 当然,如果您不担心性能(例如,用户数量不多),则可以将所有内容存储在数据库中,这更易于管理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM