简体繁体中英

How do I transfer effectively huge number of files from my python clients to server and back?

原文 2013-06-16 12:06:56 0 1 python/ networking/ file-io/ xml-rpc/ file-transfer

I have around 100 clients (Windows machines mostly with one or two Macs/Ubuntas) and I need to sync huge number of files between clients by means of central server which does almost no work on synced files (managing access rights mostly).

For now I see two solutions available:

Use XML-RPC. Looks great but I'm not sure about performance. From that I googled performance of this approach is subpar.
Use paramiko and copy files by ftp of scp. I don't like that solution because I'm storing files within riak and it would be a double i/o work on the server side: first, write file to the disk and second read file from disk and finally write it to the riak.

Is there a third approach like using sockets and writing file transferring code myself? Is there asynchronous XML-RPC server and do I need one for my task?

Operations during file transfer:

Authentication of uploading user
Checking user's disk quota
Rules based access rights managment (who can read/write each files/directories).
Placing files in riak because certain level of fault tolerance needed.

As I see it this application is actually to be closer to dropbox than to rsync. We'd actually use dropbox api but this storage is to be integrated deep with our other systems so we wanted to have more control over it.

1 answers

The first thing in my mind when you say "sync huge number of files" is rsync. In case you don't know that tool, it allows you to sync directories efficiently, both local and remote. It can be configured to skip things that are unchanged, making it very efficient.

Now, when you say that the server "does almost no work on synced files", what is "almost"? If there is nothing to do on the files, you can use rsync. If there actually is some heavy computation on the files, the cost of these will probably dwarf the cost of transferring, so the IO is not your bottleneck and you can use any tool for it without degrading performance.

Now, if you can mirror the files on the server and apply the various modifications there, you could then use rsync to transfer them efficiently. This would allow you to not reinvent the file-transfer-wheel but instead to build on proven infrastructure. I must stress here that I don't understand from your description what exactly it is that you are doing though, maybe if you described the requirements a bit more, there would be a better or different answer.

Edit according to the updated question:

There are Python rsync bindings that should allow you to sync access even from MS Windows systems. It doesn't mention OS X, but since that is rather close to POISX, chances are high that it works without too much hassle. On the server side, you just monitor the local filesystem for changes (check out eg iwatch ) and then commit the differences to your DB. Using these two should get you started, if the performance later on doesn't suffice, you could hook into the rsync server (open source) and trigger DB updates from there without going through the filesystem.

How do I download and upload files as effectively as possible using Python?

I accidentally created a huge amount of files on my Desktop with Python. How do I get rid of them?

How to Transfer Files from Client to Server Computer by using python script?

How do I simply transfer and download a file from an FTP server with Python?

How do I design my django app which fetches the data from a legacy database which has huge number of tables?

How do I transfer files from s3 to my ec2 instance whenever I add a new file to s3?

How can i transfer data from python to opc da server?

How do I effectively save a file to my directory of choice in Flask

How to group elements effectively in a huge list by their first character in python

python - how do I transfer values from one df to another

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How do I download and upload files as effectively as possible using Python? I accidentally created a huge amount of files on my Desktop with Python. How do I get rid of them? How to Transfer Files from Client to Server Computer by using python script? How do I simply transfer and download a file from an FTP server with Python? How do I design my django app which fetches the data from a legacy database which has huge number of tables? How do I transfer files from s3 to my ec2 instance whenever I add a new file to s3? How can i transfer data from python to opc da server? How do I effectively save a file to my directory of choice in Flask How to group elements effectively in a huge list by their first character in python python - how do I transfer values from one df to another

Related Tags

How do I transfer effectively huge number of files from my python clients to server and back?

Question

1 answers

solution1 0 2013-06-16 14:45:32

solution1
0 2013-06-16 14:45:32