简体   繁体   中英

How can I download specific files from google drive programmatically using python

I have around 100k files in different folders in my google drive. I want to download specific files from that.The path of the files in the google drive is inside a csv.

But how can I get the ids of the file? I tried the following.

import pandas as pd
from apiclient import errors
#from pygdrive3 import service


def retrieve_all_files(service):
  """Retrieve a list of File resources.

  Args:
    service: Drive API service instance.
  Returns:
    List of File resources.
  """
  result = []
  page_token = None
  while True:
    try:
      param = {}
      if page_token:
        param['pageToken'] = page_token
      files = service.files().list(**param).execute()

      #result.extend(files['items'])
      idval = files.get('id')
      if not idval:
        break
    except errors.HttpError.error:
      print ('An error occurred: %s' % error)
      break
  return idval


df = pd.read_csv("/home/ram/Downloads/Data_Science/Kaggle Competition/BBox_List_2017_path_colab.csv",header=None)
print(df.head())
for i in df[0]:
    request = drive_service.files()
    result = retrieve_all_files(request)
    fh = io.BytesIO()
    downloader = MediaIoBaseDownload(fh, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk()
        print ("Download %d%%." % int(status.progress() * 100))

But the error says, drive_service is not defined .The below is my csv

                                                   0           1  ...           4            5
0  /content/drive/My Drive/nihxray/images_001/ima...  225.084746  ...   79.186441  Atelectasis
1  /content/drive/My Drive/nihxray/images_001/ima...  686.101695  ...  313.491525  Atelectasis
2  /content/drive/My Drive/nihxray/images_001/ima...  221.830508  ...  216.949153  Atelectasis
3  /content/drive/My Drive/nihxray/images_001/ima...  726.237288  ...   55.322034  Atelectasis
4  /content/drive/My Drive/nihxray/images_001/ima...  660.067797  ...   78.101695  Atelectasis

I have download only those files in above csv.How can I do it in python.Any Help appreciated

Here are two snippets from an async Google API client, this might suit you better because it will allow you to download multiple files at the same time:

List files (by ID): https://github.com/omarryhan/aiogoogle/blob/master/examples/list_drive_files.py

Download files: https://github.com/omarryhan/aiogoogle/blob/master/examples/download_drive_file.py

There is a much easier way that make more sense. after installing Python and Gam you can run a script that uses the file id from google drive in a csv file to export all the documents form the list. Once python and gam is installed you will need to install some modules for the script to work. When you run the script the error codes can be googled to see what needs to be installed in Python. Also you will need to create an api credential service account and replace that account name with in the script in both places. Run cmd as administrator with the following command with a script name of script.py. "C:\Users\dcahoon\AppData\Local\Programs\Python\Python38\python.exe C:\GAM\SCRIPT.PY **script start

import os
import subprocess

from csv import writer
from csv import reader

# path to googleidlist.csv
csvfile = 'c:\\GAM\\googleidlist.csv'
destination = 'c:\\GAM\\OUTPUT\\'      #Destination for downloaded documents


# Open the input_file in read mode and output_file in write mode
with open(csvfile, 'r') as read_obj, \
        open('output_1.txt', 'w', newline='') as write_obj:
    # Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
    # Create a csv.writer object from the output file object
    csv_writer = writer(write_obj)
    # Read each row of the input csv file as list
    for row in csv_reader:
         file_id = row[0]
        outcome = subprocess.Popen(['gam', 'user', 'googleserviceaccountname', 'get', 'drivefile', 'id', file_id, 'targetfolder',destination], stdout=subprocess.PIPE)
        # os.system("gam user david.bruinsma@colonialmed.com show fileinfo "+ file_id + "name")
        filename = subprocess.Popen(['gam', 'user', 'googleserviceaccountname', 'show', 'fileinfo', file_id, 'name' ], stdout=subprocess.PIPE)
        output = outcome.stdout.readline()
        file_name = filename.stdout.readline()
        print(output)
        # Append the default text in the row / list
        # row.append(filename)
        row.append(output)
        row.append(file_name)
        row.append(file_id)

        # Add the updated row / list to the output file
        csv_writer.writerow(row)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM