简体   繁体   中英

Listing all Google Drive files and folders in Python and save ID's

I'm trying to write a program to copy a folder and all contents (including sub-folders etc.) to another folder.

I might be over-complicating it but I feel the first step is to get all the file names and ID's associated with them and save them to two lists - one for files, one for folders.

I'm having trouble getting my program to go through all the sub-folders recursively and I thought a for loop would do the trick using i to select the index from the list which is being populated.

As you can see from the output below, my program is walking the directory passed to the function and the first sub-folder fine but then the program exits cleanly.

Apologies for the mass of code but context is important I guess.

Input:

listofdictFolders = []
listofdictFiles = []


def mapFolderContents(folderid):
    # retrieves parent name from folderid
    parentfolder = service.files().get(fileId=folderid).execute()
    parentname = 'Title: %s' % parentfolder['name']
    # sets query as argument passed to function, searches for mimeType matching folders and saves to variable
    folderquery = "'" + folderid + "'" + " in parents and mimeType='application/vnd.google-apps.folder'"
    childrenFoldersDict = service.files().list(q=folderquery,
                                               spaces='drive',
                                               fields='files(id, name)').execute()
    # sets query as argument passed to function, searches for mimeType matching NOT folders and saves to variable
    notfolderquery = "'" + folderid + "'" + \
                     " in parents and not mimeType='application/vnd.google-apps.folder'"
    childrenFilesDict = service.files().list(q=notfolderquery,
                                             spaces='drive',
                                             fields='files(name, id)').execute()
    # takes value pair of 'files' which is a list of dictionaries containing ID's and names.
    childrenFolders = (childrenFoldersDict['files'])
    # takes value pair of 'files' which is a list of dictionaries containing ID's and names.
    childrenFiles = (childrenFilesDict['files'])
    # if no files found, doesn't append to list
    if len(childrenFiles) > 0:
        listofdictFiles.append(['Parent Folder ' + parentname, childrenFiles])
    # if no folders found, doesn't append to list 
    if len(childrenFolders) > 0:
        listofdictFolders.append(['Parent Folder ' + parentname, childrenFolders])
    # finds length of list for use in for loop later to avoid index out of range error
    maxIndex = len(listofdictFolders)
    # for loop to find ID's and names of folders returned above and append name and ID's to dictionary
    for i in range(0, maxIndex):
        # strip variables are to access ID values contained in dictionary
        strip1 = listofdictFolders[0]
        strip2 = strip1[1]
        print('Now indexing ' + str(strip2[i]['name']) + '...')
        # saves query to variable using strip2 variable, index and 'id' key
        loopquery = "'" + str(strip2[i]['id']) + "'" \
                    + " in parents and mimeType='application/vnd.google-apps.folder'"
        loopquery2 = "'" + str(strip2[i]['id']) + "'" \
                    + " in parents and not mimeType='application/vnd.google-apps.folder'"
        # saves return value (dictionary) to variable
        loopreturn = service.files().list(q=loopquery,
                                          spaces='drive',
                                          fields='files(id, name)').execute()
        loopreturn2 = service.files().list(q=loopquery2,
                                          spaces='drive',
                                          fields='files(id, name)').execute()
        loopappend = (loopreturn['files'])
        loopappend2 = (loopreturn2['files'])
        # appends list of dictionaries to listofdictFolders
        listofdictFolders.append(['Parent Folder Title: ' + str(strip2[i]['name']), loopappend])
        listofdictFiles.append(['Parent Folder Title: ' + str(strip2[i]['name']), loopappend2])

mapFolderContents(blankJobFolderID)
pprint.pprint(listofdictFiles)
print('')
pprint.pprint(listofdictFolders)

Output:

Now indexing subfolder 1...
[['Parent Folder Title: Root',
  [{'id': 'subfolder 1 ID', 'name': 'subfolder 1'},
   {'id': 'subfolder 2 ID', 'name': 'subfolder 2'},
   {'id': 'subfolder 3 ID', 'name': 'subfolder 3'}]],
 ['Parent Folder Title: subfolder 1',
  [{'id': 'sub-subfolder1 ID', 'name': 'sub-subfolder 1'},
   {'id': 'sub-subfolder2 ID', 'name': 'sub-subfolder 2'}]]]

[['Parent Folder Title: Venue',
  [{'id': 'sub-file 1 ID',
    'name': 'sub-file 1'}]]]

Process finished with exit code 0

You can use a Recursive BFS in order to retrieve all the files and folders

Here's my approach:

def getChildrenFoldersByFolderId(folderid):
  folderquery = "'" + folderid + "'" + " in parents and mimeType='application/vnd.google-apps.folder'"
  childrenFoldersDict = service.files().list(q=folderquery,
                                              spaces='drive',
                                              fields='files(id, name)').execute()

  return childrenFoldersDict['files']

def getChildrenFilesById(folderid):
  notfolderquery = "'" + folderid + "'" + \
                    " in parents and not mimeType='application/vnd.google-apps.folder'"
  childrenFilesDict = service.files().list(q=notfolderquery,
                                            spaces='drive',
                                            fields='files(name, id)').execute()

  return childrenFilesDict['files']

def getParentName(folderid):
  # retrieves parent name from folderid
  parentfolder = service.files().get(fileId=folderid).execute()
  parentname = 'Title: %s' % parentfolder['name']

  return parentname

def bfsFolders(queue=[]):
  listFilesFolders = {}
  while len(queue) > 0:
    
    currentFolder = queue.pop()
    childrenFolders = getChildrenFoldersByFolderId(currentFolder['id'])
    childrenFiles = getChildrenFilesById(currentFolder['id'])
    parentName = getParentName(currentFolder['id'])
    listFilesFolders['folderName'] = currentFolder['name']
    listFilesFolders['folderId'] = currentFolder['id']
    listFilesFolders['parentName'] = parentName
    if len(childrenFiles) > 0:
      listFilesFolders['childrenFiles'] = childrenFiles

    if len(childrenFolders) <= 0:
      return listFilesFolders

    listFilesFolders['childrenFolders'] = []
    for child in childrenFolders:
      queue.append(child)
      listFilesFolders['childrenFolders'].append(bfsFolders(queue))
    
  return listFilesFolders



filesAndFolders = bfsFolders([{'id': "ASDASDASDASDVppeC1zVVlWdDhkASDASDQ", 'name': 'folderRoot'}])

pprint.pprint(filesAndFolders)

First of all separate the functions in order to simplify the script. Once you've done this use a BFS by using a root node as a parameter which contains ID and name of the folder.

The Breadth First Search recursively will use a FIFO list called listFilesFolders which contains a dictionary. Once the dictionary is set it will return the node (dictionary itself) unless there are more folders to "expand".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM