简体   繁体   中英

How do I automatically generate files to the same google drive folder as my colab notebook?

I am performing LDA on a simple wikipedia dump file, but the code I am following needs to output the articles to a file. I need some guidance as python and colab are really broad and I can't seem to find an answer to this specific problem. Here's my code for mounting google drive:

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate the user
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Get your file
fileId ='xxxx'
fileName = 'simplewiki-20170820-pages-meta-current-reduced.xml'
downloaded = drive.CreateFile({'id': fileId})
downloaded.GetContentFile(fileName)

and here's the culprit, this code is trying to create a file from the article

if not article_txt == None and not article_txt == "" and len(article_txt) > 150 and is_ascii(article_txt):
                            outfile = dir_path + str(i+1) +"_article.txt"
                            f = codecs.open(outfile, "w", "utf-8")
                            f.write(article_txt)
                            f.close()
                            print (article_txt)

I have tried so many things already and I can't recall them all. Basically, what I need to know is how to convert this code so that it would work with google drive. I've been trying so many solutions for hours now. Something I recall doing is converting this code into this

file_obj = drive.CreateFile()
file_obj['title'] = "file name"

But then I got an error 'expected str, bytes or os.PathLike object, not GoogleDriveFile'. It's not the question of how to upload a file and open it with colab, as I already know how to do that with the XML file, what I need to know is how to generate files through my colab script and place them to the same folder as my script. Any help would be appreciated. Thanks!

I am not sure whether the problem is with generating the files or copying them to google drive, if it is the latter, a simpler approach would be to mount your drive directly to the instance as follows

from google.colab import drive

drive.mount('drive')

You can then access any item in your drive as if it were a hard disk and copy your files using bash commands:

!cp filename 'drive/My Drive/folder1/'

Another alternative is to use shutil :

import shutil

shutil.copy(filename, 'drive/My Drive/folder1/')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM