简体   繁体   中英

Writing to files from jupyter notebook

I tried to run this code:

from tqdm.auto import tqdm
import os
from datasets import load_dataset

dataset = load_dataset('oscar', 'unshuffled_deduplicated_ar', split='train[:25%]')

text_data = []
file_count = 0

for sample in tqdm(dataset['train']):
    sample = sample['text'].replace('\n', ' ')
    text_data.append(sample)
    if len(text_data) == 10_000:
        # once we git the 10K mark, save to file
        filename = f'/data/text/oscar_ar/text_{file_count}.txt'
        os.makedirs(os.path.dirname(filename), exist_ok=True)
        with open(filename, 'w', encoding='utf-8') as fp:
            fp.write('\n'.join(text_data))
        text_data = []
        file_count += 1
# after saving in 10K chunks, we will have ~2082 leftover samples, we save those now too
with open(f'data/text/oscar_ar/text_{file_count}.txt', 'w', encoding='utf-8') as fp:
    fp.write('\n'.join(text_data))

and i get following PermissionError:

Permission Error

I've tried changing rights to this directory and running jupyter with sudo privilages but it still doesn't work.

You are opening:

with open(f'data/text/oscar_ar/text_{file_count}.txt')

But you are writing:

filename = f'/Dane/text/oscar_ar/text_{file_count}.txt'

And you're screenshot says:

filename = f'/date/text/oscar_ar/text_{file_count}.txt'

You have to make a choice between data , /date or /Dane :)


Also It seems you should remove the first / in /data/text/oscar_ar/text_{file_count}.txt .

Explanation: When you put a slash ( / ) at the begin of a path, that means to look from the root of the filesystem, the top level. If you don't put the slash, it will start looking from your current directory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM