Writing to files from jupyter notebook

Question

I tried to run this code:

from tqdm.auto import tqdm
import os
from datasets import load_dataset

dataset = load_dataset('oscar', 'unshuffled_deduplicated_ar', split='train[:25%]')

text_data = []
file_count = 0

for sample in tqdm(dataset['train']):
    sample = sample['text'].replace('\n', ' ')
    text_data.append(sample)
    if len(text_data) == 10_000:
        # once we git the 10K mark, save to file
        filename = f'/data/text/oscar_ar/text_{file_count}.txt'
        os.makedirs(os.path.dirname(filename), exist_ok=True)
        with open(filename, 'w', encoding='utf-8') as fp:
            fp.write('\n'.join(text_data))
        text_data = []
        file_count += 1
# after saving in 10K chunks, we will have ~2082 leftover samples, we save those now too
with open(f'data/text/oscar_ar/text_{file_count}.txt', 'w', encoding='utf-8') as fp:
    fp.write('\n'.join(text_data))

and i get following PermissionError:

Permission Error

I've tried changing rights to this directory and running jupyter with sudo privilages but it still doesn't work.

Answer 1

You are opening:

with open(f'data/text/oscar_ar/text_{file_count}.txt')

But you are writing:

filename = f'/Dane/text/oscar_ar/text_{file_count}.txt'

And you're screenshot says:

filename = f'/date/text/oscar_ar/text_{file_count}.txt'

You have to make a choice between data , /date or /Dane :)

Also It seems you should remove the first / in /data/text/oscar_ar/text_{file_count}.txt .

Explanation: When you put a slash ( / ) at the begin of a path, that means to look from the root of the filesystem, the top level. If you don't put the slash, it will start looking from your current directory.

Writing to files from jupyter notebook

Question

1 answers

solution1
1 ACCPTED 2022-01-25 19:07:13

Writing to files from jupyter notebook

Question

1 answers

solution1 1 ACCPTED 2022-01-25 19:07:13

solution1
1 ACCPTED 2022-01-25 19:07:13