I am using Faker ; a library for generating values for your mock datasets.
I am using Jupyter Notebooks .
The goal of this code is to generate specific fake data, under the condition of gender. Eg so as "Mrs." and "Peter" don't get mixed together.
Error is in relation to how I am appending data to the dataframe.
Cell 1:
import numpy as np
import pandas as pd
from faker import Faker
fake = Faker()
import random
np.random.seed(42)
Cell 2:
def example_dataset_simulation(samples, cols):
df = pd.DataFrame(index=np.arange(samples), columns=np.arange(cols))
#for col in range(cols):
for row in range(samples):
gender = random.randint(0, 1)
df['Prefix'] = [fake.prefix_male() if gender == 0 else fake.prefix_female()]
df['Forename'] = [fake.first_name_male() if gender == 0 else fake.prefix_female()]
df['Surname'] = fake.first_name() # unconditional
df['Suffix'] = [fake.suffix_male() if gender == 0 else fake.suffix_female()]
return df
Cell 3:
df = example_dataset_simulation(2, 2)
df
Error:
ValueError: Length of values (1) does not match length of index (2)
I cannot have list notation wrapping around my if-conditional "outputs" .
ie remove '[]' :
Original:
df['Prefix'] = [fake.prefix_male() if gender == 0 else fake.prefix_female()]
Now:
df['Prefix'] = fake.prefix_male() if gender == 0 else fake.prefix_female()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.