1. Creating Fake User Data
Using the Faker library, I'll be generating the fake data for 20 users:
random.randint(a, b) # Generates a random integer between 'a' and 'b'
random.choice(["A", "B"]) # Randomly picks one value from a list
fake.name() # Generates a random fake name
fake.country() # Generates a random countryimport pandas as pd
from faker import Faker
import random
#Initailize Faker
fake = Faker()
#set a random seed for reproducibility
random.seed(42)
fake.seed_instance(42) #to make sure Faker generates the same random values each time as it also had its own randomness stored somewhere in the module
#Generate and print a fake name
#We only need a placeholder like _ to run the loop and didn't necessarily need it for generating sequential user IDs.
users = []
for _ in range(1, 21):
user = {
"user_id": "PWR" + str(100+ _),
"name": fake.name(),
"age": random.randint(18,60),
"country": fake.country(),
"group": random.choice(["A", "B"]),
"clicks": random.randint(0,20),
"converted": random.choice([0, 1])
}
users.append(user)
#Create DataFrame
df = pd.DataFrame(users)
#to shift the default index from 0 to 1 in pandas
df.index = df.index + 1
print(df)
#below is done to save the data in a csv file
df.to_csv("ab_testing_data.csv", index=False)
print("Wohoo! Data saved successfully, 1/3rd of your project is complete!")
Last updated