I ran into a problem.
Got a dataset as such:
dataset = [['9874, 209384, 20938'], ['9874,209384, 20938'], ['9874, 209384, 20938']]
Initially wanted to run Apriori on it but the problem is that the individual item in the list of list is not in quotation mark.
Desired output:
dataset = [['9874', '209384', '20938'], ['9874', '209384', '20938'], ['9874', '209384', '20938']]
How should I do it?
You can use split function.
x =[['9874, 209384, 20938'], ['9874,209384, 20938'], ['9874, 209384, 20938']]
x = [i[0].split(",") for i in x]
print(x)
// [['9874', ' 209384', ' 20938'], ['9874', '209384', ' 20938'], ['9874', ' 209384', ' 20938']]
Try:
res = [ i[0].split(", ") for i in dataset]
res:
[['9874', '209384', '20938'],
['9874,209384', '20938'],
['9874', '209384', '20938']]
As per I can see, each list of lists ie ['9874, 209384, 20938'] has a string inside this list. Means the whole value '9874, 209384, 20938' is a string. So, you can try this:
dataset = [['9874, 209384, 20938'], ['9874,209384, 20938'], ['9874, 209384, 20938']]
""" Create an empty list"""
emp_list = []
for i in range(len(dataset)):
emp_list.append(dataset[i][0].split()
What I did was took strings of each list, and split them and append it to the empty list. Now your dataset would look like this:
emp_list = [['9874,', '209384,', '20938'], ['9874,209384,', '20938'], ['9874,', '209384,', '20938']].
Hope thos helps.
Because your example has inconsistent spacing, this will be more tolerant:
>>> [[y.strip() for y in x[0].split(',')] for x in dataset]
[['9874', '209384', '20938'], ['9874', '209384', '20938'], ['9874', '209384', '20938']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.