I have a row that looks like this:
Alain,David,43,"['Cinema:ABC', 'Cafe:Evasion', 'Hotel:Hotel Du Parc', 'Cafe:Casa del gelato']","['Notebook', 'Cigarette électronique', 'Livre:Roman']","['Matin:8h-10h', 'Apres-midi:12h-15h']","['Politique']
I have tried to remove delimiters ([,],"",'') to obtain something like this in order to calculate similarity between rows later:
Alain,David,43,Cinema:ABC, Cafe:Evasion, Hotel:Hotel Du Parc, Cafe:Casa del gelato,Notebook, Cigarette électronique, Livre:Roman,Matin:8h-10h, Apres-midi:12h-15h,Politique
But it failed! Any idea?
I assume you have list, not string
row = ['Alain','David',43,"['Cinema:ABC', 'Cafe:Evasion', 'Hotel:Hotel Du Parc', 'Cafe:Casa del gelato']","['Notebook', 'Cigarette électronique', 'Livre:Roman']","['Matin:8h-10h', 'Apres-midi:12h-15h']","['Politique']"]
You have string with list in some columns. You have to convert back string to list. You can use eval()
to convert string to Python's list.
result = []
for item in row:
if isinstance(item, str) and item.startswith('['):
result += eval(item)
else:
result.append(item)
print(result)
EDIT:
You generate it with
file.writerow([
random.choice(Prenoms),
random.choice(Noms),
random.randint(17,65),
random.sample(Lfreq,4)
])
But random.sample(Lfreq,4)
gives list which you have to write as separated columns.
data = random.sample(Lfreq,4)
file.writerow([
random.choice(Prenoms),
random.choice(Noms),
random.randint(17,65),
data[0],
data[1],
data[2],
data[3]
])
or extend list using extend
or +=
data = [random.choice(Prenoms), random.choice(Noms), random.randint(17,65)]
#data.extend(random.sample(Lfreq,4))
data += random.sample(Lfreq,4)
file.writerow(data)
There is a function that solves this.
# -*- coding: utf-8 -*-
import re
def plain_array_from_array_with_subarrays_as_strings(array):
response = []
for el in array:
if not isinstance(el, (int, float)):
sub_els = re.findall(r"'([^']+)'", el)
if len(sub_els) > 0:
for sub_el in sub_els:
response.append(sub_el)
else:
response.append(el)
else:
response.append(el)
return response
r = [
"Alain",
"David",
43,
"['Cinema:ABC', 'Cafe:Evasion', 'Hotel:Hotel Du Parc', 'Cafe:Casa del gelato']",
"['Notebook', 'Cigarette électronique', 'Livre:Roman']",
"['Matin:8h-10h', 'Apres-midi:12h-15h']",
"['Politique']"
]
print(plain_array_from_array_with_subarrays_as_strings(r))
Output:
['Alain',
'David',
43,
'Cinema:ABC',
'Cafe:Evasion',
'Hotel:Hotel Du Parc',
'Cafe:Casa del gelato',
'Notebook',
'Cigarette électronique',
'Livre:Roman',
'Matin:8h-10h',
'Apres-midi:12h-15h',
'Politique']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.