I have a string that contains many integers separated with commas. I am trying to convert this string (which is something like this csv_data = "1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5"
)
into a list of distinct integer values. csv_values = [1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5]
The first idea that i tried was a for loop, but i know that is not the fastest way to make the conversion
l = []
for ch in csv_data:
if ch != ',':
l.append(int(ch))
any ideas?
using standard lib and pandas.
from ast import literal_eval
import pandas as pd
import numpy as np
string = "1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5"
list(literal_eval(string))
[1, 23, 543, 12, 423, 534, 76, 32, 765, 23, 12, 1, 43, 213, 6, 5]
pd.eval(string)
array([1, 23, 543, 12, 423, 534, 76, 32, 765, 23, 12, 1, 43, 213, 6, 5],
dtype=object)
you can then use np.unique
or just set
to get the distinct integers.
np.unique(pd.eval(string))
array([1, 5, 6, 12, 23, 32, 43, 76, 213, 423, 534, 543, 765], dtype=object)
or
list(set(literal_eval(string)))
[32, 1, 5, 6, 423, 43, 12, 76, 213, 534, 23, 765, 543]
note, np.unique
will sort your values.
some naive timings,
string2 = string * 1000
%%timeit
list(set(literal_eval(string2)))
34 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%%timeit
np.unique(pd.eval(string2))
494 ms ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Use this to firstly convert it to a list
def Convert(string):
li = list(string.split("-"))
return li
If you have it in a string, you can use [int(x) for x in csv_data.split(',')]
If the data actually comes from a file, use one of the already-existing functions to read a csv file, either the built-in csv
module or the Pandas read_csv
function .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.