简体   繁体   中英

Fastest way to convert a string with many integers into a list distinct integers

I have a string that contains many integers separated with commas. I am trying to convert this string (which is something like this csv_data = "1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5" )

into a list of distinct integer values. csv_values = [1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5]

The first idea that i tried was a for loop, but i know that is not the fastest way to make the conversion

l = []
for ch in csv_data:
    if ch != ',':
       l.append(int(ch))

any ideas?

using standard lib and pandas.

from ast import literal_eval
import pandas as pd
import numpy as np

string = "1,23,543,12,423,534,76,32,765,23,12,1,43,213,6,5"

list(literal_eval(string))

[1, 23, 543, 12, 423, 534, 76, 32, 765, 23, 12, 1, 43, 213, 6, 5]

pd.eval(string)

array([1, 23, 543, 12, 423, 534, 76, 32, 765, 23, 12, 1, 43, 213, 6, 5],
      dtype=object)

you can then use np.unique or just set to get the distinct integers.

np.unique(pd.eval(string))
array([1, 5, 6, 12, 23, 32, 43, 76, 213, 423, 534, 543, 765], dtype=object)

or

list(set(literal_eval(string)))

[32, 1, 5, 6, 423, 43, 12, 76, 213, 534, 23, 765, 543]

note, np.unique will sort your values.

some naive timings,

string2 = string * 1000

%%timeit

list(set(literal_eval(string2)))

34 ms ± 1.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit

np.unique(pd.eval(string2))

494 ms ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Use this to firstly convert it to a list

def Convert(string): 
       li = list(string.split("-")) 
       return li
  • If you have it in a string, you can use [int(x) for x in csv_data.split(',')]

  • If the data actually comes from a file, use one of the already-existing functions to read a csv file, either the built-in csv module or the Pandas read_csv function .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM