简体   繁体   中英

Create a dataset for Pandas, from sql, in a more performant way without having to use setdefault and append?

I create the pandas DataFrame like this. I don't take it directly from the database, but first create a loop with a setdefault and append . The reason is that in sql i use an Inner Join and have to add separately with append row[5] .

Next i use the dataset inside pandas.

Is there a more performant way to create the dataset without using setdefault and append? Or is the code I'm using already performant?

newlist = {}

conn = sqlite3.connect('....')
cursor = conn.cursor()

x = cursor.execute('''SQL CODES WITH INNER JOIN''')

for row in x.fetchall():
    newlist.setdefault((row[0],row[1],row[2], row[3], row[4]), []).append(row[5])

# Transform dataset to DataFrame
df = pd.DataFrame.from_dict(newlist, orient='index')

I don't know about your SQL query (and I don't have material to test) but maybe the easiest way is to use pd.read_sql .

Something like:

conn = sqlite3.connect('....')
qs = '''SQL CODES WITH INNER JOIN'''

df = pd.read_sql(qs, conn)

One way to improve the performance of this code would be to use a list of tuples to store the data from the database query, and then use pandas.DataFrame() to create the DataFrame from the list of tuples. This will avoid the overhead of using a dictionary and the setdefault() and append() methods.

Here's how you could modify your code to do this:

data = []

conn = sqlite3.connect('....')
cursor = conn.cursor()

x = cursor.execute('''SQL CODES WITH INNER JOIN''')

for row in x.fetchall():
    data.append((row[0],row[1],row[2], row[3], row[4], row[5]))

# Create DataFrame from list of tuples
df = pd.DataFrame(data, columns=['col1', 'col2', 'col3', 'col4', 'col5', 'col6'])

This should be more performant than the original code, since it avoids the overhead of using a dictionary and the setdefault() and append() methods.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM