简体繁体中英

Best approach to work with .csv files with 4 gigas

原文 2021-02-24 00:06:54 4 1 sql/ python-3.x/ pandas/ visual-studio-code/ data-science

I am on data science. I have a.csv file with 5kk records and 3.9gigas of size. Whats the best pratice to deal with it? I normally use vscode or jupyter and even when i set max-memory to 10gigas the operations like load etc are taking too much time to complete.

What do you recommend to improve my work?

notebook lenovo S145 20gigas ram i7-8565U - Ubuntu

Thanks

1 answers

If you want to bring a CSV into a database for reporting, one fairly quick and easy option is to use an external table. It uses syntax similar to SQLLDR in the create table definition. Once established, the latest saved CSV data will immediately be available as a table in the database.

Best approach when uploading csv files, and check for duplicated lines

Best Approach to Parse for SQL in PHP Files?

Hive: best approach to consume large number of small XML files

Best approach for script to download thousands of files in C#, SQL 2008

Best approach for DB connections

best mysql approach?

Best approach for my requirements

Query Logic best approach

Best approach to parametrized aggregates

Best Approach for Reindexing

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Best approach when uploading csv files, and check for duplicated lines Best Approach to Parse for SQL in PHP Files? Hive: best approach to consume large number of small XML files Best approach for script to download thousands of files in C#, SQL 2008 Best approach for DB connections best mysql approach? Best approach for my requirements Query Logic best approach Best approach to parametrized aggregates Best Approach for Reindexing

Related Tags

Best approach to work with .csv files with 4 gigas

Question

1 answers

solution1 1 2021-02-24 00:18:38

solution1
1 2021-02-24 00:18:38