![](/img/trans.png)
[英]extracting specific lines of data from multiple text files, to convert to a single csv file
[英]convert multiple DBs to CSV
我有成千上萬個需要轉換為CSV文件的dB文件。 這可以通過一個簡單的腳本/批處理文件來實現,即
.open "Test.db"
.mode csv
.headers on.
我需要腳本來打開其他所有具有不同名稱的數據庫文件,有一種方法可以執行此操作,因為我不想為每個數據庫文件編寫以上腳本
我制作了一個腳本,該腳本將當前目錄中的所有db-sqlite文件批量轉換為CSV,稱為“ sqlite2csv”。 它將每個db-sqlite的每個表輸出為CSV文件,因此,如果您有10個文件和3個表,則將獲得30個CSV文件。 希望它至少可以幫助您制作自己的腳本作為起點。
#!/bin/bash
# USAGE EXAMPLES :
# sqlite2csv
# - Will loop all sqlite files in the current directory, take the tables of
# each of these sqlite files, and generate a CSV file per table.
# E.g. If there are 10 sqlite files with 3 tables each, it will generate
# 30 CSV output files, each containing the data of one table.
# The naming of the generated CSV files take from the original sqlite
# file name, prepended with the name of the table.
# check for dependencies
if ! type "sqlite3" > /dev/null; then
echo "[ERROR] SQLite binary not found."
exit 1
fi
# define list of string tokens that an SQLite file type should contain
# the footprint for SQLite 3 is "SQLite 3.x database"
declare -a list_sqlite_tok
list_sqlite_tok+=( "SQLite" )
#list_sqlite_tok+=( "3.x" )
list_sqlite_tok+=( "database" )
# get a lis tof only files in current path
list_files=( $(find . -maxdepth 1 -type f) )
# loop the list of files
for f in ${!list_files[@]}; do
# get current file
curr_fname=${list_files[$f]}
# get file type result
curr_ftype=$(file -e apptype -e ascii -e encoding -e tokens -e cdf -e compress -e elf -e tar $curr_fname)
# loop through necessary token and if one is not found then skip this file
curr_isqlite=0
for t in ${!list_sqlite_tok[@]}; do
curr_tok=${list_sqlite_tok[$t]}
# check if 'curr_ftype' contains 'curr_tok'
if [[ $curr_ftype =~ $curr_tok ]]; then
curr_isqlite=1
else
curr_isqlite=0
break
fi
done
# test if curr file was sqlite
if (( ! $curr_isqlite )); then
# if not, do not continue executung rest of script
continue
fi
# print sqlite filename
echo "[INFO] Found SQLite file $curr_fname, exporting tables..."
# get tables of sqlite file in one line
curr_tables=$(sqlite3 $curr_fname ".tables")
# split tables line into an array
IFS=$' ' list_tables=($curr_tables)
# loop array to export each table
for t in ${!list_tables[@]}; do
curr_table=${list_tables[$t]}
# strip unsafe characters as well as newline
curr_table=$(tr '\n' ' ' <<< $curr_table)
curr_table=$(sed -e 's/[^A-Za-z0-9._-]//g' <<< $curr_table)
# temporarily strip './' from filename
curr_fname=${curr_fname//.\//}
# build target CSV filename
printf -v curr_csvfname "%s_%s.csv" $curr_table "$curr_fname"
# put back './' to filenames
curr_fname="./"$curr_fname
curr_csvfname="./"$curr_csvfname
# export current table to target CSV file
sqlite3 -header -csv $curr_fname "select * from $curr_table;" > $curr_csvfname
# log
echo "[INFO] Exported table $curr_table in file $curr_csvfname"
done
done
sqlite3
命令行外殼允許使用命令行參數進行某些設置,因此您可以簡單地對每個數據庫文件中的表執行簡單的SELECT *
:
for %%a in (*.db) do sqlite3 -csv -header "%%a" "select * from TableName" > %%~na.csv
(如果這不是批處理文件的一部分,而是直接從命令行運行,則必須將%%
替換為%
。)
我准備了一個簡短的python腳本,該腳本將從多個sqlite數據庫寫入一個csv文件。
python multiple_sqlite_files_tocsv.py -d <inputFolder> -e <extension> -t <tableName>
會將數據輸出到output.csv文件。
Jupyter筆記本和python腳本在github上。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.