简体   繁体   English

将一行中的每个值都视为新列-转换为可加载的数据文件

[英]every value in a row treated as new column - convert to loadable data file

I have a data file where every row value is shown as a new column entry. 我有一个数据文件,其中每个行值都显示为新的列条目。 I want to convert this or find a logic to make this file loadable to a database. 我想转换此文件或查找逻辑以使此文件可加载到数据库。 Below is sample of how the data is in the file. 以下是数据在文件中的方式示例。

The file is huge. 该文件很大。 It has more than >7000 columns. 它具有> 7000多个列。 I have tried loading/importing this to a table but it exceeds max column limit in all the tools 我尝试将其加载/导入到表中,但在所有工具中都超过了最大列数限制

+--------+-----------+----------+----------+----------+------------+------------+------------+------------+
| emplid | status_0  | status_1 | status_2 | status_3 | location_0 | location_1 | location_2 | location_3 |
+--------+-----------+----------+----------+----------+------------+------------+------------+------------+
| 1234   | Submitted | Reviewed | Approved | Accepted |            | California | Michigan   |            |
+--------+-----------+----------+----------+----------+------------+------------+------------+------------+
| 4568   | Submitted | Reviewed | Denied   |          | Texas      | Utah       | Illinois   | NewYork    |
+--------+-----------+----------+----------+----------+------------+------------+------------+------------+

+--------+-----------+------------+
| emplid | status    | location   |
+--------+-----------+------------+
| 1234   | Submitted |            |
+--------+-----------+------------+
| 1234   | Reviewed  | California |
+--------+-----------+------------+
| 1234   | Approved  | Michigan   |
+--------+-----------+------------+
| 1234   | Accepted  |            |
+--------+-----------+------------+
| 4568   | Submitted | Texas      |
+--------+-----------+------------+
| 4568   | Reviewed  | Utah       |
+--------+-----------+------------+
| 4568   | Denied    | Illinois   |
+--------+-----------+------------+
| 4568   |           | Newyork    |
+--------+-----------+------------+

What tool can you load your data file into? 您可以将数据文件加载到什么工具中? If you could load it into any SQL compliant database, you could use a SQL Query such as: 如果可以将其加载到任何符合SQL的数据库中,则可以使用SQL查询,例如:

INSERT INTO master_status_table
(SELECT emplid, status_0, location_0) as x)
GO

INSERT INTO master_status_table
(SELECT emplid, status_1, location_1) as y)
GO

INSERT INTO master_status_table
(SELECT emplid, status_2, location_2) as z)
GO

But, it sounds like you can't get it into a database in the first place. 但是,听起来您一开始无法将其放入数据库。 So, you could try to load it into an EXCEL spreadsheet. 因此,您可以尝试将其加载到EXCEL电子表格中。 Then, in the spreadsheet, let's say, your original data is in one sheet called 'original data', then you would create another sheet called, say, 'status 0' and in that sheet, you would have a formula to display the contents of the same row of data from column 'A' of 'original data' (employee id), and columns 'B', and 'F' for status and location. 然后,在电子表格中,假设您的原始数据位于一个名为“原始数据”的工作表中,然后您将创建另一个名为“状态0”的工作表,并且在该工作表中,您将拥有一个公式来显示内容来自“原始数据”的“ A”列(员工ID),“ B”列和“ F”列中状态和位置的同一行数据。 Then sheet 'status 1' would have the same but columns 'C' and 'G' for status and location, etc. Once you have all of your 'status' sheets, you can export each one as a .CSV file and import that directly into your master_status_table. 然后工作表“状态1”将具有相同的内容,但状态和位置等列为“ C”和“ G”列。一旦拥有所有“状态”表,则可以将每个工作表导出为.CSV文件并将其导入直接放入您的master_status_table。

I know this will still be a very manual process, but it should be possible. 我知道这仍然是一个非常手动的过程,但是应该可以实现。

If the data are in a CSV format, you could try normalizing the data with un-xtab.py ( https://pypi.org/project/un-xtab/ ). 如果数据为CSV格式,则可以尝试使用un-xtab.pyhttps://pypi.org/project/un-xtab/ )规范化数据。 un-xtab imports the data into SQLite, which should accommodate more than 7,000 columns. un-xtab将数据导入SQLite,该数据库可容纳7,000多个列。 Documentation is in the doc subdirectory of the Bitbucket repository at https://bitbucket.org/rdnielsen/un-xtab/src/default/ . 文档位于https://bitbucket.org/rdnielsen/un-xtab/src/default/上的Bitbucket存储库的doc子目录中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM