简体   繁体   English

关于从带有标题的 U2 文件中取样和保存数据的问题

[英]Question about taking sample and saving data from U2 files with headers

Relative newbie to using Uniquery.使用 Uniquery 的相对新手。 I've found some helpful documentation and answers from prior users posting here, and on other sites.我从以前的用户在这里和其他网站上找到了一些有用的文档和答案。 I'm trying to document what we have in our U2 files, as we are exploring options to migrate historical data into a data warehouse running SQL.我正在尝试记录 U2 文件中的内容,因为我们正在探索将历史数据迁移到运行 SQL 的数据仓库的选项。

I've been able to list out how files are structured with LIST DICT <<FILENAME>> , as well as save those results to a file, which I'm then able to view in with Excel.我已经能够使用LIST DICT <<FILENAME>>列出文件的结构,并将这些结果保存到一个文件中,然后我就可以使用 Excel 进行查看。

Once I've found this basic data, I wanted to take some sample data from each of these files.找到这些基本数据后,我想从每个文件中获取一些示例数据。 If I use, LIST <<FILENAME>> ALL TO DELIM "|" /TSTSAMPLE.TXT SAMPLE 300如果我使用, LIST <<FILENAME>> ALL TO DELIM "|" /TSTSAMPLE.TXT SAMPLE 300 LIST <<FILENAME>> ALL TO DELIM "|" /TSTSAMPLE.TXT SAMPLE 300 , I am able to get this sample. LIST <<FILENAME>> ALL TO DELIM "|" /TSTSAMPLE.TXT SAMPLE 300 ,我能够得到这个样本。 However I was wondering if there's a way to create a tab delimited file instead of using pipe as the delimiter?但是我想知道是否有办法创建制表符分隔的文件而不是使用管道作为分隔符?

Another question I had was if anyone knew of a way to get the headers that go with the data being saved?我的另一个问题是,是否有人知道一种方法来获取与正在保存的数据相匹配的标题?

I've seen some suggestions of using XML, LIST <<FILENAME>> ALL TOXML , which works, but it doesn't look like empty elements are placed into the saved file.我已经看到一些使用 XML 的建议, LIST <<FILENAME>> ALL TOXML ,它有效,但它看起来不像空元素被放置到保存的文件中。

Have additionally been using, UDT.OPTIONS 91 ON , to get any dates into a readable format for the saved file.此外,还使用UDT.OPTIONS 91 ON将任何日期转换为已保存文件的可读格式。

Thanks to any U2 pros who can offer suggestions.感谢任何可以提供建议的 U2 专业人士。

You are definitely on the right track.你绝对是在正确的轨道上。 UDT.OPTIONS 91 ON is essential for dates and money fields. UDT.OPTIONS 91 ON 对于日期和货币字段是必不可少的。 For the specific question of exporting as tab-delimited, I haven't seen it documented anywhere but this works for me:对于以制表符分隔导出的具体问题,我没有在任何地方看到它的记录,但这对我有用:

LIST <<FILENAME>> ALL TO DELIM 9 /TSTSAMPLE.TXT SAMPLE 300

Obviously the 9 represents CHAR(9) for tab.显然,9 代表制表符的 CHAR(9)。 I'm not sure if other characters work as well - I always use 9 or "|".我不确定其他字符是否也能正常工作 - 我总是使用 9 或“|”。 I don't use ALL because I have dictionaries that are a mess, but good for you if yours are well maintained.我不使用 ALL,因为我的字典很乱,但如果你的字典维护得很好,对你有好处。

For the headers, that's a tough thing to do in general.对于标题,一般来说这是一件很难的事情。 I've tried to solve that one too, and ended up creating a tab-delimited header to use for each file.我也试图解决这个问题,最终创建了一个制表符分隔的标题以用于每个文件。 You can start from the XML dump and do some tweaking in your favorite editor to not have to do the whole thing from scratch.您可以从 XML 转储开始,然后在您喜欢的编辑器中进行一些调整,而不必从头开始。

The other thing that is very challenging is a) identifying MV fields and then b) deciding which are controlling and which dependent.另一件非常具有挑战性的事情是 a) 识别 MV 场,然后 b) 决定哪些是控制的,哪些是相关的。 I have a program that does this by counting MV marks in a sample of the data, and attempting to line up those fields that have the same count in all records.我有一个程序通过计算数据样本中的 MV 标记并尝试排列所有记录中具有相同计数的字段来执行此操作。 If you're looking to do that I can post on github or somewhere.如果你想这样做,我可以在 github 或其他地方发帖。 It's complicated, and unless your data is perfectly clean, not 100% correct.这很复杂,除非您的数据非常干净,否则不是 100% 正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM