简体   繁体   English

数据总是在SAS Proc Import(CSV)中被截断

[英]Data always gets truncated in SAS Proc Import (CSV)

I want to read in a bunch of CSV files. 我想读一堆CSV文件。 This one (movies_user.cleaned.csv) for example contains 2 variables: uid (user id) and movie_name. 例如,这个(movies_user.cleaned.csv)包含2个变量:uid(用户id)和movie_name。 Because SAS only reads the first 20 rows for guessing the length of the string, my data gets truncated. 因为SAS只读取前20行来猜测字符串的长度,所以我的数据会被截断。 ("Harry Potter" often changes to "Harry Pot" and so on.) (“哈利波特”经常改为“哈利波特”等等。)

I know I can use guessingrows=32767 (32767 is the maximum that can be used) in my code to let SAS check the first 32767 rows, but I don't think this safe enough to ensure no truncation. 我知道我可以在我的代码中使用guessingrows = 32767(32767是可以使用的最大值)让SAS检查第一个32767行,但我认为这不足以确保没有截断。 Some of my CSV's are way bigger than this. 我的一些CSV比这更重要。

Here is the code I use: 这是我使用的代码:

proc import datafile="H:\FBDATA_CLEANED\facebookdata2\movies_user.cleaned.csv"
 out=thesis.activities2
 dbms=csv
 replace;
 getnames=yes;

run;

Can you guys help me out a bit? 你们能帮我一点吗? Thanks! 谢谢!

Run PROC IMPORT manually. 手动运行PROC IMPORT In the log, you will see the DATA STEP code it generated. 在日志中,您将看到它生成的DATA STEP代码。

Copy that code. 复制该代码。

Replace PROC IMPORT with that DATA STEP . 用该DATA STEP替换PROC IMPORT

Edit the size of the INFORMAT and FORMAT statements so the field is large enough. 编辑INFORMATFORMAT语句的大小,使字段足够大。

Use the DATA STEP code going forward. 使用DATA STEP代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM