简体   繁体   English

在Stata中使用循环导出数据文件

[英]Exporting data files using loop in Stata

I have a large dataset of 20 cities and I'd like to split it into smaller ones for each city. 我有一个包含20个城市的大型数据集,我想将其分为每个城市的较小数据集。 Each variable in the dataset will be exported into a text file. 数据集中的每个变量都将导出到文本文件中。

foreach i in Denver Blacksburg {
use "D:\Data\All\AggregatedCount.dta", clear

drop if MetroArea != `i'

export delimited lnbike using "D:\Data/`"`i'"'/DV/lnbike.txt", delimiter(tab) replace
export delimited lnped using "D:\Data/`"`i'"'/DV/lnped.txt", delimiter(tab) replace 
}

I tried i' and "`i'"' in the export commands but none of them worked. 我在导出命令中尝试了i' and “`i”“',但是它们都不起作用。 The error is 错误是

"Denver not found." “找不到丹佛。”

I also have cities that have space in between, such as Los Angeles. 我也有介于两者之间的城市,例如洛杉矶。 I tried 我试过了

local city `" "Blacksburg" "Los Angeles" "Denver" "'
foreach i of city {
use "D:\Data\All\AggregatedCount.dta", clear

drop if MetroArea != `i'

export delimited lnbike using "D:/Data/`"`i'"'/DV/lnbike.txt", delimiter(tab) replace
export delimited lnped using "D:/Data/`"`i'"'/DV/lnped.txt", delimiter(tab) replace 
}

This didn't work either. 这也不起作用。 Do you have any suggestion? 你有什么建议吗?

If you want to continue with Stata, the only thing you would need to change in your first code snippet is 如果要继续使用Stata,则只需在第一个代码段中进行更改,

`"`i'"'

to

\`i'

Note the \\ so that your code looks like: 注意\\以便您的代码如下所示:

export delimited lnbike using "D:\Data\\`i'/DV/lnbike.txt", delimiter(tab) replace

(I would personally change all of the forward slashes ( / ) to back slashes ( \\ ) in general anyway) but the extra one is because a backslash before a left single quote in a string evaluates to just the left single quote. (无论如何,我一般都会将所有的正斜杠( / )更改为反斜杠( \\ )),但额外的一个是因为字符串中左单引号之前的反斜杠的计算结果仅为左单引号。 Having the second backslash tells Stata that you want the local macro i to be evaluated. 使用第二个反斜杠告诉Stata,您希望对本地宏i求值。

Your second code snippet could work if you also changed 如果您也进行了更改,则第二个代码段可能会起作用

foreach i of city {

to

foreach i of `city' {

It might be helpful to read up on local macros: they can definitely be confusing, but are powerful if you know how to use them. 仔细阅读本地宏可能会有所帮助:它们肯定会造成混乱,但是如果您知道如何使用它们,它们将非常强大。

This answer overlaps with the helpful answer by @Eric HB. 该答案与@Eric HB的有用答案重叠。

Given 20 (or more) cities you should not want to type those city names, which is tedious and error-prone, and not needed. 在给定20个(或更多)城市的情况下,您不希望键入那些乏味且容易出错且不需要的城市名称。 Nor do you need to read in the dataset again and again, because you can just export the part you want. 您也不需要一次又一次地读取数据集,因为您只需export所需的零件即可。 This should get you closer. 这应该使您更接近。

use "D:/Data/All/AggregatedCount.dta", clear

* result is integers 1 up, with names as value labels
egen which = group(MetroArea), label 
* how many cities: r(max), the maximum, is the number  
su which, meanonly 

forval i = 1/`r(max)' { 
     * look up city name for informative filename  
     local where : label (which) `i' 
     export delimited lnbike if which == `i' using "D:/Data/`where'/DV/lnbike.txt", delimiter(tab) replace
     export delimited lnped if which == `i' using "D:/Data/`where'/DV/lnped.txt", delimiter(tab) replace 
}

The principles concerned not yet discussed: 有关原则尚未讨论:

-- When testing for literal strings, you need " " or compound double quotes to delimit such strings. -在测试文字字符串时,您需要使用" "或复合双引号来分隔此类字符串。 Otherwise Stata thinks you mean a variable or scalar name. 否则,Stata认为您的意思是变量或标量名称。 This was your first bug, as given 如给定的,这是您的第一个错误

drop if MetroArea != `i' 

interpreted as 解释为

drop if MetroArea != Denver 

Stata can't find a variable Denver . Stata找不到变量Denver As you found, you need 如您所见,您需要

drop if MetroArea != "`i'" 

-- Windows uses the backslash as a separator in file and directory names, but Stata also uses the backslash as an escape character. -Windows使用反斜杠作为文件名和目录名的分隔符,但Stata也使用反斜杠作为转义符。 If you use local macro names after such file separators, the result can be quite wrong. 如果在此类文件分隔符之后使用本地宏名称,则结果可能会很错误。 This is documented at [U] 18.3.11 in this manual chapter and also in this note . 本手册的章节 [U] 18.3.11和本说明对此进行了记录 Forward slashes are never a problem, and Stata understands them as you intend, even with Windows. 正斜杠从来都不是问题,即使在Windows中,Stata都能按需理解斜杠。

All that said, it is difficult to believe that you will be better off with lots of little files, but that depends on what you want to do with them. 综上所述,很难相信拥有许多小文件会更好,但这取决于您要如何处理它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM