简体   繁体   English

将数据从MS Word移动到MS Excel

[英]Moving data from MS Word to MS Excel

I have transcripts of data in MS Word want to read into a stats program called R. The problem is these documents contain special characters (not plain text). 我有MS Word中的数据记录,想读入一个名为R的统计程序。问题是这些文档包含特殊字符(不是纯文本)。 My process for dealing with them has been to sub them out in MS Word/save as a txt document/read into MS Excel (makes a column for people and dialogue using the import wizard)/Convert to .csv/read into R. This process works but is time consuming. 我处理它们的过程是将它们细分为MS Word /另存为txt文档/读入MS Excel(使用导入向导为人员和对话创建一列)/转换为.csv /读入R。过程有效,但很耗时。 I found out how to read the text with special characters right into R (R generally wants plain text) but this requires the document be in an excel document. 我发现了如何将带有特殊字符的文本直接读到R中(R通常需要纯文本),但这要求文档位于excel文档中。 This is desirable because if I can read the special characters into R it's rather simple to sub out all the special characters at once. 这是可取的,因为如果我可以将特殊字符读入R中,那么一次将所有特殊字符都除掉就相当简单。 The problem arises because I can't get the MS Word document into Excel directly. 问题就出现了,因为我不能直接将MS Word文档到Excel中。 I have to save it as a text file first (which I don't mind doing) and then read it in. This turns the special characters into boxes and question marks. 我必须先将其保存为文本文件(我不介意这样做),然后再读取它。这会将特殊字符转换为方框和问号。 I need to get the MS Word doc into Excel as a data frame with 2 columns (person, dialogue) without destroying the special characters (“, ”, —, ', ', …, etc.). 我需要将MS Word文档作为具有两列(人,对话)的数据框放入Excel中,而又不破坏特殊字符(“,”,“,”,“ ...”等)。

I can do this by subbing out in Word with replace but again if I could get it to Excel doing this in R would be much easier. 我可以通过用replace在Word中进行精简来做到这一点,但是如果我可以将其添加到Excel中,那么在R中这样做会容易得多。

Here is a sample MS Word doc of what my data looks like (tab separated columns) 以下是对我的数据看起来像(制表符分隔列)一个样本MS Word文档

https://dl.dropbox.com/u/61803503/TEST.doc https://dl.dropbox.com/u/61803503/TEST.doc

Excel and Word versions 2010 on a Win 7 machine. Win 7计算机上的Excel和Word版本2010。

One way: use Edit->Copy in Word and Edit->Paste in Excel. 一种方法:在Word中使用“编辑”->“复制”,在Excel中使用“编辑”->“粘贴”。 A simple tabular structure should be preserved if you do that, with preservation of Unicode characters. 如果这样做,则应保留一个简单的表格结构,并保留Unicode字符。 Not so sure about non-Unicode stuff such as Wingdings. 对于诸如Wingdings之类的非Unicode内容不太确定。 Haven't tried VBA-ing that, either. 也没有尝试过VBA。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM