简体   繁体   English

读取保存在 Jupyter 文件夹中的 Excel 文件

[英]Reading Excel files that are saved in Jupyter folder

I am trying to read my excel file using R that I dragged to the Jupyter lab folder (...Tabs.xlsx in this case).我正在尝试使用 R 读取我的 excel 文件,我将其拖到 Jupyter 实验室文件夹(在本例中为 ...Tabs.xlsx)。 How do I use R or Python to read in that file?如何使用 R 或 Python 读取该文件?

In python you can use pandas, which has a built in function to make this easy:在 python 中,你可以使用 pandas,它有一个内置函数来简化这个过程:

import pandas as pd
pd.read_excel("my_excel.xlsx", sheet_name="my_sheet_name")
require(openxlsx)

# I wrote a function to read-in all sheets of a excel file
# assuming the excel sheets reflect 1 simple data frame each.
# I hope your excel sheets are very simple and don't need skipping
# data or leaving out some areas etc. Otherwise, you have to modify
# or use plain `read.xlsx` from `openxlsx`.
# This function returns a list of data frames 
# (for each sheet 1 data frame)
# the names of the elements of the list being the sheet-titles.

#############################
# read xlsx files to dfs list
#############################

xlsx2df.list <- function(xlsx.path, rowNames = TRUE, colNames = TRUE, ...) {
  wb <- loadWorkbook(xlsx.path)
  sheetNames <- names(wb)
  res <- lapply(sheetNames, function(sheetName) {
    read.xlsx(wb, sheet = sheetName, rowNames = rowNames, colNames = colNames, ...)
  })
  names(res) <- sheetNames
  res
}

dfs <- xlsx2df.list("path/to/my_excel.xlsx")

first.sheet.df <- dfs[[1]] # or dfs[["sheet1-title"]]
second.sheet.df <- dfs[[2]] # ...

I wrote this to not to have to check what is the sheet name and thus which sheet I have to read-in.我写这个是为了不必检查工作表名称是什么,因此我必须读入哪张工作表。 This is one of the most frequently used functions I use at work, since biologists for which I do analysis, love excel sheets.这是我在工作中最常用的功能之一,因为我做分析的生物学家喜欢 Excel 表格。

This function saves you time by calling the openxlsx` functions for you.此函数通过为您调用 openxlsx` 函数来节省您的时间。 (You don't have to learn them, thus, as long as your sheets are simple and regular enough ...). (你不必学习它们,因此,只要你的床单足够简单和规则......)。

Note: openxlsx is much less error prone than xlsx , since it avoids Java.注意: openxlsxxlsx更不容易出错,因为它避免了 Java。 I had problems with memory restriction by Java.我遇到了 Java 内存限制的问题。 xlsx -dependent functions got memory errors when the excel files were huge (Gbs). So: use -dependent functions got memory errors when the excel files were huge (Gbs). So: use xlsx 的-dependent functions got memory errors when the excel files were huge (Gbs). So: use -dependent functions got memory errors when the excel files were huge (Gbs). So: use openxslx , avoid xlsx` (Java-dependency)! -dependent functions got memory errors when the excel files were huge (Gbs). So: use openxslx , avoid xlsx`(Java 依赖)!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM