简体   繁体   English

如何计算jupyter笔记本中的代码行数

[英]How to count lines of code in jupyter notebook

I'm currently using Jupyter ipython notebook and the file I am working with has a lot of code. 我目前正在使用Jupyter ipython笔记本,我正在使用的文件有很多代码。 I am just curious as to how many lines of code there exactly are in my file. 我只是好奇我的文件中有多少行代码。 It is hard to count since I have separated my code into many different blocks. 由于我将代码分成许多不同的块,因此很难计算。

For anyone who is experienced with jupyter notebook, how do you count how many total lines of code there are in the file? 对于有jupyter笔记本经验的人,你如何计算文件中有多少行代码?

Thanks! 谢谢!

Edit: I've figured out how to do this, although in a pretty obscure way. 编辑:我已经想出了如何做到这一点,虽然以一种非常模糊的方式。 Here's how: download the jupyter notebook as a .py file, and then open the .py file in software like Xcode, or whatever IDE you use, and count the lines of code there. 方法如下:将jupyter笔记本下载为.py文件,然后在Xcode等软件或您使用的任何IDE中打开.py文件,并计算代码行。

This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line: 这将为您提供通过命令行传递给脚本的一个或多个笔记本中的LOC总数:

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    cells = load(open(nb))['cells']
    return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(run(argv[1:]))

So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb to get results. 所以你可以用$ ./loc.py nb1.ipynb nb2.ipynb结果。

The same can be done from shell if you have a useful jq utility: 如果您有一个有用的jq实用程序,可以从shell完成相同的操作:

jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l

Also, you can use grep to filter lines further, eg to remove blank lines: | grep -e ^\\"\\\\\\\\n\\"$ | wc -l 此外,您可以使用grep进一步过滤行,例如删除空行: | grep -e ^\\"\\\\\\\\n\\"$ | wc -l | grep -e ^\\"\\\\\\\\n\\"$ | wc -l

The answer from @Jessime Kirk is really good. @Jessime Kirk的回答非常好。 But it seems like the ipynb file shouldn't have Chinese character. 但似乎ipynb文件不应该有中文字符。 So I optimized the code as below. 所以我优化了如下代码。

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    with open(nb, encoding='utf-8') as data_file:
        cells = load(data_file)['cells']
        return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(r"This file can count the code lines number in .ipynb files.")
    print(r"usage:python countIpynbLine.py xxx.ipynb")
    print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
    print(r"it can also count multiple code.ipynb lines.")
    print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
    print(r"start to count line number")
    print(run(argv[1:]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM