简体   繁体   中英

Pandas and JSON ValueError: arrays must all be same length

I'm trying to make a simple application that will take lyrics from a song and save them, I'm using lyricsgenius to create a JSON file with the lyrics of the songs I'm requesting, however, I can't figure out how to parse the data from the JSON file. I've tried following this tutorial but I am getting an error when I start working with Pandas.

Code to create the JSON File

import lyricsgenius as genius
import os

os.getcwd()

geniusCreds = "qlDFcHWqCRpSfq0pVTctt1ZhDc4wHF6lpP5WGODh4iVQB7yTPn7Hw6SjWAFiCdxa"
artist_name = "Steely Dan"

api = genius.Genius(geniusCreds)
artist = api.search_artist(artist_name, max_songs=3)

artist.save_lyrics()

Code to read the Data from the JSON File

import pandas as pd
import os


Artist = pd.read_json("Lyrics_SteelyDan.json")

df = pd.DataFrame.from_dict(Artist['songs'])

df.head

Whenever I run the code above I get the error, any help on how to fix the error or a better way to parse the data would be much appreciated, thank you.

 "c:/Users/Admin/Desktop/Steely Dan/Data.py"
Traceback (most recent call last):
  File "c:/Users/Admin/Desktop/Steely Dan/Data.py", line 5, in <module>
    Artist = pd.read_json("Lyrics_SteelyDan.json")
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 592, in read_json
    result = json_reader.read()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 717, in read
    obj = self._get_object_parser(self.data)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 739, in _get_object_parser
    obj = FrameParser(json, **kwargs).parse()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 849, in parse
    self._parse_no_numpy()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 1093, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 411, in __init__
    mgr = init_dict(data, index, columns, dtype=dtype)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 257, in init_dict
    return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 77, in arrays_to_mgr
    index = extract_index(arrays)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 368, in extract_index
    raise ValueError("arrays must all be same length")
ValueError: arrays must all be same length

You have different lengths if rows so your original code will fail.

Try this:

import json
from pandas.io.json import json_normalize

with open('Lyrics_SteelyDan.json') as json_data:
    data = json.load(json_data)

df = pd.DataFrame(data['songs'])
df['lyrics']

Read also this: https://hackersandslackers.com/json-into-pandas-dataframes/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM