I put the code below in Pythontutor.com to see if I could understand how this works. However, despite reading up on flattening, extend, and append I am a little lost. My question is why does it evaluate 'b' twice? For example, it goes to extend then creates a newlist and then takes 'b' to the else and appends? I would appreciate any help that will make this more clear to me.
aList = ['b','a','c',2],[[[3]],'dog',4,5]
def flatten(aList):
newList = [ ]
for item in aList:
if type(item) == type([]):
newList.extend(flatten(item))
else:
newList.append(item)
return newList
print(flatten(aList))
The function uses recursion to call itself again. The idea is that you break down a larger problem into smaller parts that you each solve independently, then combine the results to solve the larger problem.
Here, flatten()
will call itself again whenever a contained element in the current sequence is a list. These recursive calls continue until the smaller part no longer contains more lists.
The thing to remember is that local names such as newList
are local to each function call . Even if flatten()
calls itself, each call results in a new, local newList
value that is independent.
For your input, a tuple:
['b', 'a', 'c', 2], [[[3]], 'dog', 4, 5]
the first element is a list too:
['b', 'a', 'c', 2]
so that's passed to a new flatten()
call. There are no more lists in that sub-list, so all the function then does is append each item to the newList
list and return that as the result. Upon returning the first flatten()
function is resumed and the returned list is added to the local newList
with an extend()
call.
All the while you look at how Pythontutor visualises this, you'll note that there are a lot of pointers to those lists within the original object:
You can see that the first flatten()
call references a tuple with two elements, and that the second flatten()
call references the first element of that tuple, the contained list. Python values all live in a dedicated area of memory called the 'heap', and names and list elements are all just labels , references, nametags with strings attached to those objects, and you can have any number such labels. See Ned Batchelder's excellent article on the subject . Both flatten()
functions have their own newList
reference pointing to a list object, and the currently active flatten()
function is busy copying the values from the aList
reference it has to newList
.
So once the recursive call to flatten()
returns control to the remaining, still active flatten()
function. Once the local newList
function has been extended with the returned values, the function then moves to the next element, [[[3]], 'dog', 4, 5]
, which has a few more lists to process, first [[3]]
, then [3]
and then there are no more nested lists to process.
If you write this all out with indentation for new calls, you get:
flatten((['b', 'a', 'c', 2], [[[3]], 'dog', 4, 5]))
newList
is set to an empty list item
is set to ['b', 'a', 'c', 2]
type(item)
is a list, so recurse
flatten(['b', 'a', 'c', 2])
newList
is set to an empty list item
is set to 'b'
, not a list, appended to newList
, now ['b']
item
is set to 'a'
, not a list, appended to newList
, now ['b', 'a']
item
is set to 'c'
, not a list, appended to newList
, now ['b', 'a', 'c']
item
is set to 2
, not a list, appended to newList
, now ['b', 'a', 'c', 2]
newList
['b', 'a', 'c', 2]
newList
is extended with ['b', 'a', 'c', 2]
, so now ['b', 'a', 'c', 2]
item
is set to [[[3]], 'dog', 4, 5]
type(item)
is a list, so recurse
flatten([[[3]], 'dog', 4, 5])
newList
is set to an empty list item
is set to [[3]]
type(item)
is a list, so recurse
flatten([[3]])
newList
is set to an empty list item
is set to [3]
type(item)
is a list, so recurse
flatten([3])
newList
is set to an empty list item
is set to 3
type(item)
is a list, so recurse
flatten([3])
item
is set to 3
, not a list, appended to newList
, now [3]
newList
[3]
newList
is extended with [3]
, so now [3]
newList
[3]
newList
is extended with [3]
, so now [3]
newList
[3]
newList
is extended with [3]
, so now [3]
item
is set to 'dog'
, not a list, appended to newList
, now [3, 'dog']
item
is set to 4
, not a list, appended to newList
, now [3, 'dog', 4]
item
is set to 5
, not a list, appended to newList
, now [3, 'dog', 4, 5]
newList
[3, 'dog', 4, 5]
newList
is extended with [3, 'dog', 4, 5]
, so now ['b', 'a', 'c', 2, 3, 'dog', 4, 5]
['b', 'a', 'c', 2, 3, 'dog', 4, 5]
In the Pythontutor visualisation (the default visualisation that Pythontutor uses for Python code), the fact that you see "b"
twice is actually an artifact of the simplification that Pythontutor uses. While lists and tuples are shown as separate objects with arrows showing how they are referenced, 'primitive' types such as strings and integers are shown inside the lists or directly inside variables in function frames.
In reality, these objects too are separate, and they too live on the heap and are referenced. That "b"
value exists as a single object, with multiple lists referencing it. You can pick a different visualisation, however:
With that option, the visualisation becomes a lot larger:
Here you can see that both newList
in the active function frame and the original list object referenced from the input tuple reference a single str
object with value "b"
. But you can perhaps see that with this level of detail things are a bit too verbose to take in in one go.
Perhaps it would be easier to understand if it were written simpler:
aList = ['b','a','c',2],[[[3]],'dog',4,5]
def flatten(value):
if not isinstance(value,(list,tuple)) : return [value]
return [ item for subItem in value for item in flatten(subItem) ]
If the value
parameter is a list or tuple, each element is concatenated to form the flattened output (2nd line). Because each of these elements could itself be a list or tuple, the function calls itself to flattent the item out before concatenating to the others. The function will stop calling itself when its parameter is a scalar value (ie not a list or tuple). In that case it will return the value itself as a single element list (1st line) because it cannot be further flattened and its caller (itself) expects a list.
flatten( aList ) : returns ['b']+['a']+['c']+[2]+[3]+['dog']+[4]+[5]
--> flatten( ['b','a','c',2] ) : returns ['b']+['a']+['c']+[2]
--> flatten('b') : returns ['b']
--> flatten('a') : returns ['a']
--> flatten('c') : returns ['c']
--> flatten(2) : returns [2]
--> flatten( [[[3]],'dog',4,5] ): returns [3]+['dog']+[4]+[5]
--> flatten([[3]]) : returns [3]
--> flatten([3]) : returns [3]
--> flatten(3) : returns [3]
--> flatten('dog') : returns ['dog']
--> flatten(4) : returns [4]
--> flatten(5) : returns [5]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.