[英]Trying to understand 'lambda' and 'map' in this script
I am trying to understand the lambda and map functions in python, specifically with regard to the below code I have been following using the tweeps API.我正在尝试了解 python 中的 lambda 和 map 函数,特别是关于我一直在使用 tweeps Z243ADB97A47CDE1871 的以下代码I have googled lambda and map but I'm struggling to understand it in the context of this script.
我用谷歌搜索了 lambda 和 map 但我在这个脚本的上下文中很难理解它。 As I understand Lambda passes an argument and an expression, thereby becoming a shortened function?
据我了解 Lambda 传递一个参数和一个表达式,从而成为缩短的 function? Could you kindly take a look at the code below for me and indicate what map and lambda are doing in each line here?
您能否帮我看看下面的代码,并指出 map 和 lambda 在这里的每一行中都在做什么?
#Reading the raw data collected from the Twitter Streaming API using Tweepy
tweets_data = []
tweets_data_path = 'output2.txt'
tweets_file = open(tweets_data_path, 'r')
for line in tweets_file:
try:
tweet = json.loads(line)
tweets_data.append(tweet)
except:
continue
print('The total number of Tweets is:', len(tweets_data))
#Create a function to see if the tweet is a retweet
def is_RT(tweet):
if 'retweeted_status' not in tweet:
return False
else:
return True
#Create a function to see if the tweet is a reply to a tweet of another user, if so return that user.
def is_Reply_to(tweet):
if 'in_reply_to_screen_name' not in tweet:
return False
else:
return tweet['in_reply_to_screen_name']
#Convert the Tweet JSON data to pandas Dataframe, and take the desired fields from the JSON.
tweets = pd.DataFrame()
tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))
tweets['Username'] = list(map(lambda tweet: tweet['user']['screen_name'], tweets_data))
tweets['Timestamp'] = list(map(lambda tweet: tweet['created_at'], tweets_data))
tweets['length'] = list(map(lambda tweet: len(tweet['text']) if 'extended_tweet' not in tweet else len(tweet['extended_tweet']['full_text']), tweets_data))
tweets['location'] = list(map(lambda tweet: tweet['user']['location'], tweets_data))
tweets['device'] = list(map(reckondevice, tweets_data))
tweets['RT'] = list(map(is_RT, tweets_data))
tweets['Reply'] = list(map(is_Reply_to, tweets_data))
I was following the guide fine but this threw me as I have never seen map or lambda before.我很好地遵循了指南,但这让我感到震惊,因为我以前从未见过 map 或 lambda。 I understand we are building a data frame in pandas I'm just not sure how it is happening?
我知道我们正在 pandas 中构建一个数据框我只是不确定它是如何发生的?
Thanks!!谢谢!!
Syntactically a map function is like this:语法上 map function 是这样的:
map(callable, <collection>)
In simple word, it iterates over the collection, and on each item, executes the callable, and replaces the item with the return value of callable, in the list.简而言之,它遍历集合,并在每个项目上执行可调用对象,并将列表中的可调用对象的返回值替换为该项目。 Well, technically is doesn't modifies the list, nor it creates the new list, but you get the idea.
好吧,从技术上讲,它不会修改列表,也不会创建新列表,但你明白了。 You pass an
iterable
, and map
returns a new iterable
, where each item is transformed using callable
.您传递一个
iterable
, map
返回一个新的iterable
,其中每个项目都使用callable
进行转换。
Now, lambda
is a shorthand to create unnamed function.现在,
lambda
是创建未命名 function 的简写。
lambda x: str(x)
is similar to:类似于:
def transform_to_str(x):
return str(x)
Now, given this code:现在,给定这段代码:
tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))
Let's split that up:让我们把它分开:
callable = lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']
iterable = tweets_data
tweets['text'] = list(map(callable, iterable))
Let's convert callable
to a normal function:让我们将
callable
转换为普通的 function:
def callable(tweet):
return tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']
So, what your code does is:因此,您的代码所做的是:
tweets_data (iterable)
.tweets_data (iterable)
。tweet
in that tweets_data, it calls callable (lambda)
, which takes single argument.tweet
,它调用callable (lambda)
,它采用单个参数。 The list()
function converts generator
to list
, thus forcing all tweets
to transform at once. list()
function 将generator
转换为list
,从而强制所有tweets
立即转换。
Now, you can try to understand other lambdas.现在,您可以尝试理解其他 lambda。 Probably go through the documentation also, which is quite elaborate.
也可能通过文档 go ,这是相当详尽的。
A simple way to understand lambda is, it takes an argument before :
whatever after :
comes, gets returned.理解 lambda 的一种简单方法是,它在
:
之前需要一个参数,无论在:
之后是什么,都会返回。 For ex, in your above code:例如,在您上面的代码中:
tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))
lambda tweet: tweet['text']
simply takes a dictionary
tweet
and returns value of the key text
lambda tweet: tweet['text']
只接受dictionary
tweet
并返回键text
的值
And, map is a function
which simply applies a given function over an iterable
(list, tuple, etc.) and returns an iterable
而且, map 是一个
function
,它只是将给定的 function 应用于一个iterable
对象(列表、元组等)并返回一个iterable
Note: An iterable is something over which you can apply for loop注意:可迭代是您可以申请循环的东西
So, if we make a small function for your lambda expression lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']
, it would look like: So, if we make a small function for your lambda expression
lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']
, it would look like:
def foo(tweet):
if 'extended_tweet' not in tweet:
return tweet['text']
else:
return tweet ['extended_tweet']['full_text']
Let us apply this to our map:让我们将此应用于我们的 map:
map(foo, tweets_data)
So, here, function foo()
is being applied to each and every element of tweets_data
因此,在这里,function
foo()
被应用于tweets_data
的每个元素
And the list
function takes the returned value of map
one-by-one and converts them to a list list
function 将map
的返回值一一转换为列表
Hope you find the explanation helpful希望你觉得解释有帮助
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.