简体   繁体   English

试图理解这个脚本中的 'lambda' 和 'map'

[英]Trying to understand 'lambda' and 'map' in this script

I am trying to understand the lambda and map functions in python, specifically with regard to the below code I have been following using the tweeps API.我正在尝试了解 python 中的 lambda 和 map 函数,特别是关于我一直在使用 tweeps Z243ADB97A47CDE1871 的以下代码I have googled lambda and map but I'm struggling to understand it in the context of this script.我用谷歌搜索了 lambda 和 map 但我在这个脚本的上下文中很难理解它。 As I understand Lambda passes an argument and an expression, thereby becoming a shortened function?据我了解 Lambda 传递一个参数和一个表达式,从而成为缩短的 function? Could you kindly take a look at the code below for me and indicate what map and lambda are doing in each line here?您能否帮我看看下面的代码,并指出 map 和 lambda 在这里的每一行中都在做什么?

#Reading the raw data collected from the Twitter Streaming API using Tweepy
tweets_data = []
tweets_data_path = 'output2.txt'
tweets_file = open(tweets_data_path, 'r')
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet)
    except:
        continue

print('The total number of Tweets is:', len(tweets_data))

#Create a function to see if the tweet is a retweet
def is_RT(tweet):
    if 'retweeted_status' not in tweet:
        return False
    else:
        return True

#Create a function to see if the tweet is a reply to a tweet of another user, if so return that user.
def is_Reply_to(tweet):
    if 'in_reply_to_screen_name' not in tweet:
        return False
    else:
        return tweet['in_reply_to_screen_name']

#Convert the Tweet JSON data to pandas Dataframe, and take the desired fields from the JSON.

tweets = pd.DataFrame()
tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))

tweets['Username'] = list(map(lambda tweet: tweet['user']['screen_name'], tweets_data))

tweets['Timestamp'] = list(map(lambda tweet: tweet['created_at'], tweets_data))

tweets['length'] = list(map(lambda tweet: len(tweet['text']) if 'extended_tweet' not in tweet else len(tweet['extended_tweet']['full_text']), tweets_data))

tweets['location'] = list(map(lambda tweet: tweet['user']['location'], tweets_data))

tweets['device'] = list(map(reckondevice, tweets_data))

tweets['RT'] = list(map(is_RT, tweets_data))

tweets['Reply'] = list(map(is_Reply_to, tweets_data))

I was following the guide fine but this threw me as I have never seen map or lambda before.我很好地遵循了指南,但这让我感到震惊,因为我以前从未见过 map 或 lambda。 I understand we are building a data frame in pandas I'm just not sure how it is happening?我知道我们正在 pandas 中构建一个数据框我只是不确定它是如何发生的?

Thanks!!谢谢!!

Syntactically a map function is like this:语法上 map function 是这样的:

map(callable, <collection>)

In simple word, it iterates over the collection, and on each item, executes the callable, and replaces the item with the return value of callable, in the list.简而言之,它遍历集合,并在每个项目上执行可调用对象,并将列表中的可调用对象的返回值替换为该项目。 Well, technically is doesn't modifies the list, nor it creates the new list, but you get the idea.好吧,从技术上讲,它不会修改列表,也不会创建新列表,但你明白了。 You pass an iterable , and map returns a new iterable , where each item is transformed using callable .您传递一个iterablemap返回一个新的iterable ,其中每个项目都使用callable进行转换。

Now, lambda is a shorthand to create unnamed function.现在, lambda是创建未命名 function 的简写。

lambda x: str(x)

is similar to:类似于:

def transform_to_str(x):
    return str(x)

Now, given this code:现在,给定这段代码:

tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))

Let's split that up:让我们把它分开:

callable = lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']

iterable = tweets_data

tweets['text'] = list(map(callable, iterable))

Let's convert callable to a normal function:让我们将callable转换为普通的 function:

def callable(tweet):
    return tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text']

So, what your code does is:因此,您的代码所做的是:

  • It iterates over tweets_data (iterable) .它遍历tweets_data (iterable)
  • For each tweet in that tweets_data, it calls callable (lambda) , which takes single argument.对于该 tweets_data 中的每条tweet ,它调用callable (lambda) ,它采用单个参数。
  • And takes it return value, and returns it, as a part of generator.并将其作为生成器的一部分返回值并返回。

The list() function converts generator to list , thus forcing all tweets to transform at once. list() function 将generator转换为list ,从而强制所有tweets立即转换。

Now, you can try to understand other lambdas.现在,您可以尝试理解其他 lambda。 Probably go through the documentation also, which is quite elaborate.也可能通过文档 go ,这是相当详尽的。

A simple way to understand lambda is, it takes an argument before : whatever after : comes, gets returned.理解 lambda 的一种简单方法是,它在:之前需要一个参数,无论在:之后是什么,都会返回。 For ex, in your above code:例如,在您上面的代码中:

tweets['text'] = list(map(lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'], tweets_data))

lambda tweet: tweet['text'] simply takes a dictionary tweet and returns value of the key text lambda tweet: tweet['text']只接受dictionary tweet并返回键text的值

And, map is a function which simply applies a given function over an iterable (list, tuple, etc.) and returns an iterable而且, map 是一个function ,它只是将给定的 function 应用于一个iterable对象(列表、元组等)并返回一个iterable

Note: An iterable is something over which you can apply for loop注意:可迭代是您可以申请循环的东西

So, if we make a small function for your lambda expression lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'] , it would look like: So, if we make a small function for your lambda expression lambda tweet: tweet['text'] if 'extended_tweet' not in tweet else tweet ['extended_tweet']['full_text'] , it would look like:

def foo(tweet):
    if 'extended_tweet' not in tweet:
        return tweet['text']
    else:
        return tweet ['extended_tweet']['full_text']

Let us apply this to our map:让我们将此应用于我们的 map:

map(foo, tweets_data)

So, here, function foo() is being applied to each and every element of tweets_data因此,在这里,function foo()被应用于tweets_data的每个元素

And the list function takes the returned value of map one-by-one and converts them to a list list function 将map的返回值一一转换为列表

Hope you find the explanation helpful希望你觉得解释有帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM