I have a movie database that has the cast field as a comma delimited column as shown below.
What I would like to accomplish is to get a count of how many movies each cast member has in the database. The cast field is of different lengths
As in
'John Wayne' 100
'Tom Cruise' 17
'Ravi Shankar' 5
Sample data:
Title Cast
----------------------------------------------------------------------
BEAUTIFUL MIND Russell Crowe, Ed Harris, Jennifer Connelly, Paul Bettany, Adam Goldberg, Judd Hirsch, Josh Lucas, Anthony Rapp, Christopher Plummer, Clint Howard
ABDUCTION OF EDEN Jamie Chung, Matt O Leary, Beau Bridges, Scott Mechlowicz, Mariana Klaveno, Tantoo Cardinal, Tracey Fairaway, Russell Hodgkinson, Naama Kates, Tony Doupe
ABSENCE OF MALICE Bob Balaban, Wilford Brimley, Melinda Dillon, Sally Field, John Harkins, Paul Newman, Barry Primus, Josef Sommer, Luther Adler, Don Hood
Columbus Short, Mark Derwin, Jonah Hill, Robin Lord Taylor, Adam Herschman
Thank you for your kind consideration.
Tony
Assuming there are two columns, title
and cast
, the solution to your problem requires you to split the comma-delimited list of cast in the cast
column into a new column where each cast member appears on its own row.
If you are running SQL Server 2016 or above and have the database compatibility level set to 130 or above you can use a combination of the built-in string_split function and cross apply to get the result you want like this:
select value as "Cast member", count(title) as "Count of movies"
from movies
cross apply string_split(cast, ',')
group by value
If you're using an older database version you have to split the cast column using some custom function, which there are many around - a quick search here on SO should give you a suitable solution. ( Here is one post with several suitable functions )
Assuming you have two separate columns for movie name and cast.. here is a possible solution. You will need a function to create you a list from a csv.. google it or search here and you will find it..
;with my_cast as (
select
'Jamie Chung,Russell Crowe, Ed Harris, Jennifer Connelly, Paul Bettany, Adam Goldberg, Judd Hirsch, Josh Lucas, Anthony Rapp, Christopher Plummer, Clint Howard, Clint Howard' as names
union all
select
'Jamie Chung, Matt O Leary, Beau Bridges, Scott Mechlowicz, Mariana Klaveno, Tantoo Cardinal, Tracey Fairaway, Russell Hodgkinson, Naama Kates, Tony Doupe'
union all
select
' Adam Goldberg,Bob Balaban, Wilford Brimley, Melinda Dillon, Sally Field, John Harkins, Paul Newman, Barry Primus, Josef Sommer, Luther Adler, Don Hood Columbus Short, Mark Derwin, Jonah Hill, Robin Lord Taylor, Adam Herschman'
)
,my_names_list as (
Select value as names_list from my_cast
CROSS APPLY dbo.F_SplitList(my_cast.names, ',') b
)
Select
names_list
,count(*) as howmany
from my_names_list
group by names_list
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.