Above is a picture of my Excel sheet. I have 2 columns of data that have multiple data points in them (separated by commas). This is how my data is spit out after running an online psychology experiment. I'm hesitant to split text to columns because some lines only have 3 values and other lines have 20+. Essentially, I need to match values in one column to values in the second column. For example, the first value in column G needs to match with the first value in column H. The second value needs to match with the second value, etc. I don't need to match up every value in both columns, however. I only need a (defined) subset of values.
I'm not sure if this is possible to do in Excel (or any Excel add-on) without separating the values into separate columns, but any help is appreciated!
I've seen this before in survey data - the output uses "packed data" where each cell contains many values. You will need Excel 2010+ for Windows (or Excel 365) for this solution. Otherwise, there a solution that is also Mac compatible that does not involve VBA, but it takes time to construct. This approach should take you 10 mins to do - a lot of steps, but it is just clicking.
Let's say that these are your data in two columns in a table.
Click anywhere inside the table. Open the Data tab and click on From Table/Range:
This will convert your data into an Excel Table and ask you if your table has headers - yes it does. Click OK.
This will open the Power Query (PQ) editor (congratulations, you are now a step closer to data scientist, so take a selfy with this screen in the back and share on social media).
You will see in the Applied Steps on the right hand side that PQ has helpfully detected the data type in a step called Changed Type. You need to undo that because it will likely think that your comma separated numbers are just one giant number. So click the X on the left side of that step.
On the right side, you can expand out Queries as shown above. Right click on your table and select Duplicate.
NB: This is not the most efficient way to do this, but I think this is something you just want do one time and you probably don't want to go hacking through the Advanced Editor.
So now you have two tables:
Rename Table1 (2) to Output in the box on the right hand side just to create some clarity.
Right Click on the Response RT column in Output and Remove it. Click on Table1 and do the same thing to the Response column. So now you have Table1 with only the Response RT and Output with only the Responses. Now we will parse these into rows of cleaned data.
Parse Table1
First, in Table1, click on the Response RT column and in the Home tab you will see Split Column. 1) Click on that and choose By Delimiter.
2) It will default to Comma, but you need to click on Advanced options and choose the Rows radio button.
Click OK and it should turn your data into rows of separated numbers and change to the type (this time helpfully) to decimal.
Now you need to add an index. 3) Go to the Add Column tab and click on Add Index, starting from 1.
Parse Ouput Table
Now go back to Output and repeat steps 1), 2) and 3) for it as well. Then you will have to take an extra step to clean up your text column. Right-Click on the Response column and choose Transform > Trim on the data.
That will get rid of those spurious spaces.
Merge Them Back Together
While you still have the Output table selected, go to the Home tab and choose Merge Queries.
It will bring up this window:
Choose Table1 from the bottom dropdown. Click on Index on both tables and click OK. You will get something like this:
Click on the button on the top right of the Table1 column and then unselect Index and Use original column name as prefix .
Click OK. Right click the Index column and Remove it. You now have your answer, but you still need to bring it back to Excel.
Putting it back in Excel
Click on Close and Load to on the left hand of the Home tab. To keep things simple, just click OK.
It will put both Output and Table1 as worksheets into your workbook, (this is where I said it is not the most efficient approach - you can always delete the Table1 worksheet. Excel will complain when you do, but you can ignore it.) Output is your answer.
Congratulations, you just did an ETL (extract transform and load) operation in data analytics. Do another selfy with the answer and share on social media.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.