Public Schedule Face-to-Face & Online Instructor-Led Training - View dates & book

excel data mappingcleansing

Forum home » Delegate support and help forum » Microsoft Excel Training and help » Excel data mapping/cleansing

Excel data mapping/cleansing

ResolvedVersion 2016

Amy has attended:
Excel Advanced course
Excel Advanced - Formulas & Functions course

Excel data mapping/cleansing

Hi,

I'm doing some data categorization for a client, and have a spreadsheet where there are a list of supplier names (nearly 9000), but some of the names are repeated throughout with slight variations e.g.:

A&L Dent
A&L Dent Solutions Ltd
AA Garage Service MB
AA Garage Services MH

someone has done some cleansing to get final "cleansed supplier name" column, so in the case above the adjacent column now reads:

A&L Dent
A&L Dent
AA Garage Service
AA Garage Service

However some of the cleansing has not worked effectively, e.g.

4 D Distributors
4D DISTRIBUTORS

Has been cleansed to:

4 D Distributors
4D Distributors

I'm trying to find these missed ones that haven't been cleansed effectively so that there is only one naming convention for each supplier in the "cleansed column".

I have no idea how the person mapped the supplier names in the first place as all formula's have been removed.

Is there a way for me to ensure that the cleansed supplier names are correct and are mapped properly to the original names?

Thanks
Amy

RE: Excel data mapping/cleansing

Hi Amy,

Thank you for the forum question.

Du you have a list of how the names should be?

Is this a repeating task or something you need to do one time?

Is there a pattern in the mistakes Excel can use to get it right?

If there is a pattern, it could be that some typed UK and others United Kingdom. Then you can tell Excel automatic to replace United Kingdom with UK.

If you have a list of the client's names spelled correct, you can merge the data in Power Query and if you use a anti left/right join Power Query can create a list of all misspelled client names.

You can also download the fuzzy lookup Add-in to Excel. Then you can set a threshold for accuracy. You can set a percentage. If it 90% the same map them.


Kind regards

Jens Bonde
Microsoft Office Specialist Trainer

Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us

London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector

RE: Excel data mapping/cleansing

Hi,

Thanks for your response, in response to your questions:

I don't have a list of how the names should be.

It's a one time task.

I can't identify a pattern in the mistakes. (In the example above it was the spacing between the number and letter)

I just have one list/column of names with numerous variations throughout that column and I want to end up with a new "cleansed" column/list with no variations, just one naming format for each name. So if I filter that supplier I just have one name to select from not a number of variations.

I'm not sure what was done to get the current cleansed list, but maybe I could start fresh and see how it works out? I'd appreciate any advise on how to do that please?


Many Thanks
Amy

RE: Excel data mapping/cleansing

Hi Amy,


If there is no pattern in the errors it can be very difficult to do. You may be able to do it with a combination of different methods.

If you create a column and use the Left function. If you have have the client names in column A you can in the new column write =Left(A2, 8). This will extract the first 8 characters from the client name.

A&L Dent = A&L Dent
A&L Dent Solutions Ltd = A&L Dent
AA Garage Service MB = AA Garag
AA Garage Services MH = AA Garag

But this method will not help you if you have the problem within the first 8 characters.


There is no functions in Excel which can do cleaning if there is absolutely no pattern. I am almost sure that who ever did the cleaning created a list of how he wanted the suppliers name to be and in the "cleaned" column the person used a Vlookup approximately match.

Kind regards

Jens Bonde
Microsoft Office Specialist Trainer

Tel: 0207 987 3777
STL - https://www.stl-training.co.uk
98%+ recommend us

London's leader with UK wide delivery in Microsoft Office training and management training to global brands, FTSE 100, SME's and the public sector

Tue 29 Sep 2020: Automatically marked as resolved.

Excel tip:

Switch between workbooks ni EXCEL.

use Alt-TAB to switch between workbooks.

View all Excel hints and tips

Connect with us:

0207 987 3777

Call for assistance

Request Callback

We will call you back

Server loaded in 0.13 secs.