I have sometimes needed to create a working data sheet from a large section of text. As an analyst and data organiser it is often your responsibility to take unformatted information and turn it into useful, meaningful, organised and workable data. When might it be required? Some databases extract data in the form of text files. Not all users are experienced and it is entirely possible that clients will send data in this format. Rather than making them re-do it to your required format, it is much better if you can transform it yourself. Another reason may be that data is on a worksheet, but needs to be formatted properly to be of use. An example would be a column containing the full name of clients. In order to put the data from the sheet into a database it needs to be in two columns called First_name and Surname. Sometimes when copying and pasting into Excel, Excel places the data into one column rather than separating it. This would also require separation.

The Text to Column functionality is extremely useful in this case and there is even a wizard to help. In version 2010, go to the Data tab of the ribbon and choose Text to Column. This will present a wizard with a step by step guide on how to separate the text. In previous versions, go to Data and select Text to Columns. Many rows of data can be converted in one go, but if you attempt to select more than one column then Excel will give a warning message since the functionality is for the conversion of one column at a time.

The first step in the wizard is to decide on what exactly separates the text in the column. In our names example there is a space between the first name and the surname. If spaces separate the columns then select Fixed Width. If another character such as a comma or full stop separates them then choose delimited. In this example we will assume that the separator is a comma. Click next and some simple options are given for the separation character or select other and type your own choice. More than one separator can be selected. In the wizard viewer a vertical line will appear to show where Excel plans to separate the text.

The next tab allows the user to choose the data type for each column. Highlight each column in the viewer, one at a time and select the appropriate data type. It is important to get this right at this stage since if you need to complete further calculations on the data then this may be a problem if it is in an incorrect format. Sometimes there will be information which is considered irrelevant. For a data sheet that has customer names and addresses for example, there may also be an id field containing an id per row from a database you do not use yourself. If this is the case then highlight the column and select the checkbox labelled 'do not import column (skip).' When Excel performs the conversion it will ignore this column.

Make sure that you choose the correct destination cells. If you are converting column A, but there is already data in columns B and C which are required, select some new destination cells in the box provided. If you do not then you are in danger of overwriting the data in columns B and C since Excel will default to the next available column.

When using the Fixed width separator option in the wizard, the second tab will show a similar display with vertical lines as the delimited wizard. The difference is that the user can manually add a column break by clicking on the viewer or move an existing column break by clicking and dragging it. The wizard then has the same options to set the format of the columns.

The success of this functionality will depend on the quality of your starting data. The more data included in one cell, the more likely it is that you will have to go back and manually format some of the results. Our first name and surname example is very simple, but in reality data is often presented in a much more complicate way. An example of things getting more complicated is when an entire address is included in one cell. For a database this information would need to be split into Address1, Address2, Town, City and Post Code. Unfortunately the first address contains four words and the second address contains three. Unless each section is separated by a comma, this can be extremely complicated. Many columns will be created since each column will be separated using the spaces.

This functionality works best when the original data is simple and organised, but can help with complicated starting data, even if some manual adjustments must be performed upon completion.