Parsing unevenly spaced columns from text in PHP
Published by Nicholas Dunbar on August 23rd, 2014
If you want to parse unevenly or jaggedly spaced columns out of a text file in PHP, below are two functions that will help you do it.
Let's say that you have the following text output:
col1 col2 col3
==== ==== ====
1 a b c d e 103 14 as d9
2 a 103 14 as d9
3 a 103 14 as d9
And we want to transform the data into the following:
col1 | col2 | col3 |
1 a b c d e |
103 | 14 as d9 |
2 a |
103 | 14 as d9 |
3 a |
103 | 14 as d9 |
We need an algorithm that detects large spaces and small spaces and makes a weighted estimate as to which column the data belongs. Perhaps you are parsing columns from a data source that changes frequently the amount of spaces that are between each column and that breaks your parser. This is a more robust solution than just using the positions of the column headers to parse out the data in the columns. This algorithm deals with a small degree of noise or error that may show up in the text you are parsing.
|
Here is a more efficient algorithm you can use if your columns have the same positions as the headers.
|