A reader writes....

Post

Posted
Rating:
#1 (In Topic #236)
Avatar
Regular
Cedron is in the usergroup ‘Regular’
I recently got a letter from a Dorothy in Kansas, she writes:

Gosh, Mr. Dawg, why do you hate tabs so much?

Well, Dottie, may I call you Dottie?  First a shout out to farm country, you are feeding the world, and we appreciate you so much.  Hug a farmer today.  It's also in the middle of tornado alley, so please be careful out there.

Whatever gave you the idea that I hate tabs?  I love tabs, they are the best at what they do, no other character can replace them.  When tabs are outlawed, only outlaws will use tabs.

So, anticipating your next question, what are tabs good for?

They are absolutely the best delimiter to use when passing text files to a spreadsheet program.  Commas, which are commonly used for this, come with a host of problems:

Numeric looking fields at are actually character strings (you are in a text file after all), like 1,234.56, or the convention of using commas for decimal points, will throw off a parser, so those fields have to be wrapped in quotes.  But, oh no, what if that field also has a quote in it, then it has to be escaped (replaced with a sequence that will parse correctly).  There are two prevalent methods, using a " or doubling up the quote "".  You can see a parser has to be one or the other, it can't be both and work properly.  So, if you are writing a program that is dealing with unknown values, you have to take all these things into consideration.  What a pain.

Now, let's bring in the tab.  Stick a tab between each field and all those issues go away, poof.  A tab is considered a white space character, so it is invisible when a document is sent to a printer.  If there is even the possibility that there might be a tab character coming in, all you have to do is replace it with the standard escape sequence \t.  Most spreadsheet programs will interpret this literally, so that is what you will see when it is loaded in a cell, but it will be in the right cell.  Spreadsheets don't handle tabs as characters in values very either, so you are not likely to encounter this.

To demonstrate, here is a code sample of a writer reading:

Code (gambas)

  1.   Dim IO As File          ' Input/Output
  2.   Dim FileName As String
  3.   Dim D As String         ' Delimiter Character
  4.   Dim Cell As String
  5.   Dim Cells As String[]
  6.   Dim InputLine As String
  7.   Dim Row As Integer
  8.   Dim Col As Integer
  9.  
  10.   FileName = "~/test.csv"
  11.   D = ","
  12.   GoSub WriteFile
  13.   GoSub ReadFile
  14.  
  15.   FileName = "~/test.tsv"
  16.   D = gb.Tab
  17.   GoSub WriteFile
  18.   GoSub ReadFile
  19.  
  20.  
  21. WriteFile:
  22.  
  23.   Print FileName  
  24.  
  25.   IO = Open FileName For Output Create
  26.  
  27.   Print #IO, "1,234.56"; D; "Howdy, folks"; D; "I'm in column 3"
  28.   Print #IO, "Embedded \t tab"; D; " Howdy, \"folks\""; D; "I'm in column 3"
  29.  
  30.   Close #IO    
  31.  
  32.  
  33. ReadFile:
  34.  
  35.   IO = Open FileName
  36.  
  37.   Row = 1    
  38.   Do Until IO.EndOfFile
  39.      Line Input #IO, InputLine
  40.      Col = 1    
  41.      Cells = Split(InputLine, D)
  42.      For Each Cell In Cells
  43.        Print Row, Col, Cell  
  44.        Inc Col
  45.      Next
  46.      Inc Row
  47.   Loop
  48.  
  49.   Close #IO    
  50.  
  51.  

Your output should look like this:

Code

~/test.csv
1       1       1
1       2       234.56
1       3       Howdy
1       4        folks
1       5       I'm in column 3
2       1       Embedded         tab
2       2        Howdy
2       3        "folks"
2       4       I'm in column 3

~/test.tsv
1       1       1,234.56
1       2       Howdy, folks
1       3       I'm in column 3
2       1       Embedded
2       2        tab
2       3        Howdy, "folks"
2       4       I'm in column 3

So you see, Dottie, neither approach is foolproof, but one is easier to defend against hackers than the other, ummmmm, I mean easier to fix.  Try each in your favorite spreadsheet program to see how they react.

Just keep the tabs in your text files and out of your source code.  And watch out for all those \r and \n's too, but that will have to wait.  A replace "\" with "\\" should fix all that.  But that is a story for another time.   Thanks for reading, and writing, Dottie.  Remember it's always a good time to go for a walk.

Sincerely,

Mr. Dawg

.... and carry a big stick!
Online now: No Back to the top
1 guest and 0 members have just viewed this.