help-glpk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-glpk] comments in csv data files


From: Chris Wolf
Subject: Re: [Help-glpk] comments in csv data files
Date: Wed, 19 May 2010 15:52:33 -0400
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4

Sure.  It actually appears that the list software performed an XML substitution 
of the 
numeric character entities in the script, thus mangling it by replacing with 
literal 
newlines and carriage returns.  To get around that, I'm attaching a zip of the 
script.

  -Chris   

On 5/19/10 10:26 AM, Nigel Galloway wrote:
> Thanks for the script, let's hope its needed soon.
> 
> The possibility to use XML and XSLT were discussed but not taken up 
> previously. Subsequent corespondence has indicated that the csv format was 
> mainly required for reading in to Excel. As well as editing the XML XMLFox 
> can in fact save the edited XML as an Excel spreadsheet or MS Access 
> database, thus doing away with the csv altogether.
> 
> There are a wide variety of modern XML aware tools (for editing, display and 
> conversion) running on a variety of platforms. XSLT is the standard for batch 
> processing and for shared applications.
> 
>> ----- Original Message -----
>> From: Chris Wolf <address@hidden>
>> To: address@hidden, Nigel Galloway <address@hidden>
>> Subject: Re: [Help-glpk] comments in csv data files
>> Date: Mon, 17 May 2010 11:29:00 -0400
>>
>>
>> As Andrew points out, there is not standard way to embed comments 
>> in the CSV format.
>>
>> There are a number of ways you can embed comments in an XML 
>> document.  One is to
>> declare an element or attribute for this purpose in the schema (or 
>> DTD).  The other,
>> more general way is to use SGML comments, i.e.  "<!-- this is a comment -->".
>>
>> The advantage of the the former technique is that the comment is actually not
>> an XML comment, but an XML-parsable part of the document, which XML 
>> processors
>> can access and pass on in an XML processing pipeline.
>>
>> Probably the simplest approach would be to just use SGML style comments, 
>> which
>> are ignored by XML readers in a pipeline - there's no need to use "grep".
>>
>> I would also advise against depending on a GUI-based editor such as this
>> "XMLFox", since it's platform-dependent (Windows only) and requires
>> manual interaction in a production workflow.
>>
>> You are much better off writing a simple XSLT script to process your
>> XML document to perform any transformations that are needed, such as
>> re-mapping column order, filtering out columns, converting to CSV, etc.
>>
>> In this way, running the XSLT is a simple one-liner command on
>> any platform.
>>
>> Here an XML=>CSV example:
>>
>> http://stackoverflow.com/questions/365312/xml-to-csv-using-xslt
>>
>> The command invocation on MacOS/Linux would be:
>>
>> $ xsltproc tocsv.xsl data.xml  > data.csv
>>
>> On Windows the invocation would be:
>>
>> c:\> msxsl data.xml tocsv.xsl > data.csv
>>
>> (Note the command arguments are reversed with "msxsl")
>>
>> I cleaned up the XSL script a little:
>>
>>
>> tocsv.xsl:
>>
>> <xsl:stylesheet version="1.0" 
>> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>>    <!-- Convert XML to CSV. Assumes only one level of nesting. -->
>>    <xsl:output method="text" encoding="iso-8859-1"/>
>>    <xsl:strip-space elements="*" />
>>
>>    <xsl:template match="/*/child::*">
>>      <xsl:for-each select="child::*">
>>        <xsl:if test="position() != last()">"<xsl:value-of 
>> select="normalize-space(.)"/>",</xsl:if>
>>        <xsl:if test="position() = last()">"<xsl:value-of 
>> select="normalize-space(.)"/>"<xsl:text>
> </xsl:text>
>>        </xsl:if>
>>      </xsl:for-each>
>>    </xsl:template>
>> </xsl:stylesheet>
>>
>> If you really want DOS line endings in your CSV result, you can 
>> replace "
> " with "
> "
>>
>> This script can be easily extended to re-map and/or filter columns.
>>
>>
>> Regards,
>>
>> Chris Wolf
>>
>>
>> On 5/17/10 10:40 AM, Nigel Galloway wrote:
>>> I might have mentioned the benefits of XML before. It has been 
>>> helpfully pointed out that a program may be written in grep to 
>>> convert csv to XML. Careful research has revealed that Excel can 
>>> read csv and save it as XML.
>>>
>>> How then to include the useful comments.
>>>
>>> It would be possible to include the comments and then write a 
>>> program in grep to remove them.
>>>
>>> One of the benefits of XML is that someone may already have done 
>>> this for me!!!!
>>>
>>> You are looking to create an Excel file by selecting only some of 
>>> the Columns from an XML. The columns are not is the correct 
>>> sequence for the Excel so they need to be mapped.
>>> You have an XML that contains columns A, B, C, D, E, F, etc.
>>> You only need data from columns B, D, and F.
>>> But in the output Excel file the sequence has to be F, B, and D.
>>>
>>>
>>> To accomplish the task we will use XMLFox Advance that is a 
>>> useful XML and XSD schema editor. By using XMLFox Advance you can 
>>> output data to several other data format files. The Editor allows 
>>> you export XML tables or whole XML to the following data files: 
>>> TXT; upload XML into MS SQL Server database, convert into CSV 
>>> (Comma Separated Value) file, convert into HTML page, create MS 
>>> Access database, upload XML into SQL Server database, convert to 
>>> PDF, and create Excel file.
>>>
>>> Full details of the above (reproduced without any permission) may 
>>> be found at:
>>>
>>> http://www.xmlfox.com/convert_xml.htm
>>>
>>>
>>>> ----- Original Message -----
>>>> From: Andrew Makhorin <address@hidden>
>>>> To: address@hidden
>>>> Subject: [Help-glpk] comments in csv data files
>>>> Date: Sun, 16 May 2010 13:36:57 +0400
>>>>
>>>>
>>>> I found that it would be convenient to allow comment lines in csv data
>>>> files read from mathprog models thru the table statement. Unfortunately,
>>>> the RFC document 4180 that specifies the csv format says nothing about
>>>> such a feature.
>>>>
>>>> Probably, a comment line can be indicated by its very first character
>>>> (like in many scripting languages): '#', ';', '*', or may be '%'.
>>>> Another issue is whether to allow comment lines everywhere in the file
>>>> or only in the beginning. The latter seems safer, because the first line
>>>> contains field names which, as a rule, contain no special characters.
>>>>
>>>> Any opinions/suggestions are appreciated. Thanks.
>>>>
>>>> Andrew Makhorin
>>>>
>>>>
>>>> _______________________________________________
>>>> Help-glpk mailing list
>>>> address@hidden
>>>> http://lists.gnu.org/mailman/listinfo/help-glpk
>>>
>>>>
>>>
>>>
> 
>>
> 
> 

Attachment: tocsv.xsl.zip
Description: Zip archive


reply via email to

[Prev in Thread] Current Thread [Next in Thread]