Thanks for your advice. I'll give it a go now, but at first glance I think I'm missing something in my understanding of how to proceed. >From the way my original post is quoted in your reply, it looks like the three attachments have been inlined - perhaps mangled by the mail list software. I'll forward the original post to you in the hope that the .st file I attached to it is then loadable. cheers -ben Tudor Girba wrote: > Hi Ben, > > I did not look into the details (mostly because you did not provide easily loadable code), but from what I can read: > - you need to create a stack in your parser to make sure that the rows go to the right component > - one way to do it is to define a components variable in the Parser > - make sure you declare it in ignoredNames > Parser class>>ignoredNames > ^ super ignoredNames , #(components) > - in componentNamesRow, you need to add to components with the newly created one > - always associate the dataRows to the last created component > > Cheers, > Doru > > > On 26 Dec 2012, at 08:17, Ben Coman <btc@openInWorld.com> wrote: > > >> I tried PetitParser for the first time a few days ago, to import some data required for testing my Masters project (which is due RealSoonNow(c)). I found it easy to adapt PetitCSV for part of my requirement, but am having trouble extending it further. I was hoping to get a lot done this vacation on completing my project but this is slowing me down. >> >> Now since everyone has better things to do this festive break, I am offering two small bounties of AUD $50 to hopefully draw the attention of someone more familiar with PetitParser, for whom this will hopefully be much quicker than my random experimenting. >> >> The attached LEImportSKMPowerToolsData.txt shows a sample of what I need to parse. A single file containing multiple tables (12 of) with the data of each table in CSV format. Tables start with a line "// --- tablename ---", followed by column names on the next line, followed by multiple lines of data. >> >> Using the attached LEImportSKMPowerTools.st (also see attached LEImportSKMPowerTools.txt formatted for quick reference ) >> I can successfully parse the first table of the following sample... >> ========= >> // --- Bus --- >> //<ComponentName>,<ComponentType>,<SystemNominalVoltage>,<AF_ArcType>,<AF_WorkingDistance> >> 00BFA10,10,415,In Box,609.6 >> 00BFA20,10,415,In Box,609.6 >> >> // --- Cable --- >> //<ComponentName>,<SystemNominalVoltage>,<Phase>,<CableSize>,<NeutralSize>,<Length>,<Size/Do Not Size> >> 11P08023,11500,ABC,240,,680.0,Do Not Size >> 11P08024,11500,ABC,240,,610.0,Do Not Size >> ========= >> ... but the next table ends up as a subpart of the first as shown in the following result for the above sample... >> #( >> #('Bus' >> #('ComponentName' 'ComponentType' 'SystemNominalVoltage' 'AF_ArcType' 'AF_WorkingDistance') >> #( >> #('00BFA10' '10' '415' 'In Box' '609.6') >> #('00BFA20' '10' '415' 'In Box' '609.6') >> #('') >> #('// --- Cable ---') >> #('//<ComponentName>' '<SystemNominalVoltage>' '<Phase>' '<CableSize>' '<NeutralSize>' '<Length>' '<Size/Do Not Size>') >> #('11P08023' '11500' 'ABC' '240' '' '680.0' 'Do Not Size') >> #('11P08024' '11500' 'ABC' '240' '' '610.0' 'Do Not Size') >> ) >> ) >> ) >> >> Essentially I don't know how to close off the pattern matching at the blank line following the data rows, so that the next table can be started. Also something I hadn't looked at yet, LEIMportSKMPowerToolsData.txt has comment lines starting with '//' that need to be ignored along with the blank lines. >> >> So two things are required for the first $50. >> 1. The above result needs to instead have two close-round-brackets in place of the blank line #('') so that #('Cable' appears at the same level as #('Bus' >> 2. The comments and blank lines need to be ignored. >> >> In addition, all of the PetitParser documentation that I find says "parse a string (or stream)" but I see no actual samples of using a stream. Changing from using a string to a stream is pretty basic I should probably be able to guess, but while I have your attention I may as well learn what is best practice. >> >> ------------ >> I also offer a second AUD $50 for the following... >> 1. A method to which I pass a filename, that will return an object which can be accessed with the tablename to return a collection of rows, >> 2. A cell in one of those rows can be accessed by the column name. >> >> This will just be a temporary step to move the data into my own internal format, so it doesn't need to be elegant. I had two rough separate ideas: >> a. A simple dictionary of array of dictionaries. >> b. A method that when passed a column name will return the array index into a row. >> >> ------------ >> So I hope that it worth someone's while to attend to this quickly. Technical discussion should continue here. I am not quite sure yet how I'll handle multiple solutions. The method of payment will need to be discussed privately with contributors once requirements are met. >> >> hope you are all enjoying your festive season... >> cheers -ben >> >> 'From Pharo1.4 of 18 April 2012 [Latest update: #14457] on 26 December 2012 at 10:58:18 am'! >> PPCompositeParser subclass: #LEImportSKMPowerTools >> instanceVariableNames: 'endOfLine nonComma start tableName columnNames columnName columnNamesRow table dataCell dataRow dataRows' >> classVariableNames: '' >> poolDictionaries: '' >> category: 'Lektrek-ImportExport'! >> >> !LEImportSKMPowerTools methodsFor: 'as yet unclassified' stamp: 'BenComan 12/24/2012 15:23'! >> dataCell >> ^ nonComma star flatten >> ==> [ :nodes | nodes value ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'as yet unclassified' stamp: 'BenComan 12/24/2012 19:37'! >> dataRows >> ^ (dataRow delimitedBy: endOfLine) ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'as yet unclassified' stamp: 'BenComan 12/24/2012 12:18'! >> endOfLine >> ^ #newline asParser token! ! >> >> !LEImportSKMPowerTools methodsFor: 'as yet unclassified' stamp: 'BenComan 12/24/2012 12:23'! >> nonComma >> ^ PPPredicateObjectParser anyExceptAnyOf: {Character tab . Character cr . Character lf . $, }! ! >> >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/24/2012 15:38'! >> columnName >> ^ $< asParser , (#letter asParser / #digit asParser / #space asParser / $_ asParser / $/ asParser ) star flatten , $> asParser >> ==> [ :nodes | nodes second value ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/24/2012 14:54'! >> columnNames >> ^ ((columnName delimitedBy: $, asParser token) ) ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/24/2012 14:56'! >> columnNamesRow >> ^ '//' asParser, columnNames >> ==> [ :nodes | nodes second value ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/24/2012 15:23'! >> dataRow >> ^ (dataCell delimitedBy: $, asParser token) ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ]! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/26/2012 10:30'! >> start >> ^ table star end! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/26/2012 10:38'! >> table >> ^ ( (tableName , endOfLine, columnNamesRow , endOfLine , dataRows) >> ==> [ :nodes | { nodes first . nodes third . nodes fifth } ] )! ! >> >> !LEImportSKMPowerTools methodsFor: 'grammar' stamp: 'BenComan 12/24/2012 14:24'! >> tableName >> ^ '// --- ' asParser, #word asParser star flatten, ' ---' asParser >> ==> [ :nodes | nodes second value ]! ! >> PPCompositeParser subclass: #LEImportSKMPowerTools >> instanceVariableNames: 'endOfLine nonComma start table tableName columnNamesRow columnNames columnName dataRows dataRow dataCell' >> >> LEImportSKMPowerTools>>start >> ^ table star end >> >> LEImportSKMPowerTools>>table >> ^ ( (tableName , endOfLine, columnNamesRow , endOfLine , dataRows) >> ==> [ :nodes | { nodes first . nodes third . nodes fifth } ] ) >> >> LEImportSKMPowerTools>>tableName >> ^ '// --- ' asParser, #word asParser star flatten, ' ---' asParser >> ==> [ :nodes | nodes second value ] >> >> LEImportSKMPowerTools>>columnNamesRow >> ^ '//' asParser, columnNames >> ==> [ :nodes | nodes second value ] >> >> LEImportSKMPowerTools>>columnNames >> ^ ((columnName delimitedBy: $, asParser token) ) >> ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ] >> >> LEImportSKMPowerTools>>columnName >> ^ $< asParser , (#letter asParser / #digit asParser / #space asParser / $_ asParser / $/ asParser ) star flatten , $> asParser >> ==> [ :nodes | nodes second value ] >> >> LEImportSKMPowerTools>>dataRows >> ^ (dataRow delimitedBy: endOfLine) >> ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ] >> >> LEImportSKMPowerTools>>dataRow >> ^ (dataCell delimitedBy: $, asParser token) >> ==> [ :nodes | nodes reject: [ :each | each class = PPToken ] ] >> >> LEImportSKMPowerTools>>dataCell >> ^ nonComma star flatten >> ==> [ :nodes | nodes value ] >> >> LEImportSKMPowerTools>>endOfLine >> ^ #newline asParser token >> >> LEImportSKMPowerTools>>nonComma >> ^ PPPredicateObjectParser anyExceptAnyOf: {Character tab . Character cr . Character lf . $, } >> >> // Export start : 12/18/12 18:18:07 >> // Datablock format : All Input Data >> // Query name : *ALL COMPONENTS >> >> // --- Bus --- >> //<ComponentName>,<ComponentType>,<SystemNominalVoltage>,<AF_ArcType>,<AF_WorkingDistance> >> 00BFA10,10,415,In Box,609.6 >> 00BFA20,10,415,In Box,609.6 >> >> // The following field(s) are enumerated data types >> // Either the quoted text or integer may be used in importing >> >> // <InService>: "Out"=0 "In"=1 >> // <Phase>: "None"=0 "A"=1 "B"=2 "C"=4 "AB"=3 "AC"=5 "BC"=6 "ABC"=7 "AB MidTap"=48 "BC MidTap"=96 "CA MidTap"=80 "ABC MidTap"=112 >> // <Size/Do Not Size>: "Size"=0 "Do Not Size"=1 >> >> // --- Cable --- >> //<ComponentName>,<SystemNominalVoltage>,<Phase>,<CableSize>,<NeutralSize>,<Length>,<Size/Do Not Size> >> 11P08023,11500,ABC,240,,680.0,Do Not Size >> 11P08024,11500,ABC,240,,610.0,Do Not Size >> _______________________________________________ >> Moose-dev mailing list >> Moose-dev@iam.unibe.ch >> https://www.iam.unibe.ch/mailman/listinfo/moose-dev >> > > -- > www.tudorgirba.com > > "We are all great at making mistakes." > > > > > > > > > _______________________________________________ > Moose-dev mailing list > Moose-dev@iam.unibe.ch > https://www.iam.unibe.ch/mailman/listinfo/moose-dev > >
_______________________________________________ Moose-dev mailing list Moose-dev@iam.unibe.ch https://www.iam.unibe.ch/mailman/listinfo/moose-dev